Skip to main content
  • Home
  • Features
  • How It Works
  • Pricing
  • Blog
  • FAQ
Log In Start Free

How to Run a Complete Website Audit Step by Step

Published by SiteCrawlIQ Team

How to Run a Complete Website Audit Step by Step

A website audit is only as useful as its thoroughness. Miss a category and you miss the issues hiding in it. This guide walks you through every phase of a modern website audit - from preparation to action plan - covering SEO, GEO, content quality, and conversion readiness.

Whether you use SiteCrawlIQ or another tool (or a combination of tools and manual checks), this process applies.

Phase 1: Preparation (15 Minutes)

Before you crawl a single page, set yourself up for a productive audit.

Define Your Goals

What triggered this audit? Your goals shape what you prioritize:

  • Traffic dropped - Focus on technical SEO and indexing issues

  • Launching a new site - Focus on foundational technical checks and content completeness

  • Competitors outranking you - Focus on content gaps and competitive benchmarking

  • AI search visibility - Focus on GEO readiness factors

  • Conversion rate optimization - Focus on CRO and user experience checks

  • Routine quarterly check - Cover everything at a high level
  • Gather Baseline Data

    Pull these before you start so you have a comparison point:

  • Google Search Console - Impressions, clicks, average position for the last 90 days

  • Google Analytics - Traffic by channel, bounce rates, conversion rates

  • Previous audit results - If you have them, compare against prior findings

  • Core Web Vitals report - From Search Console or PageSpeed Insights
  • Set Your Scope

    Decide what you are auditing:

  • Full site - Every crawlable page (recommended for first audit or quarterly reviews)

  • Specific section - Just the blog, just product pages, just landing pages

  • Sample audit - A representative sample of 50-100 pages (useful for very large sites as a quick check)
  • Phase 2: Technical Crawl (1-5 Minutes)

    This is where automated tools earn their value. A crawler visits every page and checks structural health.

    Using SiteCrawlIQ

  • Log in and add your site URL

  • Click "Start Crawl" - the hybrid crawler (Cheerio + Playwright) discovers pages via sitemap and internal links

  • For a 200-page site, this typically completes in 30-60 seconds

  • For larger sites, adjust the page limit in your plan settings
  • What the Crawl Checks

    For every discovered page, the crawl records:

  • Status code (200, 301, 302, 404, 500)

  • Title tag (presence, length, uniqueness)

  • Meta description (presence, length, uniqueness)

  • H1 tag (presence, count, uniqueness)

  • Content length (word count)

  • Internal links (count, broken links)

  • External links (count, broken links)

  • Canonical tag (presence, correctness)

  • Robots meta directives (noindex, nofollow)

  • Page load time (in milliseconds)

  • Content type and encoding
  • What to Look For Immediately

    After the crawl completes, check these first:

  • Any 500 errors - Server errors indicate infrastructure problems that affect all users

  • Redirect chains - URLs that redirect more than once waste crawl budget and slow page loads

  • Missing title tags - Pages without titles are essentially invisible to search engines

  • Duplicate content signals - Multiple pages with the same title or content
  • Phase 3: Technical SEO Analysis (20 Minutes)

    With crawl data in hand, systematically work through the technical categories.

    Crawlability and Indexing

  • Are all important pages returning 200 status codes?

  • Is your XML sitemap complete and submitted to Search Console?

  • Does robots.txt block anything it should not?

  • Are there orphan pages (pages with no internal links pointing to them)?

  • Check crawl depth - can every important page be reached within 3 clicks from the homepage?
  • Site Architecture

  • Is the URL structure logical and hierarchical?

  • Do breadcrumbs reflect the site structure accurately?

  • Is navigation consistent across all pages?

  • Are pagination and canonical tags handling multi-page content correctly?
  • Security and Performance

  • Is HTTPS enforced site-wide with no mixed content?

  • Are Core Web Vitals passing? Check LCP (under 2.5s), INP (under 200ms), CLS (under 0.1)

  • Is the site mobile-responsive?

  • Are images optimized (compressed, modern formats, correct dimensions)?
  • For a comprehensive technical checklist, see our [complete site audit checklist](/blog/complete-site-audit-checklist-2026).

    Phase 4: On-Page SEO Review (20 Minutes)

    Title Tags

  • Every page should have a unique title under 60 characters

  • Titles should include the primary keyword naturally

  • Check for truncation in search results

  • Avoid duplicate titles across different pages
  • Meta Descriptions

  • Every page should have a unique meta description under 160 characters

  • Descriptions should be compelling and include a call to action where appropriate

  • While not a direct ranking factor, good descriptions improve CTR
  • Heading Structure

  • Exactly one H1 per page

  • H2s break content into logical sections

  • H3s provide subsection structure

  • No skipped heading levels

  • Pages with proper heading hierarchy are 2.8x more likely to be cited by AI engines
  • Internal Linking

  • Every important page should receive internal links from other pages

  • Anchor text should be descriptive (avoid "click here")

  • Link to your most valuable pages from your highest-authority pages

  • Check for broken internal links
  • Phase 5: Content Quality Assessment (30 Minutes)

    This phase benefits most from human review, even when using automated tools.

    Quantitative Checks (Automated)

  • Word count distribution - flag pages under 300 words

  • Content uniqueness - check for internal duplicate content

  • Reading level - is your content appropriate for your audience?

  • Freshness - when was content last updated?
  • Qualitative Review (Manual)

    For your top 10-20 pages by traffic or business importance:

  • Does the content fully answer the user's query?

  • Is the information accurate and current?

  • Are there content gaps compared to competing pages?

  • Is E-E-A-T demonstrated (author credentials, citations, experience)?

  • Does the content include statistics, examples, and actionable advice?
  • Phase 6: GEO Readiness Audit (10 Minutes)

    This is the phase most auditors still skip - and it is increasingly the most impactful.

    AI Crawler Access

    Check your robots.txt for these essential directives:

  • GPTBot (OpenAI/ChatGPT) - Allow

  • Google-Extended (AI Overviews) - Allow

  • ClaudeBot (Anthropic/Claude) - Allow

  • PerplexityBot (Perplexity) - Allow

  • Applebot-Extended (Apple Intelligence) - Allow
  • About 26% of top websites still block GPTBot. If you are among them, you are invisible to ChatGPT's 883 million monthly users.

    llms.txt File

    Check for presence at yoursite.com/llms.txt. If it does not exist, create one. This is a 30-minute task with outsized impact. See our [llms.txt guide](/blog/how-to-check-llms-txt) for format details.

    Schema Markup Coverage

  • Organization schema on the homepage (minimum)

  • WebSite schema with SearchAction

  • Article/BlogPosting schema on content pages

  • FAQPage schema where appropriate

  • BreadcrumbList on interior pages
  • Content Citability

    Evaluate your content against AI citation factors:

  • Answer-first formatting

  • Factual density (statistics, data points)

  • Clear heading hierarchy

  • Lists and tables for structured information

  • Freshness signals (dates, "last updated" timestamps)
  • SiteCrawlIQ runs all 40+ GEO checks automatically and produces a citability score for each page.

    Phase 7: AI-Powered Analysis (5 Minutes)

    After collecting all the data, AI analysis connects the dots and prioritizes what matters.

    In SiteCrawlIQ, click "Run AI Analysis" after your crawl completes. The GPT-5 analysis reads your entire crawl dataset and produces:

  • "If you only do 3 things" - The three highest-impact actions

  • Fix Now - Critical issues affecting performance today

  • Fix Next - Important issues for the next sprint

  • Fix Later - Low-priority improvements

  • Multi-agent insights - Separate analyses from technical, content, CRO, GEO, and competitive specialist agents
  • Each recommendation includes evidence from your crawl data, the scope of the issue (how many pages affected), and specific remediation steps.

    Phase 8: Create Your Action Plan (15 Minutes)

    An audit without an action plan is just information. Convert findings into a prioritized task list:

    Priority Framework

    | Priority | Criteria | Timeline |
    |----------|----------|----------|
    | P0 - Critical | Breaks indexing, causes errors, blocks revenue | This week |
    | P1 - High | Significant ranking/visibility impact | Next 2 weeks |
    | P2 - Medium | Moderate improvement opportunity | Next month |
    | P3 - Low | Minor optimization, polish | Next quarter |

    Example Action Plan Format

  • [P0] Fix 12 pages returning 500 errors - Server misconfiguration on /products/ path. Contact hosting provider.

  • [P0] Unblock GPTBot in robots.txt - Currently blocked, losing all ChatGPT visibility.

  • [P1] Create llms.txt file - High-impact GEO fix, 30 minutes of work.

  • [P1] Fix 45 missing meta descriptions - Use CMS bulk editor.

  • [P2] Add FAQ schema to 20 blog posts - Install schema plugin, configure.

  • [P3] Compress 200+ unoptimized images - Use ShortPixel bulk optimizer.
  • Track Progress

    Re-run the audit after completing each priority tier. Compare scores to confirm improvements and catch any regressions introduced by the changes.

    Key Takeaways

  • Preparation matters - define goals and gather baselines before crawling

  • Technical crawling should be automated; content review benefits from human judgment

  • GEO readiness is the most commonly skipped audit phase and often the most impactful

  • AI-powered analysis connects data points and prioritizes fixes that humans might miss

  • Every audit must end with a prioritized action plan, not just a report

  • Re-audit after fixes to confirm improvements and catch regressions
  • Frequently Asked Questions

    How long does a complete website audit take?

    Using automated tools with manual review of key pages, a thorough audit of a 200-page site takes about 2 hours. The automated crawl and analysis take 5-10 minutes; the rest is manual review and action plan creation. Sites over 1,000 pages may need a full day.

    Should I audit staging or production?

    Always audit production. Staging environments often have different configurations, blocked crawlers, and placeholder content that produces misleading results. The only exception is pre-launch audits of a new site or redesign, where staging is all that exists.

    What if my audit finds hundreds of issues?

    This is normal, especially for first audits. Do not try to fix everything at once. Focus on P0 and P1 issues first. Many lower-priority issues can be addressed through improved processes (e.g., requiring meta descriptions before publishing) rather than retroactive fixes.

    Can I audit a competitor's site?

    You can crawl publicly accessible competitor pages to benchmark technical factors. You will not have access to their analytics or Search Console data, but crawl-level data (titles, schema, page speed, GEO readiness) provides valuable competitive intelligence.

    ---

    Ready to run your first complete audit? [Start free with SiteCrawlIQ](https://sitecrawliq.com) - 200 pages, full SEO + GEO analysis, no credit card required.

    See Your Site's Real SEO Data

    Stop guessing and start with real crawl data. SiteCrawlIQ combines traditional SEO auditing with GEO readiness scoring, structured data validation, and Core Web Vitals monitoring. Our hybrid crawler renders JavaScript pages, checks your llms.txt file, validates schema markup, and scores your content for AI engine citability. Get a comprehensive health score across seven weighted categories, plus a prioritized action plan generated by GPT-5 analysis of your actual crawl data.

    Start Your Free Audit
    • Features
    • Pricing
    • How It Works
    • Blog
    • FAQ
    • Help Center
    • API Docs
    • Privacy Policy
    • Terms of Service
    • What Is GEO?
    • SEO vs GEO
    • Audit Checklist
    • AI Crawler Guide

    SiteCrawlIQ - AI-powered SEO and GEO audit platform.

    SiteCrawlIQ, Inc. | support@sitecrawliq.com | 1-800-555-1234