Advanced Tool Builds &
Capstone Projects
This is the final and most technically ambitious tutorial in the programme. You'll build three professional-grade SEO tools: a log file analyser that diagnoses crawl budget waste, a redirect chain auditor that maps and flags redirect problems programmatically, and a content brief generator that uses the Claude API itself to produce AI-powered briefs from live SERP data. By the end, you'll have a toolkit genuinely capable of replacing hours of manual agency work every week.
- Parse and analyse large server log files with Python
- Build a redirect chain follower that detects loops, long chains, and bad destinations
- Call the Claude API programmatically from within a Python script
- Combine live web scraping with AI analysis to generate content briefs
- Understand how to structure and maintain a professional tools repository
- Know where to go next after completing the programme
1Your Completed Tools Repository
Before building the final three tools, here's what your seo-tools repository should look like by the end of this tutorial — a complete, professional toolkit any member of your team can use:
Ask Claude Code to write your README: Once all tools are built, open Claude Code and say: "Read all the Python scripts in the scripts/ folder and write a comprehensive README.md that explains what each tool does, what input files it needs, how to run it, and what output it produces." It will do this automatically.
What it does
Server log files contain every request made to your site — including every Googlebot visit. Analysing them reveals what Google is actually crawling, how often, and where it's wasting budget on low-value URLs. This tool parses a raw Apache or Nginx access log, filters for search engine bot requests, and produces a prioritised report of crawl budget waste.
Key concepts
Access logs are plain text, one line per request. Each line contains: IP, date, HTTP method, URL, status code, bytes, referrer, and user agent. We filter for lines where the user agent contains "Googlebot".
Log files can be gigabytes. We read them line-by-line rather than loading the whole file into memory — essential for files over ~100MB.
URLs crawled frequently that return 4xx, 5xx, or redirect responses are wasting crawl budget. So are low-value URL patterns like faceted navigation, session IDs, and print pages.
We look at how Googlebot's crawls are distributed across the site — are key commercial pages being crawled as often as low-value pages?
Building the tool
Sample output
Log File Analysis Complete ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ File: data/access.log | Size: 847 MB Date range: 2026-01-01 to 2026-02-19 (50 days) Googlebot requests: 24,847 (avg 497/day) Unique URLs crawled: 8,312 Status breakdown: 200 OK 18,204 (73.3%) 301 Redirect 3,891 (15.7%) ← budget waste 404 Not Found 2,196 (8.8%) ← budget waste 500 Server Error 556 (2.2%) ← budget waste ⚠ Crawl budget waste: 6,643 requests (26.7% of total) ⚠ Faceted nav URLs crawled: 1,204 unique URLs, 3,812 requests Reports saved to output/log_analysis_report.html Waste URLs saved to output/log_analysis_waste.csv
How to get a log file from a client
Log files live on the web server. For clients on shared hosting, they can download them from cPanel → Logs → Raw Access. For VPS/dedicated servers, they're typically at /var/log/apache2/access.log or /var/log/nginx/access.log. For clients on cloud hosting (WP Engine, Kinsta, etc.), check their dashboard for a log download option. A week's worth of logs is usually sufficient for analysis.
What it does
Takes a CSV of URLs (from a crawl, a sitemap, or a manual list), follows every redirect chain for each URL, and produces a report flagging: chains longer than 2 hops, redirect loops, chains that end in non-200 responses, and HTTP-to-HTTPS redirect opportunities. This is one of the most-requested tools in SEO migrations and site audits.
Why redirect chains matter: Each hop in a redirect chain adds latency, dilutes PageRank, and risks Googlebot abandoning the chain before reaching the destination. Google's John Mueller has stated Google will follow up to 10 redirects, but best practice is a maximum of 1–2 hops. Chains from old site migrations often go undetected for years.
Useful follow-up improvements
| Improvement | Follow-up prompt |
|---|---|
| Generate fix recommendations | "For each LONG_CHAIN URL, add a 'Recommended Fix' column showing the direct URL the source should redirect to (skipping intermediate hops)." |
| Bulk import from Screaming Frog | "Add support for a Screaming Frog redirect export CSV as an alternative input format — the column is called 'Address'." |
| Check redirect type consistency | "Flag any chains that mix 301 and 302 redirects — all hops should use 301 for permanent redirects." |
What it does & why it's different
This is the most sophisticated tool in the programme. Unlike the previous tools that use Python to process and report on data, this one calls the Claude API directly to perform AI analysis as part of the script itself. The result is a tool that: fetches the top 10 SERP results for a target keyword, extracts each page's heading structure, feeds everything into Claude, and receives back a full, structured content brief — all automatically.
Understanding the Claude API call
What's the Claude API? The same Claude you've been using in your browser is accessible via an API — meaning your Python script can send it a message and receive a reply, exactly like you do in the chat interface. The difference is it happens inside your code, automatically, as part of a larger workflow. You need an Anthropic API key for this (separate from your Claude for Teams subscription — see below).
API key for this tool: The Claude API requires an Anthropic API key from console.anthropic.com. This has a cost component based on usage, but content brief generation uses a modest amount of tokens — typically a few pence per brief. Add the key to your .env file as ANTHROPIC_API_KEY.
How the tool works — build phases
SERP scraping
Fetch top 10 Google results for the target keyword. Extract each result's URL, title, and meta description.
Page heading extraction
Fetch each of the 10 pages, extract H1–H3 headings, and the first sentence of each major section.
Claude API analysis
Send all extracted data to Claude with a detailed system prompt asking it to synthesise a content brief.
Brief generation
Claude returns a structured brief. The script formats and saves it as a clean HTML file.
The Claude API call in plain Python
For reference, here's the core API call Claude Code will generate. You don't need to write this yourself — it's shown here so you understand what's happening:
Choosing the right model: Use claude-sonnet-4-6 for brief generation — it's the best balance of quality and cost for this task. For simpler tasks like summarising a single page's headings, claude-haiku-4-5-20251001 is faster and cheaper. Claude Code will use whichever model you specify.
Sample brief output snippet
CONTENT BRIEF — "technical seo audit" Generated: 19 Feb 2026 | Analysed 8 competitor pages 1. SEARCH INTENT ANALYSIS Mixed intent: primarily informational (users learning what a technical SEO audit involves) with a secondary commercial layer (users evaluating whether to hire an agency vs. do it themselves). Content must serve both: explain the process credibly while positioning the brand as the expert to trust with the work. 2. RECOMMENDED FORMAT Comprehensive guide with embedded tool/checklist. All top-ranking pages are long-form guides (2,400–4,800 words). A checklist component would provide differentiation and increase time-on-page. 5. RECOMMENDED HEADING STRUCTURE H1: What Is a Technical SEO Audit? (Complete Guide for 2026) H2: What Does a Technical SEO Audit Cover? H3: Crawlability and indexation H3: Site architecture and internal linking H3: Page speed and Core Web Vitals H3: Structured data and schema markup H3: Mobile usability H3: International SEO (hreflang) ← gap: only 2/8 pages cover this H2: How to Run a Technical SEO Audit: Step-by-Step ...
2Where to Go From Here
Completing this programme means your team can build, maintain, and iterate on professional SEO tools. But this is a starting point, not a finish line. Here are the most valuable directions to explore next:
3Practice Exercises
Build Tool 1 and run it on a real client log file:
- Use the Claude Code prompt from Section 1 to build the script
- Obtain a week's worth of access logs from one client (see "How to get a log file" in Section 1)
- Run the analyser and review the HTML report
- Identify the top 3 crawl budget waste issues in the report
- Follow up: "For each waste pattern identified, write a one-paragraph recommendation I can include in the client's audit report"
Run the redirect auditor on a real migration or audit project:
- Build the script using the Claude Code prompt from Section 1
- Create a URLs CSV from a recent Screaming Frog crawl (export redirect URLs)
- Run the auditor and review classifications in the HTML report
- Identify any LOOP or BAD_DESTINATION cases — these need immediate attention
- Follow up in the same session: "Generate a redirect fix plan CSV with columns: source_url, current_chain, recommended_direct_target"
Build Tool 3 and generate a real brief for a client keyword:
- Get an Anthropic API key from console.anthropic.com and add it to .env
- Get a SerpAPI or ValueSERP key (both offer free trial credits) and add to .env
- Use the Claude Code prompt from Section 1 to build the script
- Run it for a target keyword from one of your client's wish lists
- Review the generated brief — is the heading structure reasonable? Compare it to what you'd have written manually. What did it miss?
- Send the brief to a content writer and ask for their feedback on usefulness
4Programme Complete
Final thought: The most important thing you've built over these eight tutorials isn't any individual tool — it's the habit of reaching for Claude Code when you hit a repetitive task, the discipline of maintaining a shared prompt library, and the confidence to say "we can build that" when a client need arises. Those compound over months and years in ways that a single script never will.