SEO as Infrastructure

I pointed OpenAI’s GPT 5.4/fast at tonyseets.com and told it to make the site present well to machines. Search engines, social previews, LLMs. The brief was open-ended on purpose. I didn’t hand it a checklist. I said: audit the current state, fix what’s broken, build whatever’s missing.

What came back wasn’t a pile of tag fixes. It was infrastructure.

The starting point#

SEO basics were mostly in place but the quality was uneven. Blog posts had proper OG images and structured data. Those got left alone. But short-form pages, field notes and colophon entries, fell back to the global site description because nobody had written per-page meta for them. The activity page was missing schema entirely. The homepage had a SearchAction pointing at a search page that didn’t exist. And there was no llms.txt surface anywhere.

None of this was catastrophic. It was the kind of slow erosion that happens when you add content types faster than you update the SEO plumbing around them.

What got built#

Five things came out of the session, each one a system, not a one-time fix. The full technical breakdown is in the colophon entry.

Build-time llms.txt and llms-full.txt. Two endpoints that generate machine-readable site inventories from the content collections. The short version lists start-here pages and top writing. The full version lists everything. They stay current automatically because they pull from the same source as the pages themselves. Add a blog post, it shows up in llms-full.txt on the next build. No manual step.

A shared metadata resolver. A single module that resolves page titles, meta descriptions, and OG images through a cascading priority: explicit SEO override, then the description field, then body text. Applied across all detail page types. The real fix here is that short-form pages now get descriptions pulled from their actual content instead of falling back to a generic default. One piece of infrastructure instead of per-template copy-paste.

Reusable JSON-LD schema components. Route-specific structured data: the homepage gets WebSite, /about gets ProfilePage + Person, /activity gets WebPage, listing pages get CollectionPage, blog posts get BlogPosting, projects get CreativeWork. The dead SearchAction on the homepage got removed.

An SEO validation script. It crawls the built sitemap and checks every page for required meta tags (description, canonical, OG, social cards), JSON-LD presence, route-specific schema types, SearchAction validity, default-description detection, and llms.txt integrity. It runs against the build output, not source code. If I add a new page template and forget to wire up its metadata, this catches it.

Content schema expansion. Added seoTitle and seoDescription fields to the colophon and field notes schemas so I can override generated metadata when the automatic version isn’t good enough.

The incidental find#

While checking the activity page’s data surface, the agent noticed my GitHub activity dataset had gone stale since February 10. It dug into why and found a date-window overlap bug in the sync pipeline. Both the build-time sync and the runtime fetcher had overlapping ranges, counting one day twice and inflating my commit count by 61. Fixed, regenerated, verified against GitHub.

I didn’t ask it to look at that. It was auditing the activity page’s schema and noticed the numbers didn’t add up.

Systems vs. cleanup#

This is the distinction I keep coming back to. A cleanup is a snapshot. You fix every tag, verify every page, and the site is correct today. Tomorrow you add a new content type and it starts drifting again. Nobody remembers to add the JSON-LD component. The meta description falls back to the default. The llms.txt file goes stale because it was a static file someone wrote by hand.

The agent didn’t do a cleanup. It built machinery. The metadata resolver means new pages get proper meta without anyone remembering to configure it. The validator catches regressions at build time. The llms.txt files regenerate from content collections, so they can’t fall out of sync with the site. Each piece runs automatically, which means the standard holds even when I’m not thinking about SEO at all.

Wide work#

SEO touches every page template, every content type, every route. Doing it well means checking everything, not just the three pages you looked at recently. That’s exactly the kind of work agents are built for. They don’t get bored, don’t skip the tedious parts, don’t declare victory after fixing the homepage.

This is a personal site. A handful of content types, maybe a hundred pages. The stakes are low and the tolerance for roughness is high. But the pattern scales. A marketing site with fifty landing pages and twelve content types has the same structural problem, just more of it. More templates missing schema. More pages falling back to default descriptions. More surfaces that drift every time someone adds a section and forgets the meta.

The difference on a bigger site is that you’d actually have guardrails, staging checks, content review workflows. The agent-built systems slot right into that. A validation script that runs in CI. A metadata resolver that new templates inherit automatically. An llms.txt that regenerates on every deploy. The infrastructure doesn’t care whether it’s protecting ten pages or ten thousand.

I told the agent what “good” looks like. It applied that standard everywhere, then built the systems to keep it there. That’s the part that matters. Not the audit itself, but the fact that the standard is now self-enforcing.