Technical AEO

7 Schema Markup Mistakes That Are Silently Killing Your AI Citations

Jan 16, 20268 min read

Invalid, incomplete, or misapplied Schema is worse than no Schema at all in some cases. Here are the 7 most common Schema mistakes and the exact fixes for each.

Schema markup mistakes are frustrating because they are invisible — they do not cause page errors, your site still functions normally, and traditional SEO tools often miss them. But AI citation systems are specifically designed to evaluate schema quality, and these mistakes silently suppress your citation probability. Audit your schema completeness now.

Why Bad Schema Can Hurt More Than No Schema

Some schema mistakes create active negative signals:

  • Conflicting schema tells AI models your content is inconsistent or untrustworthy
  • Schema that does not match page content signals attempted manipulation
  • Outdated dates in schema signal stale, unmaintained content

In these cases, removing the bad schema and replacing it with nothing would actually improve your citation probability temporarily — until you add correct schema.

Mistake 1: Using WebPage Instead of Article for Blog Content

What it looks like:

{
  "@type": "WebPage",
  "name": "Your Blog Post Title"
}

Why it hurts: WebPage does not carry author, publisher, or date signals. AI systems cannot assess content authority without these fields. Article, BlogPosting, or NewsArticle provide the full metadata graph that AI systems need.

The fix:

{
  "@type": "Article",
  "headline": "Your Blog Post Title",
  "author": { "@type": "Person", "name": "Author Name" },
  "datePublished": "2025-01-15",
  "dateModified": "2025-06-01",
  "publisher": { "@type": "Organization", "name": "Brand Name" }
}

Mistake 2: Missing dateModified (or Leaving It Stale)

What it looks like: Article schema with datePublished but no dateModified, or dateModified that is the same as datePublished for a 3-year-old post.

Why it hurts: AI models use dateModified as the freshness signal. A post published in 2022 with no dateModified is treated as potentially stale. Perplexity in particular weights freshness heavily.

The fix: Add dateModified to all Article schema, and update it whenever you meaningfully update the content. "Meaningfully" means adding new information or updating data — not fixing a typo.

Mistake 3: FAQPage Answers That Are Too Long

What it looks like: FAQPage schema where Answer.text fields are 400-600 words each.

Why it hurts: AI systems extract FAQPage answers as citation snippets. Long answers get truncated unpredictably. The extraction algorithm also scores shorter, more direct answers higher for relevance.

The fix: Keep FAQPage Answer.text under 150 words — ideally 50-100 words. If you need to cover a topic in depth, the detailed content belongs in the page body, not the schema answer field. The schema answer should be the direct, concise answer; the body expands on it.

Mistake 4: Schema in the Body Instead of the Head

What it looks like: JSON-LD script tags inserted via a CMS plugin that places them in the page body, after the main content.

Why it hurts: Some AI crawlers parse pages sequentially and may not reach body-injected schema before making their extraction decisions. The HTML spec recommends JSON-LD in <head>.

How to check: View page source (Ctrl+U) and search for application/ld+json. If it appears after </main> or near the bottom of the page, it is body-injected.

The fix: Configure your CMS or theme to inject JSON-LD schema into the <head> section. In WordPress, this typically means updating your schema plugin settings or using a hook that targets wp_head.

What it looks like:

{
  "@type": "Person",
  "name": "Jane Smith"
}

Why it hurts: "Jane Smith" is not a verifiable entity without external references. AI models cannot cross-reference this author against professional databases, LinkedIn, or the Knowledge Graph. The author schema provides no authority signal without at least one sameAs link.

The fix:

{
  "@type": "Person",
  "name": "Jane Smith",
  "jobTitle": "Head of Content",
  "sameAs": [
    "https://www.linkedin.com/in/janesmith-real",
    "https://twitter.com/janesmith"
  ]
}

Mistake 6: Mismatched Schema Content and Page Content

What it looks like: FAQPage schema with Q&A pairs that are not present in the visible page body — added purely for schema manipulation.

Why it hurts: AI models cross-reference schema against page content. When the schema lists a question and answer that does not appear on the page, the schema is flagged as potentially manipulative. This reduces overall schema trust weight for the page.

The fix: Every FAQ in your FAQPage schema should be answered (possibly in shorter form) in the visible page body. Write the FAQ section in your content, then adapt it for the schema — not the other way around.

Mistake 7: Multiple Conflicting Schema Blocks

What it looks like: A page with both a manually added Article schema and a CMS auto-generated WebPage schema, with different name, description, or author values in each.

Why it hurts: Conflicting property values across schema blocks create parsing ambiguity. Which author is correct? Which description should be used? AI systems resolve this ambiguity by downgrading the trust weight of both schema blocks.

How to check: Use Schema.org Validator or Google Rich Results Test — they show all schema blocks on a page. Look for duplicate type definitions with conflicting values.

The fix: Audit your page source for all application/ld+json blocks. Remove or consolidate duplicates. If your CMS auto-generates schema, disable it for the content types where you have manual schema.

The Full Schema Audit Process

  1. Run RankAsAnswer's schema audit on your top 20 pages
  2. Export the schema completeness report — it flags all 7 mistake types above
  3. For each issue, use the auto-generated fix to replace the broken schema
  4. Validate with Google Rich Results Test after implementing
  5. Re-run the AEO audit in 4-6 weeks to confirm improvement

Continue reading

All articles
Technical AEO

AI Content Detectors Are a Myth: What RAG Engines Actually Penalize

Major LLMs and their RAG pipelines do not use AI content detectors. The compute cost is prohibitive, false positive rates are unacceptable at scale, and it is architecturally incompatible with standard indexing pipelines. The real penalties are Repetition Entropy and boilerplate template patterns.

8 min read
Technical AEO

Recency Bias in RAG: Why ISO 8601 Timestamps Are Mandatory

AI engines answer time-sensitive queries by filtering their candidate pool to recently-dated content first. Missing a machine-readable timestamp gets your content excluded from this filtered pool entirely — regardless of how accurate and dense it is.

7 min read
Technical AEO

Stop Writing for Humans: The Brutal Truth About Tokenizer Optimization

Writing flowery, engaging transition sentences dilutes your vector embeddings. Fact-dense, atomic sentences that tokenizers process efficiently earn more AI citations. This is a controversial position — and the citation data fully supports it.

8 min read
Technical AEO

The 'Lost in the Middle' Problem: Where to Put Your Best Facts

Research proves that LLMs exhibit primacy and recency bias: they use information from the beginning and end of the context window more than information in the middle. Your most important quantitative claims must be positioned at the start or end of your semantic chunks to consistently win the [1] citation.

8 min read
Technical AEO

JSON-LD in the RAG Era: The VIP Pass to the Context Window

Schema types like FAQPage and Organization are parsed separately from the noisy DOM and injected directly as pre-structured context into LLM processing pipelines. JSON-LD is not just an SEO signal — it is a direct mechanism for inserting pre-formatted facts into the context window.

10 min read
Technical AEO

Bypassing the Boilerplate: The Semantic HTML Rule for AI Crawlers

LLM ingestion pipelines use Readability.js and similar tools to strip div soup from web pages before indexing. If your core content is not wrapped in semantic HTML containers, it may be treated as boilerplate and excluded from the vector database entirely.

7 min read
Was this article helpful?
Back to all articles