Methodology · AEO · v1.0
How we measure AI Engine Optimization, and what we deliberately don't measure.
This page is the trust contract for Stork Pro and Pro+. It explains exactly what we test, how we score, what we can verifiably do, and what we explicitly refuse to sell. Last updated 2026-05-26.
1. Engines we measure
We run buyer-intent queries against three AI engines every Monday at 18:00 UTC:
- ChatGPT via OpenAI's gpt-4o-search-preview (search-grounded responses with annotations)
- Claude via Anthropic's claude-sonnet-4-6 with the web_search tool
- Perplexity via the sonar-pro API (explicit citation array per response)
Pro+ adds Gemini (gemini-2.5-flash with grounding) and Google AI Overviews (scraped via SERP vendors — best-effort; Google breaks scrapers regularly so this engine is reported as "best-effort" not guaranteed).
We do not measure Grok at scale yet — its API is unstable and citation behavior changes weekly. We do not separately measure model variants on each engine (e.g. ChatGPT free vs Plus) because the buyer-side cost is high and the signal-to-noise is low at our scale.
2. How we pick queries
Each tool tracked gets 25 queries (Pro) or 100 queries (Pro+) per week. Three sources fill that quota:
- Category defaults: we seed standard buyer-intent prompts for the tool's category — "best AI X for Y", "alternative to popular tool", "what's the most reliable Z" — based on patterns we've observed drive citation behavior in that category.
- Owner-added queries: you can add up to the tier limit of custom queries via the dashboard. These typically include your direct competitive prompts.
- Brand lookups: a small number of "what is [your tool name]" queries to track parametric-knowledge presence vs retrieval-driven mentions.
We cap at 25/100 because statistical noise dominates above that for a single tool. More queries does not equal more signal; it equals more compute cost and worse signal-to-noise.
3. Why directional scoring, not absolute numbers
Most AEO tools sell precise-looking numbers — "your prompt has 3.2M monthly Perplexity volume" — that are extracted from Chrome-panel data, keyword-filtered, and statistically extrapolated. Independent comparisons in May 2026 found these volumes routinely inflated by 17,000× vs same-keyword Google Search Console data.
We refuse to put a precise-looking number on something we can't measure precisely. Instead, we report directional buckets:
- Very Low — appears in less than 5% of tracked queries on a given engine
- Low — 5–20%
- Medium — 20–40%
- High — 40–65%
- Very High — over 65%
Share of Voice is reported as a percentage (your brand mentions / total brand mentions across self + competitors), because that ratio is meaningful even with our sample size. Absolute mention counts are also shown for transparency but framed as "this week's runs," not as market-level demand.
4. How we detect brand mentions
For each response text, we run:
- Exact-string match for your brand and known competitor brands (case-insensitive, word-boundary aware).
- Fuzzy match if exact fails — collapses non-alphanumeric characters so "Stork.AI" / "Stork AI" / "stork-ai" all collapse to the same brand.
- Sentiment heuristic — a naive keyword scan in a ±40-character window around the mention, classifying positive/neutral/negative. We label this an approximation, not real NLP. Visible in your dashboard but never gated to product decisions.
Edge case: when a brand name is also a common word (e.g. "Notion" the company vs notion the concept), match counts can over-report. We surface this in the dashboard with a "matchMethod: fuzzy" badge so you can audit individual mentions.
5. How we parse citations per engine
Citation parsing is per-engine because each engine reports differently:
- Perplexity: top-level
citations[]array of URLs in citation order. Easiest. Inline [N] markers in the text correspond to array positions. - Anthropic Claude: citation blocks attached to text blocks, each with
url,cited_text, andsource_urlfields. Walked recursively. - OpenAI ChatGPT: annotations under
message.annotations[]with typeurl_citation. Falls back to URL regex extraction from response text when annotations are missing (gpt-4o-search-preview is inconsistent). - Gemini:
groundingMetadata.groundingSupports[]+groundingChunks[]. - AI Overviews: scraped from rendered HTML — extracts
<a>elements from the answer block. Brittle.
Each parsed URL is canonicalized (strip protocol redirects, tracking params, fragments) before being stored, so example.com/?utm_source=X and example.com count as the same domain.
6. Update cadence
Once per week, every Monday 18:00 UTC. Not daily.
Why not daily: AI citation patterns shift on weekly cadences. Reddit's ChatGPT share moved from 60% to 10% over five weeks in late 2025; Gemini's overall citation rate dropped 23 percentage points in two weeks. Day-to-day noise dominates the signal at this sample size. Weekly is the right cadence for both customer signal and our compute budget.
7. What we cannot do
These are explicit non-promises:
- We cannot modify your website. AEO requires technical hygiene (schema markup, llms.txt, semantic HTML) on the tool's own domain. We can audit and advise; you implement.
- We cannot influence LLM training data. Model weights are frozen; nothing we do retroactively affects parametric knowledge.
- We cannot guarantee ChatGPT citations. Wikipedia accounts for 47% of ChatGPT's source citations. The remaining 53% is heavily concentrated in a small set of news outlets (Reuters, AP, FT, BBC). Indie brands rarely land in either, regardless of effort.
- We cannot promise sub-90-day citation lift. Industry data shows 30–60 days for initial movement on tracked queries, 3–6 months for durable lift. We tell you this upfront so you don't churn at week 4.
- We cannot post to Reddit on your behalf. Reddit's ToS + community culture both reject automated posting. We do the research and drafting; you post from your own account.
8. Reddit Signals: draft and edit-gate mechanics
Reddit drove ~24% of Perplexity citations and ~21% of Google AI Overviews citations in early 2026 — the largest single non-Wikipedia citation source for category and product queries. Getting cited on Reddit is the highest-leverage AEO action available to indie tools.
The Reddit Signals workflow:
- Every 6 hours, we scan 12+ AI-cited subreddits matched to your tool's category. PullPush.io is the data source (free, community-run Pushshift successor).
- Each thread is scored on intent (does the OP sound like a buyer?), freshness (linear decay over 7 days), rule fit (does the subreddit allow this kind of mention?), and engagement velocity (upvotes per hour).
- Threads above the composite score threshold are surfaced in your dashboard.
- You click "Generate draft." Claude writes a reply in the Reddit Citation Block format — 300–600 words, structured as problem acknowledgment → numbered framework → concrete outcome with data → optional contextual mention of your tool as one option among 2–3 alternatives. This is the format that gets cited by LLMs.
- You see the draft in a textarea. The "I posted this" button is locked until you've edited at least 20% of the characters. This is enforced both client-side (live progress bar) and server-side (server rejects if the gate isn't met).
- You open the Reddit thread in a new tab, paste your edited reply, post from your account.
- You return to Stork, paste the URL of your posted comment. Stork tracks the comment's upvote/reply trajectory and watches for the thread appearing in future weekly citation queries.
Why the edit gate: AI-flavored replies get removed by subreddit moderators within an hour. Forcing meaningful edits keeps your account safe and ensures the reply sounds like you, not like a draft from a tool.
9. What we will not sell
The honest list of things this product explicitly does not include, even if asked. These are anti-patterns we've watched competitors use to inflate revenue at the cost of customer trust:
- "Guaranteed ChatGPT citations" — see §7. Anyone selling this is lying.
- "llms.txt will rank you" — a 300,000-domain study by SE Ranking in late 2025 found zero correlation between llms.txt presence and AI citation frequency. We help you set up an llms.txt as hygiene, but we won't sell it as an outcome.
- 30 sponsored AI-written articles per month — Google's May 2026 spam policy update explicitly targets this content pattern. We previously sold individual sponsored blog posts at $99; we killed that product in May 2026 for exactly this reason.
- Inflated absolute prompt-volume metrics — see §3.
- Automated Reddit posting — see §8.
- Pay-to-feature in our newsletter — editorial bar applies to free and Pro+ tools equally. Pro+ customers get newsletter eligibility, not guaranteed placement.
10. Methodology version history
Material changes to this document trigger a version bump and a row in our public methodology table. Older AEO scores remain tagged with the methodology version they were computed under.
- v1.0 (2026-05-26) — Initial publication. Three engines tracked (OpenAI, Anthropic, Perplexity), weekly cadence, directional scoring, Reddit Signals workflow with 20% edit gate.
If you see a Stork score that references an older methodology version than the one above, that score was computed under the older rules. We don't retroactively re-score historical data; old scores stay annotated with the version that produced them.