What is GEO (Generative Engine Optimization)?

GEO (Generative Engine Optimization) is the practice of getting AI engines such as ChatGPT, Gemini, Claude and Perplexity to mention and recommend your brand when answering user questions. Unlike SEO, which targets search-engine rankings, GEO targets your visibility, hit-rate and citation position inside AI-generated answers.

Which AI platforms does Maxfound AI monitor?

Maxfound AI connects to the leading international AI engines: ChatGPT, Gemini, Claude and Perplexity. Every scan issues real queries against the live APIs, so results reflect what each engine actually says about your brand.

What is the PAWC metric?

PAWC (Position-Adjusted Word Count) is the core visibility metric from the Princeton ACM KDD 2024 GEO paper (Eq. 3): Imp_pwc = Σ |s|·e^(-pos(s)/|s|) / Σ |s|. It measures how visible your brand name is inside an AI answer, weighting earlier positions more heavily.

What can the AEO rewriting tool do?

The AEO rewriting tool uses real LLMs to restructure your source content into an inverted-pyramid format (headline + direct answer + evidence + context + FAQ) plus JSON-LD, measurably improving how often AI engines cite you.

How do I try Maxfound AI?

You can request a free visibility scan and a product demo. For the full capability list and commercial plans, get in touch via Request a demo.

📚 AEO Glossary

AEO Glossary · 62+ terms

Generative Engine Optimization · AI citation monitoring · authoritative industry definitions

From the fundamentals of AEO / GEO to industry-standard metrics like Citation Diversity Index / TTFC / E-E-A-T, covering core terms from leading voices such as Lily Ray (E-E-A-T), Mike King (entity-first SEO), and Ethan Smith (the 80/20 framework).

A

4 terms

AEO (Answer Engine Optimization)

Answer Engine Optimization

Definition: A content methodology for generative answer engines such as ChatGPT, Gemini, Claude, and Perplexity. The goal is for your brand to be cited and recommended when an AI answers a user's question directly.

Why it matters: Traditional SEO optimizes for rankings (the ten blue links); AEO optimizes for the sentence the AI answers with. When users stop clicking links and only read the AI summary, AEO decides whether your brand enters the decision.

Related: GEO Score · Citation Rate · Long Tail

Anti-Pattern

Definition: Practices proven ineffective or harmful in AEO — for example over-engineered schema stuffing, AI content farms, keyword stuffing, and faking citations.

Why it matters: Ethan Smith (founder of Graphite.io) has written repeatedly that 80% of an AEO budget should go to content quality and distribution, and only 20% to technical work.

Source: Ethan Smith / Graphite.io

Related: 80/20 Budget

Authority Score

Authority Score (0–100)

Definition: A composite score for how much an AI engine trusts a source (a domain, account, or entity) on a given topic. It blends Wikidata anchoring, backlink quality, citation frequency, and recency.

Why it matters: Rankings only tell you what position a keyword holds; the authority score tells you whether AI engines believe whatever this site says. Key accounts should track their domain's authority score over time.

Related: TAI · Anchor Entity

Anchor Entity

Definition: An entity that holds a unique QID in the knowledge graph (Wikidata). Having a QID for your brand, product, or person is the prerequisite for AI engines to recognize it.

Why it matters: The core of Mike King's (iPullRank) entity-first SEO: a brand with no Wikidata entry effectively does not exist to an AI engine.

Source: Mike King / iPullRank

Related: QID · Wikidata · Knowledge Graph

B

3 terms

Bot Affinity Index

Definition: Measures how frequently, how deeply, and how often AI crawlers (GPTBot, ClaudeBot, PerplexityBot, and others) revisit your own site.

Why it matters: If AI engines don't crawl you, they won't cite you. This index is an early signal of whether your site is entering AI training and RAG data pools.

Related: Crawler · ClawBot

BGE-M3

Multilingual embedding model

Definition: An open-source multilingual embedding model from BAAI, strong on long documents and cross-language retrieval. A common base for RAG and semantic clustering, especially across languages.

Why it matters: For prompt clustering, citation de-duplication, and merging similar queries, a dedicated multilingual model can outperform general English embeddings on non-English text.

Related: Embedding

Brand Coverage

Definition: The share of AI answers across a prompt set that mention the brand name at least once. A mention-level metric — weaker than a citation.

Why it matters: High coverage with a low citation rate means you're named but not recommended — often a neutral or negative mention, so check the context.

Related: Mention Detection · Citation Rate

C

8 terms

Citation

Definition: When an AI engine cites a brand or source as the basis for its answer. A citation is stronger than a mention: a mention names you, a citation uses you as evidence.

Why it matters: In the AI era, the search result is the citation, not the ranking. The core metrics of leading GEO tools and Maxfound AI all revolve around citations.

Citation Diversity Index (CDI)

Definition: A 1–100 score for how evenly a brand is cited across source types (community Q&A, social, Wikipedia, news, blogs, and so on). Normalized using Shannon entropy.

Why it matters: A single-point-of-failure alert. A CDI below 50 means one source dominates (e.g. 80% of citations come from a single account); if that source goes dark, your brand presence collapses.

Benchmark tool: Maxfound AI differentiator

Related: Citation Heatmap · Source Type

Citation Heist Map

Definition: A visualization of the prompts where a competitor took the citation that should have been yours, along with how much ground they captured.

Why it matters: Reverse-engineers a competitor's RAG fuel — what they wrote that you didn't, which sources they placed that you missed — and turns it straight into a content topic list.

Benchmark tool: Maxfound AI differentiator

Related: Citation

Citation Heatmap

Definition: Time on the x-axis, AI engines (ChatGPT / Gemini / Claude / Perplexity and more) on the y-axis; cell shading shows citation frequency.

Why it matters: Shows at a glance which engines are forgetting your brand and which are picking it up, so you can decide where to focus next week's content.

Citation Rate (CR)

Definition: The share of AI answers across a prompt set that contain at least one brand citation. The north-star metric for GEO.

Why it matters: Stricter than coverage and more concrete than rankings. It's the core metric of leading GEO tools and the first number on the Maxfound AI dashboard.

Related: Citation · Brand Coverage · GEO Score

Conversation Topics

Definition: The semantic clusters you get from topic-clustering a set of user prompts. Each cluster represents one kind of thing users actually ask AI engines.

Why it matters: Traditional keyword research gives you search terms; conversation topics give you the natural-language questions users ask AI — long-tail and high-intent.

Related: Prompt Corpus · Long Tail

Crawler

Crawler (AI bot)

Definition: User-agents such as GPTBot, ClaudeBot, PerplexityBot, and Google-Extended that must be explicitly allowed or denied in your site's robots.txt.

Why it matters: Blocking them in robots.txt means giving up on being cited by AI. But allowing everything carries its own risk (content scraped for training), so allow selectively.

Related: Bot Affinity Index · ClawBot

ClawBot

AI-crawler entry monitor

Definition: A Maxfound AI module that monitors how each AI crawler behaves on your site: frequency, paths, depth, dwell time, return visits, and which chunks they fetch.

Why it matters: Traditional GA / Plausible track human traffic; ClawBot tracks what knowledge AI engines take from your site and how much they use. It's the earliest signal before a citation.

Benchmark tool: Built by Maxfound AI

Related: Crawler · Bot Affinity Index

D

3 terms

DeepSeek

DeepSeek (LLM)

Definition: An open-weight large-model family (V3 / R1 and others) developed by the company DeepSeek. Widely used in Chinese-language markets.

Why it matters: Relevant when your audience uses Chinese-market AI assistants; for English and cross-border markets, focus on ChatGPT, Gemini, Claude, and Perplexity.

Related: Doubao

DRAM (Drift / Response 漂移指标)

Drift / Response drift

Definition: How semantically stable an AI engine's answers are for the same prompt across different times and sessions. High drift means unstable answers.

Why it matters: Before reporting a citation lift to a client, rule out drift noise first — the bump may just be model variance, not a real gain.

Related: Fluency Score

Doubao

Doubao (LLM)

Definition: ByteDance's consumer-facing large model and assistant, with an API on its cloud platform. Used primarily in Chinese-market apps.

Why it matters: Relevant for Chinese-market mobile audiences; for English and cross-border markets, focus on the leading global engines instead.

Related: DeepSeek

E

4 terms

E-E-A-T

Experience / Expertise / Authoritativeness / Trust

Definition: Google's four Search Quality pillars: Experience, Expertise, Authoritativeness, and Trustworthiness. AI engines reuse this framework to screen sources.

Why it matters: Lily Ray (Amsive Digital) repeatedly stresses that for YMYL categories (health, finance, legal, automotive) E-E-A-T is a hard gate — fail it and AI engines won't cite you.

Source: Lily Ray / Amsive

Related: YMYL · Authority Score

Earned Distribution

Definition: Distribution where other people speak for you — earned through third-party communities like Reddit, Quora, and industry forums.

Why it matters: AI engines trust third-party voices far more than a brand's own. One highly upvoted community answer outweighs a hundred of your own SEO posts.

Related: Owned Distribution · Source Type

Embedding

Definition: A representation that maps text (or images) into a high-dimensional numeric vector, where semantically similar items sit close together. The basis for RAG, clustering, and de-duplication.

Why it matters: Prompt clustering, citation similarity merging, and content-gap analysis all rely on embeddings. Choose a model that handles your target languages well.

Related: BGE-M3 · RAG

Entity SEO

Definition: An SEO methodology that optimizes for entities (brands, products, people, places) rather than keywords. The point is to make AI engines and Google's knowledge graph recognize your entity.

Why it matters: Per Mike King's RAG-Web framework, AI engines index web pages as attributes of an entity — so first they need to know what the entity is called, what it is, and what attributes it has.

Source: Mike King / iPullRank

Related: Anchor Entity · Wikidata · QID

F

2 terms

Fact Claim

Definition: An objective, independently verifiable statement in your content (a number, date, quote, or source) — as opposed to opinion or marketing language.

Why it matters: The Princeton GEO paper shows that adding more fact claims (statistics, quotations, citations) can lift AI citations by 30–41%.

Source: Princeton GEO Paper (Aggarwal et al. 2024)

Fluency Score

Definition: A score for how fluent and natural an AI engine's answer reads. Low fluency usually means the model didn't really understand the topic or is improvising.

Why it matters: A sudden drop in fluency signals falling model confidence on a topic — an early warning that citations are about to slide.

Related: DRAM

G

2 terms

GEO Score

Definition: A composite scoring system from the Princeton GEO paper (Aggarwal et al., KDD 2024) that weights multiple signals — citations, quotations, statistics, and fluency.

Why it matters: The paper shows that GEO optimization on rank-5 sites can lift citations by +115% and quotations by +41% (the strongest single lever). It's the most authoritative quantitative benchmark in the field today.

Source: Princeton GEO Paper (Aggarwal et al. 2024 KDD)

Grounding

Definition: The process by which an AI engine ties its answer to external, verifiable sources. Gemini grounding, Perplexity sources, and ChatGPT browsing are all implementations of grounding.

Why it matters: Grounding is the root reason AEO matters more than prompt engineering: if you don't optimize the external sources, rewriting prompts alone won't save you.

Related: RAG · Rehydration

H

3 terms

Hallucination

Definition: When an AI engine confidently fabricates facts, numbers, citations, or regulations that don't exist.

Why it matters: Hallucinations hurt brands: an AI engine may attribute a competitor's feature to you, present outdated info as current policy, or invent a case study about you. Monitor and address them proactively.

Hermes Agent

Autonomous sales agent

Definition: A Maxfound AI AI sales and support agent that connects GEO monitoring data, AutoMedia distribution, and your CRM to follow up on opportunities automatically.

Why it matters: It closes the loop from "we spotted a problem" to "here's the next action." Others hand you a report; Maxfound AI drives execution.

Benchmark tool: Built by Maxfound AI

Related: ClawBot

Helpcenter Engine

Definition: A method that takes the Cartesian product of feature × integration × use-case to auto-generate hundreds to thousands of high-quality help-center articles.

Why it matters: Ethan Smith found that 80% of a B2B SaaS's long-tail AI traffic comes from the help center, because AI engines favor structured content that answers specific questions.

Source: Ethan Smith / Graphite.io

Related: Long Tail · Earned Distribution

I

2 terms

Inngest

Async job orchestration

Definition: An event-driven, serverless orchestration platform for background jobs — with steps, retries, and observability.

Why it matters: GEO monitoring involves many long-running jobs (multi-engine probes, crawling, embedding, scoring); serverless timeouts are too short, so Inngest is a common production choice.

Intent Score

Definition: A score for a prompt's intent — commercial, decision-stage, research, or casual. High-intent prompts map directly to deals.

Why it matters: Not every citation is worth the same. A citation won on a high-intent prompt is the kind that actually brings customers — an effective citation.

Related: Conversation Topics

K

2 terms

Kimi

Kimi (LLM)

Definition: A long-context large model from Moonshot AI, used mainly in Chinese-language markets.

Why it matters: Relevant for Chinese-market, long-document use cases (contracts, policies, research). For English and cross-border markets, focus on the leading global engines.

Knowledge Graph

Definition: A structured knowledge network of entity–relation–entity triples. Wikidata and the Google Knowledge Graph are leading examples.

Why it matters: The knowledge graph is the backbone of an AI engine's factual memory. If your brand isn't in it, the engine can't tie you to any topic.

Related: Wikidata · Anchor Entity · Entity SEO

L

3 terms

LLM Wiki

Definition: Structured knowledge pages optimized for AI reading — fact-dense, easy for RAG to chunk, and annotated with schema.

Why it matters: Maxfound AI included an LLM Wiki module in its design; its public release is currently on hold while the other GEO modules ship as planned.

Related: RAG · Fact Claim

LSI (Latent Semantic Indexing)

Latent Semantic Indexing

Definition: A semantic keyword-expansion technique from the classic SEO era. Its relevance has faded under AEO (AI engines model semantics themselves), but it's still a starter tool for keyword planning.

Why it matters: SEO teams moving to AEO often misuse LSI as a KPI; make clear that AI engines no longer rank by LSI.

Long Tail

Definition: Low-volume queries that are 10+ words, conversational, and question-shaped. Their share rises in the AI era.

Why it matters: An estimated 60%+ of AI queries are long-tail, precisely because users ask AI in natural language rather than search-box keywords.

M

2 terms

Mention Detection

Definition: The algorithm layer that finds a brand name (including variants, abbreviations, and misspellings) in an AI answer. A prerequisite step before citation extraction.

Why it matters: Brand names often collide with common words and need disambiguation; otherwise the false-positive rate is high and clients stop trusting the data.

Related: Citation

Momentum Score

Definition: A rate-of-change score (a first-order time difference) over metrics like citation rate, CDI, and authority score that tells you whether they're rising or falling.

Why it matters: In a client report, "+12% citation rate" is an absolute number; "momentum is positive and has accelerated for four straight weeks" is far more persuasive.

O

2 terms

Owned Distribution

Definition: Content channels the brand fully controls — its own website, official social accounts, and help center.

Why it matters: Owned and earned should run about 1:2. All owned reads as self-promotion; all earned cedes content sovereignty.

Related: Earned Distribution · Source Type

OpenRouter

LLM aggregation API

Definition: A proxy layer that puts many providers (OpenAI, Anthropic, Google, and others) behind a single OpenAI-compatible API.

Why it matters: For multi-engine probing, OpenRouter saves you from maintaining many SDKs and unifies billing and rate limiting.

P

4 terms

PIPL

China's data-protection law (PIPL)

Definition: China's Personal Information Protection Law (in force since Nov 2021). It governs the collection of personal data, cross-border transfers, user consent, and the handling of sensitive data.

Why it matters: Compliance is required when monitoring clients with China operations, running cross-border prompt probes, or collecting user logs. (Outside China, GDPR and CCPA apply correspondingly.)

Prompt Corpus

Definition: A carefully built, representative set of prompts for a brand or industry — covering brand-name, category, comparison, recommendation, and negative scenarios.

Why it matters: Pick the wrong prompts and the whole monitoring is wrong. Corpus coverage directly determines whether your conclusions are credible. Also called a Prompt Set.

Princeton GEO

Princeton GEO paper

Definition: Aggarwal et al., "GEO: Generative Engine Optimization" (KDD 2024) — the paper that formalized the GEO evaluation framework and quantified the citation uplift of each lever.

Why it matters: The most-cited academic paper in GEO, and the scientific backing to point to when explaining the methodology to clients.

Source: Aggarwal et al. 2024 KDD

Related: GEO Score · Fact Claim

Overseas GEO tools

Leading GEO tools

Definition: The leading GEO monitoring SaaS products, centered on multi-engine citation monitoring, prompt sets, and brand-visibility reporting.

Why it matters: Maxfound AI differentiates on citation-heist analysis, hallucination alerting, and the Citation Diversity Index.

Q

2 terms

Quotation Score

Definition: A density score for direct quotations in your content — expert statements, named sources, user testimonials. The single highest-uplift lever in the Princeton GEO paper.

Why it matters: Want AI engines to cite you? The most effective move isn't fancy writing — it's giving your content quotable lines worth citing.

Source: Princeton GEO Paper

Related: GEO Score · Fact Claim

QID

QID (Wikidata entity ID)

Definition: The unique identifier for each Wikidata entity, starting with "Q" (for example, Google is Q95).

Why it matters: Getting a QID is how you enter an AI engine's formal knowledge system. No QID means you're a nobody in its eyes.

Related: Wikidata · Anchor Entity

R

2 terms

RAG (Retrieval-Augmented Generation)

Retrieval-Augmented Generation (RAG)

Definition: Before answering, an AI engine retrieves from external knowledge bases or the web, adds the results to the prompt as context, then generates. Perplexity, ChatGPT browsing, and Gemini grounding are all RAG.

Why it matters: The physical foundation of AEO. Your content must clear all three steps — retrievable, chunkable, and selected by the model — before it can be cited.

Related: Grounding · Embedding · Rehydration

Rehydration

Definition: The process by which an AI engine stitches multiple retrieved chunks back into a coherent answer.

Why it matters: Not all well-written content reassembles well. Short paragraphs, clear subheadings, and explicit conclusions are easier for rehydration to pick.

Related: RAG · Fact Claim

S

5 terms

Sentinel

Definition: A subsystem that continuously monitors a class of anomaly signals and alerts automatically.

Why it matters: GEO data is noisy; without automated sentinels you'd have to watch it by hand, which breaks the moment you scale.

SoV (Share of Voice)

Share of Voice (SoV)

Definition: Across a prompt set's AI answers, the percentage of all peer-brand mentions and citations that go to your brand.

Why it matters: More competitive than an absolute citation rate: clients care most about how they stack up against rivals.

Related: Citation Rate · Brand Coverage

Source Type

Definition: The category dimension assigned to each citation — typically: community Q&A / social / Wikipedia / news / blog / owned / YouTube / other.

Why it matters: CDI is computed from the distribution of source types. Without source-type labels there's no notion of diversity.

Statistics Score

Definition: A density score for concrete numbers, percentages, and study results in your content. The second-highest-uplift lever in the Princeton GEO paper (+37% on Perplexity).

Why it matters: "We're better" is opinion; "in a 2024 test across 12 cities we averaged a 31% lift" is fact. AI engines prefer the latter.

Source: Princeton GEO Paper

Related: GEO Score · Fact Claim

Sumtotal

Sumtotal (GEO total score)

Definition: A single 0–100 score that Maxfound AI produces by weighting the sub-metrics — citation rate, CDI, authority, momentum, and hallucination.

Why it matters: Executives don't read the detail; they read one number. Sumtotal is the monthly red/green light for the CMO.

Related: GEO Score

T

3 terms

TAI (Topical Authority Index)

Topical Authority Index (TAI)

Definition: A brand's cumulative authority within a specific topic sub-domain. A brand that's top-5 industry-wide can still have a low TAI on a particular sub-topic.

Why it matters: For niche sub-domains, site-wide authority is useless — TAI is the real signal.

Related: Authority Score

TTFC (Time-to-First-Citation)

Time-to-First-Citation (TTFC)

Definition: The number of days from publishing a new piece of content to its first appearance in an AI engine's answer.

Why it matters: In Maxfound AI's own measurements, 65% of high-GEO-score articles earned their first citation within 21 days. Clients care most about how long until results show — TTFC is the answer.

Related: GEO Score

80/20 Budget

Definition: Ethan Smith's two-ring rule: 80% of budget to content quality and distribution, 20% to technical work (schema, robots, sitemap). The ratio holds in the AEO era; the focus shifts from keyword density to fact density.

Why it matters: The moment leadership hears "AEO" they want more schema and hreflang, and technical work eats the budget. This rule is the money-saving maxim to set clients straight.

Source: Ethan Smith / Graphite.io

Related: Anti-Pattern

W

3 terms

Wikidata

Definition: Wikipedia's structured-data sibling, where every entity has a QID and attribute triples. A high-weight source in the training data of Google, OpenAI, and Anthropic.

Why it matters: Registering a Wikidata entry is your passport into AI engines.

Related: QID · Wikipedia · Anchor Entity

Wikipedia

Definition: The world's largest encyclopedia, and one of the highest-weight sources in AI training data.

Why it matters: Brands with a stable Wikipedia entry are cited by AI engines far more often than peers without one.

Related: Wikidata · Authority Score

Workflow

Definition: Chaining multiple GEO steps (collect prompts → multi-engine probe → citation extraction → CDI calculation → report) into a reusable, schedulable pipeline.

Why it matters: You can review one run by hand; running ten depends on a workflow. Inngest, Temporal, and self-hosted cron are common foundations.

Related: Inngest

Y

1 term

YMYL (Your Money Your Life)

Your Money Your Life (YMYL)

Definition: High-bar categories where content materially affects users — money, health, legal, and safety. Aesthetics/medical, automotive, finance, and legal all qualify.

Why it matters: Lily Ray has written repeatedly that in YMYL categories, failing E-E-A-T means AI engines flatly refuse to cite you. That's why medical and automotive clients are so willing to pay.

Source: Lily Ray / Amsive

Related: E-E-A-T · Authority Score

Z

2 terms

Zhihu

Zhihu (Q&A community)

Definition: China's largest Q&A community and a key source in Chinese-language RAG data pools. (For English markets, the equivalent earned-distribution battlegrounds are Reddit and Quora.)

Why it matters: The top earned-distribution battleground for Chinese-language audiences — one highly upvoted answer can drive half a brand's AI citations.

Related: Earned Distribution · Source Type

Zhipu

Zhipu GLM (LLM)

Definition: Zhipu AI's GLM family of large models, including open-API versions such as GLM-4-Flash. Used mainly in Chinese-market and enterprise contexts.

Why it matters: Relevant for Chinese-market enterprise and public-sector audiences. For English and cross-border markets, focus on the leading global engines.

Want to see your brand's citation rate across AI answers?

Maxfound AI probes the leading AI answer engines — citation rate, CDI, hallucination alerts, and priority queues in one place. First report in 10 minutes.

Request a demo →View pitch deck

Glossary updated regularly · drop your email for new terms

Up to 2 emails a month · new terms and AI answer-engine trends · unsubscribe anytime

我们尊重你的隐私 · 仅用于产品和行业更新 · 一键退订

This glossary is updated regularly · spot a missing term? Let us know ·Citation & copyright notice