Reference

The AI Citation Dictionary: 100 Terms Every Marketer Should Know

100 essential terms across 10 categories -- the canonical reference for AI visibility, GEO, citation behavior, and AI-era marketing measurement. Each definition is concise, structured for AI extraction, and grounded in verified research.

Nisha Kumari|April 26, 202620 min read

The vocabulary of AI visibility moves faster than any single glossary can keep up with. Terms that didn't exist two years ago -- GEO, query fan-out, llms.txt, brand persistence, parametric memory -- are now the difference between a marketer who understands what's happening and one who doesn't. This dictionary defines 100 essential terms across 10 categories. Every definition is concise enough to extract, structured for AI to cite, and grounded in verified research where applicable.

100

essential terms across 10 categories -- the canonical reference for AI visibility, GEO, citation behavior, and AI-era marketing measurement

Use this as a quick reference (jump to any category below) or read straight through for a comprehensive understanding of the field. Each definition links back to verified sources where underlying research informs the terminology. For context on how these concepts fit together, start with Ranqo's complete GEO guide.

10 Categories, 100 Terms

Jump to any category, or read straight through for the complete glossary.

Foundational AI Concepts

10 terms

Search Paradigms

10 terms

AI Platforms

10 terms

AI Crawlers & Bots

10 terms

Citation Mechanics

10 terms

Content Optimization

10 terms

Technical Standards

10 terms

Crawling & Rendering

10 terms

Authority & Trust Signals

10 terms

Measurement & Analytics

10 terms

Why a Glossary, and Why Now

AI platforms cite definitional content disproportionately often. When a user asks ChatGPT "What is GEO?" or Perplexity "What does query fan-out mean?", the AI looks for clear, structured term-definition pairs it can quote. Wikipedia is the most-cited single domain for ChatGPT for exactly this reason: it's the canonical reference for definitions across nearly every topic.

Most companies don't realize this opportunity. They write blogs explaining concepts in narrative form, embedding definitions in long paragraphs that AI struggles to extract. A clean term-definition format -- the format you're reading right now -- is among the highest-citation-rate content structures available, especially when paired with schema markup like FAQ schema (which produces a 3.2x AI Overview boost per Frase's research).

The terminology in this dictionary is also where most AI-era marketing conversations break down. Senior leaders hear "mention rate" and "share of voice in AI" without a shared definition. Engineers hear "llms.txt" and ask whether it's a real standard. Marketing teams confuse GEO, AEO, and AISO. A common vocabulary unblocks all of those conversations.

1. Foundational AI Concepts

The base layer of vocabulary -- terms that describe how LLMs actually work under the hood. You can't reason about AI visibility without understanding the difference between parametric memory and retrieval, or what RAG actually does.

Foundational AI Concepts

Core terminology for understanding how AI works under the hood

Large Language Model (LLM): An AI model trained on massive text datasets to understand and generate human-like language. LLMs power ChatGPT, Claude, Gemini, Perplexity, and Grok.
Generative AI: AI systems that produce new content (text, images, code, audio) rather than only classifying or predicting from existing data.
Transformer: The neural network architecture that powers all modern LLMs. Introduced in the 2017 paper "Attention Is All You Need."
Embedding: A numerical representation of text, images, or other data that captures semantic meaning. AI uses embeddings to find conceptually similar content.
Vector Database: A specialized database that stores embeddings and supports similarity search. The infrastructure layer behind retrieval-augmented generation.
Retrieval-Augmented Generation (RAG): An architecture where AI retrieves external sources before generating a response, then cites those sources. The basis for AI citation.
Fine-Tuning: Adapting a pre-trained LLM to a specific domain or task by continuing training on a focused dataset. Distinct from prompting.
Training Data: The corpus of text and other content an LLM learns from during training. Cutoff dates determine what the model "knows" parametrically.
Parametric Memory: Information an LLM learned during training and stores in its weights. Distinct from real-time retrieval -- it has a knowledge cutoff.
Hallucination: When an LLM produces a confident but incorrect or fabricated statement. A leading reason AI platforms increasingly rely on real-time retrieval and citations.

2. Search Paradigms

The vocabulary of how search itself is changing. SEO is no longer the only game -- and the GEO/AEO distinction matters when you're scoping work and explaining strategy. The Princeton/Georgia Tech KDD 2024 paper that formalized "GEO" is here.

Search Paradigms

The vocabulary of search optimization across the SEO-to-GEO transition

SEO (Search Engine Optimization): The practice of optimizing content to rank in traditional search engines like Google and Bing.
GEO (Generative Engine Optimization): The practice of optimizing content so AI platforms cite it in their responses. Term formalized in Princeton/Georgia Tech KDD 2024 research.
AEO (Answer Engine Optimization): Broader umbrella covering AI answers, featured snippets, and voice assistants. Often used interchangeably with GEO.
AISO (AI Search Optimization): Synonym for GEO/AEO, emphasizing optimization specifically for AI-powered search experiences.
AI Search: Search experiences where AI synthesizes a direct answer from multiple sources, typically with citations, instead of returning a list of links.
Generative Search: Search powered by generative AI that composes answers in natural language. Examples: Google AI Overviews, ChatGPT Search, Perplexity.
Conversational Search: Multi-turn search dialogue where the AI maintains context across follow-up questions. Replaces single-query search behavior.
Zero-Click Search: When a user gets their answer directly from the search results page (or AI overview) without clicking through to any website.
AI Overview: Google's AI-generated summary appearing above traditional search results. Now appearing on roughly a quarter of all Google searches.
Featured Snippet: A direct answer extracted from a webpage and displayed at the top of Google search results. The traditional precursor to AI Overviews.

3. AI Platforms

The AI assistants and search experiences your buyers use. Market share data here is current to First Page Sage's April 2026 report. For a deeper breakdown of how each platform selects sources, see Ranqo's platform-specific playbook.

AI Platforms

The major AI assistants and search experiences your buyers use

ChatGPT: OpenAI's AI chatbot. As of April 2026, holds 60.2% AI chatbot market share (First Page Sage). Search functionality aligns 87% with Bing.
Claude: Anthropic's AI assistant, known for nuanced, balanced analysis with high disclaimer rates. ~5% market share.
Perplexity: AI-powered answer engine with strong inline citations and real-time web crawl. Maintains an independent index from Google/Bing.
Gemini: Google's AI assistant. Inherits Google's infrastructure and Knowledge Graph. ~15% AI chatbot market share (April 2026).
Grok: xAI's chatbot integrated with X/Twitter for real-time social signal access. Differentiated by current discourse data.
Microsoft Copilot: Microsoft's AI assistant powered by OpenAI models. Built into Bing, Office 365, and Windows.
ChatGPT Search (SearchGPT): OpenAI's search-optimized ChatGPT mode. Built on Bing's index, with 87% citation match to Bing top results.
Google AI Mode: Google's full conversational AI search experience, distinct from AI Overviews. Replaces traditional results with conversational responses.
DeepSeek: Open-source Chinese LLM that gained significant adoption in 2025. Increasingly used in cost-sensitive RAG applications.
Generative Engine: Umbrella term for any AI system that synthesizes answers from multiple sources -- ChatGPT, Perplexity, Gemini, Claude, Copilot all qualify.

4. AI Crawlers & Bots

The user agents that actually visit your site to feed AI systems. Most platforms run multiple bots with distinct purposes (training vs. real-time vs. search index). For everything they fetch (and don't execute), see Ranqo's walkthrough of what AI sees when it crawls.

AI Crawlers & Bots

User agents visiting your site to feed AI systems

GPTBot: OpenAI's crawler for training data collection. Does not execute JavaScript. Respects robots.txt.
ChatGPT-User: OpenAI bot triggered when a ChatGPT user actively requests web content. Makes 3.6x more requests than Googlebot per Search Engine Journal data.
OAI-SearchBot: OpenAI's crawler for ChatGPT Search index. Distinct from GPTBot (training) and ChatGPT-User (real-time fetches).
ClaudeBot: Anthropic's training data crawler. Part of a three-bot system alongside Claude-User and Claude-SearchBot.
Claude-User: Anthropic bot used for user-triggered web fetches inside Claude. Requires explicit user request to activate.
Claude-SearchBot: Anthropic's search infrastructure crawler that determines what Claude can cite in its answers.
PerplexityBot: Perplexity's primary indexing crawler. Cloudflare documented inconsistent robots.txt compliance in August 2025.
Google-Extended: Google's crawler for Gemini training data. Inherits Googlebot's JavaScript rendering capability -- the only AI crawler that does.
CCBot: Common Crawl's bot. Its archive is used by many LLMs (including older ChatGPT versions) as training data.
Bingbot: Microsoft's search crawler. Critical for ChatGPT Search visibility because ChatGPT relies on Bing's index, not Google's.

5. Citation Mechanics

How AI platforms select, attribute, and reference sources. This is the layer most marketers don't understand -- query fan-out alone explains why optimizing for one keyword rarely transfers to AI visibility. For the disconnect with Google rankings, see Ranqo's research on the Google-AI gap.

Citation Mechanics

How AI selects, attributes, and references sources

Citation: When an AI platform references a specific URL or source as the origin of information in its response.
Source Attribution: The practice of naming the source of information within an AI response. Distinct from a hyperlinked citation.
Inline Citation: Source references embedded directly in the AI's response text. Perplexity does this for 95% of claims; ChatGPT often does not.
Citation Half-Life: The time period over which a citation"s value decays. AI visibility decays 30-60 days before measurable performance drops.
Brand Persistence: Whether a brand cited in one AI response continues to appear in subsequent responses. Only ~30% of brands persist between consecutive responses.
Mention Rate: The percentage of relevant AI queries in your category that include your brand. The foundational AI visibility KPI.
Position: Where your brand is mentioned within an AI response (first, second, third). Position 1.2 average is excellent (DerivateX B2B SaaS data).
Share of Voice (SoV): Your brand's share of total mentions in AI responses for category-level queries. The AI equivalent of search market share.
Query Fan-Out: AI technique of breaking one user question into multiple sub-queries to retrieve diverse sources. Often produces 8+ sub-queries per ChatGPT prompt.
Grounding: The process of backing AI responses with verifiable external sources rather than relying solely on parametric memory.

6. Content Optimization

Tactical content terminology, with citation impact data attached where research exists. For the full 7-step implementation framework, see Ranqo's optimization playbook.

Content Optimization

Tactical content terminology for boosting AI citation rates

Answer-First Formatting: Placing the direct answer in the first 40-60 words of each section. Onely measured a 140% ChatGPT citation increase from this technique alone.
Listicle: A numbered or bulleted list-format article. Listicles account for 21.9% of all AI citations -- the highest of any content format.
Comparison Content: "X vs Y" pages comparing products or options head-to-head. Achieves 45-60% citation rates -- the highest of any single format.
Pillar Content: A comprehensive, definitive resource that covers a topic thoroughly and links to subtopics. Designed to be the canonical reference.
Hub Page: A central page that organizes and links to related content. Helps AI understand topical relationships across your site.
Content Depth: The substantive thoroughness of content. Articles 1,500+ words receive 4.7x more AI citations (Hashmeta).
Content Freshness: How recently content has been updated. Pages updated within 30 days receive 3.2x more AI citations (rank.bot).
Definition Statement: An explicit "[X] is [definition]" sentence pattern. AI extracts these directly when answering "what is" queries.
Third-Party Mention: When external sites (review platforms, press, analyst reports) reference your brand. Brands are 6.5x more likely cited via third-party sources.
Original Research: Proprietary surveys, analyses, or experiments your brand publishes. Adding original statistics increases AI visibility by 41% (Princeton GEO).

7. Technical Standards

File formats, schemas, and protocols that AI crawlers respect (or are expected to). For the complete llms.txt analysis, including verified adoption data, see Ranqo's llms.txt complete guide.

Technical Standards

File formats, schemas, and protocols AI crawlers respect

robots.txt: A plain-text file telling crawlers which URLs to avoid. Critical for AI: blocking GPTBot or ClaudeBot here removes you from those platforms.
sitemap.xml: An XML file listing all the URLs on your site, helping search and AI crawlers discover content efficiently.
Schema Markup: Structured data added to your HTML to help AI and search engines understand content meaning. Implemented as JSON-LD.
JSON-LD: JavaScript Object Notation for Linked Data. The recommended format for adding schema markup to web pages.
Structured Data: Standardized format for providing information about a page and classifying its content. Only 12.4% of websites use it.
FAQ Schema: Structured data type for question-and-answer content. Pages with FAQ schema are 3.2x more likely to appear in AI Overviews (Frase).
HowTo Schema: Schema type for step-by-step instructional content. Helps AI extract sequential steps cleanly.
Article Schema: Schema type for article content, with fields for author, datePublished, dateModified, and headline.
llms.txt: A proposed standard file (proposed by Jeremy Howard, Sept 2024) at /llms.txt that gives LLMs a curated map of your site. ~10.13% adoption.
llms-full.txt: Companion to llms.txt that includes concatenated full page content, allowing LLMs to load core content without separate requests.

8. Crawling & Rendering

The technical concepts that determine what AI can actually read on your site. The single most important fact in this category: AI crawlers do not execute JavaScript -- a finding Vercel verified across millions of crawler requests.

Crawling & Rendering

Technical concepts that determine what AI can actually read on your site

Server-Side Rendering (SSR): Generating HTML on the server before sending it to the browser. Critical for AI visibility -- AI crawlers don't execute JavaScript.
Client-Side Rendering (CSR): Generating page content via JavaScript after the browser receives an empty HTML shell. Invisible to AI crawlers without SSR.
Static Site Generation (SSG): Pre-building HTML pages at build time. Functionally equivalent to SSR for AI visibility -- content is in raw HTML.
Hydration: The process where JavaScript activates a server-rendered page in the browser. Adds interactivity without affecting AI visibility.
JavaScript Execution: The process of running JS code in a browser. AI crawlers do NOT do this -- 500M+ GPTBot fetches showed zero JS execution (Passionfruit).
Crawl Budget: The number of pages a crawler will visit on your site within a given time period. AI crawler budgets are growing rapidly.
Indexation: Whether a page has been added to a search engine or AI platform's index of known content.
Core Web Vitals (CWV): Google's page performance metrics (LCP, FID, CLS). Acts as a gate for AI: severe failures hurt visibility, but good-to-great has minimal impact.
First Contentful Paint (FCP): How quickly the first content appears on screen. Pages with FCP under 0.4s receive 3x more AI citations (ZipTie).
Mobile-First Indexing: Search and AI systems primarily evaluate the mobile version of your site, not desktop. Content parity across viewports matters.

9. Authority & Trust Signals

Trust signals AI platforms use as gating filters before deciding whether to cite a source. For the empirical breakdown of which signals matter most, see Ranqo's 5 factors that drive AI citations.

Authority & Trust Signals

Trust signals AI platforms use to gate which brands get cited

E-E-A-T: Experience, Expertise, Authoritativeness, Trustworthiness. Google's quality framework, now used as a binary gating filter by AI platforms.
Author Byline: A visible author name on content. 89% of frequently-cited pages have bylines vs 31% of rarely-cited pages (Hashmeta).
Author Schema: JSON-LD markup defining the author of a page (Person type with name, url, jobTitle). Strengthens E-E-A-T signals.
Expert Quote: An attributed statement from a named expert with credentials. Increases AI citations by 41% per Princeton GEO research.
Backlink: A hyperlink from another site to yours. Weighted heavily by Google but lower priority for AI -- correlation with AI citations is just 0.22.
Brand Mention: When another site references your brand by name without necessarily linking. Stronger AI signal than backlinks (correlation 0.66).
Domain Authority: Third-party metric estimating a domain's likelihood to rank. Important for SEO but only moderately predictive of AI visibility.
Topical Authority: How thoroughly your domain covers a specific topic. The strongest single predictor of AI citations (correlation 0.41 per ZipTie).
Social Proof: Evidence others trust your brand: customer logos, testimonials, named case studies. Surfaces in AI responses for trust queries.
Knowledge Graph: Structured database of entities and relationships. Google's Knowledge Graph powers significant Gemini and AI Overview surfaces.

10. Measurement & Analytics

The KPIs and metrics for tracking AI visibility over time. Traditional SEO metrics don't cover this surface -- mention rate, share of voice in AI, sentiment, and citation source tracking are the new dashboard. For the audit framework that maps to all of these, see Ranqo's AI readiness audit guide.

Measurement & Analytics

KPIs and metrics for tracking AI visibility over time

AI Visibility: The umbrella metric for how present your brand is across AI platforms. Composed of mention rate, position, sentiment, and SoV.
AI Referral Traffic: Traffic to your site originating from AI platforms (ChatGPT, Perplexity, etc.). Grew 156% YoY through 2025.
Click-Through Rate (CTR): The percentage of impressions that result in a click. Organic CTR for AI Overview queries dropped 61% in 15 months (Seer Interactive).
Conversion Rate: The percentage of visitors who complete a desired action. AI-referred traffic converts 4.4x better than organic search (Semrush).
Sentiment Analysis: Analyzing the tone (positive, neutral, negative) of how AI platforms describe your brand. Negative sentiment is worse than no mention.
Citation Source Tracking: Identifying which third-party sites AI platforms cite when discussing your category. Reveals optimization targets.
AI Mention Tracking: The practice of monitoring brand mentions across AI platforms over time. Required because AI responses are volatile.
Brand Sentiment: The aggregate tone of brand mentions across AI responses. Tracked at the platform level since each platform has different patterns.
Attribution: Determining which marketing channel drove a conversion. Increasingly difficult as AI traffic appears as "direct" or generic referrers.
Citation Decay: The reduction in AI citations over time without ongoing optimization. Visibility decays 30-60 days before performance metrics show it.

How to Use This Dictionary

For marketers: bookmark this page. When a vendor, agency, or consultant uses unfamiliar AI vocabulary, look it up here first. The categories are organized roughly by abstraction level -- foundational concepts at the top, measurement at the bottom.

For content teams: this dictionary itself is an example of citation-optimized content. Each term and definition is in a <dl> structure with <dt> and <dd> elements -- the HTML pattern AI extracts most reliably. Apply the same structure to your own glossary, FAQ, and definitional pages.

For executives: if your team can't define mention rate, share of voice in AI, and brand persistence, you have a measurement gap. Those three metrics are the foundation of any credible AI visibility KPI dashboard.

For agencies: share this with clients during onboarding. Pre-empting the terminology questions saves weeks of explanation and accelerates strategy alignment.

A common vocabulary is the cheapest, fastest way to align an organization on AI visibility strategy. Definitions before tactics; tactics before tools.

Track every metric in this dictionary

Mention rate, share of voice, position, sentiment, citation sources -- across ChatGPT, Claude, Perplexity, Gemini, and Grok. For deeper context, also see the 15 anti-GEO mistakes to avoid and the AI readiness audit framework.

Start tracking

Written by

Nisha Kumari

Co-Founder at Ranqo

Nisha Kumari is Co-Founder at Ranqo, where she leads growth strategy and client acquisition. With a background in digital marketing and financial management, she specializes in SEO, Generative Engine Optimization, and helping brands build visibility across AI platforms.

Share this article

Reference

The AI Citation Dictionary: 100 Terms Every Marketer Should Know

Nisha Kumari|April 26, 202620 min read

100

essential terms across 10 categories -- the canonical reference for AI visibility, GEO, citation behavior, and AI-era marketing measurement

10 Categories, 100 Terms

Jump to any category, or read straight through for the complete glossary.

Foundational AI Concepts

10 terms

Search Paradigms

10 terms

AI Platforms

10 terms

AI Crawlers & Bots

10 terms

Citation Mechanics

10 terms

Content Optimization

10 terms

Technical Standards

10 terms

Crawling & Rendering

10 terms

Authority & Trust Signals

10 terms

Measurement & Analytics

10 terms

Why a Glossary, and Why Now

1. Foundational AI Concepts

Foundational AI Concepts

Core terminology for understanding how AI works under the hood

Large Language Model (LLM): An AI model trained on massive text datasets to understand and generate human-like language. LLMs power ChatGPT, Claude, Gemini, Perplexity, and Grok.
Generative AI: AI systems that produce new content (text, images, code, audio) rather than only classifying or predicting from existing data.
Transformer: The neural network architecture that powers all modern LLMs. Introduced in the 2017 paper "Attention Is All You Need."
Embedding: A numerical representation of text, images, or other data that captures semantic meaning. AI uses embeddings to find conceptually similar content.
Vector Database: A specialized database that stores embeddings and supports similarity search. The infrastructure layer behind retrieval-augmented generation.
Retrieval-Augmented Generation (RAG): An architecture where AI retrieves external sources before generating a response, then cites those sources. The basis for AI citation.
Fine-Tuning: Adapting a pre-trained LLM to a specific domain or task by continuing training on a focused dataset. Distinct from prompting.
Training Data: The corpus of text and other content an LLM learns from during training. Cutoff dates determine what the model "knows" parametrically.
Parametric Memory: Information an LLM learned during training and stores in its weights. Distinct from real-time retrieval -- it has a knowledge cutoff.
Hallucination: When an LLM produces a confident but incorrect or fabricated statement. A leading reason AI platforms increasingly rely on real-time retrieval and citations.

2. Search Paradigms

Search Paradigms

The vocabulary of search optimization across the SEO-to-GEO transition

SEO (Search Engine Optimization): The practice of optimizing content to rank in traditional search engines like Google and Bing.
GEO (Generative Engine Optimization): The practice of optimizing content so AI platforms cite it in their responses. Term formalized in Princeton/Georgia Tech KDD 2024 research.
AEO (Answer Engine Optimization): Broader umbrella covering AI answers, featured snippets, and voice assistants. Often used interchangeably with GEO.
AISO (AI Search Optimization): Synonym for GEO/AEO, emphasizing optimization specifically for AI-powered search experiences.
AI Search: Search experiences where AI synthesizes a direct answer from multiple sources, typically with citations, instead of returning a list of links.
Generative Search: Search powered by generative AI that composes answers in natural language. Examples: Google AI Overviews, ChatGPT Search, Perplexity.
Conversational Search: Multi-turn search dialogue where the AI maintains context across follow-up questions. Replaces single-query search behavior.
Zero-Click Search: When a user gets their answer directly from the search results page (or AI overview) without clicking through to any website.
AI Overview: Google's AI-generated summary appearing above traditional search results. Now appearing on roughly a quarter of all Google searches.
Featured Snippet: A direct answer extracted from a webpage and displayed at the top of Google search results. The traditional precursor to AI Overviews.

3. AI Platforms

AI Platforms

The major AI assistants and search experiences your buyers use

ChatGPT: OpenAI's AI chatbot. As of April 2026, holds 60.2% AI chatbot market share (First Page Sage). Search functionality aligns 87% with Bing.
Claude: Anthropic's AI assistant, known for nuanced, balanced analysis with high disclaimer rates. ~5% market share.
Perplexity: AI-powered answer engine with strong inline citations and real-time web crawl. Maintains an independent index from Google/Bing.
Gemini: Google's AI assistant. Inherits Google's infrastructure and Knowledge Graph. ~15% AI chatbot market share (April 2026).
Grok: xAI's chatbot integrated with X/Twitter for real-time social signal access. Differentiated by current discourse data.
Microsoft Copilot: Microsoft's AI assistant powered by OpenAI models. Built into Bing, Office 365, and Windows.
ChatGPT Search (SearchGPT): OpenAI's search-optimized ChatGPT mode. Built on Bing's index, with 87% citation match to Bing top results.
Google AI Mode: Google's full conversational AI search experience, distinct from AI Overviews. Replaces traditional results with conversational responses.
DeepSeek: Open-source Chinese LLM that gained significant adoption in 2025. Increasingly used in cost-sensitive RAG applications.
Generative Engine: Umbrella term for any AI system that synthesizes answers from multiple sources -- ChatGPT, Perplexity, Gemini, Claude, Copilot all qualify.

4. AI Crawlers & Bots

AI Crawlers & Bots

User agents visiting your site to feed AI systems

GPTBot: OpenAI's crawler for training data collection. Does not execute JavaScript. Respects robots.txt.
ChatGPT-User: OpenAI bot triggered when a ChatGPT user actively requests web content. Makes 3.6x more requests than Googlebot per Search Engine Journal data.
OAI-SearchBot: OpenAI's crawler for ChatGPT Search index. Distinct from GPTBot (training) and ChatGPT-User (real-time fetches).
ClaudeBot: Anthropic's training data crawler. Part of a three-bot system alongside Claude-User and Claude-SearchBot.
Claude-User: Anthropic bot used for user-triggered web fetches inside Claude. Requires explicit user request to activate.
Claude-SearchBot: Anthropic's search infrastructure crawler that determines what Claude can cite in its answers.
PerplexityBot: Perplexity's primary indexing crawler. Cloudflare documented inconsistent robots.txt compliance in August 2025.
Google-Extended: Google's crawler for Gemini training data. Inherits Googlebot's JavaScript rendering capability -- the only AI crawler that does.
CCBot: Common Crawl's bot. Its archive is used by many LLMs (including older ChatGPT versions) as training data.
Bingbot: Microsoft's search crawler. Critical for ChatGPT Search visibility because ChatGPT relies on Bing's index, not Google's.

5. Citation Mechanics

Citation Mechanics

How AI selects, attributes, and references sources

Citation: When an AI platform references a specific URL or source as the origin of information in its response.
Source Attribution: The practice of naming the source of information within an AI response. Distinct from a hyperlinked citation.
Inline Citation: Source references embedded directly in the AI's response text. Perplexity does this for 95% of claims; ChatGPT often does not.
Citation Half-Life: The time period over which a citation"s value decays. AI visibility decays 30-60 days before measurable performance drops.
Brand Persistence: Whether a brand cited in one AI response continues to appear in subsequent responses. Only ~30% of brands persist between consecutive responses.
Mention Rate: The percentage of relevant AI queries in your category that include your brand. The foundational AI visibility KPI.
Position: Where your brand is mentioned within an AI response (first, second, third). Position 1.2 average is excellent (DerivateX B2B SaaS data).
Share of Voice (SoV): Your brand's share of total mentions in AI responses for category-level queries. The AI equivalent of search market share.
Query Fan-Out: AI technique of breaking one user question into multiple sub-queries to retrieve diverse sources. Often produces 8+ sub-queries per ChatGPT prompt.
Grounding: The process of backing AI responses with verifiable external sources rather than relying solely on parametric memory.

6. Content Optimization

Tactical content terminology, with citation impact data attached where research exists. For the full 7-step implementation framework, see Ranqo's optimization playbook.

Content Optimization

Tactical content terminology for boosting AI citation rates

Answer-First Formatting: Placing the direct answer in the first 40-60 words of each section. Onely measured a 140% ChatGPT citation increase from this technique alone.
Listicle: A numbered or bulleted list-format article. Listicles account for 21.9% of all AI citations -- the highest of any content format.
Comparison Content: "X vs Y" pages comparing products or options head-to-head. Achieves 45-60% citation rates -- the highest of any single format.
Pillar Content: A comprehensive, definitive resource that covers a topic thoroughly and links to subtopics. Designed to be the canonical reference.
Hub Page: A central page that organizes and links to related content. Helps AI understand topical relationships across your site.
Content Depth: The substantive thoroughness of content. Articles 1,500+ words receive 4.7x more AI citations (Hashmeta).
Content Freshness: How recently content has been updated. Pages updated within 30 days receive 3.2x more AI citations (rank.bot).
Definition Statement: An explicit "[X] is [definition]" sentence pattern. AI extracts these directly when answering "what is" queries.
Third-Party Mention: When external sites (review platforms, press, analyst reports) reference your brand. Brands are 6.5x more likely cited via third-party sources.
Original Research: Proprietary surveys, analyses, or experiments your brand publishes. Adding original statistics increases AI visibility by 41% (Princeton GEO).

7. Technical Standards

File formats, schemas, and protocols that AI crawlers respect (or are expected to). For the complete llms.txt analysis, including verified adoption data, see Ranqo's llms.txt complete guide.

Technical Standards

File formats, schemas, and protocols AI crawlers respect

robots.txt: A plain-text file telling crawlers which URLs to avoid. Critical for AI: blocking GPTBot or ClaudeBot here removes you from those platforms.
sitemap.xml: An XML file listing all the URLs on your site, helping search and AI crawlers discover content efficiently.
Schema Markup: Structured data added to your HTML to help AI and search engines understand content meaning. Implemented as JSON-LD.
JSON-LD: JavaScript Object Notation for Linked Data. The recommended format for adding schema markup to web pages.
Structured Data: Standardized format for providing information about a page and classifying its content. Only 12.4% of websites use it.
FAQ Schema: Structured data type for question-and-answer content. Pages with FAQ schema are 3.2x more likely to appear in AI Overviews (Frase).
HowTo Schema: Schema type for step-by-step instructional content. Helps AI extract sequential steps cleanly.
Article Schema: Schema type for article content, with fields for author, datePublished, dateModified, and headline.
llms.txt: A proposed standard file (proposed by Jeremy Howard, Sept 2024) at /llms.txt that gives LLMs a curated map of your site. ~10.13% adoption.
llms-full.txt: Companion to llms.txt that includes concatenated full page content, allowing LLMs to load core content without separate requests.

8. Crawling & Rendering

Crawling & Rendering

Technical concepts that determine what AI can actually read on your site

Server-Side Rendering (SSR): Generating HTML on the server before sending it to the browser. Critical for AI visibility -- AI crawlers don't execute JavaScript.
Client-Side Rendering (CSR): Generating page content via JavaScript after the browser receives an empty HTML shell. Invisible to AI crawlers without SSR.
Static Site Generation (SSG): Pre-building HTML pages at build time. Functionally equivalent to SSR for AI visibility -- content is in raw HTML.
Hydration: The process where JavaScript activates a server-rendered page in the browser. Adds interactivity without affecting AI visibility.
JavaScript Execution: The process of running JS code in a browser. AI crawlers do NOT do this -- 500M+ GPTBot fetches showed zero JS execution (Passionfruit).
Crawl Budget: The number of pages a crawler will visit on your site within a given time period. AI crawler budgets are growing rapidly.
Indexation: Whether a page has been added to a search engine or AI platform's index of known content.
Core Web Vitals (CWV): Google's page performance metrics (LCP, FID, CLS). Acts as a gate for AI: severe failures hurt visibility, but good-to-great has minimal impact.
First Contentful Paint (FCP): How quickly the first content appears on screen. Pages with FCP under 0.4s receive 3x more AI citations (ZipTie).
Mobile-First Indexing: Search and AI systems primarily evaluate the mobile version of your site, not desktop. Content parity across viewports matters.

9. Authority & Trust Signals

Trust signals AI platforms use as gating filters before deciding whether to cite a source. For the empirical breakdown of which signals matter most, see Ranqo's 5 factors that drive AI citations.

Authority & Trust Signals

Trust signals AI platforms use to gate which brands get cited

E-E-A-T: Experience, Expertise, Authoritativeness, Trustworthiness. Google's quality framework, now used as a binary gating filter by AI platforms.
Author Byline: A visible author name on content. 89% of frequently-cited pages have bylines vs 31% of rarely-cited pages (Hashmeta).
Author Schema: JSON-LD markup defining the author of a page (Person type with name, url, jobTitle). Strengthens E-E-A-T signals.
Expert Quote: An attributed statement from a named expert with credentials. Increases AI citations by 41% per Princeton GEO research.
Backlink: A hyperlink from another site to yours. Weighted heavily by Google but lower priority for AI -- correlation with AI citations is just 0.22.
Brand Mention: When another site references your brand by name without necessarily linking. Stronger AI signal than backlinks (correlation 0.66).
Domain Authority: Third-party metric estimating a domain's likelihood to rank. Important for SEO but only moderately predictive of AI visibility.
Topical Authority: How thoroughly your domain covers a specific topic. The strongest single predictor of AI citations (correlation 0.41 per ZipTie).
Social Proof: Evidence others trust your brand: customer logos, testimonials, named case studies. Surfaces in AI responses for trust queries.
Knowledge Graph: Structured database of entities and relationships. Google's Knowledge Graph powers significant Gemini and AI Overview surfaces.

10. Measurement & Analytics

Measurement & Analytics

KPIs and metrics for tracking AI visibility over time

AI Visibility: The umbrella metric for how present your brand is across AI platforms. Composed of mention rate, position, sentiment, and SoV.
AI Referral Traffic: Traffic to your site originating from AI platforms (ChatGPT, Perplexity, etc.). Grew 156% YoY through 2025.
Click-Through Rate (CTR): The percentage of impressions that result in a click. Organic CTR for AI Overview queries dropped 61% in 15 months (Seer Interactive).
Conversion Rate: The percentage of visitors who complete a desired action. AI-referred traffic converts 4.4x better than organic search (Semrush).
Sentiment Analysis: Analyzing the tone (positive, neutral, negative) of how AI platforms describe your brand. Negative sentiment is worse than no mention.
Citation Source Tracking: Identifying which third-party sites AI platforms cite when discussing your category. Reveals optimization targets.
AI Mention Tracking: The practice of monitoring brand mentions across AI platforms over time. Required because AI responses are volatile.
Brand Sentiment: The aggregate tone of brand mentions across AI responses. Tracked at the platform level since each platform has different patterns.
Attribution: Determining which marketing channel drove a conversion. Increasingly difficult as AI traffic appears as "direct" or generic referrers.
Citation Decay: The reduction in AI citations over time without ongoing optimization. Visibility decays 30-60 days before performance metrics show it.

How to Use This Dictionary

For agencies: share this with clients during onboarding. Pre-empting the terminology questions saves weeks of explanation and accelerates strategy alignment.

A common vocabulary is the cheapest, fastest way to align an organization on AI visibility strategy. Definitions before tactics; tactics before tools.

Track every metric in this dictionary

Start tracking

Written by

Nisha Kumari

Co-Founder at Ranqo

Share this article