Why You Can't Measure AI Visibility Just Once
Run the same AI query twice and you may get a different answer. So is AI visibility just noise? Mostly not: in our 102-brand study, 77.5% of brand-prompt-engine cells were deterministic, always cited or never. But sentiment flips 6.7x more often than whether you're mentioned. The fix is boring and it works: measure the same prompts repeatedly.
Run the same question through ChatGPT twice and you can get two different answers. So is AI visibility just noise you cannot act on? Mostly, no. In our study of 102 brands across five engines, 77.5% of brand, prompt, and engine combinations were deterministic: the brand was either always cited or never cited, run after run. The catch is that you cannot tell which bucket a prompt is in from a single measurement — and the one genuinely volatile signal, sentiment, flips 6.7x more often than whether you are mentioned at all.
77.5%
of brand × prompt × engine cells are deterministic — always cited or never cited, run after run.
This is from our 102-brand study, published openly on arXiv. This post is the deep dive on the finding that changes how you should track: most of your visibility is stable and improvable, but you only learn that by measuring more than once.
Sentiment is the noisy part
Plenty of tools sell an "AI sentiment" number. Our data says be careful with a single reading of it.
Sentiment is 6.7x noisier than mention
How often each signal flips between consecutive measurements (% of brand–engine pairs)
Between consecutive measurements, whether a brand was mentioned at all flipped about 6.8% of the time. How the brand was framed — positive, neutral, negative — flipped 45.5% of the time. That makes sentiment roughly 6.7x noisier than mention. A single "your AI sentiment dropped this week" alert is close to a coin flip; the real signal only shows up as a trend across many runs.
But visibility itself is mostly stable
Given all that volatility, you might conclude AI visibility is chaos. It is mostly not.
Most visibility is not a dice roll
Share of brand × prompt × engine cells that are deterministic vs stochastic
Across brand, prompt, and engine combinations, 77.5% were deterministic — always cited or never cited. Only the remaining ~22.5% sat in the stochastic middle. That is good news: most of your visibility is a stable state you can actually move, not a dice roll. The catch is you cannot tell, from one run, whether a given prompt is in the stable majority or the volatile tail. Independent work reaches the same conclusion — the "Don't Measure Once" paper argues AI visibility should be treated as a distribution, not a single observation.
So measure the same prompts, repeatedly
Different signals deserve different trust. Here is how to read each one.
| Signal | Run-to-run stability | How to read it |
|---|---|---|
| Whether you are mentioned | Stable (~6.8% flip) | Act on it directly |
| Position / rank | Fairly stable | Act on it |
| Sentiment | Volatile (~45.5% flip) | A slow trend across many runs, never one reading |
The discipline is simple: track the same set of prompts on a steady cadence so the stable majority separates from the 22.5% noise. That is the difference between reacting to one screenshot and watching a real trend — and it is why measuring share of voice only works when you do it repeatedly. Continuous search-visibility tracking exists for exactly this.
What this means for how you track
1. Never trust one run
A single check tells you almost nothing about whether a prompt is stable or volatile. One screenshot is an anecdote, not a measurement.
2. Act on mention and position; trend sentiment
Mention and rank are stable enough to act on. Sentiment is a slow-moving trend — if a tool shows it swinging wildly week to week, that is usually the measurement, not your brand.
3. Measure the same prompts on a cadence
Repetition is what turns a noisy snapshot into a signal you can improve against, and what tells you when a change actually moved something.
AI visibility is not random, but a single measurement of it is. Measure the same prompts again and again, and the real picture holds still.
Track your AI visibility on a cadence
Ranqo runs your prompts across ChatGPT, Claude, Perplexity, Gemini, and Grok on a schedule — so you see the trend, not one noisy run. Read the full study, or check your AI visibility free to start.
Start monitoring freeWritten by
Nisha Kumari
Nisha Kumari is Co-Founder at Ranqo, where she leads growth strategy and client acquisition. With a background in digital marketing and financial management, she specializes in SEO, Generative Engine Optimization, and helping brands build visibility across AI platforms.
Share this article
Related articles
We Measured AI Visibility Across 102 Brands and 5 AI Engines. Here's What the Data Shows.
Across 102 brands and 102,025 AI answers, only 2.9% of citations pointed to a brand's own domain. Yet most published 'AI visibility' studies still tell teams to optimize their own pages first. The real structure is a 73 / 44 / 11 stature ladder where third-party pages do the work. Here's everything our arXiv study found, and what to do about it.
How to Measure AI Share of Voice: The Three Decisions That Change the Number
The same prompt plays a different game on every platform -- Perplexity routinely stacks several times more citations into an answer than ChatGPT. Yet every guide hands you one formula: mentions divided by total, times 100. Share of voice is three decisions -- denominator, position weighting, aggregation -- before it's a number. This is the measurement playbook.
Your Website Is Only 2.9% of AI Citations. Here's Where the Other 97% Comes From.
Only 2.9% of the 149,912 AI citations we measured pointed to a brand's own website. Yet most AI-visibility advice still starts with 'optimize your pages.' In AI search your own domain is the floor, not the lever; the other 97% is earned on third-party pages. Here's the full source breakdown, and how to earn your way in.