01 // The two corpora
DriftMetrics computes a single number — drift score — from the gap between two text corpora that
both claim to represent "what the internet thinks about X." The gap is what the
algorithmic sort suppresses. The drift score is how big that gap is.
The Enclosure
The first ten organic results from a mainstream commercial SERP. By default this is Bing (with Yahoo as fallback when Bing rate-limits). These are the URLs and snippets a typical user sees when they type the keyword into a search engine — heavily ranked, heavily filtered for commercial viability, optimized for clickthrough.
The Baseline
Two sources fused: html.duckduckgo.com/html/ (the legacy DDG endpoint, more
keyword-literal and less aggressively re-ranked) and the Wikipedia OpenSearch + REST summary
API (organic encyclopedic intent, ranked by reader behavior, not advertiser bidding).
02 // The math
Both corpora get tokenized with a tight stopword filter. Each corpus becomes a TF-IDF vector over the union vocabulary. From those two vectors:
- SEM —
1 − cosine(E, B). Semantic distance. - LEX —
1 − jaccard(terms_E, terms_B). Vocabulary distance. - OVL —
jaccard(hosts_E, hosts_B). URL host overlap. - SUB — mass of top "subsidy" terms (high TF-IDF in E, near-zero in B).
- SUP — mass of top "suppressed" terms (high TF-IDF in B, near-zero in E).
- COM — commercial-marker share gap between corpora.
The composite drift score is a weighted blend:
0.55·SEM + 0.20·LEX + 0.15·(1−OVL) + 0.10·COM. Weighted toward semantic delta
because that's the signal that maps to "the algorithm is showing you a different reality."
03 // What this isn't
Drift score is not bias detection. It does not say which side is "right." A high drift just means the two corpora disagree heavily on which words matter; both could be lying, neither could be lying, the gap is the data.
Drift score is not a search engine quality measure. Bing and DDG aren't being graded against each other. They're being used as proxies for two different sampling mechanisms — commercial ranking (Enclosure) and lexical / organic-encyclopedic surfacing (Baseline). The score describes the distance between those mechanisms applied to one keyword, not which mechanism is better.
Drift score is not stable across time. The same keyword scanned a week apart will produce different scores because the internet underneath both corpora moved. The number is a snapshot, not a constant.
04 // Honest limitations
- Source variability. Bing/Yahoo/DDG return different content based on IP geolocation. A scrape from a US-based Acurast processor will differ from a scrape from an EU one.
-
Captcha walls.
Bing aggressively rate-limits scrapers. When it walls, the engine falls back to Yahoo;
when both wall, the report still posts but with
confidence ≈ 0as the signal that the data underneath is thin. - English-only tokenizer. The stopword list and tokenizer are tuned for English. Non-English keywords will produce misleading TF-IDF vectors until the tokenizer is internationalized.
- Wikipedia bias. Wikipedia surfaces by reader-traffic and editor consensus. That's an organic signal, but it's not "raw user intent" — there's curation in there, just a different shape of curation than Bing's commercial ranking.
- Sample size. Top 10 SERP results per corpus is fast but shallow. Deep crawl would produce more stable numbers; the tradeoff is fleet time and IP burn.
05 // Why this exists
Most "search transparency" tools are built by ad-tech companies measuring how to optimize ad spend. DriftMetrics is the inverse: built by a small studio (Asleepius Games) to give everyone — not just paying customers — a way to look at the gap between algorithmic surfaces and lexical surfaces, freely, with the same view everyone else gets.
The architecture enforces that. There is no admin tier, no enterprise SSO, no internal dashboard with "real" numbers. The numbers in the index are the numbers we have. If you paid a code to scan a keyword, anyone hitting the cache the next hour sees what you saw. That's the trade.