← Back

Methodology — technical

For engineers, legal, and DPA correspondence · generated from current data

At a glance — current sites

Real numbers from the most recent trial set. Users = present in ≥3 of 5 stealth trials. Monitoring = present in production DB last 30 days. Gap = users yes, monitoring no.

Site Total cookies Pre-consent Post-consent Users In monitoring Gap (users \ monitoring) Historical only Pre-consent · non-essential
ahaonline-cz
ahaonline.cz (production)
395 319 76 252 99 209 54 281
beta-labrador-ahaonline-cz
beta-labrador.ahaonline.cz (beta)
340 204 136 216 25 192 0 172

"Pre-consent · non-essential" = the legally significant subset: cookies in the advertising, analytics, or unknown categories that fire before the consent click in at least one trial.

What this audit does not cover

Limitations the hardened scanner still has. Read these before drawing conclusions.

  1. IndexedDB writes — we capture indexedDB.databases() (names only), not IDBObjectStore.put / add. Didomi, OneTrust, Permutive write into IndexedDB.
  2. Service Worker storage and Cache Storage — not intercepted.
  3. Fingerprinting probes — Canvas, WebGL, AudioContext, font enumeration, navigator.userAgentData not captured.
  4. Cross-page persistence — every trial loads only the homepage. Article pages, search, login flows not exercised.
  5. Geographic variance — all trials from one egress IP. Country-conditional ad stacks not stressed.
  6. Consent vendor coverage — Didomi and CPEx only. OneTrust, Cookiebot, TrustArc, Sourcepoint, custom CMPs fail consent capture.
  7. Tamper-evidence — trial JSONs and HTML reports are not cryptographically signed. The reports carry an "Evidence: unsigned" chip.

Worked examples

Real cookies pulled from the current data, one per pattern. Each example shows how the three measurements differ for one specific cookie.

The four scanners

ScannerWhere it runsDisguiseWhat it captures well
Productionsrc/cookies-checker.ts · nightly cron → MariaDBOptional stealth via --stealthLegacy operational dataset. Carries the 28 F-flaws (dedup races, regex name overwrite, value truncation, no SS/CS/IDB/TC-string).
Correctedscripts/true-scan.tsNoneF-flaws fixed: SessionStorage + CookieStore + JAR + TC-string + name preservation. Bot-suppressed.
Hardened headless (workhorse)scripts/scanner-lib.tsplaywright-extra + stealth plugin + --disable-blink-features=AutomationControlledF-flaws fixed AND bot disguise intact. Reproducible in CI.
Hardened headed + humanSame as 3, headed modeSame as 3 + mouse moves, scroll, 3 s dwellClosest proxy to a real interactive session. One-run ablation only.

A fifth tier — attaching to a real Chrome via CDP — is the gold standard but cannot be automated, so it is out of scope.

The three measurements (columns on every per-site report)

ColumnInclusion ruleAnswers
Seen by usersCookie present in ≥3 of 5 hardened-headless trialsWhat a real visitor reliably encounters on this site.
In monitoringCookie name present in the production cookies table, last 30 daysWhat our operational compliance system has on record.
Seen in audit (detailed view only)Cookie present in any of 7 trials (5 hardened + 1 headed-human + 1 baseline)The full audit surface, including stochastic RTB winners.

Cross-measurement gaps

GapMeaning
Users \ MonitoringReliably reaches users; not in nightly DB. The compliance under-reporting gap.
Audit \ MonitoringCaught by the deep scan at least once; not in nightly DB. Broader version of the gap above, including stochastic RTB.
Monitoring \ AuditOnly in the DB historically. Three causes: (a) regex name normalisation (F-010); (b) ad inventory rotation; (c) trials missed it.
Pre-consent · non-essentialSubset of the above with category ∈ {advertising, analytics, unknown} and first-seen before the consent click. ePrivacy 5(3) candidates.

How to reproduce

# 1. Tunnel to MariaDB
ssh -i ~/.ssh/cnc_analyticsstack -fN -L 3316:127.0.0.1:3316 cnc_user@analyticsstack.aws.cnci.tech

# 2. Run 14 trials (2 sites × 7 modes, ≈4 min)
npx ts-node --transpile-only scripts/run-trials.ts

# 3. Pull last 30 days from production DB
npx ts-node --transpile-only scripts/pull-db-cookies.ts

# 4. Aggregate per-site Users / Monitoring / Audit
npx ts-node --transpile-only scripts/aggregate-tables.ts

# 5. Generate reports
npx ts-node --transpile-only scripts/generate-canonical-report.ts ahaonline-cz
npx ts-node --transpile-only scripts/generate-canonical-report.ts beta-labrador-ahaonline-cz
npx ts-node --transpile-only scripts/generate-detailed-report.ts ahaonline-cz
npx ts-node --transpile-only scripts/generate-detailed-report.ts beta-labrador-ahaonline-cz
npx ts-node --transpile-only scripts/generate-methodology-technical.ts