A user couldn't sign up on an Android device. Two coding agents investigated the same anonymous session. This page puts their full runs next to each other — every step, who drove it, and how deep each one pushed. The headline difference: Claude ran the whole thing autonomously; Codex had to be steered by a human three times (including being told to use the replay skill at all). Personal data, identifiers, session/replay IDs, IPs, and secrets are REDACTED.
| Dimension | Claude | Codex |
|---|---|---|
| Model | Opus 4.8 · 1M context | GPT-5.5 · x-high reasoning |
| Human-in-the-loop | 0 nudges — zero-shot, zero-touch autonomous | 3 human interventions — incl. being told to use the replay skill steered |
| Reached the replay by | Its own decision — recognised the session was anonymous and triangulated it | A human pointing it at posthog-replay-analysis |
| SportHead (suspected) | Ruled out — all calls 200 |
Ruled out — all calls 200 |
| Signup blocker | Duplicate account (success:false on a 200) |
Duplicate account (success:false on a 200) |
| Root cause | Gmail dot-variant typo — one mechanism explaining every symptom deeper | "Duplicate account" — not traced to a single cause |
| Reset email | Proven never sent — 3 layers + positive-control user deeper | "Request accepted (200), completion not observed" |
| Account nuance | Flagged expired provisional_minor, 0 parent links (dead-end) |
— |
Signup was never broken. The session was anonymous, so
there was nothing to search on — Claude found it by replay
triangulation, then proved a 16-year-old already had a confirmed account.
She failed login 10× (HTTP 400), was
correctly blocked from re-registering, and her password reset
never sent. Root cause: a Gmail address with the dot in
the wrong place — same inbox to Gmail, a different (non-existent)
user to Supabase. Done end-to-end without a human nudge.
SportHead was the suspected area; replay showed its calls all returned
200. The real blocker was create_pending_profile
returning success:false — "account already exists". A
password reset was triggered and accepted by Supabase
(200), but completion was not observed. Reached these
findings only after three human course-corrections.
git checkout no-ops) to read the PostHog / Sentry / Supabase / Brevo keys.Failed to fetch on Mac/Chrome. Key realisation: an abandoned signup never fires identify, so the failing session has no email/username — it is anonymous, and identity search is a dead end.$os = Android on /register & /login within the window, matched on registration step custom events. One anonymous distinct ID fit on time, platform, and flow.session_recordings metadata (~8 min, 227 clicks, 2 console errors), listed snapshot blobs, pulled blob_v2 ranges and gzip-decoded the rrweb events into a page/console/network timeline. Platform: https://localhost = Android Capacitor native WebView.POST /auth/v1/token ×5 → all 400. She tried to log in before registering, and couldn't.generate_sporthead_handle 200, many check_handle_availability 200, prefix/suffix lookups 200. The named-suspect area was healthy.200 but took 11.9 s; then create_pending_profile → 200 with body {"success":false, "...already exists"}. Console logged the thrown Registration error. A handled refusal — not a crash (which is why Sentry was empty).account_status = provisional_minor, 16 years old, provisional window expired ~2 months ago, parent_links = 0. A re-entry problem, not a creation problem./auth/v1/token → 400, with a $rageclick; then /forgot-password → /auth/v1/recover → 200.200 from /recover proves nothing (GoTrue returns 200 for any address). Checked three independent layers — recovery_sent_at (NULL), the user_recovery_requested audit row (absent), and Brevo delivery events (none). A separate user who reset successfully nearby lit all three green as a control.The human supplied a partial identity clue, then warned that "signup may not capture an email." Codex changed strategy from a direct user lookup to searching recent anonymous signup sessions.
The human asked Codex to check "current signup sessions now" and explicitly pointed it at the replay skill — the tool it needed to make any progress. Scope finally narrowed to fresh /register activity.
200.create_pending_profile returned 200 with success:false and a duplicate-account business-rule message.The human changed the focus from "why did signup fail?" to "did the password reset actually go through?" Only then did Codex pivot to recovery events.
POST /auth/v1/recover → 200 (PostHog + auth logs). Read as "request accepted."/reset-password, no /auth/callback, no successful login). Concluded "completion not observed" — did not test whether an email was ever generated or delivered.PostHog "replays" are not video — they are rrweb event streams (JSON). The /posthog-replay-analysis skill decodes them from the command line. This is the technique both engines relied on once they reached the replay — Claude on its own, Codex after being pointed at it.
Query session_recordings for the candidate session (matched on time, $os, and signup events). Metadata — duration, click/rageclick, console-error counts — is the first smell test.
Each recording is a series of blob_v2 snapshots. Gaps between blob timestamps reveal a dead/killed renderer; continuity means a live tab.
Pull blob ranges (max 20 keys/request). Events tagged cv:"2024-10" carry a gzip-compressed data field (latin-1 → gunzip → JSON).
type 4 = page load (href); type 6 plugins = rrweb/console@1 (console) and rrweb/network@1 (fetch/XHR); type 2 = full DOM snapshot.
Replay the URL sequence, per-second network buckets, and console lines. This is where the five 400 logins and the success:false reset body surfaced.
The type 2 full snapshot exposes element values. Here the unmasked register field revealed the exact Gmail dot-variant — the root cause.
Gmail ignores dots in the local part (same inbox); Supabase compares the raw string (different user). One misplaced dot explains everything.
real: ⟨name⟩.⟨word⟩NN@gmail.com ← dot AFTER name
typed: ⟨name⟩⟨word⟩.NN@gmail.com ← dot BEFORE number
→ explains the 400 logins, the dead reset, and why "already exists" still matched (autofill gave the correct spelling only on the register screen).
Concluded "existing account conflict" and recommended a clearer login/reset path. It did not isolate why the same person could neither log in nor reset — the dot-variant mechanism was never surfaced.
A 200 from /recover means nothing — GoTrue
returns 200 for any address (anti-enumeration). Verified across three
independent layers, with a control user who reset successfully nearby:
recovery_sent_at — NULL (control: set)user_recovery_requested audit row — absent (control: present)Conclusion: the email was never sent.
Treated POST /recover → 200 as "request
accepted" and noted completion was not observed in the session. Did not
test whether an email was actually generated or delivered, so non-delivery
was never established.
session_recordings, blob_v2)auth.users, audit_log_entries, vault, /config/auth)/v3/smtp/statistics/events) — delivery truth/recover = 200)rg, sed, jq, curlprovisional_minor / no-parent dead-end.password_reset_requested.password_reset_completed.