Auth, Passkey & Redis Hardening — May 2026

On May 12, 2026 we conducted an internal audit of our authentication, passkey, and Redis infrastructure. We found eleven issues — every one of them in code we wrote ourselves. This article documents what we found, why it mattered, and exactly what we changed.

Executive Summary

Audit date May 12, 2026
Scope Authentication, passkey (WebAuthn), and Redis infrastructure
Total findings 11
Severity breakdown 3 High 5 Medium 3 Low
Remediation status All findings resolved in commit 493d5e44. Full test suite passed. No regression.
Exploitation evidence None

Why We Are Publishing This

We found these issues in our own application code and infrastructure layers — not in third-party dependencies or external services. Code we reviewed, committed, and shipped.

We are publishing this because security transparency requires more than disclosing external CVEs or passing audits. It means being publicly accountable when your own team ships code that does not meet the bar you set for yourself.

We are not embarrassed by this. The greater failure would have been discovering these issues and choosing not to disclose them.

Audit Methodology

This audit was conducted internally by the engineering team on May 12, 2026. The review covered all code paths related to authentication state management, WebAuthn credential lifecycle, and Redis key handling.

  • Manual code review of all controller, domain, and infrastructure files involved in session creation, passkey registration, passkey login, and account recovery flows.
  • Static analysis via PHPStan at Level 9 — zero-tolerance for type-unsafe or unreachable code paths.
  • Threat modeling against the WebAuthn Level 2 specification (§6.1 authenticator data, §7.1 registration ceremony, §7.2 authentication ceremony).
  • Regression testing with the full PHPUnit regression suite post-remediation. All tests passed.

No external auditor, bug bounty report, or security incident preceded this review. These issues were identified through routine internal process.

Our Engineering Philosophy

This audit surfaced failures across three principles we hold as foundational:

  • Atomicity before correctness. If two operations must happen together, treat them as one operation or do not attempt the design at all. A system that is “correct most of the time” is not correct.
  • Layered defense. No single check should be the sole barrier to a security boundary. If the database flags a credential as revoked, the registration path must also enforce it. Defense must not have gaps between layers.
  • Information asymmetry as a design goal. An attacker who probes the system should learn as little as possible about what is happening inside it. Error messages, log entries, and response timing are all surfaces.

Finding 1 — Non-Atomic hset + expire (Redis Race Condition) High

Category: Redis / Atomicity

Across eight callsites, a Redis hash was written with HSET and then immediately had a TTL applied with a separate EXPIRE command:

// Before — two separate round trips
Database::hset($key, $fields);
Database::expire($key, $ttlSeconds);

These are two separate round trips to Redis. If the PHP process dies, is interrupted, hits a timeout, or if Redis experiences a momentary failure between them, the hash is written without an expiry — and lives forever in Redis.

The affected callsites and their security implications:

Callsite Key type Consequence of missing TTL
Authentication::createSession() Session record Session never expires — account accessible past intended lifetime
PasskeyController (signup challenge) WebAuthn challenge Stale challenge data persists beyond intended lifetime, increasing replay risk
PasskeyController (register challenge) WebAuthn challenge Same as above
PasskeyController (login challenge) WebAuthn challenge Same as above
AccountRecoveryController Recovery passkey challenge Recovery session data never expires
RecoveryEmailController (code issue) Recovery email code One-time codes survive past their intended expiry window
RecoveryEmailController (code resend) Recovery email code Same as above
CapabilityTokenService One-shot admin tokens Tokens designed to expire in 5 minutes may survive indefinitely
AccountRecoveryTransaction Recovery transaction record Recovery transaction state never cleaned up

For sessions, this is a direct access-lifetime violation. A session should have a hard ceiling. If the TTL is never set, that ceiling does not exist.

For one-shot capability tokens, a token designed to be valid for exactly 300 seconds may still be valid days later.

The fix: We introduced Database::hsetex() — a wrapper that executes both operations inside a Redis MULTI/EXEC transaction, making them atomic. The operations are executed in the same execution unit so the key cannot exist without its TTL being applied. The key either has data and a TTL, or it has nothing.

// After — atomic MULTI/EXEC
Database::hsetex($key, $fields, $ttlSeconds);

Every callsite that issued a hset followed by expire on the same key was converted.

Finding 2 — Logout and CSRF Invalidation Could Silently Fail High

Category: Redis / Logout, CSRF

The Database::del() method — responsible for deleting Redis keys by pattern — was enumerating keys using the read replica and then issuing DEL commands to the primary:

// Before — key enumeration on replica
$keys = self::getReadInstance()->client->keys($pattern);

Redis replication is asynchronous. If the replica lags — even by milliseconds — it may not yet hold the key that was just written. In that case keys() returns an empty list and no DEL is issued to the primary. The key survives.

The two most critical callers of del():

  • destroySession() — logout: When a user logs out, we delete their session key. If the replica is behind, the session key list returns empty, the delete never fires, and the session continues to exist on the primary. The user believes they are logged out. They are not.
  • validateCSRFToken() — nonce invalidation: CSRF tokens are one-shot nonces. After first use they must be deleted. If the delete never fires, the token can be reused on a second request. One-shot becomes reusable.

This bug is subtle because it only surfaces under load or transient replica lag. In development against a single Redis instance it never triggers.

The fix: Key enumeration and deletion must target the same instance.

// After — enumerate against write instance
$keys = self::getWriteInstance()->client->keys($pattern);

Finding 3 — WebAuthn User Verification Bypass High

Category: Authentication

In AccountRecoveryController, when a passkey was being registered as part of account recovery, the processCreate() call passed false for requireUserVerification:

// Before — UV not enforced on verification
$result = $webauthn->processCreate(
    $clientDataJSON, $attestationObject, $challengeBinary,
    false,  // requireUserVerification — should be true
    true
);

The challenge issued to the client specified userVerification: 'required' — the authenticator was told the user must complete a biometric check or PIN. But when verifying the response, we were telling the library not to enforce that the UV flag was set.

A modified client could submit an authenticator response with the UV bit cleared. Our server would accept it without requiring biometric verification to have actually occurred.

The account recovery flow is the path a user takes when they have lost access to their other credentials. This is the highest-risk authentication surface we operate. Weakening biometric enforcement here is exactly the wrong trade-off.

The fix: UV is now enforced. A response where the authenticator data does not carry the UV flag set is rejected.

// After — UV enforced
$result = $webauthn->processCreate(
    $clientDataJSON, $attestationObject, $challengeBinary,
    true,   // requireUserVerification — enforced
    true
);

Finding 4 — Sign Count Clone Detection Missed Replay Attacks Medium

Category: Authentication

Our passkey clone detection was checking:

// Before — misses equal-count replay
$suspectedClone = $newSignCount > 0 && $oldSignCount > 0 && $newSignCount < $oldSignCount;

The WebAuthn Level 2 specification (§6.1) states: if the stored sign count is non-zero and the new sign count is not strictly greater than the stored value, the credential should be considered as possibly cloned. Our condition required <, not <=, so an equal sign count — as in a replay attack — passed without triggering the clone flag.

The fix: Aligned to the spec.

// After — covers replay (equal) and rollback (less-than)
$suspectedClone = $oldSignCount > 0 && $newSignCount <= $oldSignCount;

Finding 5 — Sign Count Not Always Persisted Medium

Category: Authentication

After a successful passkey login, the sign count update was gated on it being non-zero:

// Before — zero counts never written
if ($newSignCount > 0) {
    $updateFields['sign_count'] = (string) $newSignCount;
}

Some authenticators return 0 as a sentinel meaning “this device does not implement a counter.” If a device later starts returning a real counter (firmware update, or the user registers the same credential on a counter-supporting platform), we would never persist the initial real count because we had stored 0 forever.

Clone detection (Finding 4) requires the stored count to be non-zero — an authenticator we permanently tag as 0 is permanently opted out of counter-based protection.

The fix: The sign count is always written. The clone detection threshold handles interpretation.

// After — always persist sign count
$updateFields['sign_count'] = (string) $newSignCount;

Finding 6 — Revoked Passkey Could Be Re-Registered Medium

Category: Authentication

When a credential is marked revoked (clone detection triggered), there was no check in the registration path preventing re-registration of that same credential_id. An adversary with the raw passkey credential and account access could re-register the revoked credential, clearing its tainted history.

Revocation is only meaningful if it is permanent. If it can be overwritten by re-registration using the same credential, clone detection provides no lasting protection.

The fix: If revoked_at is non-empty on an existing credential record, the re-registration is blocked with HTTP 403 and a security log entry is written.

if (($existing['revoked_at'] ?? '') !== '') {
    SecurityLog::log('passkey_revoked_reregistration_blocked', [...]);
    Response::error('Registration failed.', ['error' => 'passkey_revoked'], HttpStatus::HTTP_FORBIDDEN);
    return;
}

Finding 7 — Account Enumeration via Differing Error Responses Medium

Category: Information Disclosure

When a passkey login was attempted with an unrecognized email, the error response body took a different shape than other failure cases — an empty [] data payload vs. the {'error': 'passkey_invalid'} body returned elsewhere. A client probing the API could distinguish “this email has no account” from “this email exists but the challenge failed” by inspecting the response body.

Additionally, the raw email address was being written to the observability log. Log aggregation pipelines should never hold raw user email addresses — if the log system is compromised, every enumeration attempt becomes a list of email addresses.

The fix: Both “email not found” and “no credentials registered” now return the same error body. The observability log records a SHA-256 hash of the email only — sufficient for incident correlation, insufficient to reconstruct the address.

// Before
Lens::add('[PASSKEY] Login email not found', ['email' => $email]);
Response::error('Authentication failed.', [], HttpStatus::HTTP_UNAUTHORIZED);

// After
Lens::add('[PASSKEY] Login email not found', ['email_hash' => hash('sha256', $email)]);
Response::error('Authentication failed.', ['error' => 'passkey_invalid'], HttpStatus::HTTP_UNAUTHORIZED);

Finding 8 — Recovery Key State Written Before Email Delivery Confirmed Medium

Category: Data Integrity

During account recovery key generation, the server was writing recovery_key_generated = 1 and recovery_proof_key to the user record before sending the recovery key email:

// Before — DB written first, email second
Database::hset(Keys::USER.':'.$user->user_uuid, [
    'recovery_key_generated' => '1',
    'recovery_proof_key' => $recoveryProofKey,
]);
$sent = EmailGarum::sendRecoveryKeyEmail(...);

If the email failed to send, the database would show recovery_key_generated = 1 — the system believes a key was issued. The user never received it.

There is no regeneration path for a user in this state. Account recovery is permanently broken for that account until manual intervention.

The fix: Email delivery is confirmed first. Database state reflects what actually happened.

// After — email first, then persist
$sent = EmailGarum::sendRecoveryKeyEmail(...);
if ($sent) {
    Database::hset(Keys::USER.':'.$user->user_uuid, [
        'recovery_key_generated' => '1',
        'recovery_proof_key' => $recoveryProofKey,
    ]);
}

Finding 9 — Disabled Registration Path Still Collected Password Fields Low

Category: Attack Surface

RegistrationController was still reading password and confirm_password from POST even though password-based registration has been disabled. PayCal registration is passkey-only.

Collecting fields that serve no function is not harmless. Every value read from user input is a surface: it can be logged, audited, passed to other functions inadvertently, or included in error payloads. The principle of least surface area requires that we do not collect what we do not use.

The fix: The two fields were removed from the input collection map.

Finding 10 — User Email in Email Verification 403 Response Low

Category: Information Disclosure

EmailVerificationGuard — the middleware enforcing email verification before granting access to protected resources — was including user_email in the 403 response body:

// Before
Response::error('Email verification required...', [
    'email_verified' => false,
    'user_email' => $user->email,  // disclosed to caller
], HttpStatus::HTTP_FORBIDDEN);

If an attacker gains a valid but unverified session token (through session fixation or a compromised temporary link), they can learn the email address associated with the account from the 403 response body — without having supplied the email themselves. The only party who benefits from the email in this error payload is someone who has the session token but not the email.

The fix: The email field was removed from the error payload.

Finding 11 — Dead Code in EmailGarum::verifyNewUserEmail() Low

Category: Dead Code / Attack Surface

EmailGarum contained a 90-line method, verifyNewUserEmail(), handling a password-based email change flow. This flow was superseded when the platform moved to passkey-only authentication. The method was not called anywhere in the codebase.

Dead code is not neutral. It occupies space in the security review surface, in static analysis, and in the cognitive load of anyone reading the file. It also represents a risk that a future developer, not knowing it was intentionally abandoned, might re-wire it into a new flow without full context.

The fix: Deleted. All callsites were confirmed empty before removal.

Summary of All Findings

# Finding Severity Category
1Non-atomic hset + expire across 9 callsitesHighRedis / Atomicity
2del() using read replica for key enumerationHighRedis / Logout, CSRF
3WebAuthn UV bypass in account recovery registrationHighAuthentication
4Sign count clone detection missed replay attacksMediumAuthentication
5Sign count not persisted when authenticator returns zeroMediumAuthentication
6Revoked passkey could be re-registeredMediumAuthentication
7Account enumeration via error body + raw email in logsMediumInformation Disclosure
8Recovery key DB state written before email confirmedMediumData Integrity
9Disabled registration still collecting password fieldsLowAttack Surface
10User email in email verification 403 responseLowInformation Disclosure
11Dead method verifyNewUserEmail() in EmailGarumLowDead Code

What We Got Right

In the interest of a complete picture, the foundations that were already in place:

  • Passkey-first authentication. The platform runs on WebAuthn with no password fallback for passkey users. The UV bypass and clone detection issues were defects within a fundamentally sound architecture.
  • One-shot capability tokens. Admin-level mutations already required fresh, time-limited tokens. The atomicity fix hardened an existing protection rather than adding a missing one.
  • Signed security log. Every security event — including the new passkey_revoked_reregistration_blocked events added in this commit — is written to a signed, append-only log with structured fields.
  • PHPStan at Level 9. All 11 modified files were validated at maximum static analysis strictness. The full regression suite passed without regression.
  • Clone detection existed. The logic was present and partially correct. Finding 4 was a boundary condition error, not a missing feature.

Customer Impact

  • No evidence of exploitation. All findings were identified internally through routine code review. No external report, CVE, or incident preceded this disclosure.
  • No plaintext credential exposure. No passwords or recovery keys were exposed. Credential data at rest remains encrypted. Biometric data never leaves the authenticator device and is never transmitted to or stored by PayCal.
  • No evidence of unauthorized account access. Security logs show no anomalous patterns consistent with exploitation of these vectors.
  • All findings remediated before disclosure. Every issue described in this article was fixed, committed, and tested before this page was published.
  • Full regression suite passed. Complete PHPUnit suite and PHPStan Level 9 static analysis completed cleanly post-remediation.
  • Monitoring expanded. New security log events were added for passkey revocation enforcement (Finding 6) to surface future anomalies earlier.

Prevention and Recurrence Controls

Two engineering rules adopted as standing policy from this audit:

hsetex is the default Redis write pattern

Any future code that needs to write a hash with a TTL must use Database::hsetex(). The old two-step pattern is no longer permitted. PHPStan rules will be written to flag new occurrences.

Write-instance primacy for all key operations

Any Redis operation whose correctness depends on reading back what was just written must use the write instance. Read replicas are for read-heavy non-critical queries only.

Self-audits at this level of specificity are a standing commitment. We will continue publishing what we find. Future reports will be posted to the Transparency Hub.

Disclosure Timeline

Date Event
Findings identified during routine internal audit session
All fixes implemented and committed (493d5e44)
Full PHPUnit regression suite passed, PHPStan Level 9 clean
Pushed to origin/main
This transparency article published

All findings were identified internally. No external report, CVE, or breach preceded this disclosure. There is no evidence that any finding was exploited.