How We Made the Business Members Page ~100x Faster

The Business Members page was taking about 1.8 seconds of server time on every load. We traced it to two compounding design flaws, fixed both, and the same page now renders from cache in single-digit milliseconds. This article explains what was slow, why, and exactly what we changed.

Performance Improvement

  • Before: ~1.8 seconds
  • After: <10 milliseconds
  • Improvement: ~100x+

Measured using PayCal Lens.

For business administrators, the Members page now appears effectively instant, even for larger organizations.

Executive Summary

Page affected Business Members — the grid listing every member of a business with computed financial columns
Before ~1.8 seconds of server time per page load
After Single-digit milliseconds on cache hits (~100x+ improvement); cache misses also faster than the old path
Root causes An N+1 query pattern against Redis, and full recomputation of a year of payroll math on every view
The fix Batched (pipelined) Redis lookups, plus a materialized cache of the finished grid data with explicit invalidation
Data freshness Cache is invalidated immediately on any member-related change; a 5-minute expiry bounds staleness as a safety net

Why We Are Publishing This

Performance is part of transparency. When a page is slow, users deserve to know whether the slowness is inherent to the work being done or the result of an avoidable design flaw. In this case it was the latter, twice over, and we think the details are worth sharing because both flaws are among the most common performance mistakes in web software.

Nothing in this article involves a security issue or any user data exposure. It is purely an engineering story about making a slow page fast.

What the Page Does

The Business Members page lists every member of a business. Alongside each member's name and role, the grid shows five computed financial columns:

None of these values are stored anywhere as finished numbers. They are computed from each member's raw work entries, the individual shift records stored in our Redis database. Producing one row of the grid means loading a member's profile, loading their full year of work entries, splitting hours into regular and overtime, and summing gross pay. Multiply that by every member of the business, and that is the work the page does.

  • Year-to-date gross — total earnings so far this year
  • Total hours — all hours worked this year
  • Regular hours — hours at the standard rate
  • Overtime hours — hours beyond the regular threshold
  • Trailing baseline — a rolling reference figure used for comparison

The Problem, Part 1 — The N+1 Query Pattern

The original implementation made one query to fetch the list of members, and then, for each of roughly 100 members, made separate, sequential round-trips to Redis: one for the member's profile, and more for their full year of work entries.

Small businesses with a handful of members were largely unaffected; the stacked latency only becomes noticeable as organizations grow to dozens or hundreds of members, where the per-member round trips accumulate into seconds of server time.

This is the classic “N+1” pattern: one query to get the list, then N more queries issued one at a time for the items in it. Each individual Redis lookup is fast, well under a millisecond of actual work, but every round-trip also pays a fixed cost in network latency. Issued sequentially, those costs do not overlap; they stack linearly. Hundreds of sequential round-trips meant hundreds of stacked latency payments before the page could render anything.

The database was never the bottleneck. The conversation with the database was.

The Problem, Part 2 — Recomputing Everything on Every View

The second flaw compounded the first. All of that payroll math, splitting a year of work entries into regular and overtime hours, computing year-to-date gross, deriving the trailing baseline, for every member, was redone from scratch on every single page view.

Work entries do not change very often. Between two consecutive views of the Members page, the underlying data is almost always identical. Yet every visit paid the full cost of recomputing results that had just been computed moments earlier and then thrown away.

Combined, the two flaws produced about 1.8 seconds of server time per page load, measured with PayCal Lens, our built-in server timing instrumentation.

The Fix, Part 1 — Redis Pipelining

Redis supports pipelining: sending a batch of commands in a single round-trip and receiving all of the answers together. Instead of asking one question, waiting, asking the next, and waiting again, the server now asks all of its questions at once.

We added a batched lookup method, Database::pipelineHgetall(), and converted the members grid to use it. All member profile lookups are gathered into a single round-trip, and all work-entry lookups into another, rather than one round-trip per member per data type.

// Before - one round-trip per member, latency stacks linearly
foreach ($memberIds as $id) {
    $profiles[$id] = Database::hgetall($profileKey($id));
}

// After - one round-trip for the whole batch
$profiles = Database::pipelineHgetall(array_map($profileKey, $memberIds));

This change alone collapsed hundreds of sequential latency payments into a handful.

The Fix, Part 2 — A Materialized Cache

Pipelining makes the computation cheaper to feed; the second fix avoids repeating the computation at all. We introduced BusinessMembersCache, a server-side cache that stores the finished computed grid data, the per-member financial summaries, in Redis.

Two rules govern cache freshness:

  • A 5-minute expiry. Every cached grid automatically expires after 5 minutes, bounding how old the data can possibly be.
  • Explicit invalidation on change. Any member-related change, a role update, a member added, a member removed, deletes the cached grid immediately. The cache is invalidated immediately after relevant edits, ensuring fresh data is generated on the next request.

Member identity and permission checks are deliberately not cached. Every request still runs the full access-control path; only the expensive financial arithmetic is reused.

The Impact

  • Cache hits serve the grid in single-digit milliseconds — a ~100x+ improvement over the ~1.8 seconds the page previously took.
  • Cache misses are still faster than the old page ever was, because the recomputation now runs on pipelined, batched lookups instead of hundreds of sequential round-trips.
  • Correctness is tested. Contract and unit tests cover the cache invalidation behavior, verifying that role updates and membership changes evict the cached grid, and that expired or mismatched cache entries are never served.
Performance improvement summary showing ~1.8 seconds before, under 10 milliseconds after, and roughly 100x improvement measured with PayCal Lens
PayCal Lens measured server time before and after the fix.
PayCal Lens performance summary before optimization showing 1812 milliseconds total duration with financial summaries, profile hydration, and sequential work entry lookups as the slowest paths
Before: PayCal Lens ranked the per-member financial recomputation and sequential Redis round trips as the dominant costs (~1812 ms total).
PayCal Lens performance summary after optimization showing 7 milliseconds total duration with BusinessMembersCache get as the top path and cache hit status
After: the same page on a cache hit completes in single-digit milliseconds (~7 ms), with the materialized grid read as the primary work.
Business members grid table showing member names, roles, year-to-date gross, and total hours columns populated
The members grid loads immediately on repeat visits while access control still runs on every request.

What We Took Away From This

  • Measure before optimizing. PayCal Lens told us exactly where the 1.8 seconds went. Without per-request timing instrumentation, both flaws would have been guesses.
  • Latency stacks; batch it. Many small fast queries issued sequentially are slower than one large batched query. Round-trips, not data volume, dominated this page.
  • Cache finished work, invalidate eagerly. A cache is only trustworthy if every write path that affects it also clears it. The expiry is a safety net, not the mechanism.

We will continue publishing engineering write-ups like this one to the Transparency Hub.

Engineering Facts

  • Commit(s): 3db2229b, 2b3eafb8, f63773ea
  • Files Changed: 14 (under html/)
  • Tests Added: 12 (BusinessMembersCacheTest — 9 cases; LensRenderTest — 3 cases)
  • Tests Passing: 1901 (full PHPUnit suite, June 2026)
  • Performance Impact: ~1.8 s → ~7 ms (<10 ms on cache hits)
  • Production Status: Deployed