Yesterday a friend reached out to me. He has a Claude subscription and wanted to try Anthropic's new code review feature, which is supposed to be very good. (And indeed is). He doesn't write code, so he came to me looking for something interesting to test it on.
As it happened, I had a real problem.
In NextApp, I derive a checksum from the local database on each device. The purpose is simple: if two devices are fully synchronized, they should produce the same checksum. If they don't, something is out of sync.
Recently I noticed that after a device had been running for a while since its last full sync, the checksums no longer matched. My desktop, phone, and laptop all appeared to have the same data, yet they generated different hashes.
So we handed the problem to Claude.
Claude Finds the Bug
Claude spent 500k tokens on the codebase. I think it consumed roughly half of my friend's five-hour usage limit. Eventually, it identified a problem in the way I generate database hashes.
Most of the local database is replicated from the server, so if synchronization works correctly, the data should be identical across devices.
The exception is the WorkSession table.
A work session tracks the time spent working on an action. While a session is active, the client application updates certain fields locally every second or two. These fields include accumulated duration and pause time.
Claude correctly observed that I write locally calculated values and write those to the database.
That means two devices can be perfectly synchronized and still generate different hashes simply because their local timers update at different moments.

The screenshot shows the work-sessions in NextApp on my desktop
Why Measuring Work Time Is Hard
To understand the problem, it's useful to understand how work sessions work in NextApp.
When I start working on an action, neither the client nor the server starts a stopwatch.
That sounds strange at first, but there is a good reason.
Imagine I start a task on my desktop computer. Then I shut down the computer, get on my motorcycle, travel somewhere, and continue working. An hour later I finish the task using my phone.
The desktop app wasn't running during that hour. The phone app wasn't running during the entire hour either. Even the server might have restarted for an upgrade during that period.
No single device can reliably measure the elapsed time.
Instead, NextApp records events on a timeline:
- Start events
- Pause events
- Resume events
- Correction events
- Completion events
From these events, an algorithm reconstructs the actual work time and pause time whenever needed. The displayed values are calculated from the event history rather than measured directly by a running timer.
It's a surprisingly interesting problem, and honestly it might deserve its own blog post someday.
The Wrong Conclusion
Claude identified the root cause correctly: values that are updated independently on each device should not participate in synchronization checksums.
However, its proposed solution was to not write the local values to the database. That may have side-effects and may cause other issues.
I fed Claude's findings into Codex and asked for a review and an implementation plan.
Codex reached a similar conclusion. It agreed that the locally updated values were problematic, but it didn't suggest a better solution.
What surprised me is that both models had access to the full codebase. The code is the documentation. Everything needed to understand the design was right there.
The Actual Solution
The answer was obvious to me as soon as I read the findings.
A work session can exist in three states:
- Active — the task currently being worked on.
- Paused — work that has been temporarily suspended.
- Done — completed sessions.
For a given user, only one session can be active at a time. Several sessions may be paused. Everything else is completed.
The locally updated duration and pause counters only matter for active and paused sessions. Those values are continuously recalculated and naturally differ between devices.
Completed sessions are different. Once a session is done, its values become stable.
So the solution was simple:
- Include all timeline events in the checksum.
- Include completed session data in the checksum.
- Exclude the calculated duration and pause values for active and paused sessions.
The event history is already synchronized through the server. Whenever work starts, pauses, resumes, or finishes, those events are replicated to every client. They are the true source of truth and should absolutely participate in the hash.
The derived timer values are merely local calculations and should not.
An Interesting Lesson
What I found interesting is that both Claude and Codex successfully identified the problem and even pointed toward the problematic area of the code.
Neither of them, however, managed to connect that insight with the underlying domain model and arrive at the correct/simplest fix.
Both of them had a clearer understanding than me of exactely what that code does, still my very human brain saw the simple solution immediately.
Claude found a good handful of other problems, so this was a very nice experience and a productive day.
