What Went Wrong
Not everything worked smoothly. Context limits, HealthKit crashes, WatchConnectivity quirks, and CloudKit surprises. Here's what tripped us up.
Not everything worked smoothly. This post covers the hard parts, the things that no amount of good prompting could solve on the first try.
Context Limits
Long conversations hit context limits. Claude Code warns at 20% remaining, then 10%, then 5%. When this happens, we either use /compact to summarise the conversation or start a new session.
For complex features, we break work into phases. Each phase gets its own session. The CLAUDE.md file and PRD documents carry context between sessions. It’s not seamless; some nuance always gets lost in the handover, but it works well enough.
Plan for context limits up front. If a feature needs three sessions, design the work in phases from the start.
The 0xdead10cc Crash
This one was memorable. FoxFit kept crashing in the background with a 0xdead10cc exception code. This is Apple’s way of telling you your app is holding a file lock while suspended.
The problem: our SwiftData writes were still in flight when iOS suspended the app. The system detected the lock, killed the process, and left us with a crash report and a hex code that sounds like a threat.
This wasn’t something the AI could diagnose from documentation alone. We had to read the crash logs, understand what 0xdead10cc actually meant (it’s a known iOS behaviour, but not well-documented outside WWDC sessions), and work through the fix iteratively. The solution involved ensuring all pending writes complete before the app enters the background. That’s something that sounds obvious in hindsight but took several attempts to get right.
HealthKit Edge-Cases
HealthKit has behaviours that aren’t well documented. The AI’s initial implementations handled the happy path. The edge cases took a LOT of iteration.
The nastiest: workout session management during app transitions. A user starts a workout, switches to Spotify to put on music, comes back, and the session state might not be what you expect. Background state transitions, calorie tracking across session boundaries, and handoffs between iPhone and Apple Watch all had undocumented quirks.
These situations require patience. We described the symptoms and the AI proposed hypotheses. We tested them on real devices and fed back results. Through teration, eventually we converged on a series of fixed. Then we added the lesson to CLAUDE.md so it doesn’t happen again. This is probably the thing that caused the single biggest delay in release.
WatchConnectivity in the Simulator
The Watch app’s sync with iPhone is unreliable in the Simulator. Messages get lost, reachability states don’t update correctly, and the Simulator simply doesn’t behave like real hardware for WatchConnectivity.
We had to test on physical devices. This is slower as deploying to and testing on a real Apple Watch takes time, but there was no workaround we could find. The AI can write correct WatchConnectivity code but it can’t make the Simulator work properly. Some problems are infrastructure limitations, not code problems.
Actually, this is competing with HealthKit edge-cases as the longest delay.
CloudKit Configuration
TestFlight apps use the production CloudKit database, not development. We didn’t realise this initially. Data created during testing ended up in production.
This was not a disaster while in development. TestFlight testers are early adopters, and their data should persist to launch anyway. But it just caught us off guard. This is the kind of deployment behaviour that the AI doesn’t know about and isn’t clear in API documentation. It’s tribal knowledge; the kind of thing you usually learn from a colleague who’s been burned by it before.
The Pattern
Look at all of those problems and you’ll notice they fall into two camps. Either the AI didn’t know about something because it wasn’t in the documentation (the 0xdead10cc crash, the HealthKit quirks, the CloudKit gotcha), or the AI couldn’t test something because it only happens on real hardware or in production (Simulator vs device, dev vs prod). Both kinds need a human in the room watching what actually happens. The AI is fast, but it’s not psychic.