Staff Engineer at Series B Fintech — Nexus Prime Customers

We’d been running Claude Code for three months. Our token spend was climbing 40% month-over-month — not because we were doing more, but because every session started from zero. Claude re-read our monorepo, re-learned our conventions, re-understood our service boundaries. Every. Single. Time.

I found Nexus Prime on a Hacker News thread. The “99.4% re-reads” stat felt exaggerated. I ran a token audit that afternoon. We were at 96.8%. The remaining 3.2% was actual generation.

One npm i -g nexus-prime and a weekend later, our agents started sessions with context already loaded. Week-one token spend dropped 62%. That number has held for six months.

The dashboard at localhost:3377 turned our AI usage from a black box into something observable. I can see exactly where tokens are going, which sessions are memory-heavy, which agents are still re-reading.

It’s become infrastructure. We don’t think about it anymore — which is exactly what infrastructure should do.