- Leveling Up
- Posts
- The uncomfortable truth about Token Maxing
The uncomfortable truth about Token Maxing
AI usage is climbing. Customer value is flat. Here's what's missing.
Here's the uncomfortable truth about token maxing.
The idea is simple. Spend as many tokens as possible. More usage, more reps, faster learning curve. Run that for six months and the operating system of the business actually changes.
I push it for that reason.
But token maxing on its own is a bad metric. Tie someone's job to token spend and you'll get exactly what you measured. Agents run on low-value work all day, memos get rewritten twenty times, compute burns to keep a dashboard green. Spend climbs. The business stays exactly where it was last month.
This is why every AI usage metric needs a pairing metric. Token spend is the input. The pairing metric is the output.
I track AI usage more like production than SaaS licensing. A license can sit dormant for months and nobody notices. AI can't work that way. If the bill is climbing, the output has to climb with it.
That's the difference between activity and capability.
Run the artifact test
At the end of the month, stop reading the usage dashboard. Instead ask each person on your team one question: show me the artifact
Show me the agent. Show me the workflow that didn't exist a month ago. Show me the campaign that moved faster because AI handled briefs, variants, and reporting. Show me the dashboard that updates itself.
Token spend without an artifact attached is waste. Token spend tied to a shipped system is an investment.
Here's the exact test:
Pick one team. Give them 30 days.
End of month, each person presents what they built — not what they used, what they built.
Keep the workflows that produced real output. Kill the ones that only burned compute.
Repeat next month. The bar goes up.
I track AI usage more like production than SaaS licensing. A license can sit dormant for months and nobody notices. AI can't work that way. If the bill is climbing, the output has to climb with it.
Why the first round usually disappoints
You'll probably run this test and get less than you expected back. I see it constantly.
The reason is almost always the same. The agents have no context. People throw generic prompts at generic models, get generic answers, and decide AI doesn't work for them.
The fix is the memory layer. Your agents need to know your customers, your offers, your campaigns, your transcripts, your positioning, your past decisions. The same prompt against that context produces work that compounds. The same prompt without it produces noise.
This is the problem we're solving with Single Brain, agents that carry the business context so the artifact test actually produces artifacts worth keeping.
Run the test this month. And see what shipped.
To building AI systems that actually ship,
Eric Siu