After Burning Through My Codex and Gemini Limits
Apr 08, 2026
Heavy use of coding agents taught me two practical lessons: Claude Code plus MiniMax is the best-value combo I have tried, and context management matters more than people think.
Over the past few days, I used several coding tools heavily enough to burn through both my Codex and Gemini limits.
That gave me a much clearer picture of what actually matters in day-to-day use. Part of it is about model quality, but a bigger part is about rate limits, recovery behavior, and what happens when context gets too large.
Here are the subscriptions I am currently paying for:
| Service | Price | My current take |
|---|---|---|
| ChatGPT Plus | $19.99/month | Strong coding performance, but the hard limits are noticeable |
| Google AI Pro (5TB) | $19.99/month | Usable, but coding still feels weaker and large context slows it down |
| MiniMax Coding Plan | RMB 119/month | Flexible pricing and very good value when paired with Claude Code |
If I compress everything into two conclusions, they are these:
- Claude Code plus MiniMax Coding Plan is the best combo I have used so far.
- If you do not manage context aggressively, your agent workflow degrades faster than you think.
My Current Ranking
I am not trying to do a benchmark here. This is a practical ranking based on how these tools feel in real work.
Claude Code + MiniMax Coding Plan Feels Best Right Now
This is the combination I currently like the most.
The main reason is simple: it gives me the best balance of performance, cost, and usable time. The limits reset every five hours, but unless you are doing something like fully automated long-running research, it is actually hard to hit those limits in normal coding work. Real workflows usually include manual steps: reading code, changing direction, checking outputs, deciding what to do next. Once human work is part of the loop, the limit feels much less restrictive than it sounds.
I also like MiniMax’s pricing structure. There are multiple plan choices, so you can start with the cheapest option and scale up if needed. At a bit over one hundred RMB per month, it is not free, but for heavy users it is very reasonable.
Right now, this is the setup that feels the least annoying and the most sustainable.
Codex Is Very Good, but the Limits Are Hard
I do think Codex is good at coding.
When it is available, it is often excellent. The problem is that its limits feel much harder than the limits in the Claude Code plus MiniMax setup. There is a five-hour limit, and there is also a weekly limit. Once the weekly limit is hit, there is not much to do except wait or pay for additional credit.
That matters because the real cost is not only money. The real cost is interruption. You can be deep in a working rhythm and then suddenly hit a wall for the week. For anyone trying to push a project forward continuously, that is a worse experience than a tool that is merely a bit slower.
My current view is that Codex is a strong coding tool, but it is a tool with stricter capacity management than I would like.
Gemini Is Not Useless, but It Is Not My Main Coding Tool
Gemini feels weaker than Codex for coding.
That is not even my biggest complaint. The bigger issue is how strongly context size seems to affect it. Once context goes above roughly fifteen percent, it starts to feel slow. And the slowdown is not just about response time. I also get the impression that quality drops along with speed.
Its recovery behavior is better than a hard weekly wall. If I hit the limit, waiting half a day or a full day is often enough for it to become usable again. That makes it easier to rotate back in later.
So I do not see Gemini as useless. I see it more as a secondary tool than a primary one.
The Bigger Problem Is Not the Model. It Is Context
The strongest lesson from all this is that model choice is only part of the story.
In practice, context management often matters more.
My advice now is straightforward: use subagents frequently and do not keep dumping everything into the main agent.
Once the main agent’s context becomes large, performance starts to degrade in ways that are easy to feel:
- responses get slower
- reasoning feels less sharp
- outputs become less stable
- the agent is more likely to loop
- the agent is more likely to look busy without moving the work forward
This is why I now restart the main agent more aggressively, and why I prefer to offload work to subagents before the main session becomes bloated.
A Simple Self-Check
I now use a very simple self-check:
If you always resist restarting the session, your context management is probably already broken.
I think a lot of people quietly assume that a long conversation is an asset in itself. They think that because the session has accumulated so much history, the AI must be getting more informed and more effective from it.
In reality, the long thread may be carrying a lot of low-quality state:
- outdated decisions
- abandoned approaches
- partial truths that are now misleading
- stale local context that no longer matches the current task
That material does not automatically disappear. It stays in the system and keeps influencing later behavior.
I have seen this very clearly in my own workflow. When I start a fresh session, skills like brainstorming in superpowers tend to work much more cleanly. But in a long thread, if I try to start a new feature with the same skill, the process is more likely to drift and stop following the intended flow strictly.
That is not because the skill stopped being good. It is because the surrounding context has become noisy enough to bend the behavior.
So if you are stuck, looping, or patching the same area over and over without real progress, try restarting the session. Or start a subagent and let it handle the next chunk of work in a cleaner context.
This Is Another Argument for Small Steps
This also gave me a much stronger appreciation for the old agile idea of taking small steps.
In agent workflows, that principle becomes very concrete. The more things you try to do inside one session, the faster context grows. The faster context grows, the more performance degrades. And once performance degrades, errors, drift, and wasted motion accumulate quickly.
That makes a small-step workflow much more attractive:
- let one session do a limited amount of work
- keep the scope narrow
- externalize state often
- make handoff easy for a fresh agent
- do not rely on one long conversation to carry everything forever
Externalize State or Pay for It Later
This is why I increasingly save working context into files or other durable places so a new agent can pick things up quickly.
My own setup is simple: I use a small agent-kanban tool that I wrote for myself, plus superpowers. For my use case, that is already good enough.
The important part is not the exact stack. The important part is that the state is no longer trapped inside one chat:
- what the current task is
- what has already been done
- what the next step should be
- what decisions are already settled
- what a new agent needs to know to continue
Once that information is externalized, restarting stops feeling expensive. A session becomes disposable, which is exactly what you want.
The asset is not the chat thread itself. The asset is the state that survives the thread.
My Two Conclusions
After all this heavy use, I am left with two practical conclusions.
First, Claude Code plus MiniMax Coding Plan is the best-value coding setup I have used so far.
Second, restart sessions often and use subagents aggressively. Do not romanticize long context. Context is not an asset by itself. It only becomes an asset after it has been structured, externalized, and made reusable by the next agent.
That may matter more than which model you choose.
The ideas in this post are mine; Codex helped me write it.
If you'd like to follow what I'm learning about AI tools and workflows, you can subscribe here → Subscribe to my notes