Hi,
Starting April 4 at 12pm PT / 8pm BST, you’ll no longer be able to use your Claude subscription limits for third-party harnesses including OpenClaw. You can still use them with your Claude account, but they will require extra usage, a pay-as-you-go option billed separately from your subscription.
Your subscription still covers all Claude products, including Claude Code and Claude Cowork. To keep using third-party harnesses with your Claude login, turn on extra usage for your account. This will be enforced April 4 starting with OpenClaw, but this policy applies to all third-party harnesses and will be rolled out to more shortly (read more).
To make the transition easier, we’re offering a one-time credit for extra usage equal to your monthly subscription price. Redeem your credit by April 17. We’re also introducing discounts when you pre-purchase bundles of extra usage (up to 30%).
We’ve been working to manage demand across the board, but these tools put an outsized strain on our systems. Capacity is a resource we manage carefully and we need to prioritize our customers using our core products. You will receive another email from us tomorrow where you’ll have the ability to refund your subscription if you prefer.
In other words this is about Anthropic subsidizing their own tools to keep people on their platform. OpenClaw is just a good cover story for that. You can maximize plans just as easily w/ /loop. I do it all the time on max 20x. The agent consuming those tokens is irrelevant.
For what it's worth I don't use OpenClaw and don't intend to, but I do use claude -p all the time.
You are paying to be using that limit some of the time. There are 5 hour windows when you are sleeping and can't use it. There are weekend limits.
Theoretically you can max out every 5 hour window, but they lose money on that.
It's structured so users can have bursts of unlimited usage, and spend ~15% of the theoretical max cap, and that's still cheaper than a subscription for that user.
An OpenClaw user can use 6, 7, 8 times what a human subscriber is using.
Ah, to be human!
I grew up in an Asian household of six. We definitely took food home at AYCE places. My parents definitely knew it wasn't OK, but they felt like they were gaming the system (like a dubious life hack of sorts) and saving money, so they were actually quite proud of it, bragging to friends how much they were able to get.
To be human indeed!
Goes to show just how fragile a high-trust society is. Theft and corruption can easily be normalized to such an extent that not participanting gets reframed as immoral.
If the factory is yours, then everything inside is yours ;)
But it's funny how low wages under the broken Soviet economic system turned such things into a semi-official, informal work perks, allowing people to make ends meet.
I wouldn't call it "funny" though. It ws quite sad and I'm glad it's over.
That said, the general unavailability of everything was caused by an incompetent government rather the the system itself but the system itself caused the government. My point is that it was a succession of demagogueries hiding personal interests that caused the recurring and unrecoverable tragedies of that state. Being controlled and misguided is not exclusive to any particular government or political system.
I don't think communism is a good form of government and I don't think the soviet union was marching the right way.
But the biggest blunts came from other much more serious mistakes caused by politicians ignoring science, like the big famine and many others, including the Chernobyl connerie
I guess if it’s your moral obligation to steal from the workplace it reframes it somewhat.
No, there is a weekly limit as well. Maxing out a single 5h window uses ~10% of the weekly limit
https://code.claude.com/docs/en/scheduled-tasks
Perhaps people at Anthropic should ask Sonnet (or Kimi, it's much better value) how power laws and pareto distributions work? You are advertising for people who can justify a virtually unlimited amount of tokens, why is it surprising that they would use as many as you're offering them in the plan?
PS: interesting that you'd use a throwaway account to post this
If you manage developers or product folk, do you allow them to work when you're not looking over their shoulder? All developers can be managers/team leads now. You plan, you delegate, you review.
You're welcome to not do this, surely that's appropriate in quite a few areas of work, but many of us are because we can get more work done than if we we're micromanaging every line of code change. For startups, where a bit of quality can suffer in favor of finding market fit, this is huge.
This is just the morning ones, and saves shitloads of time of clicking around from tool to tool, freeing up time for the thinking and deciding.
They could easily structure their limits to enforce that kind of pattern fairly on both human and automated users. They could e.g. force a cooldown period between your daily activity bursts, by decreeing that continued heavy use on a 24h basis would count exponentially more towards your limit. That would be transparent and force the claws to lighten their load below that of a typical human user. We're talking about a company that's worth hundreds of billions of dollars and targeting highly sophisticated enterprise users, not consumers; it's just not credible that they'd be technically unable to set that up.
Even the API pricing is subsidized by investors. When that stop, pricing will escalate.
Most estimates I've seen have shown that API usage seems to be at least unit profitable (paying for infra and electricity, not R&D)
The issue is, and always will be, competing views on what these services are for. Most, see them as augments of their normal everyday workflow. Others see it as the tool that allows their creativity to flow as fast as their thoughts do. The problem is the service is more than capable of catering to both but the creative vibe commander will hit those limits far faster. Simply telling them to “take a break” is a kin to those video game screen nags that developers were forced to put into games to remind people to pee.
I downgraded from my $200 a month plan to my $20 plan and hit limits constantly. I try to use the API access I purchased separately, and it doesn't work with Claude Code (something about the 1 million context requiring extra usage) so I have to use it Continue. Then I get instantly rate limited when it's trying to read 1-2 files.
It just sucks. This whole landscape is still emerging, but if this is what it's like now, pre enshittification, when these companies have shitloads of money - it's going to be so much worse when they start to tighten the screws.
Right now my own incentive is to stop being dependent on Claude for as much as I can as quickly as I can.
Either you get a flat rate fee based on certain allowed usage patterns or everyone has to be billed à la carte.
Your comparisons are all also "unlimited" situations to Claude's very much limited situation. You can't buy a plan for Claude that is marketed as being unlimited. They're already selling people metered usage. They're just also adding restrictions on top of that.
So they further restricted the metered caps, which were only offered to NOT be reached by that many.
Simple as that.
Then they should figure out how to structure an offering that accommodates this type of usage not just blanket ban it
They did, didn't they? You can pay the non-plan rate.
> not just blanket ban it
They didn't do that. The email specifically tells you how to use Openclaw with Anthropic. There is no "blanket ban".
You got that right; in this case they are signalling that AI token providers are not going to be able to run at a profit anytime soon.
Not sure if that helps or hurts your argument, though.
Not all power users. Some re-invent the wheel and/or do things inefficiently, and in most cases there's no business incentive to adapt the service to fit the usage patterns of those users, or of other users that deviate from the norm in regards to resource usage.
Because it is clear that there is a market demand for it.
Yes, and that's exactly the problem I'm pointig at.
Your comment "that people would love an offering without the discussed restriction" ignores the pricing burden of that, which is why it's confused why Anthropic don't just offer this.
The API has no restrictions; what is the people's objection to that?
I've had to unwind "unlimited" within startups that oversold. I've been bit by ISPs, storage providers, music streamers, fuckin _Ubers_, now AI subscription services, that all dealt in "unlimited". None of them delivered in the long run.
I'd be mad at Anthropic if it weren't for the fact that my experience now can see this sort of thing from a mile away. There are a lot folks, even on HN, that haven't been around for as long. I understand the outrage. I've been there. But these computers cost money to run, and companies don't operate at a loss in the fullness of time.
Once you know that unlimited trends towards limited, the real question is whether we're equipped as a society to deal with the fact that the capital-L Labor input to the economic equation is about to be replaced with a Capital input for which only a handful of companies have a non-zero value.
Reminds me of when ATT had a fake 5G decoration on phones.
"AT&T won’t remove fake 5G logo even after ad board says it’s misleading"[0]
You can just get away with lying. That's the level of enforcement that exists against unethical behavior in business today.
0. https://www.theverge.com/2020/5/20/21265048/att-5g-e-mislead...
But now you might get things like “unlimited” 1Gbps… which reverts to 10Mbps (1% speed) or worse after 3.6TB (eight hours). And so your new theoretical maximum is about 6.8TB per month rather than 330TB.
Not the best example. The upkeep cost of a gym is pretty flat regardless of how much people use the facilities. Two people can't use a single machine at the same time make it wear out twice as fast. The price of memberships is not correlated to usage, it's inversely correlated to the number of memberships sold.
The machine doesn't care about the number of people using it. If it's constantly being used, it will wear out faster. You are conflating "we price based on expected under-utilization" with "costs don't scale with usage." Those are different things.
The inverse correlation you talk about isn't relevant here - People buy gym memberships intending to go, feel good about the intention, and then don't follow through. The business model is built on that gap. That's pretty specific to fitness and a handful of similar industries where aspiration drives purchase.
Anthropic doesn't sell based on a "golly gee I hope people dont use this" gap - they sell compute. Different business.
There is nothing anywhere hinting at that.
They don’t sell compute. They sell a subscription for LLM token budgets that they hope people don’t use because the compute is vastly more expensive than what they charge or what users are ever willing to pay.
Especially with enterprise subscription plans the idea is for customers to never utilize anywhere close to their limits.
Yeah, but there's an absolute limit to that, beyond which the cost doesn't keep increasing. Beyond that point, the QoS goes down (queues).
>You are conflating "we price based on expected under-utilization" with "costs don't scale with usage."
I'm not conflating anything, I'm responding to what you said:
>If gyms faced a situation where people would go and spend 18 hours working out every day for a month, they would probably change how they billed things.
Why would a gym need to change how they bill things if all their customers were aiming for maximal utilization, when their costs would barely see any change? I doubt your typical gym operates on razor-thin margins.
Setting that aside, even if we accept your argument that gym costs barely scale with usage, then that makes gyms a bad comparison case for Anthropic, whose costs directly scale with usage. You can't use the gym model to defend Anthropic's pricing decisions if the two cost structures are nothing alike.
I'm arguing that both gyms and Anthropic have usage costs that scale with usage, but gym business model assumes a large margin of under-utilization and there's a hard cap to "power user" - I think both of those extremes don't apply to Anthropic's situation. Under-utilizers aren't paying for AI they have a free tier. There's also a natural ceiling on how much any one person can use a gym. There's no equivalent constraint on API usage.
Yes. In fact i remember hearing about a gym which offered a flat-rate pricing model but explicitly excluded certain professions from partaking in it. I remember the deal was excluding police, bouncers, models, actors and air stewardesses. They had a separate more costly tier for these people. (And I think i heard about it from the indignation the deal has caused online.)
Am I? I think you read something into my comments that I didn't write.
Sure they do. Free tiers suck. I may not always need to use AI, but when I need it, I don't want to immediately get hit by stupidly low quotas and rate limits, or get anything but SOTA models.
What do you expect them to do? You are looking at a business currently running at a loss, and complaining about their billing even though this is not a price-rise?
Unrelated, is it still possible to use $10k/m worth of tokens on their $200/plan?
Internal projections show the company reaching cash-flow break-even in 2028, after stopping cash burn in 2027.
They’ve already implemented several of the features that put OpenClaw on the map.
I don't know what that means in this context.
> Internal projections show the company reaching cash-flow break-even in 2028, after stopping cash burn in 2027.
What does that have to do with them implementing restrictions on their plans because they are currently running at a loss?
Okay, lets say their internal projections[1] are accurate: were those before or after Openclaw released? Maybe their projections were made on the assumption that people would stop using $10k/m worth of tokens on a $200/m plan? Or that those users doing that will only be doing code? Or that the plan users won't be running requests at a rate of 5/minute, every minute of every hour of every day?
--------------------------------
[1] Where did you find those projections? I'm skeptical, at their current prices and current plans, that a break-even at any point in the future is possible unless they shut off or severely scale down training. Running at a per-unit loss means that the more you sell, the larger your loss - increasing your sales increases your loss.
I'm sorry is there anything even close to sonnet, much less opus, that can be run on a 4080? Or 64gb of ram, even slowly?
* Weird thing of the day: https://huggingface.co/Jackrong/Qwen3.5-27B-Claude-4.6-Opus-...
This typically results in a ban for TOS violations after a few windows in a row on a claude subscription
I neither got a warning or a ban or anything - and that was with the double token amount during those days.
So I don't see human usage being something they ban for TOS violation, like you describe. But as always YMMV.
do you have any proof of your statement ?
Then it's not priced correctly. As I said, you can do all of this without OpenClaw.. claude code ships with everything you need to maximize the limits.
I mean, you can. Electricity is already sold that way. Subscribers with uncharacteristic usage spikes don't get blackouts, they get a slightly larger bill, and perhaps get moved up a tier.
It's just how hyperscaling works. You are not wrong, but in the wrong timeline.
Just because outliers can be money-losing doesn’t mean you should raise the price for everyone.
If they are losing money then it's not priced correctly. That's what I responded to.
Yes, subscriptions work as you say. Plenty of people under utilize subscriptions from prime, to credit cards, to netflix. But if they lost money overall, they too would raise prices. Because that's how economics works. Shortage of capacity, high demand, raise prices until equilibrium.
There's other knobs beyond ToS. They just didn't choose those options.
Raising prices is a bad strategy if you have a smaller base that costs enormously larger than the rest. "A million users that cost $1 and one user that costs $10 million, charge everyone $10 equilibrium", you're screwing over almost all of your users. The $20/month sub price is basically just not trying to capture the openclaw users, it doesn't make sense that all of the vanilla Claude users should subsidize them (and in fact it wouldn't even work because they will just go to Gemini or ChatGPT if your cheapest paid plan was very expensive to try to subsidize the other users)
Just a few years ago this was the standard business model for startups: attract VC money, offer plans at a loss, capture a huge market, boil the frog with incremental price increases to become profitable.
Companies like Uber wouldn't have been anywhere near as successful if they had been forced to make a profit from day one.
This makes zero sense. I'm paying to use that limit all of the time. If that's too much for Anthropic, they are free to lower the limits or increase the price. Claiming otherwise would be false advertising.
The erosion of the norm of things doing what they advertise rather than being weasel-worded BS is particularly unfortunate, and leads to claims like this.
Whether it's human token use, or future OpenClaws
I even think an LLM trained to communicate using telegram style might even be faster and way cheaper.
https://www.tbench.ai/news/terminus
.- -. -.. / .. --..-- / ..-. --- .-. / --- -. . --..-- / .-- . .-.. -.-. --- -- . / --- ..- .-. / -. . .-- / - . .-.. . --. .-. .- -- -....- -... .- ... . -.. / --- ...- . .-. .-.. --- .-. -.. ...
Terse.
This mainly just affects hobbyists.
I’m glad they give us the leeway to experiment, and I’m also glad they weed the garden from time to time. To switch metaphors, I’m deeply frustrated when my very modest, commuter-grade use gets run off the figurative highway by figurative hot-rodders. It’s been extra-529y this week, and it’s about time they reined it in a little.
You’re always welcome to pay-as-you-go for as many tokens as you’d like to burn on their infrastructure… or to compute against any of the wide array of ever-improving open models on commodity compute providers…
Thats an interesting way of phrasing it - so is there a way to use the quota that's not 'abuse'? MCP/claude code seems to be want they want you to use it - are loops or ralph abuse as well ?
> is there a way to use the quota that's not 'abuse'?
I think my answer is “no.” In that I’ve never thought of the limits as “quotas,” and I don’t think I’ve heard Anthropic speak of them that way. Quotas are to be used up, while limits are to signal that what you’re doing is outside the envelope of acceptable use. Quotas are to be met, limits are to be avoided.
I interpret the intention of the subscription, like a membership at a makerspace, to be to allow novices to experiment with stuff, to take on personal-scale projects, to allow them to learn without having to understand the tool’s economics upfront. To play without fear of expensive mistakes.
And, like the makerspace, it can only offer generous limits to the extent that most of us rarely bump up against them. If you’re doing production runs in the makerspace, you’re crowding out the other members, and something’s gotta give.
To the extent that we do bump against the limits during “ordinary” use—and we do with Claude Code, especially those of us around here—it’s really frustrating. The limits need to rise in order for it to remain attractive to casual users like me, the economics still need to add up for the subscription program as a whole, and part of that is separating out what patterns of use belong under a different regime.
If these harnesses or OpenClaws or whatever stop making sense as soon as they have to pay their actual costs, then that’s a pretty good sign they’re abusing the spirit of the subscription.
But Anthropic seem more than happy to service those uses via the API or metered usage, and even to sweeten the deal with more reliable access and bulk discounts. I certainly wouldn’t characterize the same automated usage as “abuse” via that channel.
Fair enough.
>> But that’s not what the subscription product is for.
This was the point I was trying to make - I pay for XX tokens/usage. But somehow using them all is 'taking advantage' ?
BTW - I'm actually not complaining about the limits - I probably only use half my tokens on average week. I'm just annoyed at having to jump thru hoops if I want to try something 'API' oriented. For me, AI is still the new shiny - I try all different sorts of things learning/playing. There was an article posted today about writing agent harnesses. That could be interesting - maybe I want to try my hand at it. But then I've got to mess around/pay extra to _try_ something I that my subscription already easily covers.
[added:] >>to take on personal-scale projects, to allow them to learn without having to understand the tool’s economics upfront. To play without fear of expensive mistakes.
This is exactly what I'm trying to do - however, as soon as you want to try anything 'API' oriented, the 'fear of expensive mistakes' comes right back.
More users spinning up OpenClaw means that balance starts to shift towards more users maxing their tokens, thus the average increases, so I think their explanation makes sense still.
So they profit overall if I use all my tokens either way? Again, I understand usage limits - I just don't understand why some usage is 'good' and some 'bad' if I'm using the same either way.
>>More users spinning up OpenClaw
I'm pretty sure that's a small percentage of overall users, and probably skewed towards the very people that would be recommending/implementing you model for work/businesses. Seems like that would be the group you are encouraging/cultivating ?
I wonder if anyone else has experienced this?
Perhaps because your Claude agent usage is not representative of the average user, and closer to the average OpenClaw user levels...
Basically; spin up in the morning eats a lot of tokens because the cache is cold. This has actually gotten worse now that Opus supports a 1Mt context.
So: compact before closing up for the night (reduces the size of the cache that needs to be spun up); and the default cache life is 5 minutes, so keep a heartbeat running when you step away from the keyboard to keep the cache warm.
Also, things like web-research eat context like crazy. Keep those separate, and ask for an md report with the key findings to feed into your main.
This is not exhaustive list and it's potentially subtly wrong sometimes. But it's a good band-aid.
https://news.ycombinator.com/item?id=47616297
Know what's funny? Openclaw might actually burn less tokens than a naive claude code user; if configured correctly. %-/
And I'm skeptical of the 6x-8x claim myself. They'd have to explain that in more detail.
With data, it's an engineering target.
They could just 429 badly behaved clients.
Power users always cost these services more than they pay, and OpenClaw turns every user into a power user. A recalculation was rational.
From Anthropic's perspective, everyone pays to be in bins with a given max.
And to everyone's benefit, there is a wide distribution of actual use. Most people pay for the convenience of knowing they have a max if they need it, not so they always use it.
So Anthropic does something nice, and drops the price for everyone. They kick back some of the (actual/potential) savings to their customers.
But if everyone automates the use of all their tokens Anthropic must either raise prices for everyone (which is terribly unfair for most users, who are not banging the ceiling every single time), or separate the continuous ceiling thumpers into another bin.
That's economics. Service/cost assumptions change, something has to give.
And of the two choices, they chose the one that is fair to everyone. As apposed to the one that is unfair (in different directions) to everyone.
From the email: > but these tools put an outsized strain on our systems. Capacity is a resource we manage carefully and we need to prioritize our customers using our core products
OpenClaw doesn't put an outsized strain on their systems any more than Anthropics own tools. They just happen to have more demand than they can serve and they benefit more when people to use their own tools. They just aren't saying that explicitly.
It has nothing to do with fairness or being nice.
which said customer paid for. And now they want to back out of it because it turns out they thought users wouldn't do that.
I say they ought to be punished by consumer competition laws - they need to uphold the terms of the subscription as understood by the customer at the time of the sign up.
except when people start using openclaw, and the distribution narrows (to that of a power user).
I hate companies that try to oversell capacity but hides it in the expected usage distribution. Same goes for internet bandwidth from ISP (or download limit - rarer these days, but exists).
Or airplane seats. Or electricity.
Except they charge you less because of the distribution. Competition for customers doesn't evaporate.
They might charge you less, but they don't have to and wont if the market allows it
That's a "fixed" constraint, because maximizing future adjusted value is what companies do.
So they don't play little games with mass products. If they did they would be harming their own bottom line/market cap.
(For small products, careful optimization often doesn't happen, because they are not a priority.)
Note this thesis explains what is going on here. What was previously one kind of customer (wide distribution of use), is now identifiably two. The non-automated token maxers (original distribution) and automated token maxers (all maxed, and growing in number). To maintain margins Anthropic has to move the latter to a new bin.
But the customer centric view also holds. By optimizing margins, that counter intuitively incentivizes reduced pricing on lower utilized products. (Because margin optimization is a balance to optimize total value, i.e. margins are not the variable being maximized.)
The alternatives would be bad for someone. Either they under optimize their margins, or change regular customers more which is unfair. Neither of those would be a rational choice.
(Fine tuning: Well run companies don't play those games. But companies with sketchy leaders do all kinds of strange things. Primarily because they are attempting manage contradictory stories in order to optimize their personal income/wealth over the companies. But I don't see Anthropic in that category.)
Instead, you can prioritize people "earnestly" bursting to the usage limits, like the users who are actually sitting at their computer using the service over someone's server saturating the limit 24/7.
The goal is to have different tiers for manual users vs automated/programmatic tools. Not just Anthropic, this is how we design systems in general.
When your least automated, most interactive users are competing for capacity with fully-automated tools, let's say, you're forced to define some sort of periphery between these groups.
OpenClaw is a self-directed, automated loop that sits on a server. It's wowing its owner by shitposting on moltbook and doing any number of crazy stories you can find online that amount to "omg I can't believe my self-directed claude loop spent all day doing this crazy thing haha."
On the other end of the spectrum is someone using Claude.app's interface.
And then in the middle, you can imagine "claude -p" inside a CI tool that was still invoked downstream of a user's action. Still quite different from the claude loop.
I'm sorry but this framing just doesnt make sense.
- The intention of subscriptions, as anywhere, is a combination of trying to promote brand loyalty, and the gym membership model of getting people to pay for oversubscribed resources that many will never use. As the parent noted, people maxxing out their allowed usage, for whatever reason, are not the most profitable customers, and in this case probably not profitable at all
- OpenClaw is now owned by a competitor, OpenAI, and Anthropic are trying to compete in this space
https://www.semafor.com/article/04/03/2026/anthropic-eyes-it...
- Anthropic are capacity constrained, having sensibly chosen to err on the side of safety (not going bankrupt), and are now trying to do the best they can to manage that.
Presumably they might be acting differently if they had capacity to spare, but even then helping a competitor to build market share in a potentially lucrative segment doesn't make strategic sense.
I do wonder about the wisdom of Anthropic promoting usage-maxxing development patterns such as running a dozen agents in parallel ... maybe not the wisest thing to do when capacity constrained! It would make more sense to promote usage at night with low priority "batch jobs" rather than encourage people to increase usage during periods of maximum demand.
Do you have an example of how this is how they have advertised or sold the plan? I don’t recall ever seeing any advertisement that their plan is simply pre paying for tokens.
As you said, I would imagine where the token usage comes from is irrelevant - you are generating the same load whether you do it from claude code or some other agent. So it seems like the rules are more to do with encouraging claude code usage, rather then claude model usage.
OpenClaw just happens to also get telemetry, of probably higher value, out of the same tokens. It also happens to be owned by their competitor.
edit: I'm wrong OpenClaw surprisingly doesn't collect telemetry. Good for them.
At least that’s my read. I don’t believe it is nefarious
Tokens and these agents(Claude Code/cowork/claude.ai) are separate from model tokens, and they want to discount for their own product usage.
The subscription they sell is a package of these products, not tokens. They never sell token subscriptions, so why do we need to relate tokens with the subscription? Fundamentally, they never meant to sell token usage in that subscription, similar to any other SaaS company trying to sell API usage.
Nothing beyond fumbling the PR around it.
This is so wrong.
The subscription is to Claude (the app, Claude code, etc) not the API.
Anthropic subsidizes Claude code because they collect a ton of super useful telemetry and logs so they can improve… Claude code.
Wanting to pay for a subscription to Claude and treat it like an API discount is like going to an all you can eat buffet and asking them to bring unlimited quantities of raw ingredients to you so you can cook at home. Ok, not a perfect analogy, but you get the idea.
You just paraphrased my argument
(Maybe I'm just being paranoid here).
I haven't even heard of claude -p before your comment.
OpenClaw is for sure not just a good cover story. Or its the cover face of the issue of automated tool workflows.
I don't think they are bothered too much about other frontends who do the same as claude code.
I mean, humans sleep and do other things than work, so they likely don’t hit their weekly limits or their 5 hour limits every single 5 hour chunk :)
If you max out your token limits, you are costing Anthropic more than you are paying them. They only expect a small percentage of their users to do this, but OpenClaw changed the dynamic.
Anthropic knows that they will lose more users by lowering limits than they will by blocking OpenClaw, because OpenClaw users will overwhelmingly switch to API pricing, while chatbot users will leave for competitors with higher limits.
They are a business. They hope to become profitable. This was the correct move.
It’s shame they do all this sketchy stuff, I switched to Codex I have enough of their bs.
I’m pretty happy knowing that it supports my development workflow for a week. Recent features like the Code Desktop built in browser, Cowork with Claude in Chrome and remote control matter to me way more than the number of tokens. But that’s me.
Depends on their targeted ICP also, which they are free to define. Is it those users maxing out tokens for the buck? I have the feeling there’s even better alternatives on the market right now.
For many it doesn't. It's opaque, it changes, and they bury the news in fucking twitter. https://x.com/trq212/status/2037254607001559305
There's a lot to love about Anthropic. But man do they suck at PR.
Subscriptions are crazy subsidized.
So you can’t use OpenClaw, OpenCode, etc. because they take you outside their applications/lock in and their ability to easily monetize in the future.
Second, OpenAI is burning UNIMAGINABLE sums of money. Three days ago they raised $122 billion [2], the largest funding rounding in history. By comparison, Anthropic has emphasized a more capital efficient approach, with a ~30% burn rate. [3]
[1] https://x.com/sama/status/2023150230905159801
[2] https://openai.com/index/accelerating-the-next-phase-ai/
[3] https://www.wsj.com/tech/ai/openai-anthropic-profitability-e...
its obvious they will tighten everything and raise prices for years to come
If Anthropic miscalculated the amount of tokens, or simply pushed too hard to capture market share, that is a costly mistake because people in this market are very sensitive to price hikes.
They have to be honest about what they can offer for $200. Sure, people don't max their subscriptions but when they're large they make the best of it, or they will likely cancel it. The typical subscription works well below capacity because it's cheap enough that the optionality may be worth it. $200 is not the typical subscription.
Their expectation must have been a human using the service at a human capacity.
This is different from an automated agent orchestrating a ton of different agents at the same time doing a lot of things.
There is a difference.
They already have the regular subscription plans (Pro, Max) and a separate billing process for direct API usage. They could absolutely introduce another type of plan optimized toward this kind of usage or just accept that it's a dumb pipe that is being paid for and having these random arbitrary limitations is just making things more confusing and a bad plan for the future.
Clawdbot was clearly against the Consumer Terms of Use the whole time, they’ve just started actively detecting and blocking it.
> Except when you are accessing our Services via an Anthropic API Key or where we otherwise explicitly permit it, [it is forbidden] to access the Services through automated or non-human means, whether through a bot, script, or otherwise.
If the actual concern is use pattern, enforce that directly. What we have instead is metered usage + behavioral restrictions + product fragmentation across three separate offerings.
That's not a clean billing philosophy, it's layers of control stacked on top of each other with no coherent logic tying them together.
If subscriptions are for humans and API is for automation, fine. But then don't meter the human product arbitrarily and don't sell a subscription tier for automation while also restricting automation. Pick a lane.
Except it's not. It's a desktop, web, mobile, and CLI subscription product built on top of a usage-based API with a generous token allowance bundled with it. That generous allowance comes with the restriction that those tokens can only be spent through Claude product surfaces. Why would Anthropic offer their API at a loss and subsidize the profits and growth of other businesses?
Sure there is a difference. It's like when most mobile companies wouldn't allow tethering because then people would actually use the service.
You can try to stop that, but people will price in those inconveniences. They will simply learn that the fee pays for much less than the token limit and that the company is enforcing some unwritten limits by adding extra limitations to usage.
We will see it play out.
Isn't that exactly what they just did?
being honest would be to just adjust the limits rather than adding piecewise limitations
but of course with honesty comes that people can actually gauge your product accurately and they may not want that
nobody wants them to add fineprint every time users find effective ways to actually use the service to its advertised limits, it only benefits those who want to be milked for recurring revenue for sporadic usage while paying handsomely for the privilege
If you had to pay for APIs yourself for any provider then you'd know that SOTA tokens are not cheap, and Claude Code for $100 is almost a too good to be true bargain for what you can get out of it.
I don't think that's accurate for professional users. Personal users, especially those for whom $200/m is a significant cost, will definitely try to get the most out of it.
I know several $200/m user (I'm on the $100 personally), and they've all had the same experience I had when first upgrading to the max package: initially you try to use it as much as you can and feel like you need to keep it busy. But that goes away after a few days and you use it when you have need. The primary point of the max tiers for my peers is to not hit limits during their work if they occasionally use it intensively because it's disrupting to have to wait for X hours to continue.
If you get a benefit from using it, and you bill at $200 an hour, and you work 160+ hours a month, the $200 monthly cost doesn't register as a significant cost and you won't make it determine your usage patterns. I'm sure that'd be different if VC money goes away and it turns out the true price would need to be closer to $5k, but at this point it's similar to your ISP for fiber costing $80 a month. You enjoy the speed for a few days, but then it becomes the new normal.
As a senior engineer, you get an assistant that never gets tired and can do quite a lot on its own. For me, it’s been an eye-opening experience. I used to have a collaborator called M that had a good general culture, but was not too smart. The calculation going into my mind every time I ask Claude for something is: how much would that cost, in terms of time and effort, to get M to do that? M was a resource that costed many thousand dollars per month, plus the time I spent correcting and directing, while Claude is actually smarter and does what it is asked with a degree of autonomy and common sense that M could never dream of.
The flipside of the coin is obvious: Anthropic will find a way to claw back - no pun intended - some of this value by raising the cost of subscription. They would be crazy not to.
is claude that good? the last time i tried claude it was sonnet 4.5. it was ok, not worth the api money clearly. but i only use api tokens for llms.
But… anecdotally, Claude is just that good. Gemini needs a lot of hand-holding, and it will still tell you it’s done when it achieved half the work. Or say, “this test isn’t passing, I’ll just delete it”. Every now and then I get tired of it and give the same task to Sonnet 4.6; 5 minutes later I’m done. Bug fixed, UI properly working, React hooks not being conditionally rendered, theme variables used properly. It’s wonderful.
I’m not sure about large agentic work or deep thinking, but I’m mostly automating away the drudgery of dealing with React Native. I still want to do the deeper work myself, but even there Opus is usually a really good sparing partner.
The whole point is that the users can have it doing shit for them instead of them having to babysit the computer.
The fact that users still have to sit there and argue with it erodes their value proposition. The proposition you can pay fewer salaries.
For now too many people will use AI for stuff that deterministic stupid code would be much more efficient.
But the interesting thing is, my actual token usage running agents is way less than people here seem to assume. Most of the time the agent is waiting for tools, reading files, thinking. The bursts are intense but short. I probably use less tokens per hour than someone doing a long manual coding session with lots of back and forth.
The real issue for me isnt cost, its that they can just change the rules whenever. I had to drop everything today to verify my setup still works. Thats the tax of building on someone elses platform I guess.
Sucks to be pushed back to Claude Code with opaque system behavior and inconsistency. I bet many would rather pay more for stability than less for gambling on the model intelligence.
Or maybe I’ll just get a Codex subscription instead. OpenAI has semi-officially blessed usage of third party harnesses, right?
"Developers should code in the tools they prefer, whether that's Codex, OpenCode, Cline, pi, OpenClaw, or something else, and this program supports that work."
https://developers.openai.com/community/codex-for-oss
Obviously, the context is that OpenAI is telling open source developers who are using free subscriptions/tokens from the Codex for Open Source program that they can use any harness they want. But it would be strange for that to not extend to paying subscribers.
It seems that installing claude code directly from npm shields from some of the current issues.
That sounds like their problem, not ours
You can vote with your wallet though. So don’t throw money at them or just deal with it. Plain and simple.
If they they expect X money for Y tokens, better provide Y tokens for your X money. If they can't provide that, then change the pricing plans. That's not the users problem.
Indeed. And this model breaks in several cases that overlaps with the current AI business model:
- marginal cost of incremental usage is too high (Movie Pass)
- adverse selection (all you can eat monthly steak subscriptions)
- demand is synchronized (WeWork)
People want a free lunch. If the API was cheaper than the subscription then everyone would use the API. Instead people flock to an, apparently, unsustainable pice at a fixed monthly rate; presumably subsidized by others who don't use their full capacity every month.
This is (almost) universally true of flat rate subscriptions; but there are usage-billed ones, too (and even those often have an aspect of subsidies).
A great example of the shakeup is when dial-up went from "connect, do the thing, disconnect" to "leave the computer online all the time" - they had to change the billing model because it wasn't built for continuous connections.
My meal kit delivery service doesn't.
You don’t use more tokens than with Claude Code
Is your unlimited 5G plan actually unlimited, or does your download rate drop to dialup speeds after your crest a certain amount of bandwidth usage?
Have you ever had an ISP in a populated area? What's the reliability like? Is it worse during certain times of the day?
The Anthropic subs are likely priced at marginal cost (Amp‘s CEO recently said that in a podcast). It just doesn’t serve Anthropic to be operating as the service layer for OpenClaw.
Customers have their own value calculations. If they can't use Claude for autonomous agent at reasonable price they will move to providers that are cheaper and more flexible. Autonomous agent adds way more utility than a marginally better LLM (assuming that's even true).
Honestly, this just looks like what Dylan of SemiAnalysis suggested on Dwarkesh – that they've massively under-provisioned capacity / under-spent on infrastructure.
That would honestly be a comforting answer if true, because I would gladly take 'we can't afford to do this right now' over 'we are self-preferencing, and the FTC should really take a look at us, even if we're technically not a monopoly right now, since we're the only strongly-instruction-following model in town and we clearly know it'.
You can use these tools with most providers today, just no subscription plan. If you have enough spend, you can likely get bulk deals
Tell me you have zero clue what a monopoly is or what the law is, without telling me.
Monopoly law relies on broad categories, not narrow ones. You can’t call Microsoft a monopoly because they are the only company that makes Windows. You can’t call Amazon a monopoly because they are the only company that makes AmazonBasics. You can’t call Anthropic a monopoly because their product is 20% better for your use case, otherwise by definition no company has any incentive to do a good job at anything.
Monopoly law is subject to reinterpretation over time and anybody who has studied the history of it knows this. The only people argue for "strict" interpretations of current monopoly law are those who currently benefit from the status quo.
> Monopoly law relies on broad categories, not narrow ones.
And this is currently a gigantic problem. Because of relying on broad categories to define "monopoly", every single supply chain has been allowed to collapse into a small handful of suppliers who have no downstream capacity thanks to Always Late Inventory(tm). This prevents businesses from mounting effective competition since their upstream suppliers have no ability to support such activities thanks to over-optimization.
To be effective on the modern incarnation of businesses, monopoly law needs to bust every single consolidated narrow vertical over and over and over until they have enough downstream capacity to support competition again.
Then don’t make BS up like implying Anthropic is a monopolist for the crime of competence.
> tell me you don't understand how a small quantitative gap can result in a step change in capability
The law does not give a darn about this. Being a good competitive option does not make you a league of your own. If I invent a new flavor of shake, the Emerald Slide, am I a monopolist in shakes because I’m the only one selling Emerald Slides? If you go and then start a local business reselling shakes and I’m your only supplier, am I a monopolist then? Absolutely not.
We have a similar situation in mobile where Apple may not be considered a monopoly, but people have walked around for a decade with a supercomputer in their pocket that is wildly underused.
Things have gotten faster; things are different than they were decades ago when a lot of this was devised.
The reality of the matter is that some of us just want to see innovation actually happen apace, and not see 5, 10, or 30 years of slowdown while we litigate whether or not such a company is holding all the cards, while everyone is collectively waiting at the spigot for a company to get its shit together because we're not allowed to fix the situation.
For what it's worth, I'm hopeful that the other model providers will catch up and put us in a situation where this conversation is irrelevant.
What I'm afraid of is a situation where we see continued divergence, and we end up with another Apple situation.
That is not calling out that they are “absolutely not a monopoly by the law” in any way, shape, or form. You’re framing it as though they aren’t by a technicality, when they aren’t anywhere near discussion by even the most extreme of legal theories. You won’t find Lina Khan or Margarethe Vestager, both ousted for going too far, complaining about Anthropic.
> “We have a similar situation in mobile where Apple may not be considered a monopoly, but people have walked around for a decade with a supercomputer in their pocket that is wildly underused.”
In that we can’t run a Torrent client to download illegally redistributed media 99% of the time? Otherwise, in what way, are they underused? For the degrees of public addiction, a more underutilized phone would be a social benefit.
I'm looking forward. Things are moving very quickly. As I said above, I'm afraid of us diverging into another Apple situation in the future. If I suggest that they should be looked at and thought about, it's not for today, it's for tomorrow. If divergence continues. Because as with everything in AI, it might hit us a lot faster than people expect. Hell, given their approach to morality, I suspect that Anthropic folks have already thought deeply about these sorts of concerns. That's why it's actually a lot more in character for them to be doing this not due to self-preferencing, but due to unaffordability, which - if you look at my first post - is what I said seems to be happening.
Suffice to say that I have a graveyard of things that I think phones could have been, where unfortunately we've ended up with these - as you say - addicting consumerist messes.
Gonna stop here so I don't flood the thread. We're getting very off topic.
I haven't tried it to see if it's any good but it's $20/mo.
I don't think this is particularly about the financial impact of people using OpenClaw - they can adjust the amount of tokens in a subscription quite easily.
I think the root cause is that Anthropic is capacity constrained so is having to make choices about the customers they want to serve and have chosen people who use Claude Code above other segments.
We know Anthropic weren't as aggressive as OpenAI through 2025 in signing huge capacity deals with the hyperscalers and instead signed smaller deals with more neo-clouds, and we know some of the neo-clouds have had trouble delivering capacity as quickly as they promised.
We also know Claude Code usage is growing very fast - almost certainly faster since December 2025 than Anthropic predicted 12 months ago when they were doing 12-month capacity planning.
We know Anthropic has suffered from brown-outs in Claude availability.
Put this all together and a reasonable hypothesis is that Anthropic is choosing which customers to service rather than raising prices.
I doubt they actually want to do this.
They clearly see having a wide set of paying customers as valuable (otherwise they'd just raise prices) but if you are stuck having to make hard choice then I can see the attraction of this approach.
And where’s the difference between the Claude Desktop app and OpenClaw at this point? Anthropic have been hard at work porting the most important features. You can easily shoot yourself in the foot with both now.
OpenClaw and OpenCode are open source projects with zero warranty and nobody to sue if they have a npm Trojan in them
When has any technology company been sued for pushing accidental malware in their updates?
The reality is that you have never had anyone to sue.
The risk with OpenClaw et al isn't that the software itself is compromised. The risk is that what it does is fundamentally insecure and Claude Code isn't any better
Once again, despite everyone's protestations about not anthropomorphising things, LLMs are, to first approximation, best seen as little people on a chip. So with that in mind, it should be obvious why enterprise would prefer dealing with Anthropic's official products than OpenClaw - it's similar to contracting a team of software engineers from another well-known corporation and giving them keys to the castle, vs. inviting in any randos that show up at the door on any given day and can pass FizzBuzz test. Even if, in both cases, these turned out to be the same people, having an organizational/legal-level relationship changes the expectations and trust levels involved.
Anthropic wants you to use their subscription only for Anthropic products.
I don’t think the difference is that difficult to see.
It's pretty clear that they do continually adjust the amount of tokens in a subscription, per se (and at best they offer sort-of estimates of quotas). The same activity exhausts my session quota on one day, yet it's a minor contributor on another. They make this very explicit with the "2x" event for the past two weeks, but anyone who uses it knows this is basically an ongoing reality: If you stick to using it off hours, you generally enjoy a more liberal usage grant.
But if they just "adjust the amount of tokens in a subscription", they would be punishing everyone for the outliers. The average normal user has spurts of usage where occasionally they need more and then there are gaps where they use little.
Subscription services rely upon this behaviour, and the economics only work if they "oversell". That's why OpenClaw users want to sneak in under a subscription, because the tokens come at a discounted rate over using the API based upon that assumption, but they are breaking the model because those users aren't conforming to expectations. It's basically the tragedy of the commons and a small number of users want to piss in the well.
Dealing with Claude going into stupid mode 15 times a day, constant HTTP errors, etc. just isn't really worth it for all it does. I can't see myself justifying $200/mo. on any replacement tool either, the output just doesn't warrant it.
I think we all jumped on the AI mothership with our eyes closed and it's time to dial some nuance back into things. Most of the time I'm just using Opus as a bulk code autocomplete that really doesn't take much smarts comparatively speaking. But when I do lean on it for actual fiddly bug fixing or ideation, I'm regularly left disappointed and working by hand anyway. I'd prefer to set my expectations (and willingness to pay) a little lower just to get a consistent slightly dumb agent rather than an overpriced one that continually lets me down. I don't think that's a problem fixed by trying to swap in another heavily marketed cure-all like Gemini or Codex, it's solved by adjusting expectations.
In terms of pricing, $200 buys an absolute ton of GLM or Minimax, so much that I'd doubt my own usage is going to get anywhere close to $200 going by ccusage output. Minimax generating a single output stream at its max throughput 24/7 only comes to about $90/mo.
I must be missing something or supremely lucky because I feel like I’ve never hit these “stupid” moments.
If I do, it’s probably because I forgot to switch off of haiku for some tiny side thing I was doing before going back to planning.
But even when Opus is running healthy, it still doesn't address the underlying issue that these models can only do so much. I have had Opus build out a bunch of apps but I'm still finding my time absorbed as soon as it comes to anything genuinely exceeding "CRUD level difficulty". Ask it to fix a subtle visual alignment issue, make a small change to a completely novel algorithm, or just fix a tiny bug without having to watch for "Oh, this means I should rewrite module <X>" is something that simply isn't possible while still being able to stand over the work.
It's not to say I don't get a massive benefit from these tools, I just think it's possible to be asking too much of them, and that's maybe the real problem to solve.
2 weeks ago, I had only hit my limit a single time and that was when I had multiple agents doing codebase audits.
They didn’t do a great job of explaining it. I wonder how many people got used to the 2X limits and now think Anthropic has done something bad by going back to normal
That's EXACTLY and ALL I've been doing!
Using Codex and Claude both side by side to view my Godot components framework open source project (link in profile)
Claude has been..ugh.. bad, to put it mildly, on the same content and the same prompts.
On top of that their $20 plan has much higher usage limits than Anthropic's $20 plan and they allow its use in e.g. opencode. So you can set up opencode to use both OpenAI's codex plan plus one of the more intelligent Chinese models so you can maximize your usage. Have it fully plan things out using GPT 5.4, write code using e.g. Qwen 3.6, then switch back to GPT 5.4 for review
For JSON to text formatting it works well on a one-round basis. So I think you should realistically have an evaluation ready to go so you can use it on these models. I currently judge them myself but people often use a smart LLM as judge.
Today writing eval harness with Claude is 5 min job. Do it yourself so you can explore as quants on Gemma get better.
That is I get more variance between opus 4.6 and itself than I do between the sota models.
I don’t have the budget for statistical relevance but I’m convinced people claiming broad differences are just vibing, or there are times when agent features make a big difference.
Oh no, there's plenty of us willing to say we told you so.
What's more interesting to me is what it's going to look like if big companies start removing "AI usage" from their performance metrics and cease compelling us to use it. More than anything else, that's been the dumbest thing to happen with this whole craze.
Would really love some path forward where the AI parts only poke out as single fields in traditional user interfaces and we can forget this whole episode
My primary interest is using small edge models to perform specific engineering tasks. In this pursuit I do like to use gemini-cli or Antigravity with Claude a few times a week as coding assistants, but I am using relatively few tokens to do this.
I also waste a lot of time, but this is fun time: experimenting with open source coding agents with local models just to see what kinds of results I can get. This is mostly a waste of time, but I enjoy it.
My other favorite use pattern: once or twice a week I like to use the iOS Gemini app in voice mode, and once a month also use video input. I really like this, but it is not life changing.
Externalities matter: I never use frontier LLM-based AI without thinking of energy, data center, and environmental costs.
And video calling did take off, plenty of people use facetime and almost everybody working in an office uses some form of video calls. Criticizing the early attempts at getting video calling working because they hadn't taken off yet (I remember them being advertised on "video phones" with 56k modems), of course someone was going to have the idea and implement before it was quite reasonable.
To help with understanding that perspective, I cannot imagine a scenario where I would ask a device connected to the internet to turn off the lights. I literally never wanted this. A physical switch is a 100% non negotiable for me. I feel the same way about non-mechanical car doors.
Perhaps due to that outlook I was always puzzled about the entire idea of an "assistant". It's interesting for me to see, that there are people out there who actually want that "assistant".
Ever end up cooking or something when the phone/doorbell rings and you want to pause the music? Have your hands full and wanted to open a door? Hear the weather and then the news as you brew coffee or put your shoes on (without interaction with a bright screen)?
You should save some money and keep some privacy doing it your way :)
Maybe you're a little strange but it cannot be that much of a stretch for you to consider using speech to ask for things.
Not wanting to hide things behind Internet connected computers is fine, being unable to imagine wanting to use your voice to ask for things is a little silly.
I regret paying Google for a one year AI subscription last spring (although it was a deep discount over the regular $20/month cost) because it has kept me from experimenting with many venders (but it was a fantastic deal financially).
I just put a reminder on my calendar to try OpenCode zen when my subscription ends.
I’m kind of confused by these takes from HN readers. I could see LinkedIn bros getting reality checked when they finally discover that LLMs aren’t magic, but I’m confused about how a developer could go all-in on AI and not immediately realize the limitations of the output.
I'm "all-in" on AI code generation. I very much realise their limitations, it's like any tool really. I do think they're magic, you just need to learn how to weld the power.
I would like to point out something else. I have Z.ai subscription and they have a dashboard on my usage.
When trying out Openclaw a while ago, I noted something worrying. Its constantly consuming tokens, every single hour during the day, it consumed tokens. I could see over a period of 30 days, token usage would climb and climb and climb and then shrink to bottom again, as if Openclaw did a context window compaction.
Note, this usage was happening even though I wasn’t using it. It were always running and doing something in the background.
I believe its their Heartbeat.md mechanism. By default it’s set to run every half an hour. I changed it to twice a day, that was enough to me.
I can imagine if thousands of users where connecting their Openclaw instance with default config to Claude with the latest and greatest Opus model, that must’ve felt a bit.
OpenClaw was still using Claude Code as the harness (via claude -p)[0]. I understand why Anthropic is doing this (and they’ve made it clear that building products around claude -p is disallowed) but I fear Conductor will be next.
[0]: See “Option B: Claude CLI as the message provider” here https://docs.openclaw.ai/providers/anthropic#option-b-claude...
Imagine not being able to connect services together or compose building-blocks to do what you want. This is absolute insanity that runs counter to decades of computing progress and interoperability (including Unix philosophy); and I'm saying this as someone who doesn't even care for using AI.
The disrespect Anthropic has for their user base is constant and palpable.
All AI prices will rise soon - probably shortly after the IPOs. The new prices will be eyewatering compared with today’s. This bulling change is lengthening the time until Anthropic have to raise the subscription prices, so those of us who’re not doing 24hr claw stuff can continue to use the tools the way we’ve gotten used to.
If you want unrestricted and unlimited usage, it's available through the API. Complaining about the subscription like this is basically saying, I want what you're offering, but I demand it for cheaper than what you charge for it. That doesn't make any more sense here than it does at the grocery store.
I'm not sure what to say. You're either listening to the actions of these companies, or you're not in a place where you feel the need to be concerned be their actions.
I'm in a place where I'm concerned by their actions, and the impact that their claims and behavior have on the working environment around me.
Honest question from my end, I try to not read every AI related news that keeps telling me “it’s over, good luck feeding your family in 9-12 months”.
Or are you also upset about the modern plight of the telephone operator, farrier, or coal miner?
It is not a class of labor ... it is all digital labor. Do you or do you not understand this?
It is digital knowledge itself, and then all communication labor, and then all physical labor with robotics.
Is this clear to you?
Marx' whole idea of Communism was predicted on the fact that he assumed industrialization would lead to a post-scarcity society requiring virtually no work and a overhaul of how everything was owned and produced. Boy was he wrong.
But OpenClaw is not a product. It's just a pile of open source code that the user happens to choose to run. It's the user electing to use the functionality provided to them in the manner they want to. There's nothing fundamental to distinguish the user from running claude -p inside OpenClaw from them running it inside their own script.
I've mostly defended Anthropic's position on people using the session ids or hidden OAuth tokens etc. But this is directly externally exposed functionality and they are telling the user certain types of uses are banned arbitrarily because they interfere with Anthropic's business.
This really harms the concept of it as a platform - how can I build anything on Claude if Anthropic can turn around and say they don't like it and ban me arbitrarily.
Where it leaves me is is sort of like the DoD - nobody should use Claude for anything. Because Anthropic has set as principle here that if they don't like what you do, they will interfere with your usage. There is no principle to guide you on what they might not like and therefore ban next. So you can't do anything you want to be able to rely on. If you need to rely on it, don't use Claude Code.
And to be clear, I'm not arguing at all against using their API per-token billed services.
Try this one: https://code.claude.com/docs/en/overview#run-agent-teams-and...
Or perhaps: https://code.claude.com/docs/en/overview#pipe-script-and-aut...
You know what they say about looking and quacking.
If yes, why do Anthropic provide this cli flag?
When they shut down open code, I thought it was a lame move and was critical of them, but I could understand at least where they're coming from. With this though, it's ridiculous. Claude core tools are still being used in this case. Shelling out to it to use it there's no different than a normal user would do themselves.
If this continues, I'll be taking my $200 subscription over to open AI.
OpenAI will soon do the same thing, don't be delusional.
When this happens I will have to look at other providers and downgrade my subscription. Conductor is just too powerful to give up. It’s the whole reason why I’m on a max plan.
EDIT: confused by downvotes. In this thread people are saying it runs on top of `claude -p` and others saying it's on pi.
The `claude -p` option is allowed per https://x.com/i/status/2040207998807908432 so I really don't understand how they're enforcing this.
Also what's the point of Claude -p if not integration with 3rd party code? (They have a whole agents SDK which does the same thing.. but I think that one requires per token pricing.) I guess they regret supporting subscription auth on the -p flag
that's a ridiculous position to take - gemini and others work just great with claw...
OpenClaw managed to burn 2.46 trillion tokens just in the last 30 days.
I'm not even gonna judge why someone needs an AI Assistant running 24/7, the core issue is that coding plans are being ruined because they're not paying for ridiculous amount of tokens burned.
Anthropic is actually making the right decision: You want a LOT of tokens for your 24/7 agent? Ok, just use the API and pay for your tokens.
I enjoy paying for a sub that I actually use to code, and what we pay today is not even enough to cover the costs of running AI servers.
Either they pay up or get off
https://upload.wikimedia.org/wikipedia/commons/1/14/Bill_Gat...
It’s really that straightforward. If tomorrow they decide GPUs are better allocated to enterprise use, they could start removing the $20 plan just as quickly overnight, the same way they did tonight.
Subscriptions assume “human usage” — bursty, limited, mostly interactive. Agent systems are closer to autonomous infrastructure load running continuously.
OpenClaw is a good example of this. Once agents operate freely, they don’t behave like users — they behave like infrastructure.
That’s why this kind of restriction isn’t too surprising.
Long term, it seems likely this pushes things toward: - API-first usage - or local / open models
rather than agents sitting on top of subscription-based UIs.
I'm hoping with this change we see the rate limits start to not be as rough.
https://news.ycombinator.com/item?id=46936105 Billing can be bypassed using a combo of subagents with an agent definition
> "Even without hacks, Copilot is still a cheap way to use Claude models"
20260116 https://github.blog/changelog/2026-01-16-github-copilot-now-...
https://github.com/features/copilot/plans $40/month for 1500 requests; $0.04/request after that
https://docs.github.com/en/copilot/concepts/billing/copilot-... Opus uses 3x requests
Mind you, I think GHCP is a great service at an excellent price, but the hardcore vibe coders complain about the rate limits that I've never personally experienced using the CLI.
Meanwhile I can't even seem to spend my $20 Cursor Composer 2 tokens using their agent. I've been doing useless shit just to see how much usage I can cram in there and it'd probably take 10 hours of vibecoding like a loser every day to hit the limits at this point.
With that said I'm not going to pay for something that doesn't allow me to use whatever I want to use (in terms of harness, etc.), so both Anthropic (who were already disqualified because of their ridiculous limits) and Cursor is out (AFAIK you can't an agent other than their `agent` binary without some ridiculous hack like proxying all of the calls through `agent`.
I can't imagine all of the providers pretending their agents are real value going forward, but even if they do there's still stuff like OpenRouter which doesn't give a shit, may as well use something like that.
For a good existing example developed by a known company, check Cline Kanban: https://cline.bot/kanban
They don't have the MCP-bundling idea that I'm experimenting with, however.
I imagine how they treat these things will be contextual and maybe inconsistent. There aren't really hard lines between what they probably want editors that integrate with them to do and generic tools that try to sit a layer above the vendors' agent TUIs.
I think the usage patterns of a lot of harnesses are pushing against their planned capacity. I would say they can certainly explain themselves a lot better.
I use Claude -p for a lot if not most of my coding workflows
If you are not aware, ACP creates a persistent session for steering rather than using the models directly.
And you don't have to get anyone's permission to use tmux.
AKA when you fully use the capacity you paid for, that's too much!
Similarly, on a home internet connection you might pay for a given size of pipe, but most residential ISPs don't allow running publicly accessible servers on your connection because you'll typically use way more of the bandwidth.
That argument would have been valid when the 5 hours blocks were unlimited in the beginning.
The business model of an ISP involves fixed capital investments into infrastructure with constant opex and very little variable costs.
The marginal cost of sending a gigabyte is basically zero. The limited resource here is bandwidth and ISPs split their tiers based on bandwidth.
The problem is that some users may consume the local bandwidth that is shared with other users. More bandwidth requires more investment into infrastructure. This means that bandwidth in itself doesn't produce costs for the ISP either, it is the maximum bandwidth capacity that costs money.
Hence, oversubscription is a viable business as long as neighbors aren't impacted by power users.
This doesn't apply to LLMs. Token economics has the same economics as steel. There is high capex to get started, but the real killer is the variable cost per unit of steel.
You can't sell steel on a oversubscribed subscription model. It's nonsensical.
If the subscription is more expensive than buying what you need, nobody is going to pay for the subscription unless they consume all of it.
Hence the subscription must contain a subsidy to make it competitive.
However, the people who consume the full subscription are still there and each token they request adds up on your electricity bill.
Ergo, the subscription must be more expensive than the API, but with a smart billing limit that removes the cognitive burden of using your service with pay as you go billing.
The API keys are how you use models directly.
You can pay for the capacity, using the per token price.
Prior to Anthropic I have had bad experiences with Windsurf and Cursor, same shit - I pay the plan, they shrink my usage quota after a short time, couple of months or weeks. I never returned to Windsurf after they abused me, and never used Cursor after I got my Claude sub, I have no idea where I'll end up next. Too bad Anthropic is pushing my $200/mo away.
I am running opus to make changes to my code then running the code. I am genuinely curious how we are having such disparate experiences here. And at this point, IMO you're in too deep not to share...
Genuinely wondering if you're running gastown or some other crazy mixture of agents pretending they're an AI startup. I get by with a developer agent and a reviewer agent ping ponging off each other encouraged to be rude, crude, and socially unacceptable about it.
One loop of this can take 20-60 min, and eat 2-5% of my week limit. I have to actively slow myself down to not burn trough more than 15-20% of my weekly limit in a day (as I also like to work on it on weekends)
Sadly I cant share the actual problem I am working on as its not my secret to disclose, but its nothing "crazy", and I am so surprised others dont have similar experience.
My observations are these things impact both quality and token consumption a lot.
Typically my starting prompts for the plan phase are 1-2 pages worth and take 30-60m to even put in the cli text box. With that, first I would generate detailed ADRs, documentation and breakdown issues in the ticketing system using the agent, review and correct the plan several times before attempting to have a single line written.---
It is no different as we would do with a human, typing the lines of code was always easy part once you know what you want exactly.
It feels faster to bypass the thinking phase and offload to agent entirely to either stumble around and getting low res feedback or worse just wing it, either way is just adding a lot of debt and things slow down quickly .
In that context, I don't understand the difference between a "third party harness" and a shell script.
How are they even detecting OpenClaw?
Like what? I legitimately don't understand what is prohibited. Using claude as part of a shell script? Am I only allowed to use claude if a physically type the commands into a terminal via my keyboard? Why even ship `claude -p` at all?
Is it infrastructure? Are they unable to control costs?
Everyone else is spending like money is water to try to get adoption. Claude has it and is dialing back utility so that its most passionate users will probably leave.
I don’t understand this move.
For SaaS, use the SaaS API. For product, use the product.
They subsidize the product with "don't care how much" pricing so they have users to build out features without users worrying about cost. If it's not actual users using the product, then features will be built in OpenClaw instead of Claude.
The earlier they draw this line, the better.
However, announcing it the day before it is effective is a huge unforced error, even if it were just a consequence of the TOS. They gain nothing by making people scramble.
Also better to announce at the same new ways to support plugging in to Claude Code - something to encourage integration/cooperation. No fences unless the field inside is flowering.
Despite their power, frontier models are threatened by open-source equivalents. If AGI is not on the horizon and model performance is likely not going to be enough of a differentiator to keep the momentum going, the only other way is to go horizontal - enterprise solutions, proprietary coding agent harnesses, market capture, etc.
If AGI is in sight, none of these short-term games really matter. You just need to race ahead.
Thanks openclaw for getting me ahead, I’ve taken that and am in Claude code again.
I switched OpenClaw to MiniMax 2.7. This combined with Claude over telegram does enough for me.
OpenClaw used to burn through all my Claude usage anyway.
Here are the docs. https://code.claude.com/docs/en/channels - actually I couldn’t even get this going myself so I asked Claude to do it (meta) and it did, I also had it set this up with a launcher and it all starts up automatically just like openclaw.
When I do use AI, I already have a solid plan of what I need. Sometimes I ask it to look something up. I never do both in one prompt.
GLM 5.1 can do both, and its way way cheaper. I also don't hit my limit that fast (Plus I get to use it in OpenCode).
1. Make a better product/alternative to Openclaw and start eating their userbase. They hold the advantage because the ones "using their servers too much" are already their clients so they could reach out and keep trying to convert. Openclaw literally brought them customers at the door.
2. Do everyone royally and get them off their platform - with a strong feeling of dislike or hatred towards Anthropic.
Let's see how 2 goes for them. This is not the space to be treating your clients this way.
Why hatred btw? They're not even banning accounts left and right like Google?
There's a good chance they do not have the infrastructure to do that.
Okay, that got my attention. What harnesses are those?
I'm hoping that they won't bother you unless you specifically max out the subscription limits every time
https://x.com/bcherny/status/2040206440556826908?s=20
Graceful handling from Anthropic
Less than 24 hours notice and on a holiday weekend
The API is there. It's straightforward and easy to use. But these users want to piss in the well, tragedy of the commons style.
Claude Code seems designed to terminate quickly- mine always finds excuses to declare victory prematurely given a task that should take hours.
The premise of the subscription isn't "giant bucket of ultra-cheap tokens that you can use however you want", it's "giant bucket of ultra-cheap tokens that you can use with OUR tools, within reasonable limits". Even if their TOS didn't prohibit OpenClaw-oids, I wouldn't consider this bait-and-switch, I'd consider it a reasonable and needed move.
Like an API key on a subscription that could be used for 3rd party tools would count 2x towards usage when compared to the same model when used through Claude Code.
Or it’d count the same towards weekly or 5 hour limits across all models BUT would have a separate API keys under subscriptions limit that’d be more grounded. A bit like how they already have a separate Sonnet usage counter.
That’d both allow them not to go broke and also not lose so much community goodwill AND give subscription users an alternative to paying for their enterprise-oriented (overpriced) tokens.
Even the $20 subscription is ridiculously limited and they keep adding more and more limits. The $200 a month sub is insane and only going to get worse and yet still limited
Im hitting rate limits within 1:45 during afternoons.
I can’t justify extra usage since it’s a variable cost, but I can justify a higher subscription tier.
There's gotta be a limit; nobody can afford to have tons of users who are losing them money every month.
Time to compete on value with the Chinese.
My guess is a plan with double the limits would need to be 5-10x as expensive.
https://support.claude.com/en/articles/12429409-manage-extra...
It's like I was a graphic designer and my finance company said "photoshop is too expensive". I wouldn't be mad at Adobe for it
Usage of such tools should be forbidden in companies - its cheating and using code you didn't even wrote. thsts literally a crime
I presume that if you're such a vocal opponent of CC, you're also fighting using IDEs and other tools useful in software engineering, like CI pipelines?
Emacs/vim and make should be the maximum a person is permitted!
I wouldn't say it's drying.
https://x.com/OpenAI/status/2039085161971896807
https://techcrunch.com/2026/04/01/startup-funding-shatters-a...
We're all just getting too used to having great models for a fraction of the the value they give us.
If you started plugging tools into GPT5.4 you may soon discover that you don't need anything beyond a single conversation loop with some light nesting. A lot of the openclaw approach seems to be about error handling, retry, resilience and perspectives on LLM tool use from 4+ months ago. All of these ideas are nice, but it's a hell of a lot easier to just be right the first time if all you need is a source file updated or an email written. You can get done in 100 tokens what others can't seem to get done in millions of tokens. As we become more efficient, the economic urgency around token smuggling begins to dissipate.
For example...
We recently moved a very expensive sonnet 4.6 agent to step-3.5-flash and it works surprising well. Obviously step-3.5-flash is nowhere near the raw performance of sonnet but step works perfectly fine for this case.
Another personal observation is that we are most likely going to see a lot of micro coding agent architectures everywhere. We have several such cases. GPT and Claude are not needed if you focus the agent to work on specific parts of the code. I wrote something about this here: https://chatbotkit.com/reflections/the-rise-of-micro-coding-...
inb4 skill issue I could probably beat you coding by hand with you using Claude code
> Obviously step-3.5-flash is nowhere near the raw performance of sonnet
I feel like these two statements conflict with each other.
What's the exact definition of third-party harnesses? They have an Agent SDK in Claude Code that can be used. Are they trying to say that only Anthropic products can use pro/max plans?
but couldn't i use this in off times only?
The problem Anthropic is running into is that OpenClaw made it easy for everyone to become one of those folks that washes their car three times a week or more.
I’m sure they were losing money on subscriptions in general but now they are really losing money. Shutting off OpenClaw specifically probably helps stem some of the bleeding.
Why would they actively subsidize the ticking timebomb? When OpenClaw has an especially large security incident, Anthropic will probably be affected just for the association.
Like, right alongside this post on the front page, we have a post about a relatively serious privilege escalation vulnerability in OpenClaw.
Extra usage is very sneaky you don't get any notice that you are using extra usage and could end up with unnecessary costs in case you would have preferred to wait an hour or so.
Real PMF sells itself. The risk is of course the competition catching up, I bet switching costs are very low on this setup.
https://focusoverfeatures.substack.com/p/claude-max-blocks-o...
UPDATE:
reply on x Thariq @trq212 only flagged accounts, but you can still claim the credit
So this change has actually forced a reckoning of sorts. Maybe the best option is to outsource the thinking to another model, and then send it back to Opus to package up.
Ironically this is how the non-agent works too to an extent.
Forgive me if someone asked this already and I can't find it in the comments.
headers['X-Title']
You can change that
The other simple method is to only accept certain system prompts
I've been meaning to do some dumb little proxy system where all your i/o can pass through any specified system such as a web page, harness, whatever...
Essentially a local model toolcalls to an "Oracle" which is just something like a wrapper around Claude code or anything you've figured out how to scrape and then you talk to the small model that mostly uses the Oracle and.... There you go.
There's certainly i/o shuffling and latency but given model speeds and throughput it'll be relatively very small
Now people probably care
Doesn't mean I know how to market it, I'll certainly fail at that, but at least I can build it
I can do that now with claude code and a "while true" bash loop.
Or with the built-in "/schedule" in claude code to set an agent to run say once every few minutes.
They become how you think, then company has you: hook line and sinker
Instead of not driving to work to save fuel, frugal companies are going to have their engineers work on weekends to save tokens.
I understand people from the US will have an anti-Chinese reaction, but for us in the "third world" that can use both techs, the openess is always good.
We are paying for a certain amount of token consumption
Why then, is this an outsized strain on your system Anthropic?
It's like buying gasoline from Shell, and then Shell's terms of services forcing you to use that gas in a Hummer that does 5 MPG, while everyone else wants to drive any other vehicle.
To use your analogy, if Shell sold you a subscription to fill up your Hummer up to 30 times a month, they wouldn't let you use that subscription to fill gas cans with a GMC logo taped to the side. They couldn't, without overcharging the people who just want to average out their cost of driving.
You don't get to sell a subscription described primarily as being for some quantity of X and then change the terms every time people find creative ways to use the stream of X they believe themselves to have purchased from you. People thought they were purchasing in bulk.
> We are paying for a certain amount of token consumption
I dont think you are. The specific arrangement you have is you pay for a subscription to be used with Claude Code. It isnt access to tokens, so you can do whatever you please.
---
An analogy would be a refillable cup for a soda at a restuarnt. They will allow you to refill how many ever times you want, but only using the store provided cup - and you cant bring your own 2L hydroflask or whatever. You're paying not just for the liquid, but for the entire setup.
It would be like the restaurant saying "you can buy the 2-liter soda pack" and then getting all uppity when you bring your own 2L hydroflask in.
They have a per-token payment option where you can use any tool you like
The plans do not say how many tokens you get. People are paying for access. Higher plans get more usage. The marketing and support material of the plans only use the word "usage" and never "tokens."
So, to me its a "we built it into our world use ours"
Edit: FWIW I am an avid hater of all claw things, they're security nightmare.
Btw even at insane markups $200/mo means GPUs break even pretty fast.
Claude Code is subsidized because of data collection.
Our engineering team averages 1.5k per dev per month on credit costs, without busting Max limits today.
Personally idk why they dont just make Claude Code more open source friendly. Let the community do PRs for Claude Code. Let us change the tooling, if I could use their own client but swap out the tools it calls and how, I would use like 90% less tokens.
It's simply identical to how people use Claude Code locally.
The lines drawn by their consumer vs commercial TOS was clear and I never subscribed because of it.
We have had the ability to automate browser activities for a long time—but, online service providers don’t want to be behind a layer of automation, which is why captchas were invented.
Automating things on the Internet has never been a technology obstacle, it has been a social one.
I don’t see how anything has changed!
In fact I recently received an updated ToS from eBay saying I am not allowed to use an AI agent to buy stuff on their site. Just a matter of time until others follow suit!
Edit: I misunderstood what was happening. Thanks to the comment below for clarifying.
I do also bundle a default agent with it, also forked from ZeroClaw, with a goal of being more or less prompt injection proof and hopefully able to centralize some configuration and permissions for most or all of the agents it manages, though that part is very rough sketch/plan at the moment I’d love feedback and help on from anyone interested…. Two projects, clash and nono caught my eye in this space, I think both leverage Linux landgrant but I may also use landrun for similar control of other processes like openclaw that it may manage for the user, still figuring out how and where to fit all the pieces together and what’s pragmatic/what’s overkill/what overlaps or duplicates across various strategies and tools. Right now there’s real bash wrappers that evaluate starlark policies, hoping to fully validate better end to end but if you’re interested a few others users testing, validating and/or contributing Claude tokens to the project could be invaluable at this stage. Plan to open source ASAP, maybe tonight or tomorrow if there’s interest and I have time to finish cleanup and rename (I was calling it PolyClaw but that confuses with some weird polymarket Claude skill, so now the router is going to be ZeroClawed and the agent will stay NonZeroClaw in homage to ZeroClaw who it’s forked from… we may also integrate the new Claw Code port which is also rust, just for good measure/as a native coding agent in addition to the native claw agent )
Anyway the main reason I mention is it already has a working ACP integration for any code agent, and working now on using Claude codes native channel integration to make it appear as a full fledged channel of its own, as it now more or less does already to OpenClaw, for anyone wanting to gradually migrate away from their existing OpenClaw installation using this, towards Claude or some other agent. Email me or respond here if interested, or I’ll try and post link here once it’s fully public/open source
Say goodbye to my 600$/ month Anthropic.
I do a lotta stuff don’t need to get into it here.
Interestingly, it looks like I haven't received a non-receipt email from them since August 2025.
edit: see Boris' tweet about it https://x.com/bcherny/status/2040206443094446558
It is a pity though. For less than an hour of setup the Nanoclaw bot proved enormously useful at tracking meal times, training progress, etc and the interface was easy enough for the family to get involved. The ease of setup was really remarkable, and Anthropic creating artificial barriers just seems user hostile.
Public model inference quality is almost at SOTA levels, why would anyone pay these VC-subsidized companies even a cent? For a shitty chat interface? Give me a break.
I suspect the same for the forced high AI usage quotas for developers at MS etc. We've had multiple generations of models trained on all of the code that's available and there are diminishing returns on how much that data can do for training now. Newly published publicly available data is also made up of a significant portion of slop.
The best way to get fresh training data from real human brains might be to have real humans use your first party tools where you control all of the telemetry.
No, Anthropic, just because you added a clause that says "we can change these terms whenever" doesn't make it right. I'm paying you a set amount of money a month for a set amount of tokens (that's what limits are), and I should be able to use these tokens however I want.
Luckily, there are alternatives.
Anthropic not allowing Claiude Code subscriptions to be used with other projects isn't "pulling the rug out"; you paid for an API subscription to use Claude Code, and now you're using it for a different purpose and a different product.
If Tesla offered $10/month charging for your Tesla, and then a bunch of people turned around and use their Tesla Charge subscription to charge all different electric vehicles, and battery packs, and also hooked up a crypto mining rig to it, would you be surprised if they said "Nope, we're cutting this off. You can only use your Tesla Charge subscription for your Tesla vehicle"?
> If Tesla offered $10/month charging for your Tesla
No, "if Tesla offered $10/month for 100 kWh of charging", and yes, I expect to use those 100 kWh with any vehicle I want, because there's a limit on the resource I'm paying for.
I can understand caps on unlimited, I can't understand caps when there are strict limits.
You are the reason these changes are happening. You may well be the reason that subscription prices go up.
ChatGPT found it was a great idea and that I can use Claude for planning and gave me instructions on how to best hand off the building part. Claude told me it’s a horrible idea.
Claude also burns much more liberally through tokens, eg reading through entire irrelevant docs.
Openclaw is great for resolving this since I much more control which work goes where and also gives a much better user experience without all the back and forth to understand what context it has (my use case is to build things from my phone while I’m in senseless meetings in my day job).
Fully agree on the alternatives. In the end Claude’s experience is worse, while it still makes bad decisions if you let it. Better to get a good workflow on a less capable model.
the $200 tier math only works because humans have to type, read, and eventually go to sleep. OpenClaw replaced that human latency with a non-blocking while true loop. tbh they aren't really defending an ecosystem here, they are just desperately patching a hole in their unit economics that collapsed the second the meat bottleneck was removed.
It's like if I buy a hot dog every month and they tell me they're raising the price next month, or discontinuing honey mustard. Inconvenient but they're not doing anything wrong.
Especially since, given my back of the napkin math, they're giving us a pretty decent discount on the subscription plans.
This one passes the Golden Rule test for me. I treat them as I would have them treat me which is that we both will work with whatever makes economic sense.
How is what you are asking for different from what they are saying?
I'm doing a side-by-side with GPT-5.4 for $20/mo and Sonnet for $20/mo and I can tell you that all my 5 hour tokens are eaten in 30 minutes with Claude. I still haven't used my tokens for OpenAI.
Code quality seems fine on both. Building an app in Go
I think using it to write small documentation or small scripts would be a good use case for it, but serious development work you Hit the usage limits way too fast.
Only thing now is that the cheaper (worse) chinese model coding plans have huge limits, so I lean on those now. Requires a lot more hand-holding though.
The Anthropic casino wants you to continue gambling tokens at their casino only on their machines (Claude Code) only by giving more promotional offers such as free spins, $20 bets and more free tokens at the roulette wheels and slot machines.
But you cannot repurpose your subscription on other slot machines that are not owned by Anthropic and if you want it badly, they charge you more for those credits.
The house (Anthropic) always wins.
> you’ll no longer be able to use your Claude subscription limits for third-party harnesses including OpenClaw.
My understanding is that Conductor and others aren't using it.
Anthropic's current business model is to sell access to their tools to subscribers at a loss. Users maxing out their $200/month plan can realistically cost Anthropic $500-600 in actual compute costs.
Anthropic is okay with this right now because they want to amass as many users as they can, and eventually hope that GPUs will increase in power and efficiency, and their LLMs will become more efficient as well. They can eventually profit off of their current pricing, or with modest price increases, if that comes to fruition.
But letting OpenClaw wake up every 30 minutes and start sending requests is a surefire way to max out your weekly limits, and that certainly isn't something Anthropic planned for.
Claude innovation will come from being open, not closed.
If you haven't been paying attention anthropic burned a lot of their developer good will in the last 2 weeks, with some combination of bugs and rate limits.
But the writing is on the wall about how bad things are behind the scenes. The circa 2002 sentiment filter regex in their own tool should have been a major clue about where things stand.
The question every one should be asking at this point is this: is there an economic model that makes AI viable. The "bitter lesson" here is in AI's history: expert systems were amazing, but they could not be maintained at cost.
The next race is the scaling problem, and google with their memory savings paper has given a strong signal what the next 2 years of research are going to be focused on: scaling.
Ive been calling for local LLM as owning the means of production. I aint wrong.
On my desktop RTX 5060 TI (16GB) and 96GB ram, I routinely get 25-30 tokens/sec using an 80B model quantized to int8. Uses 65GB system ram and 15GB gfx ram.
And its plenty fast for many of my purposes.
I could easily run a 30B model bf16 (full) and do like 50tok/s
You can use your Claude Code subscription with third-party tools, but you have to use the Claude Code harness. Or, you use the API. OpenClaw could use the Claude Code harness, but they don't.
Open weights will always trail SOTA. Forever. So let's say they continue to get better every year. In 100 years, the open weight model will be 100x better than today. But the SOTA model will be 101x better. And still, people will make this argument that you should pay a premium for SOTA. Despite the open weights being 100x better than what we have today.
The open weights today are better than the SOTA models from a year ago. Yet people were using the SOTA models for coding a year ago. If people used SOTA models a year ago, then it was good enough, right? So why isn't the same (or better) good enough now?
The answer is: it is good enough. But people are irrationally afraid of missing out (FOMO). They're not really using their brains. They're letting fear lead their decisions. They're afraid "something bad" will happen if they don't use the absolute latest model. Despite the repeatable, objective benchmarks telling us all that open weights are perfectly capable of doing real work today, the fear is that we're missing out on something better. So people throw away their money and struggle with rate-limits because of their fear.
I'm not sure how much I trust those benchmarks; I have a feeling everyone is playing up to them in some way. Still, if you're willing to accept the latency, they're definitely usable.
Of course everyone has realized this, so the hardware you need to run them is a little bit on the expensive side right this minute.
CPU manufacturers are working on improvements so that you can more practically run models on regular CPU+RAM (it's already possible with llama.cpp, just even slower).
The GPU also takes around $500-$1000 in electricity, and even then you won't be able to run a model of as good quality as anthropic.
It's also hard to justify since who knows how quickly it will be outdated, like maybe soon you'll need a blackwell chip (like a $100k PC, check out the NVIDIA DGX Station) to run a decent model.
... It'll take a lot more than a year to pay back a model capable of running openclaw with any sort of reasonable performance.
Or can you report that you've had good luck with a Strix Halo or local GPU for less than $40k up-front costs?
Just look at how Sam Altman has led OpenAI step by step to dominate—and choke out—Anthropic, a company founded by the group of engineers who were once part of the turmoil at OpenAI.
Anthorpic's product thinking is terrible even though it is technically very good.
OpenAI seems to mostly be chasing the consumer market, but not doing great at it.
Based on the limited public information out there, the AI chat tools with the most users are ChatGPT, Meta, Gemini, Alibaba, Baidu, Copilot, and Grok. Anthropic is nowhere near the top.
IMO, the goal here is clear: they want them to use their software, have people build an ecosystem around their software, they want to have visibility around their software.
It's never about capacity or usage, they just want to have the claude ecosystem, there is a reason why they don't support AGENTS.MD or other initiatives, they want everything to be theirs and theirs alone. You can argue that 'well fair', but to me this is clear abuse of their position in the market.