Uber torches 2026 AI budget on Claude Code in four months

<- Back

Uber torches 2026 AI budget on Claude Code in four months

lwhsiao

Comments (456)

abuani
I take a peak every month or so at spend for my company and notice more and more are consumed $1k in tokens a month and it is bewildering to me how. I use llms daily, and see anywhere from $200-$400 tops. This is using the most expensive models, in deep thinking mode. So I'm not a Luddite against the usage of them. I just can't figure how _how_ to burn that much money a month responsibly.I genuinely challenge someone spending $5-$10k a month to demonstrate how that turns into $50-$100k in value. At a corporate level, I'd much rather hire a junior engineer who spends $100-$200/month and becomes productive then try and rationalize $100k/year in token spend.
internetter
I know I'm responding to AI right now, but> which means figuring out if the company can afford this level of productivity at scale.If it was actually productive, then the revenue would increase and affordability wouldn't be a question.
MichaelNolan
> 95% of Uber engineers now use AI tools monthly with 70% of committed code originating from AI.Well, that’s to be expected when using AI tools becomes relevant in your performance evaluation.
hyperpape
I love how these articles drop, and all of a sudden HN is filled with people who think engineering productivity is simple to measure.Yes, productivity implies revenue (or cost reduction), and revenue is measurable.However:1. You spend money today to build features that drive revenue in the future, so when expenses go up rapidly today, you don’t yet have the revenue to measure.2. It’s inherently a counterfactual consideration: you have these features completed today, using AI. You’re profitable/unprofitable. So AI is productive/unproductive, right? No. You have to estimate what you would’ve gotten done without AI, and how much revenue you would’ve had then.3. Business is often a Red Queen’s race. If you don’t make improvements, it’s often the case that you’ll lose revenue, as competitors take advantage.4. Most likely, AI use is a mixture of working on things that matter and people throwing shit against the wall “because it’s easy now.” Actually measuring the potential productivity improvements means figuring out how to keep the first category and avoid the second.This isn’t me arguing for or against AI. It’s just me telling you not to be lazy and say “if it were productive you’d be able to measure it.”
trjordan
> figuring out if the company can afford this level of productivity at scaleThis is the thing that boggles my mind. They spent their budget. They have 4 months of data. What do they have to show for it?I'm not a hater; I'm not a luddite. I have a $200 Max plan and I use it.But are you saying that Uber made this tool available, urged everybody to use it, and is confused about what happens when it worked? It's one thing if they decide AI isn't productive enough to be worth the cost.Are they out of ideas on what to build next, or something?
ninjagoo
According to [1], there are about 5500 people in Engineering at Uber. Using $1250 as the mid-point of the $ spend range, that comes to about $6.8 Million in engineering AI spend, ballpark, with the range being $2.75 Million - $12 Million. The article lists $3.4 Billion as the R&D spend.The AI spend does not appear to be a significant chunk of R&D spending (0.3% in 4 months or 1% annualized). If they didn't plan for it, sure, it's not peanuts in the budget, but in context not that much.The real question is, what did they get for that amount? The article claims that 70% of the code commit is now AI-generated, so presumably the code passed review and tests. Did it accelerate the feature count? did it reduce quality problems? Did it lead to other benefits?Sadly the article is silent on the outcomes, besides the higher spend.Maybe 4 months is too soon to assess the benefits. On the other hand, in an agile world ...[1] https://www.unifygtm.com/insights-headcount/uber
jjcm
Speaking as someone who's bootstrapping here, I'm often envious of engineers at these larger companies, but I also worry that the incentives are screwed up.If I were an engineer at Uber, why wouldn't I select gpt 5.5 pro @ very high thinking + fast mode for a prompt? There's no incentive not to use the most powerful (and thus most expensive) model for even the smallest of changes.I tried one of these prompts for some tests I'm doing for image->html conversion, and a single prompt cost me $40. For someone that's paying that themselves, I'd pretty much never use this configuration. For someone at a large company where someone else is footing the bill, I'd spin these up regularly (the output was significantly better, fwiw). For engineers they're being rated on what they deliver, not the expenditure to get there.There are ways to do this cheaply, but there are no incentives for engineers to do so.
tunesmith
I think as it becomes more common for executives to think we can replace software engineering with agents, I wonder if they might be basing their decisions off of unrealistic perceptions of the average software engineer. I guess I'm mulling two somewhat contradictory senses:1. You get out of it what you put into it. A savvy CTO might be incredibly excited by everything they can do with agents, and improperly think that all the software engineers can do the same thing, when in reality your org's average software engineers might not have the creativity to even think of many cases where it could save them work. So by mandating agent usage, you might find that productivity hasn't improved while AI costs have increased.2. When using AI, there are two gaps that become more obvious. First is the gap of: who tells the agent what to do? In many orgs, product isn't technically savvy enough to come up with a detailed spec/plan that LLM can use. And many cog-in-machine developers aren't positioned to come up with the spec, they just want to implement it. By expecting work to be implemented by agent-using developers, you might instead find a lot of idle workers waiting for work to show up. Second is the qa/review cycle. You've introduced a big change to the org but are you really saving cost or shifting it?I'm all for introducing LLM as optional to help existing developers increase velocity and quality, but I think the "let's restructure the org" movement is really dicey, especially for mid-size or smaller employers.
Animats
What is Uber developing? They're an app and a car allocator back end. Both work OK. Why are they spending so much?They gave up on self-driving, so that's not it.
woah
It's very easy to blow through hundreds of dollars a session using API tokens especially with the 1m context if you aren't careful about clearing old context.At the same time the subscription will allow the same usage for hundreds of dollars a month.Either Anthropic is absolutely hosing API users, massively subsidizing subscriptions, or a little bit of both.
paulbjensen
This Claudemaxxing phenomenon is amusing as hell.I've been able to get by with the $20pm Pro subscription and reap great value out of Claude Code.I feel like it really is about:- Don't feed it the works of Shakespeare into the context window if all it's working on is a few files. I actually don't have a Claude.md file in my projects.- I write the prompt as if I was giving instructions to another developer or to myself on how I want to approach a specific coding, with a numbered step plan. I've actually been able to take the details written into a Jira ticket on a work project, feed it into Clade Code, and get really good results from it.- If you are responsible for the output, then you need to review the output - that does put a natural constraint on the tool's usage, but ultimately it is you who uses the tool, not the other way around.I feel like that's the thing - you have to find the right cadence, just like with running or driving a car - you need to find the level at which you control the car, at which you maintain a consistent pace, and at which you get code that does what you need it to do and meets the quality threshold you want.
NicuCalcea
Can these AI-generated articles not be prompted to at least cite the primary sources? How do I know any of this is true?Here's a much better article: https://aimagazine.com/news/why-uber-has-already-burned-thro...
retired
Have we reached a point yet where companies are spending millions a year on software licenses, cloud and AI to the point where the return isn't worth it?Years ago I did work for a company that was spending over a million on Oracle product licenses and I was part of the consultant team they hired to rip it all out and just go for simple maintainable code based on open source products. Not only did it transform into a codebase that the average newly hired developer could maintain, you also had the savings of not paying Oracle a significant portion of your revenue.I feel like that will repeat itself in a few years time with the current cloud and AI train everyone is on.I haven't been in a professional setting for a while, I just code for fun nowadays so perhaps I'm somewhat out of the loop.
monooso
> Uber's unexpected budget burn matters because it signals how valuable AI tools have become to engineering productivity.This infers value from spend, which makes no sense. Burning the budget tells us engineers like the tool, not that it's producing value.Show me how to make two dollars whilst spending one, and budget isn't a problem.
phillipcarter
> Monthly API costs per engineer ranged from $500 to $2,000 as adoption skyrocketed across the company.That's...not exactly a lot per engineer. It sounds like they just didn't budget correctly. Especially if the net of that work is more features that would have otherwise required hiring more engineers, which would cost a lot more than $500 to $2000 a month.
tokyoproductj
AI might not make engineering cheaper — just more elastic. Instead of paying for engineers, you’re effectively paying per unit of thinking. At scale, that could get very expensive very quickly.
keeda
Relevant Pragmatic Engineer newsletter with many more cases along these lines, along with how some people are handling them: https://newsletter.pragmaticengineer.com/p/the-pulse-token-s...Tokenmaxxing seems more and more like a way to encourage experimentation and learning, and incidents like this are a part of learning. Like, today devs simply use the most expensive model by default, even to do extremely simple things. This is obviously wasteful and costly, and budgets will soon be imposed, but this is how they're figuring out the economics.For instance, like we estimate story points, we may estimate token budgets. At that point, why waste time and money invoking a model for a simple refactor when you could do it with a few keystrokes in an IDE? And why use a frontier model when an open-source local model could spit out that throwaway script? Local models can be tokenmaxxed, but frontier models will still be needed and will be used judiciously. Those are essentially trade-offs, and will eventually be empirically driven, which is what engineering is largely about.So economics will soon push engineers back to do what they're paid to do: engineering. Just that it will look very different compared to what we're used to.
bhagyeshsp
Wonderful, so when will I see novel features in my Uber app?
dcre
While this is a fundamentally stupid story to begin with, it was at least reported somewhat better in other venues. The original report came from The Information, and at least this Yahoo Finance[0] writeup mentioned that. This article has very little content and no sourcing.[0]: https://finance.yahoo.com/sectors/technology/articles/ubers-...
ssfrr
It's wild that the article frames this as> what started as an experiment in productivity became a runaway successand> figuring out if the company can afford this level of productivity at scaleIt seems like they're equating "developers are spending a ton of money on this" with "this is creating a ton of value".I'm not saying that AI tools aren't valuable, but the article doesn't question this equivalence at all.
cassianoleal
> Uber's unexpected budget burn matters because it signals how valuable AI tools have become to engineering productivityThat's a bit of a logical leap with no demonstrable increase in productivity.All this shows is that they're spending a lot more on AI than they budgeted for. Nothing else.
Painsawman123
If they burned through their ML budget in four months while using heavily subsidized models, we're going to see companies burn through their ML budgets in less than a week once those subsidies are no longer in place and they have to pay per tokens used.....
the_arun
I didn't see the article mentioning the outcomes achieved because of using AI compared to not using AI. I might be missing it. Mainly, Uber is a business. So profit & loss - both need to be measured to understand the equation.
glimshe
I spend $20/month on Gemini Pro and it greatly increased my productivity. I'm still in charge and only use AI for the more tedious or toughest problems. I can't see how these people could be spending this much productively.
linkregister
I wonder how much of this AI budget was spent on their LLM-heavy CI/CD pipeline: https://www.uber.com/us/en/blog/ureview/I'm considering rolling out something similar but am not sure if it would exceed the expenses of Claude Code Review at an estimated $20 per PR.
maplethorpe
In the Uber Eats app I can't even request a refund for an incorrect order anymore, because the UI doesn't allow me to scroll down to the "submit" button.It's been like this for months. I finally got my explanation.
bilekas
I don't know, maybe this will make companies see the actual value in their engineering team. In my company they are starting to see the rotten fruits of the AI push, but it's come at the cost of many jobs, little planning and big ideas.Exactly how Anthropic, OpenAI and co are selling it.
jimnotgym
I didn't see a bit where they said how this transformed into more productivity and more profit? What is the point in using AI to make developers more productive if you don't either have more features coded making more money, or fewer developers saving cost?
dwa3592
I am confused - what did they ship based on this spending? - it is totally alright to spend that money if it made significant progress in some area.or did the engineers just chill and let claude take over daily duties? (this is also a benefit for employees in my opinion)
pier25
> the AI coding tools represent a meaningful chunk that nobody expected would require this much capital so quicklySurprised Pikachu moment.And it's going to become even more expensive when AI companies start charging to actually make a profit.
ilia-a
Not surprising, hit my 5h limit on Claude Code Max Plan, had some credits so switched to extended (api). 40 minutes later $30 credits gone... so yeah, I can see how this can happen.
saos
Interesting. Some companies have rolled it out to every department with a small budget.I wonder how this will end as AI becomes more expensive to use. If you can't quantify ROI then I guess you're cooked.
J_Shelby_J
I use a cli tool to build a document of all relevant code and then use ChatGPT 5.5 pro to plan a feature and generate an implementation plan, and then review and edit and paste it into codex on high to implement.And it works because it won’t stop until the rust compiles. But the code is garbage and makes bad decisions that no junior would. Unmaintainable junk and sometimes I spend more time refactoring than if I would of just built it myself.People here talking about generating 100ks LoC a month and I’m wondering if it’s a skill issue with me, or Codex or if I should pull all my investments out of companies heavily invested in AI like uber.
deferredgrant
AI coding tools probably need the same boring governance as cloud spend: budgets, alerts, team-level visibility, and a way to spot runaway usage before finance notices.
mattas
Wonder how many tokens would be saved if everyone just put “be brief” in their prompts.Also wonder if there is some perverse incentive for models to be verbose to juice tokens.
hybrid_study
The more I use Claude Code the harder it is for me to believe this behavior is a byproduct of the model. Behavior = ridiculously token inefficiencies
Cyphus
> what started as an experiment in productivity became a runaway successSuccessfully burning through cash and tokens, alright, but what have they gotten out of it?
segmondy
They could have bought all their engines their own massive GPU. They could have built out their own DC. Nuts...
theusus
It's GPT 5.5 and it still can't do exactly the same thing I want. So, I think companies should call AI a lost cause.
dataranger
we run an agentic pipeline in a different domain (data sourcing) and the only way the math works is to be ruthless about which stages actually need which model.As a founder, the question I always have is "what is the marginal value per token relative to engineer-hours saved." More of a gut feel at the moment, but would be great to calculate.
ookblah
this is pointless without knowing what they are measuring. you could genuinely moving faster or you could be optimizing for engineers in a rat race to push more code because all their peers are now doing it because those are the metrics you are measuring for "ai productivity".
alansaber
There's a line where the unfettered spending is just wasteful, we are well past the line
tribune
Might as well get while the getting is good and Anthropic is subsidizing the cost of compute
KolmogorovComp
Honest question, does Uber need that much R&D? And do they expect the ROI to be positive?
DarenWatson
There is a major disconnect in that people think token usage is exclusively tied to human typing rates...it isn't true. When software developers evolve to using self-managing CLI tools (like Claude Code - the source article mentions this), they are not merely chatting; they are unleashing loops of agency.When you enter one single inquiry of "find and fix the memory leak in the billing service" you are not submitting just one single inquiry. The tool is searching through an entire code repository for relevant code, pulling 15 related files into context (easily 200k+ tokens) proposing a fix, running the test suite and failing, taking an entire stack trace of errors into context and looping to keep iterating towards the solution.. In that process you can loop multiple times (10+) in a very short period of times (within 5 minutes). While you grab a cup of coffee you will have consumed $20 in token usage. At the enterprise level (like with Uber) when you multiply that out by thousands of software developers using it as a personal shell tool your budget disappears very very quickly.And on your point about the junior developer: Comparing $100,000/year in tokens to hiring a junior developer is such a ridiculous false equivalency that even makes you question whether they even understand how to make such a comparison.The cost to a business of one junior engineer with a $100,000 salary is not just the $100,000 in salary but also an additional $40,000+ in benefits and taxes, as well as in hardware.Also, you are disregarding another cost of hiring junior engineers that is their mentorship cost. Each week, your senior and staff engineers spend hours mentoring junior engineers by reviewing their code, pairing with them, and unblocking their progress. Mentoring requires a substantial amount of time and will be expensive to your business.The return on investment (ROI) for the $10,000 monthly expenditure on tokens is not so much about replacing the junior engineer with the AI. Instead, the ROI is that your senior engineers can use the huge amount of compute power to create boilerplate and tests, and refactor their code 3x quicker than if they had to mentor junior engineers. In addition, LLMs do not sleep, require one-on-ones, or leave for another company for 20% more pay in 18 months, when the value to the code base made them an asset to your business.Lastly, the main reason that Uber has problems with their AI business is that due to the UX of these agentic tools, developers think of the API calls made to the AI as free and as a result, treat them like a basic grep command.
jeffbee
It's obvious that the word productivity has been used in this discussion to mean something other than the plain meaning of the word. If AI was productive, there would be no question about whether it could be afforded. If you're asking whether you can afford it then it isn't productive by definition.They are using it to mean a mechanism that produces prodigious amounts of toxic waste. That does not conform to the historical understanding of the word.
robmay
Most people don't have the team and time to do heavy token efficiency engineering. But that's all we do. marketplace.neurometric.ai has a bunch of task specific small models, and we charge flat monthly fees. We bear the token risk.
mkozlows
This terrible unsourced article seems to be citing this information piece: https://www.theinformation.com/newsletters/applied-ai/uber-c...... but the key fact about "$500-$2000" per engineer does not appear there, and seems to be fabricated.
tzury
What are the sources for the “facts” presented in this post?
geetee
No mention of if it actually improved outcomes.
mancerayder
And here comes the reining in of spending. If companies are anywhere like I'm seeing:1 - Company mandate, start using AI2 - You're afraid? Here's a mandate!3 - (Devs and others discover Claude Code features where the coolest burn mad tokens)4 - Um, yeah we're going to have to take a look at the spend here5 _What's 5?We know steps 3 and 4 will cycle a bit more, and we know it's going to cost more - these were startup teaser costs.
PessimalDecimal
Is this a submarine? https://paulgraham.com/submarine.html
S0y
But did it make them more productive?
bahmboo
There are no sources or references.
somewhereoutth
> When developer productivity tools become so valuable that engineers blow the entire budget in four months, the issue isn't the tool but that the budget was invented too early to forecast this adoption curve.Where oh where can I find clients like these??
wald3n
This doesn’t work at all
AndrewKemendo
This continues to boggle my mind so hopefully somebody can explain how this is happening.I’ve been using all these tools since they started popping out around 2021 personally and professionally. I probably built four or five products at this point with assistance, not to mention the thousands and thousands of back-and-forth conversations for research or search or rubber ducking or whatever.I have never spent more than whatever the professional max plan is that is consistently $20 a month.I asked a friend of mine who spent a couple hundred dollars in like an few hours how they did it. The answer was they basically getting these agent groups of agents stuck in a loop and they’re constantly just generating verbose bullshit that is not even interrogated and doesn’t come out with any artifact that is inspectable no matter how expert you are.The couple of stories I have heard of these massive crazy spends are people literally just assuming these things can complete an entire human task in one shot, so they continue to hit the “spin the wheel” button until they get something closer to what they wantBut I’ve yet to see that actually workand it actually flies in the face of every instruction guide or documentation or prompt engineering process that has been described over the last almost 5 years
taf2
i bet someone mentioned openclaw one too many times
AtNightWeCode
Uber must be the biggest tech company that got lucky with timing. They are so incredible stupid and incompetent. How on earth do you end up with that cost for AI per user.
dyauspitr
I don’t understand. On the ChatGPT pro plan for $200/month, I am essentially running it 24/7 including nights and I can barely get it under the 40% usage mark. Why are companies not using this?
pstuart
My company has an all you can eat policy, but I think we'd be well served by being thoughtful in optimizing usage so that we still have the overall capabilities but don't burn extra tokens by sloppy use.
jcgrillo
AI token austerity when
uncircle
Now AI slop factories make the HN front page?
mohamedabdallah
[flagged]
AstroBen
[dead]
redsocksfan45
[dead]
zombiwoof
[dead]
davidcann
> 70% of committed code originating from AI.How are they calculating that? They could be using my tool, Buildermark, but I do t think they are: https://buildermark.dev