Need help?
<- Back

Comments (56)

  • cheschire
    Absolutely loved the article, the process, and the results. Hated the price.You could pay a human to read receipts, 1 every 30 seconds (that’s slow!), $15/hr (twice the US federal minimum wage!), plus tax and overhead ($15x1.35) comes out to $20.25/hr over 5 hours. $101 all in.Sure, sure, a human solution doesn’t scale. But this sort of project makes me feel like we haven’t hit the industrialization moment that i thought we had quite yet.
  • hbarka
    It’s so exciting to read more and more articles like this, using LLMs to discover clever solutions. I mean how many of us have dreamed of scanning years of receipts, waiting for that moment when you know a DIY solo application is at hand. I’m not being sarcastic, I too have a drawer full of Costco receipts which to me are data waiting for insight, not just crinkly paper. It’s more than being clever, it’s the realization of using a device not as a tool, but an equal partner who can suggest what tools and approaches to do. The end product of the LLM is not the point (although it can produce it better than ever), it’s the way an LLM can elevate messy knowledge work. A single person can now say that analysis knows no bounds.
  • ProllyInfamous
    >Everyone needs a rewarding hobby. I’ve been scanning all of my receipts since 2001. I never typed in a single price - just kept the images. I figured someday the technology to read them would catch up, and the data would be interesting.This is perhaps among the best openers I've ever read.[spoiler: the tech caught up, the data is interesting]I read a lot. This article, entirely.
  • egeozcan
    I usually avoid shallow comments but I feel like this time it has to be said as a conversation starter: That's a lot of eggs!Also ignoring the benefits of subscriptions, an estimate in the magnitude of thousands of dollars for extracting egg prices still makes me feel like we aren't "there" yet. This should have been a problem with a much more efficient solution given the advancements in the AI, data analysis and OCR space. I am sort of disillusioned.
  • dinohlm
    The most surprising thing about this whole story is that he's been scanning all his receipts for the past 25 years. I've never heard of anyone doing this before and don't really know why you would want to.Still, it made for a somewhat interesting exploration of AI techniques.
  • PaulHoule
    I am amused that this in the classic 1955 Asimov storyhttps://en.wikipedia.org/wiki/Franchise_(short_story)the protagonist is interviewed as a one-man "focus group" in lieu of a national election and one of the questions he is asked is "What do you think about the price of eggs?" and he said roughly "I have no idea, my wife does the shopping."
  • ismailmaj
    I don't know why people mess with tesseract in 2026, attention-based OCRs (and more recently VLMs) outperformed any LSTM-based approach since at least 2020.My guess is that it's the entry-point to OCR and the internet is flooded by that, just like pandas for data processing.
  • rdiddly
    This is the perfect job for AI, in that it's handling work the human didn't care enough to do manually. Although of course I don't care either. No value judgment there, just an observation. Imagine a place - a field let's say, part of a farm, long ago, but it had a road built through it, and thereby became a non-place, a patch of ground nobody dwells in or pays attention to or cares about, because when they're on it they're always heading somewhere else. The AI phenomenon is like that.
  • EdNutting
    The AI writing of the article made me give up halfway through. It’s a neat idea but the writing style of these AI models is brain-grating, especially when it’s the wrong style choice for this kind of technical report.
  • PowerElectronix
    Inflation adjusted dsta just comes to tell us that either eggs have been outdoing the CPI for 25 years or that actual CPI is way higher than what the BLS calculates.
  • smcg
    Many states passed requirements for cage free eggs that went into effect by end of 2024 so that has had some effect on prices.
  • eeixlk
    Apart from the comical cost of extracting this data from paper receipts, is it more likely that stores will publish their product costs over time so trends can be observed or be more like gas stations where no prices are listed. I have no idea why a box of Cheerios costs $7 for processed oats but i see millions of reasons to obscure that data.
  • MarceliusK
    Overall this feels less like a quirky egg project and more like a blueprint for how messy real-world data pipelines are going to look going forward
  • tkgally
    I haven't tried it with receipts, but I've gotten excellent OCR results with Gemini 3.0 and now 3.1 on some challenging texts: handwritten letters I couldn't fully decipher myself, vertically printed Japanese texts with tiny furigana readings next to the kanji, a 19th century book in English with extensive use of italics and small caps. Gemini is good at extracting text and formatting from complex layouts, and it might work with egg receipts, too.
  • gib444
    > Estimated token cost $1,591I can assume this person does in fact NOT need to worry about the price of eggs ?
  • flurb
    Great article through and through. The total number of places you've bought eggs at made me feel a tad depressed though: 4 places where you lived at or spent a longer time, 5 you traveled to *.I tend to grow bored of a location after a year or two, though I'm certainly in the minority.* Of course you didn't buy eggs every time you traveled somewhere, so probably not the entire truth.
  • Metacelsus
    And if the price reflected the externalities of factory farming, eggs would be even more expensive!
  • brcmthrowaway
    Question: Do big chat providers tool call an dedicated OCR, or is it part of the LLM?
  • sgbeal
    > Estimated token cost $1,591 > Confirmed egg receipts 589 > Total egg spend captured $1,972 > Total eggs 8,604...> I can’t wait to see what 30 years of eggs looks like.At $2.70 per receipt, i'd be in no hurry to find out!
  • BoredPositron
    There is a reason why reciept transcription is still the task with the highest demand on mechanical turk.
  • DeathArrow
    Without 25 years of photographing receipts, weeks of agents coding and billions of token spent, I can predict that egg prices increased, and the graph of my egg consumption over time is concave, part because my income has risen, part because while all prices get inflated, eggs are still cheaper than other sources of protein, and I did in less than 1 microsecond.I will use them tokens to be able to afford more eggs.