comments (10)

  • I’ve always thought the US Postal Service is such a technological marvel. They somehow manage to identify and route billions of pieces of mail and I have to imagine their tech is significantly more primitive than this. Not only that but US addresses are absurdly non-standardized, you can often write the same address multiple ways and have it deliver to the same location. I’m sure there’s plenty of published knowledge in this area, but whenever I see announcements about OCR it feels like this should be a solved problem if it’s been accomplished at the scale of USPS for many years.

    ericyd

  • A tangential observation: the video on the linked page wasn't what I expected. I thought Mistral was a european AI company, so I didnt expect the video to be filmed in San Francisco featuring three people who don't seem to be european.

    I'm not against them being a global organization, that's wonderful. I was just surprised. I expected a parisian office and european accents.

    andrewmutz

  • It'll be interesting to see how this ranks against https://github.com/baidu/Unlimited-OCR

    mdrzn

  • It's cheap at $4/1k, but I'm hesitant to even benchmark this one again since the previous versions were all "98% accurate based on internal benchmarks of 4 pdfs" and ended up falling short of almost everything else on the market [1].

    Even in this one, they just report that OlmOCRBench and OmniDocBench have "known limitations" and that's why they report flagship numbers from their internal benchmark.

    https://getomni.ai/blog/benchmarking-open-source-models-for-...

    themanmaran

  • All AI labs really need to stop using truncated y-axes for benchmark bar charts...

    https://mistral.ai/_astro/cm-engish_ZhlvoT.webp?dpl=6a3a94bd...

    beklein

  • Are there any open models focused on LPR (license plate recognition)?

    I have found some old ones but curious if there are new ones being developed like this OCR model. I may even try it for the purpose and see if it does well.

    zhivota

  • Tested with Malayalam, normal handwriting got accurate but a slight different style got detected as kannada. Have samples if required, which sarvam got done with 99% accuracy leaving one text error.

    sreekanth850

  • Little on differences other than bounding boxes and double the price compared to their previous OCR v3 model from December - https://mistral.ai/news/mistral-ocr-3/ - other benchmarks were used back then.

    mcbetz

  • "A note on out-of-scope use. OCR 4 is a document-understanding model, not a decision-maker. It is not intended for medical diagnosis, legal advice or judgment, high-stakes financial decisions, safety-critical systems, real-time/latency-sensitive processing, or non-document inputs (raw audio, video, etc.). "

    Can't wait for the "oh so innovative" manager who will suggest during the next meeting "Ok... but what if WE used it for high-stakes financial decisions on non-document inputs like a photo from my phone?"

    I guarantee you somebody on HN is going to comment about this "idea" next week.

    utopiah

  • Recently I tied OCR with Opus 4.8. (I know, not technically right tool for the job). All I needed to do was extract dates from receipts. It got about 20% of the dates wrong yet rated all as “high confidence”.

    Should have probably tried a more OCR specific model

    Insanity