One real estate finance customer cut manual review time by 80% 🤯
Rent rolls are huge multi-page tables with dozens of columns.
Most tools fail:
❌ Headers only on the first page
❌ Rows break across pages
❌ Layouts shift mid-doc
Agentic document processing solves this!
We just rolled out a new agentic loading experience! And you can actually see it thinking.
🔄 Each document flows through layout correction, contrast enhancement, OCR selection, structure recognition, and more.
Check it out, and watch it think before your eyes. 👀
One of the trickier document types we’ve worked on lately are multi-page tables. Header’s on page 1. Data's on page 5.
Agentic document processing tracks header context across pages and keeps your table structured from start to finish.
Drop a comment to see it on your documents
Bounding boxes for extracted fields are here! 🔍
Now you can see exactly where each extracted field came from.
🔍 It draws a box around the exaction
📄 Auto-scrolls to it
⚙️ Works on all documents
Personally, this has made reviewing extractions way easier!
A new round of OpenAI models (4.1 series) came out today and GPT is finally back in the running. But not with the model you'd expect...
Turns out 4.1-mini dominates when it comes to document understanding.
Gemini 2.5 pro is still the top ranked but at a higher cost and latency.
Excited to welcome Terry as OmniAI’s founding Growth! He built and scaled BuildStream (YC S19) and brings a ton of experience in driving growth from 0 to 1.
💡Fun fact: The twinning at Omni continues. Terry isn’t a twin himself, but he’s the proud dad of twin daughters and son!
It’s difficult to measure document extraction accuracy!
Our benchmark compares the OCR / extraction JSON to the ground truth JSON, and then calculates the number of JSON differences divided by the total fields in the ground truth JSON.
Comment with providers you’d like to see.
Introducing the Omni OCR Benchmark, the most comprehensive evaluation of OCR tools.
We evaluated traditional OCR providers and multimodal LLMs across 1,000 documents for accuracy, cost, and speed with an open-source approach.
See how each provider ranks: getomni.ai/ocr-benchmark
What do you do when someone sends you a PDF with 12,640 rows you've got to extract? OmniAI automates it in minutes, no code!
Companies spend hundreds of engineering hours figuring out how to parse that data.
We can’t stop PDFs, but at least we can turn them into real data!
We just added Gemini 2.0 Flash to Zerox! ⚡️
These are early results from our VLM benchmark. While it still has a ways to go on the accuracy side (about ~80%), it easily beats GPT 4o and other traditional OCR providers like AWS Textract and Unstructured.
And it's cheap!
New year, new space for OmniAI! Today’s our first official day in the new office 🚚 Super excited to make great memories, close big deals, and host amazing events!
We’re still furnishing it, but of course we have all the necessities covered - coffee, monitors, and whiteboards ☕️
Wow, 7 days since the last post and Zerox went from 8,000 to 9,000 stars! 🚀 We didn't even have time to ship the extra features we promised last week 🤣
Coming soon:
- Structured schema extraction
- Edge detection & cropping
- More model options (including Deepseek and Qwen!)
Exciting updates are coming from OmniAI this Q1.
Here’s what’s on the way:
1️⃣ Document fine-tuning
2️⃣ Benchmark
3️⃣ Excel & Google Sheets Plugin
We’re also growing! We’re hiring for both growth and engineering roles.
Can’t wait to share all we’ve been working on.
Happy Q1! 🚀
We hit 7,000 stars on our OCR library! Only 3k to go till the double digits.
It's been awesome seeing the community traction on Zerox. Turns out everyone's got documents!
Up next on the roadmap:
- Structured schema extraction
- Llama 3.2 support
- Dockerized deployment
1 Followers 25 FollowingBuilding cool things with AI & Full-Stack 🚀 |
GenAI • ML • Backend • Frontend |
Researcher @ IIIT | Open to SDE / AI roles |
DMs open 💌
667 Followers 6K FollowingA builder; bitcoin, 2PIC mining tech, 3D printing, AI (agents, art and 'old school' ML); using LLMs since GPT-2
Into photography
http://t.co/ruBeILTzbr (dm for code)
2K Followers 9 FollowingThe No.1 GitHub star history graph on the web.
👉Follow us to discover interesting repos curated by real humans (not 🤖️). Post daily (usually).
7K Followers 1K FollowingOwner of @heyrevia (YC S24) | Father of 2👧👶 | Voice AI Assistant | ex-@Google assistant | autonomous | ᯅ Living on the edge of tech!👨💻
5K Followers 8K Followinggeek, entrepreneur, 'I strictly color outside the lines!', opinions r my own indeed. @ayirpelle , universal handle at this time
58K Followers 977 FollowingSeed. Early. Growth. We invest across consumer and enterprise and have partnered with thousands of inspiring entrepreneurs over the past two decades.
3K Followers 2K Followingcurrently doing things at Mintlify, prev. built a search API (trieve acq. YCW24), sideprojecting a new Patreon at https://t.co/MTSczbZEku, progression fantasy and HN enjoyer
4K Followers 438 FollowingAngel investor and Partner at @OpenCoreVenture. Previously founded WePay (acq. JPMorgan Chase) and Visiting Partner @ @ycombinator
960 Followers 387 FollowingCo-founder & CEO of @duckie_ai (YC W24) - ai support agent for SaaS. Prev eng @netflix, @linkedin. (I yap here please don’t mind🥺)
5K Followers 2K FollowingFinance → Marketing → VC-backed Startup Founder. Hosting parties with real dishes, dancing on a salsa team, and building IRL community worldwide.
848 Followers 405 FollowingCo-founder & CEO @upsolveai (@ycombinator W24) | Customer-facing analytics & reporting as a service, powered by AI | Ex-@PalantirTech | [email protected]