Latest blog on teaching VLMs to understand fine-grained objects.
VLM-FO-1 equips a novel object-enhanced vision tower, achieving remarkable object understanding performance with only 3B parameters. Larger models and RL-enhanced coming soon.
om-ai-lab.github.io/2025_08_15.html
1/3: 🚀 Thrilled to share VLM-R1’s latest results! After hitting SoTA in REC & Math, we’ve supercharged RL for open vocab detection (OVD).
TL;DR: With the right rewards, RL-powered VLM nails SoTA on OVD + sparks cool "aha" moments.
Dive in: om-ai-lab.github.io/2025_03_20.html
🌟 VLM-R1 just got SUPERCHARGED!🚀
🔥 Multi-Node Training for GRPO: Scale training across clusters! Tackle massive vision-language tasks 2x faster with our new multinode_training_demo.sh script.
🎛️ Fine-Grained Parameter Control: Tweak num_iterations for high-precision…
🚀 OmAgent v0.2.4 is here with exciting new features!
🔹 OmAgent Lite mode: No more dependency on Conductor or other middleware! It’s fully Python-based and supports local execution. Just set OMAGENT_MODE=lite to get started.
🔹 All examples now default to Lite mode – no need…
We add a HF demo space to show case the reasoning path. Although not perfect yet, some reasonable rational does emerge from the R1 learning.
huggingface.co/spaces/omlab/V…
Introducing VLM-R1!
GRPO has helped DeepSeek R1 to learn reasoning. Can it also help VLMs perform stronger for general computer vision tasks?
Our preliminary answer is YES and it generalizes better than SFT.
github.com/om-ai-lab/VLM-…
OPENAI ROADMAP UPDATE FOR GPT-4.5 and GPT-5:
We want to do a better job of sharing our intended roadmap, and a much better job simplifying our product offerings.
We want AI to “just work” for you; we realize how complicated our model and product offerings have gotten.
We hate…
In EMNLP, I attended a great agent tutorial hosted by @ysu_nlp@Diyi_Yang@ShunyuYao12 , learned a lot about language agents and the progress in agentic ops, e.g. ReAct, CoT etc. How do these methods perform given the same model? I did a study to find out! github.com/om-ai-lab/open…
I just created this cool RPG text-based game with a dynamic storyline that adapts based on your choices — and it only took me 5 minutes! ⏳
While it looks complex, OmAgent made it incredibly easy to build and customize.
Best part? You can connect it to your phone and interact…
🚀 OmAgent now supports calling DeepSeek r1 through @ollama!
This means you can integrate DeepSeek’s incredible model into your workflows seamlessly using OmAgent.
🎉Thanks to @deepseek_ai for training such an excellent model at such a low cost, and it's completely…
Came across some amazing work by @OmAI_lab! 🔥
✨ Open Agent Leaderboard: Track the best AI agents in one place!
huggingface.co/spaces/omlab/o…
✨ OmAgent: A multimodal agent framework for video understanding, handles everything from CCTV to full-length films 🎥…
24K Followers 6K FollowingCreating tools to help people work smarter. Built yfinance (3M downloads/mo). CTO as a Service @automazeio
My newsletter 👉 https://t.co/rIRlqaXIcZ
96 Followers 896 FollowingHuman, not AI. Algorithm engineer for large models → Algorithm engineer for embodied robot multimodal systems.
“São” is derived from Portuguese, meaning “sacred
451 Followers 6K FollowingGiving meaning to mine share of star dust. Visiting fellow @WinshipAtEmory. Prev at @oracle, @maddox_ai, @KITKarlsruhe, @_nference, @val_iisc, @iitdelhi.
499 Followers 7K FollowingSharing insights on Al, Tech Tools, and Productivity | Helping people monetize with Al | Al Educator & Writer @theaidaily_|DM for Collab
1.3M Followers 1K FollowingCo-Founder of Coursera; Stanford CS adjunct faculty. Former head of Baidu AI Group/Google Brain. #ai #machinelearning, #deeplearning #MOOCs
169K Followers 163 FollowingAI & Marketing Consultant 📢 Former CMO 📒 Get My Free Guides: https://t.co/UjSQZDlQ3N 📧 [email protected] ➡️ Follow for AI & business growth tips
72K Followers 60 FollowingAI & Web Dev Enthusiast | AI Explorer | Personal Branding | Web Development | Ghostwriting | 📩 Email for Collaborations: [email protected]
46K Followers 634 FollowingSharing AI, tech, and productivity tips to help you grow 1% daily | Breaking down AI tools & no-code workflows | [email protected]
39K Followers 355 FollowingX Creator | AI Educator | Techie | Sharing practical AI tips, tools & insights to help you & your business grow. Real-world solutions, clear takes, hands-on.
29K Followers 141 FollowingHelping ambitious Founders and CEOs create meaningful Personal Brands on Twitter | AI & Web Dev Enthusiast | ✉️ DM For Paid Promotion.
63K Followers 457 FollowingFreelance marketer building TechnoBizzVault, where I help professionals discover modern tools so they can be productive without burning out.
193K Followers 9K FollowingHome of everything AI — Here to help you use Al to boost your productivity — Daily insights on AI tools & tips || DM for collaborations [email protected]
132K Followers 1K FollowingPrompt Engineer, dedicated to learning and disseminating knowledge about AI, software engineering, and engineering management.
316K Followers 1K FollowingAI Educator. 𝕏 about AI, solutions and interesting things. Showing how to leverage AI in practical ways for you and your business. Opinions are my own.
44K Followers 297 Followingpurity of thought. be exactly who you are : just a serious man. 思想纯净,做好自己:严肃对待自己的兴趣,不要浮皮潦草。
公众号:howie和小能熊
youtube:https://t.co/J1aSMKnUFo
15K Followers 288 Following🚀 Create awesome apps fast (with nothing but Python)
🔥 Improve your workflow with web apps that anyone can use
🔗 Integrate with any software you can imagine
63K Followers 408 Following🧠 AI Educator | Career Coach | Founder
📧 DM for Collaboration
🚀 Want to Learn & Earn with AI?
🤝 Join our 100k+ AI community & learn AI with 27+ Free Gifts👇