PhD student @VT_CS, supervised by @tuvllms. Interested in search-augmented LLMs. Ex AI resident @VinAI_Researchthinhphp.github.io Blacksburg, VAJoined July 2023
New paper! We explore a radical paradigm for AI evals: assessing LLMs on *unsolved* questions.
Instead of contrived exams where progress ≠ value, we eval LLMs on organic, unsolved problems via reference-free LLM validation & community verification. LLMs solved ~10/500 so far:
Excited to share that our paper on efficient model development has been accepted to #EMNLP2025 Main conference @emnlpmeeting. Congratulations to my students @linusdd44804 and @Sub_RBala on their first PhD paper! 🎉
Excited to share that our paper on efficient model development has been accepted to #EMNLP2025 Main conference @emnlpmeeting. Congratulations to my students @linusdd44804 and @Sub_RBala on their first PhD paper! 🎉
A few weeks ago, I started a new job at @OpenAI. I wrote a document about my interview process and recommendations for anyone on the job market for AI research positions. I hope it's helpful!
docs.google.com/document/d/1ZV…
1/n I’m thrilled to share that our @OpenAI reasoning system scored high enough to achieve gold 🥇🥇 in one of the world’s top programming competitions - the 2025 International Olympiad in Informatics (IOI) - placing first among AI participants! 👨💻👨💻
Most search models need the cloud.
II-Search-4B doesn’t.
4B model tuned for reasoning with search tools, built for local use.
Performance of models 10x its size.
Search that is small, smart, and open.
🥳Congrats @ii_posts for an impressive result on SEAL-0, a challenging benchmark for search-augmented LLMs.
🤩Looking forward to the evaluation standards it shapes in this field.
📚Read more: arxiv.org/abs/2506.01062
🥳Congrats @ii_posts for an impressive result on SEAL-0, a challenging benchmark for search-augmented LLMs.
🤩Looking forward to the evaluation standards it shapes in this field.
📚Read more: arxiv.org/abs/2506.01062
. @EMostaque came back on the show to chat about:
--how we can't compete against AI agents
--his solution for a POSITIVE AI world
--Why UBI won't work but UBAI might..
--we need to be focused on incentivizing the right outcomes
-- Nations need sovereign AI stacks or…
We just released the evaluation of LLMs on the 2025 IMO on MathArena! Gemini scores best, but is still unlikely to achieve the bronze medal with its 31% score (13/42). 🧵(1/4)
Tokenization has been the final barrier to truly end-to-end language models.
We developed the H-Net: a hierarchical network that replaces tokenization with a dynamic chunking process directly inside the model, automatically discovering and operating over meaningful units of data
🔥 SEAL-0 Leaderboard 📈
Our results on SEAL-0 show a large room for improvement in LLMs' ability to reason over conflicting evidence. 🤯
👉Checkout our paper: arxiv.org/abs/2506.01062
👉Dataset: huggingface.co/datasets/vtllm…
✨ New paper ✨
🚨 Scaling test-time compute can lead to inverse or flattened scaling!!
We introduce SealQA, a new challenge benchmark w/ questions that trigger conflicting, ambiguous, or unhelpful web search results. Key takeaways:
➡️ Frontier LLMs struggle on Seal-0 (SealQA’s…
17K Followers 6K FollowingNeurodivergent physics student with a keen interest in multisensory integration and emergent perception. Exploring research on a proposed ‘sixth sense’. Δ
3K Followers 6K FollowingHumanist technologist and AI optimist. Currently CTO at @welcomeaccount_. Building for an inclusive economy through #AI, #MachineLearning, and #Tech4Good
83K Followers 8K FollowingCompiling in real-time, the race towards AGI.
🗞️ Don't miss my daily top 1% AI analysis newsletter directly to your inbox 👉 https://t.co/6LBxO8215l
432 Followers 4K FollowingAI Enthusiast | Trying to launch own AI app, currently @ Building something cool, Be Kind and Curious #AGI #ASI
#Bitcoin #NRG Forever
522 Followers 7K FollowingFounder @Setica —
🌐 https://t.co/k41rINekVX. Alien on planet Earth. Ai researcher and Indie Developer(web & apps).Building Ai models and Ai agents and Saas Apps
2K Followers 2K FollowingNLP postdoc at @SheffieldNLP
Ex @Imperial_NLP PhD, @Apple AI/ML Scholar, @UCL MSc
Model robustness and now uncertainty quantification
242K Followers 21 FollowingWe’ll help you make it like nobody’s business. Multimodal media generation and editing tools to get your idea to production. Self-deploy? 👍 Need a partner? 🤝
23K Followers 110 FollowingMathematician, @UCBerkeley professor, author of LOVE & MATH (published in 20 languages), host of AfterMath series on YouTube, music expolorer as DJ Moonstein
54K Followers 0 FollowingWe are building a world class AI R&D company in Tokyo. We want to develop AI solutions for Japan’s needs, and democratize AI in Japan. https://t.co/1q07mb3TzE
14K Followers 138 FollowingCofounder/CEO @Genspark_ai | Serial entrepreneur, built business from 0 to $5.5B | Ex-CPO @Baidu Search, Ex-Principal Dev Mgr @Microsoft Bing
19K Followers 11 FollowingBot. I daily tweet progress towards machine learning and computer vision conference deadlines. Maintained by @chriswolfvision.
19K Followers 1K FollowingAgents @Meta MSL TBD Lab. previously posttraining research @OpenAI train LLMs to do things: deep research, chatgpt agent, etc. CS PhD @LTIatCMU
37K Followers 565 FollowingAssistant professor at Stanford; Co-founder of Voyage AI (https://t.co/wpIITHLgF0) ;
Working on ML, DL, RL, LLMs, and their theory.
26K Followers 876 FollowingResearch Scientist Director in Meta FAIR. Reasoning, Optimization and Understanding LLM. Novelist in spare time. PhD in @CMU_Robotics.
4.3M Followers 3 FollowingOpenAI’s mission is to ensure that artificial general intelligence benefits all of humanity. We’re hiring: https://t.co/dJGr6Lg202
45K Followers 1K FollowingAI Developer Experience @GoogleDeepMind | prev: Tech Lead at @huggingface, AWS ML Hero 🤗 Sharing my own views and AI News 🧑🏻💻 https://t.co/7IosdlNz22
20K Followers 9K FollowingProgramme Director @ARIA_research | accelerate mathematical modelling with AI and categorical systems theory » build safe transformative AI » cancel heat death
No recent Favorites. New Favorites will appear here.