Applications are now open for the next cohort of the Cohere Labs Scholars Program! 🌟
This is your chance to collaborate with some of the brightest minds in AI & chart new courses in ML research. Let's change the spaces breakthroughs happen.
Apply by Aug 29.
Excited to reveal what I've been working on for the last few months. Command-A-Vision is our new flagship 112B VLM that outperforms Llama 4 Maverick, Mistral Medium/Pixtral Large, GPT 4.1, and others. We release weights on HF huggingface.co/blog/CohereLab… and hope you'll like it.
This is very cool. One of the reasons I think muP hasn't caught on is that it is not seamlessly integrated with torch. Optax can make some things annoying, but this one is nice :)
This is very cool. One of the reasons I think muP hasn't caught on is that it is not seamlessly integrated with torch. Optax can make some things annoying, but this one is nice :)
How can we make language models more flexible to adapt to new languages after pretraining? 🌏
🧠 Our latest work investigates whether a tokenizer trained on more languages than the pretraining target can improve language plasticity without compromising pretraining performance.
🔠 UTF-8 was never meant for language models.
Yet every major tokenizer still uses it, creating unfair "byte premiums".
Why should your native script cost more to tokenize? It's time for a change. 🧵👇
I'm excited to share our new pre-print
ShiQ: Bringing back Bellman to LLMs!
arxiv.org/abs/2505.11081
In this work, we propose a new, Q-learning inspired RL algorithm for finetuning LLMs 🎉
(1/n)
1/ Science is only as strong as the benchmarks it relies on.
So how fair—and scientifically rigorous—is today’s most widely used evaluation benchmark?
We took a deep dive into Chatbot Arena to find out. 🧵
1/ Science is only as strong as the benchmarks it relies on.
So how fair—and scientifically rigorous—is today’s most widely used evaluation benchmark?
We took a deep dive into Chatbot Arena to find out. 🧵 https://t.co/8RuGVBXBZH
103 Followers 1K Following*Looking for PhD and jobs!*
MSc in Math from Regensburg; BSc in Math and CS minor from Utah
AI/ML/DL & Extremal combinatorics and theoretical computer science
46K Followers 1K FollowingWriter https://t.co/TquuQXlLOJ. O'Reilly Author https://t.co/Fl3uPAZHLg. LLM Builder @Cohere. Visualizing AI one concept at a time.
112 Followers 66 FollowingFormer Senior AI researcher @Aleph__Alpha
EVE Online player since 2013
Co-Founder Pageshift Entertainment - Building the worst best story telling AI
1K Followers 614 FollowingMember of Technical Staff in Retrieval-Augmented Generation Team @cohere, previously PhD in neural Information Retrieval @tu_wien
103 Followers 1K Following*Looking for PhD and jobs!*
MSc in Math from Regensburg; BSc in Math and CS minor from Utah
AI/ML/DL & Extremal combinatorics and theoretical computer science
7K Followers 652 FollowingResearch Scientist @AIatMeta
Previously Researcher @ Samsung AI
Outstanding Paper Award @icmlconf 2023
Action Editor @TmlrOrg
I tweet about ML papers and math
1K Followers 301 FollowingAsst Professor at @JohnsHopkins (@JohnsHopkinsAMS and @HopkinsDSAI). Previously: @SimonsInstitute, @oxfordstats, @Polytechnique. I like to scale up things!
6K Followers 695 FollowingRL Scientist @OpenAI. Prev. co-founder @diffeo, acquired by @salesforce // co-authored The Principles of Deep Learning Theory // studied gravity.
112 Followers 66 FollowingFormer Senior AI researcher @Aleph__Alpha
EVE Online player since 2013
Co-Founder Pageshift Entertainment - Building the worst best story telling AI
18K Followers 4K FollowingAssociate Professor at UC Berkeley. Former Research Scientist at Google DeepMind. ML/AI Researcher working on foundations of LLMs and deep learning.
4K Followers 20 FollowingAt Essential AI, we're building an open platform to democratize frontier AI capabilities and accelerate breakthroughs globally through collaborative science.
3K Followers 216 FollowingNeural network speedrunner and community-funded open source researcher. Set the CIFAR-10 record several times. Send me consulting/contracting work!
1K Followers 614 FollowingMember of Technical Staff in Retrieval-Augmented Generation Team @cohere, previously PhD in neural Information Retrieval @tu_wien
46K Followers 2K FollowingSenior correspondent covering AI @WIRED • Subscribe to my newsletter https://t.co/jxLAFHz8UP • Robison (rah-beh-son) not Robinson • Send tips on Signal @ kylie.01
30K Followers 123 FollowingMechanistic Interpretability lead DeepMind. Formerly @AnthropicAI, independent. In this to reduce AI X-risk. Neural networks can be understood, let's go do it!