Cohere intends to acquire Perplexity immediately after their acquisitions of TikTok and Google Chrome.
We will continue to monitor the progress of those deals closely so we can submit our term sheet upon completion.
Thoughts: While LLMs provide powerful priors for RL, many recent studies show that simply narrowing the model's output distribution can improve performance, but this also exhausts the model's potential for further exploration and improvement.
*Is it a blessing or a curse?* 🤔
I want to applaud @OpenAI on releasing GPT-4.5. It's not a benchmark beater, and they released it anyway.
That takes some courage on their part, because they will get a lot of dumb criticism on eval scores.
(If you think it needs to top evals to be valuable, you are wrong.)
Reddit grandfather uploads 27 year old EXE file of a visual basic game and Claude one-shotted recreating the game in Python in under 5 minutes!!
From the binary.
Excited to be with the team in NYC today rolling out the new Alexa+.
Across Amazon, we’re harnessing the transformative power of GenAI to reimagine the experiences we offer customers, and Alexa+ is the latest example.
She’s smarter, more capable, more personalized, and unlike…
Say hello to Alexa+. Need dinner plans? She'll book your favorite restaurant, grab an Uber, and text your sitter — all in one conversation. Want concert tickets? She'll scout for the best prices. Need to check if the garbage went out? She'll find that exact Ring clip in seconds.…
Let me add a bit context to the latest DeepSeek code release as I feel it was a bit bare bones.
Mixture-of-Experts (MoE) is a simple extension of transformers which is rapidly establishing itself as be the go-to architecture for mid-to-large size LLM (20B-600B parameters).
It…
Let me add a bit context to the latest DeepSeek code release as I feel it was a bit bare bones.
Mixture-of-Experts (MoE) is a simple extension of transformers which is rapidly establishing itself as be the go-to architecture for mid-to-large size LLM (20B-600B parameters).
It…
🚀 Day 2 of #OpenSourceWeek: DeepEP
Excited to introduce DeepEP - the first open-source EP communication library for MoE model training and inference.
✅ Efficient and optimized all-to-all communication
✅ Both intranode and internode support with NVLink and RDMA
✅…
🚀 Day 0: Warming up for #OpenSourceWeek!
We're a tiny team @deepseek_ai exploring AGI. Starting next week, we'll be open-sourcing 5 repos, sharing our small but sincere progress with full transparency.
These humble building blocks in our online service have been documented,…
After 6+ months in the making and burning over a year of GPU compute time, we're super excited to finally release the "Ultra-Scale Playbook"
Check it out here: hf.co/spaces/nanotro…
A free, open-source, book to learn everything about 5D parallelism, ZeRO, fast CUDA kernels,…
imo the improvements on FrontierMath are even more impressive than ARG-AGI. Jump from 2% to 25%
Terence Tao said the dataset should "resist AIs for several years at least" and "These are extremely challenging. I think that in the near term basically the only way to solve them,…
imo the improvements on FrontierMath are even more impressive than ARG-AGI. Jump from 2% to 25%
Terence Tao said the dataset should "resist AIs for several years at least" and "These are extremely challenging. I think that in the near term basically the only way to solve them,…
@_philschmid@amazon Hmmm, it's actually not bad. Tried my standard martian railgun test.
Prompt:
calculate how long a mass driver rail would need to be to accelerate people comfortably at max 2Gs on mars travelling along the slope of and launching from the top of mount olympus mons and what speed…
Excited to share what the team has been cooking up recently. A few more big things on the horizon! 👀
PS: Please excuse the error with bold numbers in the scores.
Excited to share what the team has been cooking up recently. A few more big things on the horizon! 👀
PS: Please excuse the error with bold numbers in the scores.
417 Followers 176 FollowingPhD in ML, now AI Research Lead in 🇱🇺. Here mostly AI, including sharing paper reviews. Chess, philosophy, and a travel pic may appear. Opinions are my own.
2K Followers 839 FollowingAssistant Professor at @BristolUni, PhD from @UCL, prev. intern in @TikTok & @Microsoft. ✨ Reinforcement Learning, Causality, World Models.
3K Followers 3K FollowingFixing machine learning @ https://t.co/x06CbGClKL. There is no AGI, without energy-based models. As seen on HN: https://t.co/WpbTAjLvPv
17K Followers 929 FollowingCo-founder and CTO of @CoreViewHQ GenAI/LLM addicted, Apple MLX, Microsoft 365, Azure, Kubernetes, Investor in innovation and Mensa member.
3K Followers 342 FollowingI’m a software engineer building high-performance kernels and compilers at Anthropic! Previously at Facebook/Meta (PyTorch, HHVM, ReDex)
5K Followers 2K FollowingAssistant Prof @CIS_Penn and ML Researcher at @Apple (MLR) | exFAIRer | PhD @HKUniversity | Research on Generative AI for multimodal. また日本語もできます。
4K Followers 2K FollowingResearcher at @MSFTResearch. Prev: PhD at @Mila_Quebec, intern at @Apple MLR and FAIR Labs @MetaAI, math undergraduate at @PKU1898.
49K Followers 9K FollowingI lead @Cohere_Labs. Formerly Research @Google Brain @GoogleDeepmind. ML Efficiency at scale, LLMs, ML reliability. Changing spaces where breakthroughs happen.
7K Followers 805 FollowingIJCAI conference is premier international gathering of researchers & practitioners in AI since 1969.🗓️ #IJCAI2025❗16-22 August 2025❗Montreal 🇨🇦
3K Followers 2K FollowingResearch Scientist at Meta. 10-yr test-of-time ACL 22, Best Demo ACL 25, Best Resource Paper ACL 24, Best Theme Paper ACL 24, Best Student Paper NAACL 15 🏳️🌈