Mixing the Mexican rice and biryani I got from the restaurant with my chicken rice in ratios corresponding to perceived cooking skill
We doing Bayesian model averaging now
With such a large diversity in environments, we will also be able to make the first scaling laws for RL Envs. As the number of Envs increases, the overall number of reinforcements needed per task should decrease, and it should be a metric we track.
We should also track the…
With such a large diversity in environments, we will also be able to make the first scaling laws for RL Envs. As the number of Envs increases, the overall number of reinforcements needed per task should decrease, and it should be a metric we track.
We should also track the…
They should have mandatory hearing exams for Starbucks employees so when I say "whole milk" for the 30th time in a row they don't put "oat milk" instead
attention sinks may be a bias in causal transformers.
as some of you know, i've been writing a long blogpost on attention and its properties as a message-passing operation on graphs. while doing so, i figured i might have found an explanation for which attention sinks may be an…
I thought this said body building and was excited for him but then I remembered it's kalo and he's washed and dumb so it'd never happen and I reread and confirmed my hypothesis
I thought this said body building and was excited for him but then I remembered it's kalo and he's washed and dumb so it'd never happen and I reread and confirmed my hypothesis
My favorite bit of twitter psychology is that polls always have substantially more votes than likes, even on non-political or opinionated polls.
Somehow, basically every single one of us has a "if I like this post I should interact with it exactly once" algorithm for…
idk who needs to here this but your ML frameworks should be imperative. For every trainer.train() your inner light dims a little. "for data in dataset" or you're bad at programming
did you know you can log arbitrary HTML into wandb
this has finally solved my "how are you supposed to log RL rollouts" woes, at the expense of being really cursed
The ether is real btw there is an essence that permeates all physical space and natural things. When data passes from physical space to cyberspace it loses some of this. Further digital operation on the data can preserve many things but not essence. Things created purely in…
life update: for those who don’t know, i joined Unemployed Inc. 22 years ago to work on closed source AGI. Incredibly excited about what I'm building 🚀
life update: for those who don’t know, i joined Unemployed Inc. 22 years ago to work on closed source AGI. Incredibly excited about what I'm building 🚀
114 Followers 105 Followingpoetic militant, literally iconic. formally proven to be the greatest logician of all time; retains the right to gloat. parlor illusionist @redsecretaire.
31 Followers 205 Following1+ year uni student sabbatical, restaurant busser, quant trading (larper), tech building in public (never started), whimsical & delulu. Non ducor, duco.
242 Followers 481 FollowingPhD student at @EPFL🇨🇭 working on improved understanding of deep neural networks and their optimization. Previously did NN training @Tesla_AI @CerebrasSystems
23 Followers 5K FollowingLike to try new things you never know; trying to prove all software can be automated 😅 😅 😅
| ML/AI, | C++/Java/Go |
GitHub : Dyl777
22K Followers 52 FollowingCommunity account for sharing ClaudeCode related projects and releases. Views/shares independent from @AnthropicAI positions.
2K Followers 529 FollowingAssistant Professor at @TelAvivUni and Research Scientist at @GoogleResearch; previously postdoc at @GoogleDeepMind and @allen_ai
29K Followers 1K FollowingAI, national security, China. Part of the founding team at @CSETGeorgetown (opinions my own). Author of Rising Tide on substack: https://t.co/LKAoyL00iB
12K Followers 3K FollowingPhD student @MIT_CSAIL & cooking @thinkymachines.
Working on scalable and principled algorithms in #LLM and #MLSys. In open-sourcing I trust 🐳.
she/her/hers
5K Followers 7 FollowingInteractive AI explainers.
Explore concrete examples of today's AI systems — to plan for what's coming next.
A project of @sage_future_
4K Followers 861 FollowingMember of Technical Staff at @AnthropicAI. Making Claude more reliable. Matt Levine and Scott Alexander fan. Prev SRE @Google and FDE @PalantirTech.
20K Followers 452 Followingphysics of language models @ Meta (FAIR, not GenAI)
🎓:Tsinghua Physics — MIT CSAIL — Princeton/IAS
🏅:IOI x 2 — ACM-ICPC — USACO — Codejam — math MCM
4K Followers 20 FollowingAt Essential AI, we're building an open platform to democratize frontier AI capabilities and accelerate breakthroughs globally through collaborative science.
No recent Favorites. New Favorites will appear here.