It's very difficult to improve the *exponent* in scaling laws for loss vs compute, especially by changing the optimizer!
Our new paper shows that scaling momentum correctly can *provably* improve the scaling exponent on a theoretical model. Empirically, it works on LSTMs too!
1K Followers 4K FollowingAutomate the Neighborhood! Abundance on my block. How can we leverage AI in our neighborhoods to improve quality of life right now?
189 Followers 5K FollowingLife is Beautiful. :)
|Storyteller/Amateur Writer|ML/DL|He/Him
P.S. If you feel blue and you would like to talk to someone, feel free to DM, I will be there!
659 Followers 4K Following@Tesla_AI | Prev. Tesla Dojo, VideoML @GoogleDeepMind, QKeras @GoogleAI & @Youtube Argos, PhD @ucsd_cse, EECS信科 @PKU1898. love 🏀. Opinions are my own
1 Followers 428 FollowingNothing makes me special. My career is in AI domain (an Applied Scientist), and I question a lot, which makes me confused but also brings clarity later on.
15K Followers 6K FollowingI build tough benchmarks for LMs and then I get the LMs to solve them. SWE-bench & SWE-agent. Postdoc @Princeton. PhD @nlpnoah @UW.
5K Followers 194 FollowingPodcast, courses, and resources exploring emotional fluidity, transformation, and the journey of self-discovery in the modern world. @FU_joehudson
271 Followers 370 FollowingTheoretical physicist (bow tie included), inherently out of equilibrium. Studying data structure and deep learning. Marie Skłodowska-Curie fellow at @SISSA.
798 Followers 692 Followingtechnical staff @openai, previously theory @berkeleyeecs, eng @twosigma, math @princeton | fan of graphs, crosswords, turtles, bad puns, running, and Survivor
462 Followers 369 Followingweightlifting 🏋️ & AI - GDM, previous Anthropic, previous pretraining/data research of Gemini at Google Deepmind. Only represents my personal opinions.
6K Followers 2K FollowingCS PhD Student at Stanford Trustworthy AI Research with @sanmikoyejo. Prev interned/worked @ Meta, Google, MIT, Harvard, Uber, UCL, UC Davis
514 Followers 2K FollowingPrev. IBM Realtime Linux, @AWS, @Quora, Argo AI, @CloudKitchens. Now Product Search and Discovery at @Coupang. Opinions are my own and not of my employers.
2K Followers 397 FollowingResearch Scientist @ Google DeepMind
Building memory & open-ended AI
ex-neuroscientist
ex-IMO team Canada
Views are mine alone not GDM's.
16K Followers 495 FollowingHarvard Professor.
Full stack ML and AI.
Co-director of the Kempner Institute for the Study of Artificial and Natural Intelligence.
57K Followers 619 FollowingDistinguished Professor (Emeritus), Oregon State Univ.; Former President, Assoc. for the Adv. of Artificial Intelligence; Robust AI & Comput. Sustainability
4K Followers 2K FollowingResearcher at @MSFTResearch. Prev: PhD at @Mila_Quebec, intern at @Apple MLR and FAIR Labs @MetaAI, math undergraduate at @PKU1898.
No recent Favorites. New Favorites will appear here.