Andrew Carr (e/🤸) @andrew_n_carr
science @getcartwheel AI writer @tldrnewsletter advisor @arcade_ai Past - Codegen @OpenAI, Brain @GoogleAI, world ranked Tetris player andrewnc.github.io RuntimeError: shape is invalid Joined July 2015-
Tweets6K
-
Followers15K
-
Following3K
-
Likes23K
Every serious AI company should have regular data labeling pizza parties.
This makes sense. Reka performs best on a video task we have. Easily beating Gemini 1.5 and other models.
This makes sense. Reka performs best on a video task we have. Easily beating Gemini 1.5 and other models.
Here's a great read by @douglasahorr on understanding Transformers by looking at Gemma 2B graphcore-research.github.io/posts/gemma/
We are excited to have @andrew_n_carr present at the next AWS Utah event. meetup.com/aws-utah/event…
We're all going to look foolish if GPT-4 was just some gpt2-chatbots in a trenchcoat...
How would you run an efficient forward pass for a model such as this?
Many don't know that GPUs automatically leverage ternary and fine-grained sparsity to accelerate your matmuls! e.g. A matmul with ternary + 90% sparsity results in 33% more FLOPs in my benchmark. (not joking) I explore this "optimization" here: thonking.ai/p/strangely-ma… (1/3)
huggingface.co/blog/sc2-instr… We finally released StarCoder2 Instruct! SC2-Instruct is the very first entirely self-aligned code LLM trained with a fully permissive and transparent pipeline. On benchmarks, we are beating even versions of StarCoder2 trained on GPT-4 distilled data!
gpt2-chatbot is another low key research preview?
We're still growing our Python team! I've heard some talk of layoffs at Google in this space, so, please retweet, and share with anyone you know who might be affected.
We're still growing our Python team! I've heard some talk of layoffs at Google in this space, so, please retweet, and share with anyone you know who might be affected.
maybe a dumb question, but why rotary_embeddings IN the attention module? github.com/pytorch/torcht…
A great article, but it's important to note that star count has no correlation to code quality and final model humaneval scores.
A great article, but it's important to note that star count has no correlation to code quality and final model humaneval scores. https://t.co/xwsRFrGgBy
"How to Start Google" (paulgraham.com/google.html) was a talk at my older son's school. It was one of a series of talks about careers. I just heard the students voted it the best one, and I'm happier than if I'd won some prestigious award.
Two free medium-compute Mixture-Of-Experts research ideas: Prerequisite: Mixtral 8x7B is 32 layers, at each layer there are 8 experts, each token is assigned to 2 experts at a given layer. 1) Dynamic Expert Assignment in MoE Models Every token is assigned to 2*32=64 experts in…
We're on our first VC market map 🤸
How do the unsloth and axolotl guys not have a billion job offers?!
I've just released llamafile v0.8 which features LLaMA3, Mixtral 8x22b, and Grok support. It goes 25x faster than ollama at running LLaMA3 70B on CPU. My new tensor multiplication kernels let llamafile eval MoE models 2x faster than llama.cpp github.com/Mozilla-Ocho/l…
Sebastian Raschka @rasbt
267K Followers 906 Following Machine learning & AI researcher writing at https://t.co/A0tXWzG1p5. LLM research engineer @LightningAI. Previously stats professor at UW-Madison.Alfredo Canziani @alfcnz
86K Followers 268 Following Musician, math lover, cook, dancer, 🏳️🌈, and an ass prof of Computer Science at New York UniversityEric Jang @ericjang11
69K Followers 3K Following physical AGI at 1X. Author of "AI is Good for You" https://t.co/eFg4WXhg0p(((ل()(ل() 'yoav))).. @yoavgo
46K Followers 2K FollowingSoumith Chintala @soumithchintala
187K Followers 884 Following Cofounded and lead @PyTorch at Meta. Also dabble in robotics at NYU. AI is delicious when it is accessible and open-source.Jeremy Howard @jeremyphoward
222K Followers 5K Following 🇦🇺 Co-founder: @AnswerDotAI & @FastDotAI ; Hon Professor: @UQSchoolITEE ; Digital Fellow: @StanfordRosanne Liu @savvyRL
33K Followers 969 Following Cofounded & running @ml_collective. Host of Deep Learning Classics & Trends. Research at Google DeepMind. DEI/DIA Chair of ICLR & NeurIPS. Writing https://t.co/IbycyGfnDRHorace He @cHHillee
24K Followers 450 Following Working at the intersection of ML and Systems @ PyTorch "My learning style is Horace twitter threads" - @typedfemaleOmar Sanseviero @osanseviero
32K Followers 2K Following Chief Llama Officer @huggingface 🦙 Founder @AI_Learners. Xoogler (SWE @Google Assistant, 20% PM TF Graphics). 100% Hacker Llama🇵🇪🇲🇽clem 🤗 @ClementDelangue
91K Followers 5K Following Co-founder & CEO @HuggingFace 🤗, the open and collaborative platform for AI buildersJulien Chaumond @julien_c
47K Followers 1K Following Co-founder and CTO at @huggingface 🤗. ML/AI for everyone, building products to propel communities fwd. @Stanford + @PolytechniqueHamel Husain @HamelHusain
23K Followers 2K Following Researcher focusing on LLMs: https://t.co/iVZDFdIQiE Previously, dev tools and infra for ML. Ex @Github, @Airbnb, @DataRobot. @fastdotai core contributor.Miles Brundage @Miles_Brundage
43K Followers 10K Following Policy research at @openai. I mostly tweet about AI, animals, and sci-fi. He/him. Views my own.Dan Roy @roydanroy
45K Followers 2K Following ML / AI researcher, emphasis on theory. Research Director and Canada CIFAR AI Chair, @VectorInst Professor, @UofT (Statistics/CS)Eugene Vinitsky @EugeneVinitsky
13K Followers 2K Following Lets make multi-agent learning easy. Anti-cynic. RS at Apple, Asst. Prof at @nyutandon. He/him. Anonymous feedback: https://t.co/Mmmg7uPm1tSara Hooker @sarahookr
39K Followers 8K Following I lead @CohereForAI. Formerly Research @Google Brain @GoogleDeepmind. ML Efficiency at scale, LLMs, @trustworthy_ml. Changing spaces where breakthroughs happen.Shane Gu @shaneguML
28K Followers 1K Following Research Scientist & Manager @GoogleDeepMind Tokyo/MTV. ex: @GoogleAI Brain, @OpenAI. (JP: @shanegJP)Aleksa Gordić 🍿�.. @gordic_aleksa
19K Followers 217 Following https://t.co/mcuQvV8wEa proud father of 16 A100s & 16 H100s flirting with LLMs, tensor core maximalist x @GoogleDeepMind @MicrosoftSanyam Bhutani @bhutanisanyam1
35K Followers 994 Following 👨💻 Sr Data Scientist @h2oai | Previously: @weights_biases 🎙 Podcast Host @ctdsshow 👨🎓 International Fellow @fastdotai 🎲 Grandmaster @KaggleMarko Krema @markokrema
7 Followers 586 Followingbrownie @catgirlpitou
59 Followers 237 FollowingKimbo Chen @kimbochen
147 Followers 363 Following Writes about ML systems, compilers, and hardware acceleratorsdino_dna @dino_dna_
507 Followers 4K Followingaz0th @Az0thSZ
123 Followers 2K FollowingBoonoathoo @boonoathoo52090
15 Followers 20 FollowingRajat Patel @patel_raj55
49 Followers 550 Following ML Engineer I retweet to keep notes of important stuff I find hereDeatou @Deatou166245
4 Followers 276 FollowingChris Watkins @cwatkins346
747 Followers 2K Following Sales Manager @ Usher Inc. *Generac Dealer/ Asst Fball Coach @ Mayfield High School/ United States Marine Veteran #0311Joey (e/λ) @shxf0072
2K Followers 388 Following I speak fluent Python and Sarcasm. researcher at @NousResearchWeyland @weymarjohansson
454 Followers 2K Following the antics of this fly alarm me: can an eagle tell a lie?Morgan Smith @mmsmithlegal
636 Followers 687 Following Corporate Attorney. Reps Startups. Biz. VC/PE. Loves history, finance. Fluent English & French. DMs open. Business Inquiries: [email protected]Ledell Wu @LedellWu
701 Followers 247 Following AI Research Scientist (Generative AI/LLM/Multimodal) Co-founder @CreatifyLab, Past: FAIR @MetaAI, @BAAIBeijing Recipient of ICML 2023 Test-of-Time AwardKangwook Lee @Kangwook_Lee
2K Followers 676 Following Assistant Professor, ECE, UW-Madison / Leading deep learning research @ KRAFTONはにたすみえ @hanitasumi84458
6 Followers 291 FollowingWeaviate • vector d.. @weaviate_io
12K Followers 3K Following The easiest way to build and scale AI applications. 🐙 https://t.co/9ZP8iC4iFd 📰 https://t.co/XiFW3Ks5fKNicolas Keller @Nicolas_Keller
851 Followers 5K Following Interested in science-based startups. Having the time of my life @meshcapade; angel investor; ex Vsquared Ventures, ex @FRANKAROBOTICS; @iGEM alumnusHenri Virta @HenriVirta
49 Followers 472 FollowingBitsky @bitsky7
286 Followers 4K Following Not financial advice for entertainment purposes only. #dyorSeiiti Arata ⚡️ @seiitiarata
23K Followers 4K Following perfil hackeado by ÐēēzИц†ž, somente shitpost. se quiser conteúdo oficial sério visite o Nostr ou YouTubeɢʀɛǟȶK̶i̶n̶g�.. @GreatKingCnut
468 Followers 2K Following But the sea came up as usual and disrespectfully drenched the king's feet and shins. I want the good ending pls, not the bad one. transhumanist, ML, RL, lmaoJenni Leder (⌐■_�.. @thoughtbrain
1K Followers 624 Following Lead Designer at @_bottlerocket. Formerly Design @vevo, @StorehouseHQ. Mayor of Awesometown. Kerning keeps me from spacing out. she/her -- mastodon link in bioLucas Pickup (yes, Pi.. @lupickup
223 Followers 1K Following We're trying to have fun and learn together, everything said is not forever nor is it “correct” | MSFT Azure AI 🦀🐍 | Trying to get AI to go brrrrrrTelt 🍕 @twofifteenam
4K Followers 2K Following something new | former L7 FAANG manager (now L8) | gpu rich | xgboost appreciatoorpuf @pub_uni_friend
66 Followers 219 FollowingAlejandro Maza @alehandromz
2K Followers 978 Following Matemáticas y economía, trabajo en @OPI_Global. Todo bien.Itamar Ravid @iravid_
2K Followers 2K Following Building @scroll_ai. Prev: @coralogix, @hunters_ai, @zivergetech, Riskified, @bigpanda. @zioscala contributor.Michael DeMaria @demaria_michael
370 Followers 2K Following Building something new | Previously at Serent Capital, Union Square, & MoelisSean Hughes @hughesthe1st
555 Followers 269 Following AI Ecosystem @ServiceNow @ServiceNowRSRCH @BigCodeProject #TheAIAlliance - formerly @IntelAI @ActianCorp @HPE - All tweets are my own opinion.Idris @aloma85
99 Followers 655 Following @northwestern CIS @jhucompsci CS PhD student 🤓 Brazilian Jiu-Jitsu brown belt 🤼♂️ Let’s talk tech, philosophy, and public policy 🗣Rodrigo PU3RNW @rodrigowue
181 Followers 875 Following Someone who loves to learn how everything works. PhD Student in Microelectronics. Coding, Radios, Vintage ElectronicsUzay @uzpg_
1K Followers 1K Following CS / Math / philosophy | Researching how we learn and create // what I want to do with my life | DMs open | MIT | Emergent Ventures | France-USEd @ed_bjr
93 Followers 2K FollowingMuffin Flap @muffinflap
93 Followers 136 FollowingBryan O'Neal @1ALbryan
196 Followers 774 Following Greek among Romans. I run https://t.co/ZnSfoRcyQx among other thingsRuben @RubenWinkl80451
125 Followers 383 FollowingRohekael Part @rohekael
45 Followers 59 FollowingSebastian Raschka @rasbt
267K Followers 906 Following Machine learning & AI researcher writing at https://t.co/A0tXWzG1p5. LLM research engineer @LightningAI. Previously stats professor at UW-Madison.Peyman Milanfar @docmilanfar
67K Followers 264 Following Distinguished Scientist at Google Research. Computational Imaging, Machine Learning, and Vision. Tweets = personal opinions. May change or disappear over time.Alfredo Canziani @alfcnz
86K Followers 268 Following Musician, math lover, cook, dancer, 🏳️🌈, and an ass prof of Computer Science at New York UniversityEric Jang @ericjang11
69K Followers 3K Following physical AGI at 1X. Author of "AI is Good for You" https://t.co/eFg4WXhg0pLucas Beyer (bl16) @giffmana
56K Followers 446 Following Researcher (Google DeepMind/Brain in Zürich, ex-RWTH Aachen), Gamer, Hacker, Belgian. Mostly gave up trying mastodon as [email protected](((ل()(ل() 'yoav))).. @yoavgo
46K Followers 2K FollowingSoumith Chintala @soumithchintala
187K Followers 884 Following Cofounded and lead @PyTorch at Meta. Also dabble in robotics at NYU. AI is delicious when it is accessible and open-source.Clément Canonne @ccanonne_
31K Followers 928 Following Senior Lecturer @Sydney_Uni. Postdocs @IBMResearch, @Stanford; PhD @Columbia. Converts ☕ into puns: sometimes theorems. He/him. @[email protected]PyTorch @PyTorch
380K Followers 77 Following Tensors and neural networks in Python with strong hardware acceleration. PyTorch is an open source project at the Linux Foundation. #PyTorchFoundationJeremy Howard @jeremyphoward
222K Followers 5K Following 🇦🇺 Co-founder: @AnswerDotAI & @FastDotAI ; Hon Professor: @UQSchoolITEE ; Digital Fellow: @StanfordRosanne Liu @savvyRL
33K Followers 969 Following Cofounded & running @ml_collective. Host of Deep Learning Classics & Trends. Research at Google DeepMind. DEI/DIA Chair of ICLR & NeurIPS. Writing https://t.co/IbycyGfnDRHorace He @cHHillee
24K Followers 450 Following Working at the intersection of ML and Systems @ PyTorch "My learning style is Horace twitter threads" - @typedfemaleabhishek @abhi1thakur
81K Followers 664 Following 🤗 I build AutoTrain @huggingface 👨🏽💻 World's First 4x Grand Master @kaggle 🎥 YouTube 100k+: https://t.co/BHnem8fTu5 ⭐ GitHub Starmerve @mervenoyann
56K Followers 4K Following open-sourceress at @huggingface 🧙🏻♀️ proud mediterrenean 🍋 I do TL;DR on ML papersGautam Kamath @thegautamkamath
44K Followers 507 Following Assistant Prof of CS @UWaterloo, Faculty @VectorInst, Canada @CIFAR_News AI Chair. Co-EiC @TmlrOrg. I lead @TheSalonML. Privacy, robustness, machine learning.François Fleuret @francoisfleuret
31K Followers 460 Following Prof. @Unige_en, Adjunct Prof. @EPFL_en, Research Fellow @idiap_ch, co-founder @nc_shape. AI and machine learning since 1994. I like reality.Jürgen Schmidhuber @SchmidhuberAI
107K Followers 0 Following Invented principles of meta-learning (1987), GANs (1990), Transformers (1991), very deep learning (1991), etc. Our AI is used many billions of times every day.Omar Sanseviero @osanseviero
32K Followers 2K Following Chief Llama Officer @huggingface 🦙 Founder @AI_Learners. Xoogler (SWE @Google Assistant, 20% PM TF Graphics). 100% Hacker Llama🇵🇪🇲🇽Milo Vanegas @miloanimates
2K Followers 397 Following Art & Animation. Previous projects: Peridot, Star Wars, Hades, Pyre, Transistor.Piotr Nawrot @p_nawrot
3K Followers 226 Following PhD student in #NLProc @Edin_CDT_NLP | Previously intern @Nvidia & @MetaAIBo Wang @BoWang87
8K Followers 2K Following Assistant Prof. CS,LMP @UofT; CIFAR AI Chair @VectorInst; Chief AI Scientist, @UHN; former PHD, CS @Stanford; opinions my own. #AI #healthcare #combioFederico Cassano @ellev3n11
130 Followers 68 Following Undergraduate Researcher @neu_prl Upcoming @scale_AI Previous industry research @cursor_ai, @Roblox, @trailofbits Papers here: https://t.co/PgUSaxXs1BKangwook Lee @Kangwook_Lee
2K Followers 676 Following Assistant Professor, ECE, UW-Madison / Leading deep learning research @ KRAFTONVolodymyr Kyrylov @darkproger
2K Followers 2K Following AI student at USI/ETH. Donate https://t.co/GDSkWG2takTongzhou Wang @TongzhouWang
1K Followers 1K Following representation of type 1→2 agi @mit Ex @pytorch @MetaAI @berkeley_aiSean Hughes @hughesthe1st
555 Followers 269 Following AI Ecosystem @ServiceNow @ServiceNowRSRCH @BigCodeProject #TheAIAlliance - formerly @IntelAI @ActianCorp @HPE - All tweets are my own opinion.Aaron Gokaslan @SkyLi0n
3K Followers 345 Following Creator of the OpenWebText and OpenGPT2. @PyTorch Core Reviewer. PhD Student at @Cornell (interning at @MosaicML) Previously at @FacebookAI and @BrownUniversityJiquan Ngiam @JiquanNgiam
524 Followers 176 Following Building @Lutra_AI Previously: Google Brain, Coursera, Stanford ML GroupJoe Fioti @joefioti
284 Followers 208 Following https://t.co/7pk0ZVnuVx | https://t.co/VzqoakhN9U | https://t.co/btF8VnpAdMMR3D-Dev @MR3Dev
2K Followers 112 Following 3D Artist and Unreal Engine educator. Aspiring Film maker 🇻🇪 Patreon: https://t.co/EqiCLUZpVARyan Schmidt @rms80
7K Followers 470 Following Back @gradientspace. Made Modeling Mode & Geometry Script @epicgames. Invented Autodesk @meshmixer. PhD in Computer Shapes. @[email protected]Monumental Labs @Monumental_Labs
4K Followers 2 Following Building AI-enabled robotic stone carving factories to unleash a renaissance.victor @sun_qingfeng
129 Followers 108 FollowingVivek Raghunathan @vivek7ue
4K Followers 2K Following * AI + search at @snowflakedb. * Co-founder @Neeva (acquired by @snowflakedb). #NeevaAI = AI search engine with LLMs. * Ex-VP of Engineering @GoogleHieu Pham @hyhieu226
2K Followers 26 Following Making GPUs go brrrr @augmentcode 🤖 Past: Research Scientist at Google Brain 🧠 IMO Silver Medalist 🥈 waiting for LLMs to beat me. Tweets are my own opinions.Jonathan Ho @hojonathanho
4K Followers 152 FollowingShahbuland Matiana - .. @shah_bu_land
373 Followers 171 Following Co-Founder @carperai and researcher at @stabilityai prev @uwaterloo I want to make AI generated video gamesNathan Godey @nthngdy
537 Followers 840 Following 3rd year PhD student @InriaParisNLP Working on the representations of language models, architectures, and pretraining methods https://t.co/CTHFx1ZqPoAston Zhang @astonzhangAZ
5K Followers 92 Following Research Scientist at the #llama team of Meta Generative AI, designing and training large language models. Opinions are my own.Laurens van der Maate.. @lvdmaaten
667 Followers 1K Following Distinguished Research Scientist at Meta AI. t-SNE. DenseNet. Web-scale weakly supervised vision. CrypTen. Currently herding Llamas.Mike Lewis @ml_perception
6K Followers 227 Following Llama3 pre-training lead. Partially to blame for things like the Cicero Diplomacy bot, BART, RoBERTa, kNN-LM, top-k sampling & Deal Or No Deal.Ahmad Al-Dahle @Ahmad_Al_Dahle
4K Followers 53 Following #Girldad of twins. Leading GenAI @ Meta (llama, imagine, meta ai and more)Quentin Anthony @QuentinAnthon15
999 Followers 129 Following I make models more efficient. Google Scholar: https://t.co/kzVsAKPdrpJohn Galt @StudioMilitary
240 Followers 39 Following Designer, @NousResearch & Military Studio (https://t.co/C9kUZNyPEB)catid (e/acc) @MrCatid
3K Followers 637 Following Engineer at Juice Labs. Prior: Anduril, Oculus VR, Game Closure, MSEE@GATechRyan Smith @RyanQualtrics
91K Followers 452 Following Founder @Qualtrics, Chairman/CEO Smith Entertainment Group, NBA @UtahJazz, @RealSaltLake, @BYU alum, 5 ninjas, 1 amazing wife. “I’ll go where you want me to go”Sean Baxter @seanbax
8K Followers 193 Following The road to Memory-safe C++. https://t.co/IoFMbCXdOw CppNow 2022: https://t.co/S9EzkyQBDfxAI @xai
997K Followers 36 FollowingNando Metzger @NandoMetzger
238 Followers 301 Following PhD Student @ETH_en | Computer Vision @Meta Research in: Computer Vision | Remote Sensing | Super Resolution | Monocular Depth | Population MappingMicronics @micronics3d
573 Followers 8 Following Maker of the world's best desktop SLS 3D printers https://t.co/FAwB5824kuJeremy Fox 🦊 @JeremyDanielFox
624 Followers 610 Following Neural nets @AnthropicAI. Ex @google. My views are my own.Ethan @Ethan_smith_20
3K Followers 688 Following a boy and his gpu vs the world. directing research at @leonardoai_. learning as I go. uf psych. generative models and representation learningLingjie Liu @LingjieLiu1
3K Followers 643 Following Assistant Professor at UPenn. Research interests: Neural Scene Representation, Neural Rendering, Human Performance Modeling and Capture.Yufu Wang @YufuWang_
28 Followers 65 FollowingKostas Daniilidis @KostasPenn
4K Followers 1K Following Ruth Yalom Stone Professor @Penn @PennEngineers @PennCIS @GRASPlabZiyun (Claude) Wang @ZiyunClaudeWang
88 Followers 328 Following Ph.D. Student, Event-based Computer Vision Researcher @Penn @GRASPlabEric Alcaide @eric_alcaide
778 Followers 427 Following Physics, Medicine, Machine Learning || Universe, Life, Intelligence • From Bits to MoleculesSebastian Hofstätter @s_hofstaetter
1K Followers 254 Following RAG & tool use modelling co-lead @Cohere; PhD in efficient neural information retrieval from @tu_wienNiels Hoven @NielsHoven
20K Followers 2K Following Founded @MentavaInc to support high achieving kids. Seeker of truth, critic of tribalism, lover of ice cream. Tweets about startups, education, and my four kidsJust spent a day making a board deck and I… enjoyed it? It felt like a good opportunity to reflect on our progress. Am I a psycho CEO?
Most of what I know about research, I learned from Noam.
Many have pointed out that LLM benchmarks are broken and gamed. Happy to see my former resident @hughbzhang, @summeryue0, and the great @scale_AI folks do something about it! They made a private version of GSM8k and evaled GPT-4, Claude, Mixtral, Phi, etc: arxiv.org/pdf/2405.00332
Many have pointed out that LLM benchmarks are broken and gamed. Happy to see my former resident @hughbzhang, @summeryue0, and the great @scale_AI folks do something about it! They made a private version of GSM8k and evaled GPT-4, Claude, Mixtral, Phi, etc: arxiv.org/pdf/2405.00332
@andrew_n_carr down for it! we should host one @argilla_io 👀
@andrew_n_carr til you've been having labeling pizza parties without me
Wow, Medusa can be used for pre-training and leads to a better and faster generation! 😍
Meta presents Better & Faster Large Language Models via Multi-token Prediction - training language models to predict multiple future tokens at once results in higher sample efficiency - up to 3x faster at inference arxiv.org/abs/2404.19737
Nous and @Teknium1 killing it as always, finding more performance over the L3-Instruct that everyone has been struggling to outperform.
Announcing Hermes 2 Pro on Llama-3 8B! Nous Research's first Llama-3 based model is now available on HuggingFace. Hermes Pro comes with Function Calling and Structured Output capabilities, and the Llama-3 version now uses dedicated tokens for tool call parsing tags, to make…
Released Hermes 2 Pro on Llama-3 8B today! Get it here: huggingface.co/NousResearch/H… or the GGUF here: huggingface.co/NousResearch/H…
Announcing Hermes 2 Pro on Llama-3 8B! Nous Research's first Llama-3 based model is now available on HuggingFace. Hermes Pro comes with Function Calling and Structured Output capabilities, and the Llama-3 version now uses dedicated tokens for tool call parsing tags, to make…
@andrew_n_carr @aidan_mclau well, we (the community) are still working on it: arxiv.org/abs/2308.13111 and arxiv.org/abs/2404.11599 for example
@andrew_n_carr thank you andrew, tbh we still have so much work to do to make video understanding better
Kolmogorov-Arnold strikes again! Little known fact: this theorem features inside one of the seminal papers on permutation invariant neural nets (Deep Sets), showing an intricate connection between such representations and the way set/GNN aggregators are built (as a special case).
MLPs are so foundational, but are there alternatives? MLPs place activation functions on neurons, but can we instead place (learnable) activation functions on weights? Yes, we KAN! We propose Kolmogorov-Arnold Networks (KAN), which are more accurate and interpretable than MLPs.🧵
PSA: ASKING AN LLM ABOUT ITSELF IS NOT A RELIABLE WAY OF INVESTIGATING THE LLM "Oh the model told me it's made by OpenAI" so what?! I can't believe smart people keep making this rookie mistake!
Iterative RL finetuning seems to be sota right now. SPIN -> DNO -> IRPO
Meta presents Iterative Reasoning Preference Optimization Increasing accuracy for Llama-2-70B-Chat: - 55.6% -> 81.6% on GSM8K - 12.5% -> 20.8% on MATH - 77.8% -> 86.7% on ARC-Challenge arxiv.org/abs/2404.19733
We are excited to have @andrew_n_carr present at the next AWS Utah event. meetup.com/aws-utah/event…
Everything in the world is about FLOPS except FLOPS. FLOPS are about power.
This new writeup by @cHHillee uncovered some very unexpected reasons for why we can never reach the theoretical TFLOPS advertised by accelerator vendors thonking.ai/p/strangely-ma… spoiler: it's all about the power. Make sure to read it!
I recently left @scale_AI. I'm so thankful to the team there and for @alexandr_wang's bet to acquire our startup nearly 4 years ago. When I joined Scale, it was a single-product company building the data engine for autonomous vehicles. It's amazing to see how far Scale has come:…
I wrote a long comment on the @ApacheArrow issue tracker giving a high-level overview of what would it take to properly support if-then-else constructs in arrow/compute while preserving the ability to vectorize the compute kernels. Link below 👇