Rithesh Kumar @ritheshkumar_
Research Scientist @AdobeResearch. Ex @DescriptApp, @Mila_Quebec ritheshkumar.com Toronto, Ontario Joined November 2015-
Tweets406
-
Followers944
-
Following573
-
Likes2K
if you want to learn about how we trained KREA Flux, we prepared a detailed blog in the link below: krea.ai/blog/flux-krea…
In our continued commitment to open-science, we are releasing the Voxtral Technical Report: arxiv.org/abs/2507.13264 The report covers details on pre-training, post-training, alignment and evaluations. We also present analysis on selecting the optimal model architecture, which…
As one of the people who popularized the field of diffusion models, I am excited to share something that might be the “beginning of the end” of it. IMM has a single stable training stage, a single objective, and a single network — all are what make diffusion so popular today.
As one of the people who popularized the field of diffusion models, I am excited to share something that might be the “beginning of the end” of it. IMM has a single stable training stage, a single objective, and a single network — all are what make diffusion so popular today.
Nice paper on the trade-off between decoding quality and modelability in 2-stage generative models. I disagree with this framing though: the trade-off is quite clear from an information-theoretic perspective. Do most people really believe this? Maybe it's time for a blog post🤔
Nice paper on the trade-off between decoding quality and modelability in 2-stage generative models. I disagree with this framing though: the trade-off is quite clear from an information-theoretic perspective. Do most people really believe this? Maybe it's time for a blog post🤔 https://t.co/kv5DXSntkE
new paper! 🗣️Sketch2Sound💥 Sketch2Sound can create sounds from sonic imitations (i.e., a vocal imitation or a reference sound) via interpretable, time-varying control signals. paper: arxiv.org/abs/2412.08550 web: hugofloresgarcia.art/sketch2sound
📢 Audio AI Job opportunity at Adobe! The Sound Design AI Group (SODA) is looking for an exceptional research engineer to join us in building the future of AI-assisted audio and video creation. Strong ML background, GenAI experience a plus. Details: adobe.wd5.myworkdayjobs.com/external_exper…
A common question nowadays: Which is better, diffusion or flow matching? 🤔 Our answer: They’re two sides of the same coin. We wrote a blog post to show how diffusion models and Gaussian flow matching are equivalent. That’s great: It means you can use them interchangeably.
🎥 Introducing MultiFoley, a video-aware audio generation method with multimodal controls! 🔊 We can ⌨️Make a typewriter sound like a piano 🎹 🐱Make a cat meow like a lion roars! 🦁 ⏱️Perfectly time existing SFX 💥 to a video
New tutorial! I spent 3 weeks realizing flow-matching/rectified flows can be viewed in a simple way that end-runs the usual pages of math: "Basic physics provides a 'straight, fast' way to get up to speed with flow-based generative models" Colab included! drscotthawley.github.io/blog/posts/Flo…
Is pixel diffusion passé? In 'Simpler Diffusion' (arxiv.org/abs/2410.19324) , we achieve 1.5 FID on ImageNet512, and SOTA on 128x128 and 256x256. We ablated out a lot of complexity, making it truly 'simpler'. w/ @tejmensink @JonathanHeek @KayLamerigts @RuiqiGao @TimSalimans
What a thrill to present on the big stage! So excited to reveal our Sounds Effects GenAI tech in #ProjectSuperSonic #AdobeMAX Text-to-SFX and *VOICE*-to-SFX for expressive control! Huge kudos to @urinieto @pseetharaman @hugggof and our collaborators in design & prototyping!
What a thrill to present on the big stage! So excited to reveal our Sounds Effects GenAI tech in #ProjectSuperSonic #AdobeMAX Text-to-SFX and *VOICE*-to-SFX for expressive control! Huge kudos to @urinieto @pseetharaman @hugggof and our collaborators in design & prototyping!
Adobe just announced Generative Extend for Premiere Pro (beta) at #AdobeMAX! Use GenAI to extend your video clip *including the audio* @pseetharaman @urinieto and me in the Sound Design AI Group at @AdobeResearch worked on the audio part and we're so excited to see it go out!
Ultra-fast text-to-music generation w/o degrading quality? Introducing Presto! Distilling Steps and Layers for Accelerating Music Generation 🎹: buff.ly/4dC3rpl 📖: buff.ly/3TZBiBU w/@__gzhu__ @CasebeerJonah @BergKirkpatrick @McAuleyLabUCSD @NicholasJBryan 🧵
ICML in Vienna is coming to a close! 🇦🇹 Here are the top-10 general (and audio) trends from ICML 2024. A thread 🧵 1. Open vs. Closed AI: The debate was very present, notable in @soumithchintala's keynote or by the release of Llama 3.1 (among others). icml.cc/virtual/2024/p…
🚨 Contextual Position Encoding (CoPE) 🚨 Context matters! CoPE is a new positional encoding method for transformers that takes into account *context*. - Can "count" distances per head dependent on need, e.g. i-th sentence or paragraph, words, verbs, etc. Not just tokens. -…
Happy to release "DAC-JAX: A JAX Implementation of the Descript Audio Codec." This can reuse PyTorch weights of all model sizes, and it includes a device-parallel training script. It uses the standard JAX libraries: Flax, Optax, Orbax, and CLU. github.com/DBraun/DAC-JAX
Last week when presenting Parti (parti.research.google) at ICLR, I explained at least 20 times how I felt about autoregressive text-to-image generation models vs. diffusion models. So this is my take: The major benefit of autoregressive image generation models is that they…
Last week when presenting Parti (parti.research.google) at ICLR, I explained at least 20 times how I felt about autoregressive text-to-image generation models vs. diffusion models. So this is my take: The major benefit of autoregressive image generation models is that they…
MusicHiFi: Fast High-Fidelity Stereo Vocoding. Fast, high-fidelity stereophonic vocoding for music generation. 📝: arxiv.org/abs/2403.10493 🎵: musichifi.github.io/web/ w/ @j_p_caceres @ZhiyaoDuan @NicholasJBryan
Super happy to share this preview into what we've been building with the Speech AI team @ Adobe! Please reach out if you're interested in building large-scale audio models like this..
Super happy to share this preview into what we've been building with the Speech AI team @ Adobe! Please reach out if you're interested in building large-scale audio models like this..
Explore the future of sonic creativity 🔊 with Project Music GenAI Control! Emerging experimental tech from the Adobe Research team can create audio tracks using text prompts and even transform your music based on reference melodies. Learn more: adobe.ly/3uMBr27

AK @_akhaliq
425K Followers 3K Following AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo ,submit papers here: https://t.co/UzmYN5YmrQ
Sander Dieleman @sedielem
63K Followers 2K Following Research Scientist at Google DeepMind (WaveNet, Imagen, Veo). I tweet about deep learning (research + software), music, generative models (personal account).
Delip Rao e/σ @deliprao
61K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈
Christian Steinmetz @csteinmetz1
5K Followers 2K Following AI for Audio & Music • Research Scientist @sunomusic • PhD Student @c4dm MSc @mtg_upf • Previously Intern @Adobe @Meta @Dolby
Oriol (Uri) Nieto @urinieto
3K Followers 1K Following (INACTIVE ACCOUNT, FIND ME ON LINKEDIN/BLUESKY). Researcher at Adobe Research. Machine learning on audio. Oaklander born in Barcelona. Titan. He/they 🌈
Joan Serrà @serrjoa
2K Followers 565 Following Does research on machine learning at Sony AI, Barcelona. Works on audio analysis, synthesis, and retrieval. Likes tennis, music, and wine.
Chris Donahue @chrisdonahuey
5K Followers 1K Following GenAI for *human* creativity in music + more. Assistant prof at CMU CSD, 🎼 G-CLef lab. Part time Google DeepMind, Magenta (views my own)
최형석 (Hyeong-Seo... @92HsChoi
937 Followers 415 Following Love almost everything related to music. Research @elevenlabsio. Previously Co-founder and Research Lead @ Supertone, PhD @ Seoul National University, MARG
Luigui Sánchez @LuiguiSnchez3
28 Followers 826 Following
Amr Kayid @amr_kayid
95 Followers 4K Following 🪼🪄prev: @runwayml @Cohere 🐳 Research FORai / @CohereForAI 🧙♂️ @ManifoldRG @OpenMinedOrg 🕵 @TU_Muenchen 🤖🇩🇪🧠
Ukicen @Ukicen355432
16 Followers 2K Following
Siddhant @siddhantoon
38 Followers 167 Following MLE at @omegalabsai, ✨Blogs here: https://t.co/8pooyx598b, learning to not feel an imposter as a Data Scientist, mail: [email protected]
Nishit Anand @nishitanand99
100 Followers 2K Following MS CS @umdcs | Former ML Research - @iitdelhi, @IIITDelhi | Computer Vision | Multimodal LLMs | Photography
Anna Bianchi @nawiae87
157 Followers 1K Following
Louis Bradshaw @loubbrad
162 Followers 226 Following ML/CS PhD student at @C4DM. Interested in audio/multimodal.
Bartek Chmielewski @poprostu11
2 Followers 120 Following
Utopic e/λ @UtopicDev
258 Followers 4K Following AI Designer and Builder. Technology to save the world. There Is No Planet B...
Arnon @ArnonTu
10 Followers 124 Following
Amelia @ManteMisso41805
13 Followers 284 Following
Tirsod @Tirsodfp64
44 Followers 1K Following
Thijs Bergkamp @ThijsBergkamp
82 Followers 7K Following
gabedaramola @gabedaramola
0 Followers 8K Following
nora_fxtrade219 @LFxstock
377 Followers 2K Following 📍| ʟᴏɴᴅᴏɴ,ᴜᴋ🇬🇧 💱| 6 ʏᴇᴀʀꜱ ᴛʀᴀᴅɪɴɢ ꜰᴏʀᴇx💱 🗺| ᴡᴏʀʟᴅ🧳ᴛʀᴀᴠᴇʟᴇʀ 🎓ɪ ᴍᴇɴᴛᴏʀ 👨🏫| ʜᴇʟᴘ ᴘᴇᴏᴘʟᴇ ᴍᴀᴋᴇ ᴇɪɢʜᴛ + ꜰɪɢᴜʀᴇꜱ 💰 ᴡᴇᴇᴋʟʏ Trading crypto 📈
Sumeet Motwani @sumeetrm
1K Followers 2K Following Research Intern@Microsoft Phi | ML PhD at Oxford, Previously CS at UC Berkeley
Anuj Diwan @anuj_diwan
781 Followers 1K Following PhD Student @UTCompSci. Prev. Student Researcher @GoogleDeepmind, FAIR (@metaai), @AdobeResearch. 2021 BTech CSE @iitbombay. Interests: NLP, ASR, ML. 🇮🇳🇺🇸
SK Saiful @sksaiful_
212 Followers 4K Following Effective altruist | Working @fortify_health | Ex-Fellow @mercatus | Passionate about making an impact on people's lives, even in small ways | Proud Indian 🇮🇳
K @Kargichauhan_
196 Followers 360 Following ML Engineer | LLMs, Neuro-symbolic AI and AVs | Ex-NASA
Gallil Maimon @GallilMaimon
269 Followers 405 Following Research Scientist intern @ Meta (FAIR); PhD student @CseHuji; Speech Language Modelling
senthil shravan @senthilshrav
0 Followers 27 Following
Shawlesmp @ShawlesmpyLBjb
34 Followers 4K Following
Yining Shi @yining_shi
3K Followers 744 Following Director of Product Eng + Founding Engineer @runwayml | Adjunct professor @ITP_NYU, @ml5js, she/her.
Sam @Klikman007
151 Followers 5K Following
elizabeth @elizabe80740184
200 Followers 4K Following
McThede @McThedeO4zgSDN
29 Followers 2K Following
chaehunshin @chaehunshin
4 Followers 107 Following
Tsheighd @Tsheighd_fMLD
20 Followers 789 Following
CaraWild @nuIuBjgU05zaZbq
69 Followers 7K Following
Gokul @gokulkarthikk
119 Followers 577 Following
⨀❂ Sᴜη—Sρe�... @Sun_SpeakUpNow
14 Followers 191 Following Awakening consciousness & truth, designing sustainability & regenerative technologies—healing the Earth & revealing universal mysteries. Join us & Speak Up Now!
Lassey @LasseyGqcOb6S
40 Followers 3K Following
EnidHornby @9GT61dpI9hZ14
92 Followers 7K Following
Milin Bhade @MilinBhade
199 Followers 6K Following Post Grad Student at IISc, Bangalore Masters in Computer Science & Automation
Delia Darwin @C83zVeOfdij21
37 Followers 3K Following
DinahTed @2KEo7Wb2I05S4G
78 Followers 7K Following
Sungwon Kim @sungwon__kim
82 Followers 115 Following Research Scientist @NVIDIA, Seoul National University
Yann LeCun @ylecun
949K Followers 764 Following Professor at NYU. Chief AI Scientist at Meta. Researcher in AI, Machine Learning, Robotics, etc. ACM Turing Award Laureate.
AK @_akhaliq
425K Followers 3K Following AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo ,submit papers here: https://t.co/UzmYN5YmrQ
Andrej Karpathy @karpathy
1.4M Followers 1K Following Building @EurekaLabsAI. Previously Director of AI @ Tesla, founding team @ OpenAI, CS231n/PhD @ Stanford. I like to train large deep neural nets.
Soumith Chintala @soumithchintala
250K Followers 1K Following Cofounded and lead @PyTorch at Meta. Also dabble in robotics at NYU. AI is delicious when it is accessible and open-source.
Google DeepMind @GoogleDeepMind
1.2M Followers 279 Following We’re a team of scientists, engineers, ethicists and more, committed to solving intelligence, to advance science and benefit humanity.
Sander Dieleman @sedielem
63K Followers 2K Following Research Scientist at Google DeepMind (WaveNet, Imagen, Veo). I tweet about deep learning (research + software), music, generative models (personal account).
Jim Fan @DrJimFan
325K Followers 3K Following NVIDIA Director of Robotics & Distinguished Scientist. Co-Lead of GEAR lab. Solving Physical AGI, one motor at a time. Stanford Ph.D. OpenAI's 1st intern.
François Chollet @fchollet
572K Followers 813 Following Co-founder @ndea. Co-founder @arcprize. Creator of Keras and ARC-AGI. Author of 'Deep Learning with Python'.
Delip Rao e/σ @deliprao
61K Followers 5K Following Busy inventing the shipwreck. @Penn. Past: @johnshopkins, @UCSC, @Amazon, @Twitter ||Art: #NLProc, Vision, Speech, #DeepLearning || Life: 道元, improv, running 🌈
Kyunghyun Cho @kchonyc
77K Followers 2K Following a combination of a mediocre scientist, a mediocre manager, a mediocre advisor & a mediocre physicist at @nyuniversity (@CILVRatNYU) & @PrescientDesign
Christian Steinmetz @csteinmetz1
5K Followers 2K Following AI for Audio & Music • Research Scientist @sunomusic • PhD Student @c4dm MSc @mtg_upf • Previously Intern @Adobe @Meta @Dolby
Heiga Zen (全 炳河... @heiga_zen
9K Followers 204 Following Principal Scientist (Director) @GoogleDeepMind / GDM Tokyo site lead.波瀬小⇒一志中⇒鈴鹿高専⇒名工大 (IBM TJ Watson intern)⇒東芝欧州研⇒Google (Speech🇬🇧⇒Brain🇯🇵) ⇒GoogleDeepMind
Jeremy Howard @jeremyphoward
259K Followers 6K Following 🇦🇺 Co-founder: @AnswerDotAI & @FastDotAI ; Prev: professor @ UQ; Stanford fellow; @kaggle president; @fastmail/@enlitic/etc founder https://t.co/16UBFTX7mo
AI at Meta @AIatMeta
712K Followers 288 Following Together with the AI community, we are pushing the boundaries of what’s possible through open science to create a more connected world.
Lucas Beyer (bl16) @giffmana
108K Followers 519 Following Researcher (now: Meta. ex: OpenAI, DeepMind, Brain, RWTH Aachen), Gamer, Hacker, Belgian. Anon feedback: https://t.co/xe2XUqkKit ✗DMs → email
Irina Rish @irinarish
10K Followers 1K Following prof UdeM/Mila; Canada Excellence Research Chair; AAI Lab head https://t.co/UzlrC7ZrGF; CSO @ https://t.co/NgFagZ4pqY; advisor @ https://t.co/EyXleEdfQV
Durk Kingma @dpkingma
50K Followers 404 Following @AnthropicAI. Prev. @Google Brain/DeepMind, founding team @OpenAI. Computer scientist; inventor of the VAE, Adam optimizer, and other methods. ML PhD.
Zachary Nado @zacharynado
13K Followers 753 Following Research eng @GoogleDeepMind on Gemini pretrain. Personal acct. Past: swe intern @SpaceX, ugrad researcher in @tserre lab @BrownUniversity. All opinions my own.
Jason Lee @jasondeanlee
18K Followers 4K Following Associate Professor at UC Berkeley. Former Research Scientist at Google DeepMind. ML/AI Researcher working on foundations of LLMs and deep learning.
Shengjia Zhao @shengjia_zhao
52K Followers 231 Following Chief Scientist @ Meta MSL. Formerly MTS @ OpenAI, PhD @ Stanford. I train models. All opinions my own.
Chris Paxton @chris_j_paxton
19K Followers 3K Following Mostly posting about robots. currently AI @agilityrobotics prev embodied AI @AIatMeta, @NVIDIAAI. All views my own. writing: https://t.co/iNLA4djfZo
Seunghyun Seo @SeunghyunSEO7
2K Followers 798 Following deep learning enjoyer. from speech to llm, now exploring image space.
Yining Shi @yining_shi
3K Followers 744 Following Director of Product Eng + Founding Engineer @runwayml | Adjunct professor @ITP_NYU, @ml5js, she/her.
Minje Kim @minje_research
412 Followers 246 Following Associate Professor at CS@UIUC; Visitic Academic at Amazon Lab126; Want to share my thoughts on audio & AI research, graduate studies, and life.
Alexander Kolesnikov @__kolesnikov__
12K Followers 192 Following
Xiaohua Zhai @XiaohuaZhai
11K Followers 311 Following Researcher at Meta (previously at OpenAI Zürich, Google DeepMind)
Michael Tschannen @mtschannen
3K Followers 674 Following Research Scientist @GoogleDeepMind. Representation learning for multimodal understanding and generation. Personal account.
William Chen @chenwanch1
807 Followers 422 Following PhD Student @LTIatCMU @SCSatCMU | Masters @LTIatCMU | Formerly @TXInstruments | @UCF ‘21
Yoshua Bengio @Yoshua_Bengio
25K Followers 206 Following Working towards the safe development of AI for the benefit of all @UMontreal, @LawZero_ & @Mila_Quebec A.M. Turing Award Recipient and most-cited AI researcher.
Sungwon Kim @sungwon__kim
82 Followers 115 Following Research Scientist @NVIDIA, Seoul National University
Karan Goel @krandiash
8K Followers 965 Following founder ceo @cartesia_ai, phd @stanfordailab, @mldcmu @iitdelhi alum
Albert Gu @_albertgu
18K Followers 88 Following assistant prof @mldcmu. chief scientist @cartesia_ai. leading the ssm revolution.
Lucas Theis @lucastheis
3K Followers 833 Following Building something new. Previously @GoogleDeepMind, @twitter, Magic Pony, @bethgelab.
Ziyang Chen @CzyangChen
383 Followers 427 Following Research Scientist @LumaLabsAI. Ph.D. @UMich multimodal learning, audio-visual learning prev @Adobe and @AIatMeta
John Schulman @johnschulman2
65K Followers 1K Following Recently started @thinkymachines. Interested in reinforcement learning, alignment, birds, jazz music
Hao-Wen (Herman) Dong... @hermanhwdong
1K Followers 306 Following Assistant Professor at University of Michigan | PhD from UC San Diego | Human-Centered Generative AI for Content Creation
Charlie Nash @charlietcnash
2K Followers 841 Following Research @OpenAI - Previously @udiomusic, @GoogleDeepMind
udio @udiomusic
37K Followers 0 Following Generative music maker. Make your music. Discord https://t.co/4deihzV5gN Reddit r/udiomusic
Puyuan Peng @PuyuanPeng
2K Followers 511 Following Research Scientist @Meta Superintelligence Lab. Speech & Audio. Previously @utaustin @uchicago @bnu_1902
Emiel Hoogeboom @emiel_hoogeboom
3K Followers 178 Following Research Scientist at Google Deepmind. Previous PhD with @wellingmax at the Univ. of Amsterdam, Research intern at Qualcomm AI Research and Google Brain.
Feiteng @FeitengLi
788 Followers 1K Following Speech & LLM & RL & Video 原生算法,写过后端,今年想写前端。 🤣🥵🤖😺 公众号 Generative AI 知乎 https://t.co/MXuiNWWVOO
Vinod Khosla @vkhosla
690K Followers 600 Following entrepreneurship zealot, grounded technology possibilist, believer in the power of ideas, passionate about sustainability & impact
vinh q. tran @vqctran
2K Followers 352 Following research scientist @GoogleDeepMind, all thoughts my own, he/him
Shinji Watanabe @shinjiw_at_cmu
4K Followers 362 Following I'm working at CMU (2021-). I was working at NTT (2001-2011), MERL (2012-2017), and JHU (2017-2020). Speech and Audio Processing is my main research topic.
Benjamin Lefaudeux @BenTheEgg
1K Followers 2K Following Back in the EU after some time in sunny California and happy Copenhagen. Mistral, prev Photoroom, Meta (xformers, FairScale), EyeTribe (acq)
Hyung Won Chung @hwchung27
38K Followers 302 Following AI Research Scientist @Meta Superintelligence Labs. Past: @OpenAI / @Google Brain / PhD @MIT
Saining Xie @sainingxie
23K Followers 1K Following researcher in #deeplearning #computervision | assistant prof at @nyu_courant | rs @googledeepmind | past: rs @meta (FAIR) @ucsandiego | ynwa
Tim Brooks @_tim_brooks
37K Followers 102 Following
tobi lutke @tobi
416K Followers 2K Following @Shopify CEO by day, Dad in evening, hacker at night. Aspiring comprehensivist. (tweets auto delete eventually) retweet=noteworthy share, not endorsement
mrfakename @realmrfakename
2K Followers 386 Following LLMs, TTS, & Open Source https://t.co/PIhamCNjhp
Dan Lyth @danlyth
700 Followers 306 Following Research engineer at @sesame. Previously leading speech research at @StabilityAI and @RockstarGames.
Xu Tan @xutan_tx
2K Followers 603 Following Working on Large Language Models and Video/Audio Multimodality
Yoach @yoachlacombe
1K Followers 84 Following Audio ML Engineer | previously @HuggingFace 🤗 | Opinions are my own