Saketh Rambhatla @rssaketh
Phd student at University of Maryland, College Park rssaketh.github.io College Park, MD Joined October 2010-
Tweets79
-
Followers197
-
Following584
-
Likes7K
Inference time objectives are amazing :) We show that LLMs can be upgraded to multimodal beings by a simple trick :) No training needed! Works on image generation, editing, style transfer and more!
Inference time objectives are amazing :) We show that LLMs can be upgraded to multimodal beings by a simple trick :) No training needed! Works on image generation, editing, style transfer and more!
Super excited to share some recent work that shows that pure, text-only LLMs, can see and hear without any training! Our approach, called "MILS", uses LLMs with off-the-shelf multimodal models, to caption images/videos/audio, improve image generation, style transfer, and more!
Super cool to see transformers scaling so effectively for image/video autoencoders! Our model also offers a flexible way to implement variable token length
Super cool to see transformers scaling so effectively for image/video autoencoders! Our model also offers a flexible way to implement variable token length
How can we better animate images solely following text descriptions? We present Motion Focal Loss (MotiF) (arxiv.org/abs/2412.16153) to better align motions with text descriptions in text-image-to-video (TI2V) task and release TI2V-Bench, a comprehensive TI2V benchmark. (1/n)
Flow matching can transform one distribution to another. So why do text-to-image models map noise to images instead of directly mapping text to images? Wouldn't it be cool to directly connect modalities together? CrossFlow accomplishes exactly that! cross-flow.github.io
How can we make Imitation Leaning generalize? In my latest work we show that a key point based representation can generalize to novel instances of an object and is agnostic to background changes.
🚨 Internship in Meta GenAI NYC 🚨 I have an open PhD internship position for 2025! Interested in exploring visual generative models (or any other exciting ideas) inside the team that brought you Movie Gen and Emu Video? 📩 Send me DM with CV, website, and GScholar profile
🚨 Internship in Meta GenAI NYC 🚨 I have an open PhD internship position for 2025! Interested in exploring visual generative models (or any other exciting ideas) inside the team that brought you Movie Gen and Emu Video? 📩 Send me DM with CV, website, and GScholar profile
Meta Movie Gen is just freakin cool! Generative Video Foundation models with this quality, precise editing and personalization unlock value for creators, new creative tools and enable Agents that can interact in richer ways closing the loop on learning to unlock world models!
Meta Movie Gen is just freakin cool! Generative Video Foundation models with this quality, precise editing and personalization unlock value for creators, new creative tools and enable Agents that can interact in richer ways closing the loop on learning to unlock world models!
I’m thrilled and proud to share our model, Movie Gen, that we've been working on for the past year, and in particular, Movie Gen Edit, for precise video editing. 😍 Look how Movie Gen edited my video!
I’m thrilled and proud to share our model, Movie Gen, that we've been working on for the past year, and in particular, Movie Gen Edit, for precise video editing. 😍 Look how Movie Gen edited my video! https://t.co/0YawTGo217
Lights, camera, action - introducing Meta's Movie Gen! Our latest breakthrough in AI-powered media generation, setting a new standard for immersive AI content creation. We're also releasing a 92 page detailed report of what we learned, along with evaluation prompts that we hope…
Check out Movie Gen 🎥 Our latest media generation models for video generation, editing, and personalization, with audio generation! 16 second 1080p videos generated through a simple Llama-style 30B transformer. Demo + detailed 92 page technical report 📝⬇️
Check out Movie Gen 🎥 Our latest media generation models for video generation, editing, and personalization, with audio generation! 16 second 1080p videos generated through a simple Llama-style 30B transformer. Demo + detailed 92 page technical report 📝⬇️
And not just the paper, early next week we'll be releasing our full evaluation sets - the field of media generation would really benefit from having canonical benchmarks. Stay tuned!
And not just the paper, early next week we'll be releasing our full evaluation sets - the field of media generation would really benefit from having canonical benchmarks. Stay tuned!
So, this is what we were up to for a while :) Building SOTA foundation models for media -- text-to-video, video editing, personalized videos, video-to-audio One of the most exciting projects I got to tech lead at my time in Meta!
So, this is what we were up to for a while :) Building SOTA foundation models for media -- text-to-video, video editing, personalized videos, video-to-audio One of the most exciting projects I got to tech lead at my time in Meta!
So proud to be part of the Movie Gen project, pushing GenAI boundaries! Two key insights: 1. Amazing team + high-quality data + clean, scalable code + general architecture + GPUs go brr = SOTA video generation. 2. Video editing *without* supervised data: train a *single* model…
Hi friends, say hello to Movie Gen. Over the past couple of months, we've been working hard behind the scenes to bring you the latest advancements in video generation. Movie Gen not only packs with text-to-video capability, but also comes with video personalization, editing, and…
Hi friends, say hello to Movie Gen. Over the past couple of months, we've been working hard behind the scenes to bring you the latest advancements in video generation. Movie Gen not only packs with text-to-video capability, but also comes with video personalization, editing, and…
And here is the most exciting model we have been working on with special capabilies in text-to-video generation, video personalization, editing, and audio generation! Plus, an invaluable tech report released! Welcome to the world, Movie Gen!
And here is the most exciting model we have been working on with special capabilies in text-to-video generation, video personalization, editing, and audio generation! Plus, an invaluable tech report released! Welcome to the world, Movie Gen!
We released 92 pages worth of detail including how to benchmark these models! Super critical for the scientific progress in this field :) We'll also release evaluation benchmarks next week to help the research community 💪
We released 92 pages worth of detail including how to benchmark these models! Super critical for the scientific progress in this field :) We'll also release evaluation benchmarks next week to help the research community 💪
📢 Point tracking 🤝 action recognition at #ECCV2024 We've set the new SoTA of few-shot action recognition by harnessing morion data from point tracking and semantic features from SSL. Curious? Visit Poster #203 Thursday AM to see the future of action recognition🔥. Details:🧵
Website: cs.umd.edu/~pulkit/tats/ Work done in collabration with @namithap10, Luke Luo, @rssaketh and @abhi2610. 3/3
🚀 Excited to share InstanceDiffusion @cvpr2024! It adds precise instance-level control for image gen: free-form text conditions per instance and diverse location specs—points, scribbles, boxes & instance masks Code: shorturl.at/dtxSW arXiv: shorturl.at/rQS14 1/n

KittyFranklin @COXday7230v2EpP
97 Followers 2K Following
Sukriti Paul @sukritiollie
531 Followers 594 Following PhD @umdcs @ml_umd || Prev @Nonexomics, @AmericanExpress & IISc || Google WTM, CSRMP, GHC, and ACM-W Scholar. || She/Her. viewsOwn()
Steven (Shaobo) Wang ... @ShaoboWang6
386 Followers 1K Following Ph.D Candidate @sjtu1896, Intern @yaledatascience and @Alibaba_Qwen. Exploring Data-Centric AI on Foundation Models.
Mele @meleawi
101 Followers 2K Following AI, Computer Vision, deep learning, and Autonomous System Motion Planning and Control Software Engineer.
Anh Nguyen (Aengus) @aengusng8
108 Followers 2K Following Son & brother; AI Research Resident @Qualcomm; Contributor @huggingface🤗; Prev: @VinAI_Research. Update gradients in generative dimensions of computer vision.
Ollin Boer Bohan @madebyollin
3K Followers 2K Following Made sdxl-vae-fp16-fix, taesd, that pokemon-emulation-via-dnn thing.
praveen penumaka @praveenpenumaka
134 Followers 414 Following
miru @miru_why
1K Followers 1K Following 3e-4x engineer, unswizzled wagmi. specialization is for warps
Elausterio T Ferreira @Elausterio97035
32 Followers 927 Following
neeks @neeksww
285 Followers 589 Following a cluster of several atoms making a unique me - curious about nature and the signals (and systems) which make it. More here: https://t.co/6bsq8xhtTG
István Kerek @istvankerek
366 Followers 7K Following University Lecturer, Founder of the ChatGPT Hungarian Facebook Group and @ai2knowit, AI Business Development Expert
Crosesez @CrosesezHvFQ7n
24 Followers 38 Following
J @Jstl2bw
13 Followers 664 Following
Chen Sun @jesu9
2K Followers 494 Following Assistant Professor @BrownCSDept; Part-time Research Scientist @GoogleDeepMind. Opinions are my own.
Shijie Wang @ShijieWang20
197 Followers 420 Following Multimodal learning | CS PhD student @BrownUniversity, ex-Intern @GoogleDeepMind @meta, BS @Tsinghua_Uni.
Mara Levy @mlevy1221
78 Followers 104 Following PhD student @umdcs. | Excited to make robots work in the real world!
Fiona @tashethee72832
76 Followers 7K Following Don't wait for a leader; do it alone, one person at a time.
QuantiPhy @DebrupPaul2946
16 Followers 486 Following
Yuval Kirstain @YKirstain
703 Followers 657 Following Research Scientist @Meta | Building GenAI capabilities
Adam Polyak @adam_polyak90
156 Followers 245 Following
Kevin Chih-Yao Ma @chihyaoma
604 Followers 243 Following Building multimodal foundation models @MicrosoftAI | Past: a lead IC & babysitter of Meta's MovieGen, Emu, Imagine, ...
MagHolmes @JCQfep50n0H2He
56 Followers 7K Following
Shraman Pramanick @Shramanpramani2
204 Followers 541 Following Ph.D. @JohnsHopkins | Interned @AIatMeta FAIR, @google | Multimodal LLMs
Andrew white @Andreww95636515
136 Followers 2K Following 3d modeling. Gaussian splatting, NeRF, Diffusion models, GANs.
Silas Walkotte @SilasWalkotte
0 Followers 54 Following
jaiswati @jaiswati
22 Followers 451 Following
Jonas Gottschalk @JoSGottschalk
65 Followers 564 Following Building generative AI Solutions | Partner @Deyan7 GmbH & Co. KG | Hiring smart people who are passionate about developing effective digital solutions
Chris Chiasson @ChrisPChiasson
126 Followers 3K Following
Chester Jungseok Roh @chester_roh
4K Followers 2K Following Chester Jungseok Roh / BFACTORY Founder & CEO
Praveen @pravnx
453 Followers 4K Following Software Engineer; Interests: HPC, AI, Product Management, Entrepreneurship
prafulk @prafulk
239 Followers 4K Following
Guilherme @gpmarques1993
36 Followers 851 Following
Filip Kučera @PonekudOnekom
281 Followers 3K Following pronouns: e/acc; *manifesting*; PhD in mechinterp to debias VLMs @Uni_WUE, prev @CVUTPraha
Scott Nguyen @ScottNguye81334
0 Followers 11 Following
Dan @danredblack
62 Followers 2K Following
Maheedhar Gunturu @Vanguard_space
1K Followers 5K Following Mahee is a father, technologist, and a builder - formerly @aws @zscaler @smartthings @scylladb @mapr @VoltActiveData @qualcomm
iman jenabzadeh @imanjenabzadeh
46 Followers 1K Following Awandering with other wanderers in this wonderful world
Sergey Gulyaev @sergeygulyaev
60 Followers 138 Following #Mobile #Telecommunications #Marketing #Pricing #CRM
Nihaar Shah @non_gaussian
166 Followers 3K Following working on AI for new wearables @meta priors: @oxengsci @columbia @geresearch
Chris @chris___sun
331 Followers 2K Following Maker of @Elser_AI | AI Scientist & Software Engineer @Microsoft | Get your shit done, nobody cares your tech stack. Learning investment.
Jonathon Luiten @JonathonLuiten
3K Followers 2K Following Research Scientist at Meta Reality Labs in Boston Prev PhD at RWTH Aachen + Carnegie Mellon + Uni Oxford Dynamic 3D Gaussians + SplaTAM + HOTA + more From NZ
Reve @reveimage
12K Followers 20 Following Bring any idea to life 🌜 Try out our first image model at https://t.co/3Ig8tItH9k
Lilian Weng @lilianweng
163K Followers 166 Following Co-founder of Thinking Machines Lab @thinkymachines; Ex-VP, AI Safety & robotics, applied research @OpenAI; Author of Lil'Log
Ideogram @ideogram_ai
66K Followers 0 Following Turn your ideas into creative graphic designs, in a matter of seconds. What will you create?
Divya Kothandaraman @DivyaKRaman1
608 Followers 312 Following Sr. Researcher @Dolby. GenAI and Multimodal Learning. Earlier CS PhD, Univ. of Maryland @umdcs, @GoogleDeepMind @AdobeResearch @iitmadras
A Jabri @ajabri
5K Followers 282 Following research scientist @openai – 🇦🇺🇱🇧🇨🇳- ex @berkeley_ai @princeton
Susan Zhang @suchenzang
33K Followers 641 Following @ Google Deepmind. Past: @MetaAI, @OpenAI, @unitygames, @losalamosnatlab, @Princeton etc. Always hungry for intelligence.
Ollin Boer Bohan @madebyollin
3K Followers 2K Following Made sdxl-vae-fp16-fix, taesd, that pokemon-emulation-via-dnn thing.
Deepti @deeptigp
1K Followers 1K Following Asst. Professor in Computer Vision @ BU; Researcher @runwayml; ex-researcher @MetaAI, @UTCompSci , @iiit_hyderabad
Shijie Wang @ShijieWang20
197 Followers 420 Following Multimodal learning | CS PhD student @BrownUniversity, ex-Intern @GoogleDeepMind @meta, BS @Tsinghua_Uni.
Devi Parikh @deviparikh
26K Followers 209 Following Co-CEO @yutori_ai. Join the waitlist at https://t.co/zD3StYi8db.
Zhuang Liu @liuzhuang1234
11K Followers 1K Following Assistant Professor @PrincetonCS. researcher in deep learning, vision, models. previously @MetaAI, @UCBerkeley, @Tsinghua_Uni
Yuval Kirstain @YKirstain
703 Followers 657 Following Research Scientist @Meta | Building GenAI capabilities
Demis Hassabis @demishassabis
488K Followers 146 Following Nobel Laureate. Co-Founder & CEO @GoogleDeepMind - working on AGI. Solving disease @IsomorphicLabs. Trying to understand the fundamental nature of reality.
Nando de Freitas @NandoDF
105K Followers 776 Following Some projects I was lucky to be part of AlphaGo tuning, AlphaCode, Gato, ReST, r-Gemma, Imagen3, Veo, Genie, MAI. Ex Berkeley, UBC, Oxford Prof, Google DeepMind
Shelly Sheynin @ShellySheynin
1K Followers 209 Following Research Scientist @AIatMeta; Working on Media Generation; Meta Movie Gen, Emu Edit, Make-a-Video 3D, KNN Diffusion, Make-A-Scene
Wei-Ning Hsu @mhnt1580
2K Followers 133 Following Research Scientist @ Meta FAIR / audio generation, self-supervised learning, speech processing
Roshan Sumbaly @rsumbaly
2K Followers 750 Following Senior Director of AI, @AIatMeta - Llama & Movie Gen. Prior life @coursera, @linkedIn, @stanford
Adam Polyak @adam_polyak90
156 Followers 245 Following
Kevin Chih-Yao Ma @chihyaoma
604 Followers 243 Following Building multimodal foundation models @MicrosoftAI | Past: a lead IC & babysitter of Meta's MovieGen, Emu, Imagine, ...
Nathan Lambert @natolambert
56K Followers 853 Following Figuring out AI @allen_ai, open models, RLHF, fine-tuning, etc Contact via email. Writes @interconnectsai Wrote The RLHF Book Mountain runner
Yoshua Bengio @Yoshua_Bengio
25K Followers 206 Following Working towards the safe development of AI for the benefit of all @UMontreal, @LawZero_ & @Mila_Quebec A.M. Turing Award Recipient and most-cited AI researcher.
Ranjay Krishna @RanjayKrishna
6K Followers 436 Following Assistant Professor, University of Washington Co-Director RAIVN lab (https://t.co/f0BWKyjW48) Director PRIOR team (https://t.co/l9RzTetkIk)
Tanmay Gupta @tanmay2099
2K Followers 546 Following Senior Research Scientist @allen_ai (Ai2) | Developing the science and art of multimodal AI agents | Prev. CS PhD, UIUC and EE UG, IIT Kanpur
Laurens van der Maate... @lvdmaaten
4K Followers 2K Following Member of Technical Staff at Anthropic. Ex-Meta. t-SNE. Llama 3. DenseNet. Web-scale weakly supervised vision. CrypTen.
Jürgen Schmidhuber @SchmidhuberAI
163K Followers 0 Following Invented principles of meta-learning (1987), GANs (1990), Transformers (1991), very deep learning (1991), etc. Our AI is used many billions of times every day.
Berkeley AI Research @berkeley_ai
225K Followers 367 Following We're graduate students, postdocs, faculty and scientists at the cutting edge of artificial intelligence research.
Lucas Beyer (bl16) @giffmana
108K Followers 519 Following Researcher (now: Meta. ex: OpenAI, DeepMind, Brain, RWTH Aachen), Gamer, Hacker, Belgian. Anon feedback: https://t.co/xe2XUqkKit ✗DMs → email
Neha Kalibhat @NehaKalibhat
2K Followers 575 Following research @GoogleDeepMind | PhD @umdcs | safety and interpretability | she/her
Ajay Jain @ajayj_
7K Followers 4K Following Co-founder @genmoai. Co-created denoising diffusion (DDPM), DreamFusion, Dream Fields. Ex Ph.D. @berkeley_ai, @googleai, @facebookai, @nvidiaai, @mit
Samaneh Azadi @smnh_azadi
959 Followers 110 Following Research Scientist @GenAI @Meta . Ph.D. graduate from Berkeley AI Research and former intern @GoogleBrain and @AdobeResearch.
Quentin Duval @quduval
380 Followers 318 Following Research Engineer in Artificial Intelligence at Meta, Software Engineer and Functional Programming enthusiast.
XuDong Wang @XDWang101
1K Followers 646 Following Research Scientist @AIatMeta | PhD from @Berkeley_AI @UCBerkeley | Prev.: @GoogleDeepMind, FAIR @MetaAI
Andrew Brown @Andrew__Brown__
3K Followers 482 Following Research Scientist GenAI NY @AIatMeta working on video generation (Meta Movie Gen) | PhD @Oxford_VGG with Andrew Zisserman, Previously @oxengsci
ICLR 2026 @iclr_conf
52K Followers 53 Following International Conference on Learning Representations #ICLR2026. SPC is @BharathHarihar3 and GC is @cvondrick
paul @paul_okewunmi
1K Followers 4K Following ML/AI Engineer | MLH Fellow'23 @ Meta | Drone Hobbyist
Center for AI Safety @ai_risks
7K Followers 3 Following Reducing societal-scale risks from AI. https://t.co/5I9YG8IZa7 https://t.co/u91FCIyeSV
toly 🇺🇸 @aeyakovenko
628K Followers 6K Following Co-Founder of Solana Labs. Award winning phone creator. NFA, don’t trust me, mostly technical gibberish. https://t.co/LomgbTpb6h
Kosta Derpanis @CSProfKGD
68K Followers 197 Following #CS Assoc Prof @YorkUniversity, #ComputerVision Scientist Samsung #AI, @VectorInst Faculty Affiliate, TPAMI AE, @ELLISforEurope Member #ICCV2025 Publicity Chair
Chelsea Finn @chelseabfinn
82K Followers 399 Following Asst Prof of CS & EE @Stanford Co-founder of Physical Intelligence @physical_int PhD from @Berkeley_EECS, EECS BS from @MIT
Judea Pearl @yudapearl
79K Followers 275 Following Student of causal inference, human reasoning, and history of ideas, all viewed through the sharp lens of artificial intelligence.
Thomas G. Dietterich @tdietterich
57K Followers 619 Following Distinguished Professor (Emeritus), Oregon State Univ.; Former President, Assoc. for the Adv. of Artificial Intelligence; Robust AI & Comput. Sustainability
Ishan Misra @imisra_
7K Followers 245 Following Director, GenAI Research @Meta. Tech Lead of Movie Gen Meta. Past: MIT TR35, Llama3, Emu Video, ImageBind, DINO. Tweets and opinions are my own.