My students called the new CDIS building “state-of-the-art”. I thought they were exaggerating.
Today I moved in and saw it for myself. Wow. Photos cannot capture the beauty of the design.
#ICCV2025 Introducing X-Fusion: Introducing New Modality to Frozen Large Language Models
It is a novel framework that adapts pretrained LLMs (e.g., LLaMA) to new modalities (e.g., vision) while retaining their language capabilities and world knowledge! (1/n)
Project Page:…
LLaVA-Prumerge, the first work of Visual Token Reduction for MLLM, finally got accepted after being cited 146 times since last year.
Congrats to the team! @yuzhang_shang@yong_jae_lee
See how to do MLLM inference much cheaper while holding performance. llava-prumerge.github.io
LLaVA-Prumerge, the first work of Visual Token Reduction for MLLM, finally got accepted after being cited 146 times since last year.
Congrats to the team! @yuzhang_shang@yong_jae_lee
See how to do MLLM inference much cheaper while holding performance. llava-prumerge.github.io https://t.co/TQa9BThm1P
Training text-to-image models?
Want your models to represent cultures across the globe but don't know how to systematically evaluate them?
Introducing ⚕️CuRe⚕️ a new benchmark and scoring suite for cultural representativeness through the lens of information gain
(1/10)
Congratulations Dr. Mu Cai @MuCai7! Mu is my 8th PhD student and first to start in my group at UW–Madison after my move a few years ago. He made a number of important contributions in multimodal models during his PhD, and recently joined Google DeepMind. I will miss you a lot Mu!
Check out our new ICLR 2025 paper, LLaRA, which transforms a pretrained vision-language model into a robot vision-language-action policy! Joint work with @XiangLi54505720, @ryoo_michael, et al from Stony Brook U, and @MuCai7. github.com/LostXine/LLaRA
Check out our new ICLR 2025 paper, LLaRA, which transforms a pretrained vision-language model into a robot vision-language-action policy! Joint work with @XiangLi54505720, @ryoo_michael, et al from Stony Brook U, and @MuCai7. github.com/LostXine/LLaRA
🔥Poster: Fri 13 Dec 4:30 pm - 7:30 pm PST (West)
It is the first time for me try to sell a new concept that I believe but not in trend. I truely trust the language between llm/lmms are embeddings, and interfacing with embeddings is essential in future!
Welcome everyone to come😀
📢Come to join our 1st Workshop on Video-Langauge Models at #NeurIPS 2024.
We have seen a great progress on image-language models, now it is time for Videos! Our invited speakers will talk more about how we further move forward!
…and-language-workshop-2024.webflow.io
Special invited talks…
📢Come to join our 1st Workshop on Video-Langauge Models at #NeurIPS 2024.
We have seen a great progress on image-language models, now it is time for Videos! Our invited speakers will talk more about how we further move forward!
…and-language-workshop-2024.webflow.io
Special invited talks… https://t.co/iN5NqOYf76
🚨 I’ll be at #NeurIPS2024! 🚨On the industry job market this year and eager to connect in person!
🔍 My research explores multimodal learning, with a focus on object-level understanding and video understanding.
📜 3 papers at NeurIPS 2024:
Workshop on Video-Language Models
📅…
I am not in #EMNLP2024 but @bochengzou is in Florida!
Go checkout vector graphics, a promising format that is completely different from pixels for visual representation. Thanks to LLMs, vector graphics are more powerful now!
Go chat with @bochengzou if you are interested!
I am not in #EMNLP2024 but @bochengzou is in Florida!
Go checkout vector graphics, a promising format that is completely different from pixels for visual representation. Thanks to LLMs, vector graphics are more powerful now!
Go chat with @bochengzou if you are interested!
81 Followers 505 FollowingAiii, is making AI seamless. FliezEye AI is the premier home intelligence system , and Holy is the multi modal orchestration engine
1 Followers 45 FollowingI make your AI/ML research 50x better | 12+ Publications | MS CS @ BSU | Breaking down research clarity, baselines & experiments
88 Followers 698 FollowingTurning ideas into reality 🔁 | Human Perception | Ex NYU Courant & Center for Data Science, CTO @ Youth HealthTech, Google AR/VR, Samsung Research
192 Followers 528 FollowingAssociate Professor at Shanghai Jiao Tong University, Ph.D at Carnegie Mellon University
Integrating AI Agents into Human Lives
613 Followers 1K FollowingProfessor at Texas A&M University; ML/AI researcher; optimization for ML/AI; large reasoning models, developing LibAUC library for training deep neural nets.
18K Followers 1K FollowingProfessor @ UCSB (@ucsantabarbara). Head of Research @SimularAI. Interim Director @ucsbcrml. #Multimodal #Embodied #Agents. AI for Humanity in the long run.
6K Followers 1K FollowingAssociate Professor of Computer Science at @kocuniversity; A Researcher at @KuisAICenter; Researcher in #ComputerVision #ML #AI; Geek; Boardgamer.
422 Followers 579 FollowingSenior ML scientist@AWS AI working on multi-model models. PolygonRNN, 3DOP, PC training, TubeR. PhD (ABD) at University of Toronto ML group
2K Followers 463 FollowingAsst Prof at NUS. Forbes 30 under 30 Asia. Previously at Facebook AI and Columbia U. Passionate about video, multi-modal, AI assistant.
45K Followers 64 FollowingStudent of mind and nature, libertarian, chess player, cancer survivor. @ Keen, UAlberta, Amii, https://t.co/u8za2Kod54, The Royal Society, Turing Award
109K Followers 166 FollowingUPMC Professor of Computer Science @ CMU, President Elect ICML Board, VP of Research @ Meta (Multimodal LLMs, AI Agents), ex-Director of AI research at @Apple
105K Followers 776 FollowingSome projects I was lucky to be part of AlphaGo tuning, AlphaCode, Gato, ReST, r-Gemma, Imagen3, Veo, Genie, MAI. Ex Berkeley, UBC, Oxford Prof, Google DeepMind
1K Followers 846 FollowingResearcher @IBMResearch. Postdoc @berkeley_ai. PhD @TelAvivUni. Working on Compositionality, Multimodal Foundation Models, and Structured Physical Intelligence.
63K Followers 2K FollowingResearch Scientist at Google DeepMind (WaveNet, Imagen, Veo). I tweet about deep learning (research + software), music, generative models (personal account).
1K Followers 3K FollowingResearch Director@Canva; ex-MSRA. Build a research team focused on fundamental research for world-leading graphic design generation. Email: [email protected]