Yong Jae Lee @yong_jae_lee

Professor, Computer Sciences, UW-Madison. I am a computer vision and machine learning researcher. cs.wisc.edu/~yongjaelee/ Madison, WI Joined April 2018

Tweets

94
Followers

973
Following

125
Likes

338

Sharon Y. Li @SharonYixuanLi

2 weeks ago

My students called the new CDIS building “state-of-the-art”. I thought they were exaggerating. Today I moved in and saw it for myself. Wow. Photos cannot capture the beauty of the design.

17 13 310 39K 35

Download Image

#ICCV2025 Introducing X-Fusion: Introducing New Modality to Frozen Large Language Models It is a novel framework that adapts pretrained LLMs (e.g., LLaMA) to new modalities (e.g., vision) while retaining their language capabilities and world knowledge! （1/n） Project Page:…

2 25 82 6K 29

Download Video

Mu Cai @MuCai7

2 months ago

LLaVA-Prumerge, the first work of Visual Token Reduction for MLLM, finally got accepted after being cited 146 times since last year. Congrats to the team! @yuzhang_shang @yong_jae_lee See how to do MLLM inference much cheaper while holding performance. llava-prumerge.github.io

AI Bites | YouTube Channel @ai_bites

a year ago

0 1 2 6K 3

Download Image

2 12 57 6K 11

Download Image

Aniket Rege @wregss

3 months ago

Training text-to-image models? Want your models to represent cultures across the globe but don't know how to systematically evaluate them? Introducing ⚕️CuRe⚕️ a new benchmark and scoring suite for cultural representativeness through the lens of information gain (1/10)

1 11 29 5K 5

Download Image

Yong Jae Lee @yong_jae_lee

3 months ago

Thank you @_akhaliq for sharing our work!

AK @_akhaliq

3 months ago

Thank you @_akhaliq for sharing our work!

4 14 140 32K 71

Download Image

0 0 7 1K 1

Yong Jae Lee @yong_jae_lee

4 months ago

Congratulations Dr. Mu Cai @MuCai7! Mu is my 8th PhD student and first to start in my group at UW–Madison after my move a few years ago. He made a number of important contributions in multimodal models during his PhD, and recently joined Google DeepMind. I will miss you a lot Mu!

12 8 232 22K 10

Download Image

Jianwei Yang @jw2yang4ai

4 months ago

🚀 Excited to announce our 4th Workshop on Computer Vision in the Wild (CVinW) at @CVPR 2025! 🔗 computer-vision-in-the-wild.github.io/cvpr-2025/ ⭐We have invinted a great lineup of speakers: Prof. Kaiming He, Prof. @BoqingGo, Prof. @CordeliaSchmid, Prof. @RanjayKrishna, Prof. @sainingxie, Prof.…

1 26 103 27K 9

Download Image

Ernest Ryu @ErnestRyu

5 months ago

Public service announcement: Multimodal LLMs are really bad at understanding images with *precision*. x.com/lukeprog/statu… A thread🧵: 1/13.

Luke Muehlhauser @lukeprog

5 months ago

Public service announcement: Multimodal LLMs are really bad at understanding images with *precision*. x.com/lukeprog/statu… A thread🧵: 1/13.

171 107 3K 437K 479

Download Image

1 11 53 10K 30

Yong Jae Lee @yong_jae_lee

5 months ago

Congratulations again @MuCai7!! So well deserved. I will miss having you in the lab.

Mu Cai @MuCai7

5 months ago

Congratulations again @MuCai7!! So well deserved. I will miss having you in the lab.

61 45 2K 132K 141

Download Image

1 0 46 7K 2

Yong Jae Lee @yong_jae_lee

7 months ago

Check out our new ICLR 2025 paper, LLaRA, which transforms a pretrained vision-language model into a robot vision-language-action policy! Joint work with @XiangLi54505720, @ryoo_michael, et al from Stony Brook U, and @MuCai7. github.com/LostXine/LLaRA

Xiang Li @XiangLi54505720

7 months ago

1 7 45 13K 15

Download Video

2 6 49 4K 8

Xueyan Zou @xyz2maureen

9 months ago

🔥Poster: Fri 13 Dec 4:30 pm - 7:30 pm PST (West) It is the first time for me try to sell a new concept that I believe but not in trend. I truely trust the language between llm/lmms are embeddings, and interfacing with embeddings is essential in future! Welcome everyone to come😀

1 14 45 6K 7

Download Image

Jiasen Lu @jiasenlu

9 months ago

📢Come to join our 1st Workshop on Video-Langauge Models at #NeurIPS 2024. We have seen a great progress on image-language models, now it is time for Videos! Our invited speakers will talk more about how we further move forward! …and-language-workshop-2024.webflow.io Special invited talks…

Sangdoo Yun @oodgnas

9 months ago

1 6 19 6K 2

2 17 50 11K 8

Download Image

Mu Cai @MuCai7

9 months ago

🚨 I’ll be at #NeurIPS2024! 🚨On the industry job market this year and eager to connect in person! 🔍 My research explores multimodal learning, with a focus on object-level understanding and video understanding. 📜 3 papers at NeurIPS 2024: Workshop on Video-Language Models 📅…

5 20 133 30K 36

Download Image

Mu Cai @MuCai7

10 months ago

I am not in #EMNLP2024 but @bochengzou is in Florida! Go checkout vector graphics, a promising format that is completely different from pixels for visual representation. Thanks to LLMs, vector graphics are more powerful now! Go chat with @bochengzou if you are interested!