Yiming Dou @_YimingDou

Ph.D. student at UMich | B.Eng. from SJTU | Computer Vision, Multimodal, Robotics yimingdou.com Shanghai ↔️ Ann Arbor Joined March 2022

Tweets

78
Followers

763
Following

895
Likes

280

Paul Liang @pliang279

3 months ago

Despite much progress in AI, the ability for AI to 'smell' like humans remains elusive. Smell AIs 🤖👃can be used for allergen sensing (e.g., peanuts or gluten in food), hormone detection for health, safety & environmental monitoring, quality control in manufacturing, and more.…

7 18 133 15K 44

Download Video

Linyi Jin @jin_linyi

3 months ago

Hello! If you are interested in dynamic 3D or 4D, don't miss the oral session 3A at 9 am on Saturday: @zhengqi_li will be presenting "MegaSaM" I'll be presenting "Stereo4D" and @QianqianWang5 will be presenting "CUT3R"

1 6 36 1K 1

Ayush Shrivastava @ayshrv

3 months ago

Excited to share our CVPR 2025 paper on cross-modal space-time correspondence! We present a method to match pixels across different modalities (RGB-Depth, RGB-Thermal, Photo-Sketch, and cross-style images) — trained entirely using unpaired data and self-supervision. Our…

1 28 121 9K 80

Download Image

Jeongsoo Park @jespark0

3 months ago

Can AI image detectors keep up with new fakes? Mostly, no. Existing detectors are trained using a handful of models. But there are thousands in the wild! Our work, Community Forensics, uses 4800+ generators to train detectors that generalize to new fakes. #CVPR2025 🧵 (1/5)

1 9 23 2K 0

Download Video

Daniel Geng @dangengdg

3 months ago

Hello! If you like pretty images and videos and want a rec for CVPR oral session, you should def go to Image/Video Gen, Friday at 9am: I'll be presenting "Motion Prompting" @RyanBurgert will be presenting "Go with the Flow" and @ChangPasca1650 will be presenting "LookingGlass"

3 16 64 5K 1

Chris Rockwell @_crockwell

4 months ago

Ever wish YouTube had 3D labels? 🚀Introducing🎥DynPose-100K🎥, an Internet-scale collection of diverse videos annotated with camera pose! Applications include camera-controlled video generation🤩and learned dynamic pose estimation😯 Download: huggingface.co/datasets/nvidi…

2 38 178 42K 97

Download Video

Yuanchen Ju @ju_yuanchen

5 months ago

🧩#CVPR2025🌷Introducing Two By Two✌️: The First Large-Scale Daily Pairwise Assembly Dataset with SE(3)-Equivariant Pose Estimation. 🤖2BY2 helps robots master daily 3D assembly tasks—like plugging sockets or arranging flowers—across diverse objects! 🐨Co-lead by @yuqi_Beijing

2 22 89 9K 23

Download Video

Yiming Dou @_YimingDou

5 months ago

Thanks to @OpenAI, got a chance to grow up again in Ghibli anime🤗

0 0 15 616 0

Download Image

Sarah Jabbour @SarahJabbour_

8 months ago

I’m on the PhD internship market for Spr/Summer 2025! I have experience in multimodal AI (EHR, X-ray, text), explainability for image models w/ genAI, clinician-AI interaction (surveyed 700+ doctors), and tabular foundation models. Please reach out if you think there’s a fit!

1 11 65 6K 8

Yuanchen Ju @ju_yuanchen

9 months ago

🍌We present DenseMatcher！ 🤖️DenseMatcher enables robots to acquire generalizable skills across diverse object categories by only seeing one demo, by finding correspondences between 3D objects even with different types, shapes, and appearances.

9 28 115 24K 40

Download Video

Daniel Geng @dangengdg

9 months ago

What happens when you train a video generation model to be conditioned on motion? Turns out you can perform "motion prompting," just like you might prompt an LLM! Doing so enables many different capabilities. Here’s a few examples – check out this thread 🧵 for more results!

20 146 673 93K 334

Download Video

Junyi Zhang @junyi42

11 months ago

Excited to share MonST3R! -- a simple way to estimate geometry from unposed video of dynamic scene We achieve competitive results on several downstreams (video depth, camera pose) and believe this is a promising step toward feed-forward 4D reconstruction monst3r-project.github.io

22 140 734 131K 316

Download Video

Zichen Wang @Zichen2501

11 months ago

Differentiable rendering made SIMPLE❗️ Differentiating physically based renderers is hard: Dirac-delta discontinuities arise at object silhouette. Our #SIGGRAPHAsia2024 work shows how a simple relaxation can rescue the day, enabling easy 3D reconstruction and relighting! (1/N)

5 55 347 44K 211

Download Video

Ayush Shrivastava @ayshrv

11 months ago

We present Global Matching Random Walks, a simple self-supervised approach to the Tracking Any Point (TAP) problem, accepted to #ECCV2024. We train a global matching transformer to find cycle consistent tracks through video via contrastive random walks (CRW).