Seungwook Han @seungwookh

phd-ing @MIT_CSAIL, prev @MITIBMLab @columbia hanseungwook.github.io Joined June 2017

Tweets

136
Followers

403
Following

575
Likes

639

Seungwook Han @seungwookh

4 hours ago

Why do models forget less with RL than SFT?

Jyo Pari @jyo_pari

4 hours ago

Why do models forget less with RL than SFT?

1 48 235 17K 194

Download Image

0 0 0 105 0

We have a fun collaboration of @GPU_MODE x @scaleml coming up! We’re hosting a week-long online bootcamp that explores the core components of GPT-OSS while also diving into cutting-edge research that pushes beyond what’s currently in GPT-OSS! For example, how can MoE's power…

1 20 71 22K 27

Download Image

Seungwook Han @seungwookh

a month ago

uncertainty-aware reasoning, akin to how humans leverage our confidence

Mehul Damani @MehulDamani2

a month ago

uncertainty-aware reasoning, akin to how humans leverage our confidence

13 268 897 93K 613

Download Image

0 1 3 331 1

Seungwook Han @seungwookh

a month ago

was actually wondering with @hyundongleee the fundamental differences between diffusion and autoregressive modeling other than the structure imposed in the modeling of the sequential conditional distribution and how they manifest. a poignant paper that addresses this thought

Mihir Prabhudesai @mihirp98

a month ago

128 192 1K 230K 937

Download Image

0 1 13 987 2

Seungwook Han @seungwookh

2 months ago

omw to trying this out 👀

Pika @pika_labs

2 months ago

omw to trying this out 👀

234 207 2K 512K 719

0 0 0 255 0

Seungwook Han @seungwookh

2 months ago

how particles can act differently under different scales and conditions and how we can equip it as part of design is cool

MIT Architecture @MITarchitecture

8 months ago

how particles can act differently under different scales and conditions and how we can equip it as part of design is cool

0 0 2 872 0

Download Image

0 0 4 470 0

Laker Newhouse @LakerNewhouse

2 months ago

[1/9] We created a performant Lipschitz transformer by spectrally regulating the weights—without using activation stability tricks: no layer norm, QK norm, or logit softcapping. We think this may address a “root cause” of unstable training.

14 78 585 139K 567

Download Image

Seungwook Han @seungwookh

2 months ago

But actually this is the og way of doing it and should stop by E-2103 to see @jxbz and Laker Newhouse whiteboard the whole paper.

Jeremy Bernstein @jxbz

2 months ago

But actually this is the og way of doing it and should stop by E-2103 to see @jxbz and Laker Newhouse whiteboard the whole paper. https://t.co/NjV3qnxCaK

3 21 197 29K 101

Download Image

1 6 75 8K 11

Download Image

Jyo Pari @jyo_pari

2 months ago

If you are interested in questioning how we should pretrain models and create new architectures for general reasoning - then checkout E606 @ ICML, our position by @seungwookh and I on potential directions for the next generation reasoning models!

0 6 22 2K 7

Download Image

Seungwook Han @seungwookh

2 months ago

At #ICML 🇨🇦 this week. I'm convinced that the core computations are shared across modalities (vision, text, audio, etc). The real question is the (synthetic) generative process that ties them. Reach out if you have thoughts or want to chat!

0 3 16 2K 3

Seungwook Han @seungwookh

2 months ago

wholeheartedly agree with this direction that games can be a good playground for learning reasoning. makes us think what other synthetic environments we can design and grow over complexity

Bo Liu (Benjamin Liu) @Benjamin_eecs

2 months ago

wholeheartedly agree with this direction that games can be a good playground for learning reasoning. makes us think what other synthetic environments we can design and grow over complexity

4 50 273 65K 182

Download Image

0 1 7 522 2

Seungwook Han @seungwookh

2 months ago

robot arms becoming more human-like. now with a wrist 🦾

Martin Peticco @martinpeticco

2 months ago

robot arms becoming more human-like. now with a wrist 🦾

8 54 293 28K 131

Download Video

0 0 3 389 0

Phillip Isola @phillip_isola

3 months ago

Our computer vision textbook is now available for free online here: visionbook.mit.edu We are working on adding some interactive components like search and (beta) integration with LLMs. Hope this is useful and feel free to submit Github issues to help us improve the text!