Proudly presenting three doctors from my research group! 🤗
Congratulations 🥳
Dr. Yuchen Zeng @yzeng58
Dr. Ziqian Lin @myhakureimu
Dr. Ying Fan @yingfan_bot
+ I will be posting highlights of their amazing research achievements soon.... stat tuned ... ;)
How do language models generalize from information they learn in-context vs. via finetuning? We show that in-context learning can generalize more flexibly, illustrating key differences in the inductive biases of these modes of learning — and ways to improve finetuning. Thread: 1/
Thank you @hengjinlp so much for mentoring me since my junior year! It feels like yesterday that you provided detailed feedback and helped refine my submission for my very first paper in ACL 2018. To prospective students and interns: I'm currently recruiting passionate students…
Thank you @hengjinlp so much for mentoring me since my junior year! It feels like yesterday that you provided detailed feedback and helped refine my submission for my very first paper in ACL 2018. To prospective students and interns: I'm currently recruiting passionate students…
working on a post that's basically "how to get a paper accepted," using a case study one of my own that went from reject (2.5, 3, 3) to accept (4, 4.5, 4.5) with just one week of revisions
1/ Super excited to share our new work “LLM-Lasso,” led by my collaborators from Stanford!
tldr; We've reimagined the classic Lasso algorithm (by @robtibshirani), which uses ℓ1 regularization to select a sparse subset of features!
💥New Paper!
Algorithmic Phases of In-Context Learning:
We show that transformers learn a superposition of different algorithmic solutions depending on the data diversity, training time and context length!
1/n
Happy to share our latest work on VersaPRM!
github.com/UW-Madison-Lee…
TL;DR: VersaPRM is the first fully open-source Process Reward Model (PRM), including data, code, and weights.
It enhances LLM accuracy using test-time compute algorithms — extending beyond just mathematics!
Just 10 days after o1's public debut, we’re thrilled to unveil the open-source version of the groundbreaking technique behind its success: scaling test-time compute 🧠💡
By giving models more "time to think," LLaMA 1B outperforms LLaMA 8B in math—beating a model 8x its size.…
🎉 Milestone: Our LIFT paper has hit 100+ citations! We introduced a simple method to adapt LLMs to new domains, and researchers are now achieving success with it across predictive chemistry, metamaterial physics & more!
Check our work at uw-madison-lee-lab.github.io/LanguageInterf…
😎TLDR😎
LLMs can simultaneously solve many in-context learning tasks!
How? By giving the LLM a randomly shuffled examples from multiple tasks!
This super fun project all started with the out-of-the-box thinking of @DimitrisPapail and great team effort led by @zheyangxiong!
😎TLDR😎
LLMs can simultaneously solve many in-context learning tasks!
How? By giving the LLM a randomly shuffled examples from multiple tasks!
This super fun project all started with the out-of-the-box thinking of @DimitrisPapail and great team effort led by @zheyangxiong!
🚀 Excited to share our latest research on Looped Transformers for Length Generalization!
TL;DR: We trained a Looped Transformer that dynamically adjusts the number of iterations based on input difficulty—and it achieves near-perfect length generalization on various tasks!
🧵👇
462 Followers 1K Following(Looking for Research opportunities in AI).
ML Engineer specializing in LLMs/Agentic Frameworks & Information Extraction.
FPL Fanatic .
FC BARCELONA FAN💙♥️
146 Followers 200 FollowingPh.D. candidate in Electrical and Computer Engineering @ UW-Madison.
Interested in machine learning, signal processing and deep learning.
592 Followers 2K FollowingI feed big models @Waymo. ex @Google: WebRTC/Duo/Meet, spatial audio/ambisonics/VR media https://t.co/B1KQ2VuKVo
opinions my own.
🎸🎹🥁: https://t.co/aY9p4L7lsW
3K Followers 920 FollowingPrincipal research scientist@IBM Research & Chief Scientist@RPI-IBM AI Research Collaboration & PI@MIT-IBM AI Lab. IJCAI Computers & Thought Award Winner.
121K Followers 639 FollowingMila Scientific Director. Ex @Google DeepMind & Twitter Cortex. Father of 4. // Directeur scientifique à Mila. Ex @Google DeepMind & Twitter Cortex. Père de 4.
776 Followers 2K FollowingPh.D. student of @hkudatascience and @HKUniversity Data Intelligence Lab, fortunately advised by @huang_chao4969. Trying to be a good researcher.
1K Followers 298 FollowingAssistant Prof. @UofMaryland; Prev. {@MIT, @SimonsInstitute, @ECEILLINOIS, @Tsinghua_Uni}; Interested in Control, Game Theory, Machine Learning, and Robotics
1K Followers 301 FollowingAsst Professor at @JohnsHopkins (@JohnsHopkinsAMS and @HopkinsDSAI). Previously: @SimonsInstitute, @oxfordstats, @Polytechnique. I like to scale up things!
571 Followers 386 FollowingMy kids' favorite dad. Hobby Dev. Recovered Gamer. Last felled member of the social media resistance. ex-Broadcom / ex-VMware / Innovation @Microsoft
1K Followers 2K FollowingPhD-ing with @rfpvjr and @kaize0409 / social computing, LLMs / Big fan of @Arsenal / Intern @Snowflake @TencentGlobal @jhuclsp @NlpWestlake / Christian