Exploration over Exploitation.
RA @Mila_Quebec, Research Fellow @UniofOxford. MSc @UWindsor. Interested in Adversarial attacks, security & reliability of LLMsJoined April 2014
🎉 Thrilled that our work has been accepted at #EMNLP2025 (Main Conference)!
TL;DR: We propose a framework to predict & explain unintended side effects in models (e.g., emergent toxicity, forgotten knowledge) using OOD data.
Huge thanks to @gfarnadi, @negar_rz, and Zhuan Shi 🚀
🎉 Thrilled that our work has been accepted at #EMNLP2025 (Main Conference)!
TL;DR: We propose a framework to predict & explain unintended side effects in models (e.g., emergent toxicity, forgotten knowledge) using OOD data.
Huge thanks to @gfarnadi, @negar_rz, and Zhuan Shi 🚀
We observed similar biased behavior when evaluating LLM routers, even with commercial solutions such as Amazon Bedrock
By biases, I mean that the router tends to favor certain categories or keywords, consistently directing them to the more powerful model
arxiv.org/abs/2504.07113
We observed similar biased behavior when evaluating LLM routers, even with commercial solutions such as Amazon Bedrock
By biases, I mean that the router tends to favor certain categories or keywords, consistently directing them to the more powerful model
arxiv.org/abs/2504.07113
165 Followers 384 FollowingResearch scientist at Amazon. Interested in language models and responsible AI. Studied at @Mila_Quebec during my Ph.D. and interned at Microsoft Research.
798 Followers 5K FollowingAI explorer Interpretability, Alignment, Optimization, Safety & More at AryaXAI | AI for Social Good | AAAI UC 23 Scholar | Prev. @ Mila,Bosch,Manipal.
2K Followers 673 FollowingResearch scientist at Google DeepMind. Ph.D. in Machine Learning from the University of Toronto / Vector Institute. @[email protected]
370 Followers 497 FollowingPhD candidate @TilburgU, doing research on the intersection of machine learning, model interpretability and model efficiency for both text and speech. #NLProc
2K Followers 2K Followingيا رب يا رب عطفًا
على الضعيف الفقير
First-Generation Medical Student🩺|| Interested in Clinical Research, Biostatistics, and Academic Writing.
98 Followers 149 FollowingResearch MSc @Mila_Quebec @mcgill_nlp | Research Fellow @RBCBorealis | reasoning and hallucination x evaluation and interpretability | Looking for Fall '26 PhD
233 Followers 268 FollowingResearch Intern @samaya_AI | PhD student at @nlp_usc | Former: BS/MS student doing research in #NLProc at @uwcse @uwnlp | Previously research at @apple, @amazon
486 Followers 511 FollowingMATS 7/7.1 Scholar w/ Neel Nanda
MSc at @ENS_ParisSaclay prev research intern at DLAB @EPFL
AI safety research / improv theater
1K Followers 3K FollowingLebanese-Canadian advocacy journalist based in Montreal | Journaliste canadien-libanais basé à Montréal | Media Analyst @CJPME
Alumni @Linknewspaper
50K Followers 4K FollowingActive in peace, justice; 4 yrs in ישראל-فلسطين, also 2 in 日本, worked at Vancouver's Ahavat Olam progressive synagogue & First United Church homeless shelter
3K Followers 416 Following✨ asking sand to show its work @GoodfireAI // deep learning, math, biology // creating a more beautiful future // (opinions my own)
9K Followers 20 FollowingAdvancing humanity's understanding of AI through interpretability research. Building the future of safe and powerful AI systems.
139 Followers 98 FollowingMaster's student at Mila & Université de Montréal | Former SDE 2 at Microsoft | Passionate about pushing the boundaries of vision-language understanding 🚀
198 Followers 37 FollowingWe're a part-time, virtual research program that gives students and early career professionals an opportunity to work with professional AI safety researchers.
20K Followers 1K FollowingResearcher @MSFTResearch, AI Frontiers Lab; Prof @UWMadison (on leave); learning in context; thinking about reasoning; babas of Inez Lily.
26K Followers 1K FollowingProfessor of Holocaust and Genocide Studies at Brown University currently writing on the Holocaust, Europe’s eastern borderlands, and Israel-Palestine
976 Followers 950 FollowingAssociate Professor @ucl | Language and AI Science | Previously senior research scientist @AISafetyInst, postdoc @ETH_en, PhD @illc_amsterdam
29K Followers 1K FollowingAI, national security, China. Part of the founding team at @CSETGeorgetown (opinions my own). Author of Rising Tide on substack: https://t.co/LKAoyL00iB