What happens when we have access to two different explainability metrics: gradients *and* attention weights on transformers? How do they differ per layer? How are they affected by task complexity, or model choice?
Click here to explore more: yifan0sun.github.io/BERTGradGraph/
See more at…
How does task complexity and optimality really affect loss sharpness/flatness? What does sharpness really look like, especially along the training path? Check out this loss landscape visualizer!
yifan0sun.github.io/LossLandscapeV…
See more at
optimalvisualizer.com
There are a lot of intuitions and studies as to which layers are responsible for what complexity of task, and what fine-tuning does to the saliency of attention layers. Now you can visualize it in real-time!
bert-attention-visualizer.vercel.app
See more at optimalvisualizer.com
Ok can someone tell me what tools people are using to make these amazing simulation videos? I feel so technologically behind. What people are doing today seems so professional, and the sheer volume suggests a tool out there that does it much faster than whatever I'm using
Blog post: what does it mean to be subgaussian?
thehappyoptimist.com/2024/10/29/wha…
Tl;dr: shouldn't the subgaussian bound be parameter-dependent? Somehow just basing it on the size of the support seems too loose.
am I overthinking it?
We will present our #COLM2024 paper, Does RoBERTa Perform Better than BERT in Continual Learning: An Attention Sink Perspective, on Monday 11:00 AM – 1:00 PM at #20 Poster Area. Please stop by if you are interested!
Paper: openreview.net/pdf?id=VHhwhmt…
Code: github.com/StonyBrookNLP/…
What are people's favorite tablet platform?When I use GoodNote, I can't seem to fit a lot on the page, but when I use jamboard equivalents, the technology just isn't as good. Is there something that mimics a blackboard better, where you can fit a lot of content on one page?
Question for twitterverse: how hard is it to implement XGboost from scratch (or some barebones version of it?) I'm thinking of adding it as an undergrad assignment. Would that be cruel or just eccentric?
Whew, disappeared for a few months to do some teaching. Updated my giant pile of machine learning problems, which I've used in the undergrad and graduate course. Do let me know if you find them useful! sites.google.com/site/yifansunw…
108 Followers 998 Following#Repositories #Stars #Overview
https://t.co/Neac4YcFZg Team always ready to provide customers real and manual Old Or New Verified GitHub Account service.
1K Followers 8K FollowingTech/fin | AI/TMT/Semi/Photonics-pilled | Here to learn in the agora of ideas | Chicagoan in exile | Fudoshin | Cultivating garden and adding value…
423 Followers 3K Following@UCSDbiosciences PhD student in the @HargreavesLab at @SalkInstitute • Interested in cell fate decisions, epigenetic gene regulation, and qBio
1K Followers 2K FollowingAssistant Professor at Stony Brook University.
Geometric Deep Learning, AI4Science, Scientific Machine Learning, LLMs for Science.
30 Followers 129 FollowingPhD student in applied math at Stony Brook.
Research Interests: Geometric Deep Learning, Generative Models, Math+AI, Neural Operators
Opinions are mine.
169 Followers 1K FollowingPhD Student at UTA | Machine Learning + Cancer Research |
Ex Software Development Engineer @Amazon,
Environmentalist, Georgia State University Alumni ☮️🌿
116 Followers 192 FollowingPhD candidate, @PACELab_SBU, @stonybrooku
BTech in EE from @IIT_Bhilai
Website: https://t.co/KrDSf8SUvo
PS: I am not much active on Twitter
4K Followers 655 FollowingOn a secret ML mission! Gemini Long Context, SystemsResearch@Google, prof @CS_UVA, ex-@CarnegieMellon. Vanity is not my fav sin, self-deprecation is.
427 Followers 663 FollowingPhD candidate at @stonybrooku in @stonybrooknlp; Prev: @GoogleAI research, @ai2_aristo @allen_ai, @sfresearch; MS from @jhuclsp
1K Followers 2K FollowingAssistant Professor at Stony Brook University.
Geometric Deep Learning, AI4Science, Scientific Machine Learning, LLMs for Science.
31K Followers 669 FollowingVP Research, Google DeepMind, ex-head of Google Brain. Professor at University of Cambridge. Machine Learning Researcher. ex-Chief Scientist & VP of AI, Uber.
333 Followers 264 FollowingAssistant Professor at Stony Brook University. Previously: UPenn and UC Santa Barbara. Areas: data management, consensus, blockchains.
3K Followers 433 FollowingAssistant Professor @JohnsHopkinsAMS, Optimization, PhD @Cornell_ORIE
Mostly here to share pretty maths/3D prints, sometimes sharing my research
755 Followers 1K FollowingAssociate Professor at @sbucompsc. Works on Automated Reasoning, Security, Privacy, and Software Engineering. Views are my own.
25K Followers 101 FollowingDirector, @PrincetonPLI and Professor @PrincetonCS. Seeks math/conceptual understanding of deep learning and large AI models.
Also on the "other" social network
169 Followers 149 FollowingAssistant Professor. Mobile Sensing Researcher. Hiker. Runner. Not necessarily in that order. Views expressed on this account are my own.
105 Followers 17 FollowingTraDE-OPT is a PhD programme aimed to train experts in optimization for data science. TraDE-OPT is funded by the MCSA H2020 under the G.A. n. 861137.
634 Followers 347 FollowingAss. Prof. (maître de conférences) of artificial intelligence at @CentraleSupelec in the Centre pour la Vision Numérique.
Vélotaffeur
🇲🇽/🇺🇸
1K Followers 114 FollowingPresident of @OlinCollege. Trailblazer. Engineering educator. Biomedical & Chemical Engineer. Thought leader. Advocate for equity and access in STEM. She/her.
5K Followers 844 FollowingAdvancing engineering as a bridge to equity, opportunity & progress for humanity on a global scale. Oliners are building a better world. Tweets by @AdamPCoulter
9K Followers 901 FollowingI ♥ science & compiled, statically-typed programming languages! I dislike slop in all forms, including slop thinking. Views are my own, but are obviously right.