Hamel Husain @HamelHusain

Researcher focusing on LLMs: https://t.co/iVZDFdIQiE Previously, dev tools and infra for ML. Ex @Github, @Airbnb. @fastdotai core contributor. hamel.dev Portland, OR Joined September 2012

Tweets

10K
Followers

23K
Following

2K
Likes

11K

Grant♟️ @granawkins

9 hours ago

sota RAG in 2024

37 105 652 73K 245

Download Video

It's @huggingface Accelerate release time and there are a TONof exciting features to get through: new optimizers, FP8 fixes, DataLoader improvements, documentation, and so much more! For a quickread, check out the full notes: github.com/huggingface/ac… Otherwise let's dig in🧵

2 7 26 2K 5

Download Image

Gagan Biyani 🏛 @gaganbiyani

9 hours ago

Maven's top AI course just added a ton of new guest speakers. Incredible talent convening on Maven to teach LLM Fine-Tuning: - Wing Lian: Creator of Axolotl library for LLM fine-tuning - Shreya Shankar: LLMOps and LLM Evaluations researcher - Zachary Mueller: Lead maintainer…

Dan Becker @dan_s_becker

20 hours ago

3 4 43 11K 27

1 2 13 5K 2

Nate Raw @_nateraw

a day ago

brushed up my personal site and brain dumped a post on 🎶musicgen-songstarter-v0.2🎶 It covers: - 🧠my thought process/motivation behind it - ✏️notes on my previous experiments over the last 9 months - 👀 training deets, @weights_biases logs w/ hparams nateraw.com/posts/training…

2 9 59 5K 24

Download Image

Hamel Husain @HamelHusain

2 days ago

Classic example of overfitting to the validation set re: LLMs, when I started working with @_cartermp I found few-shot examples from the validation set in the prompt (we fixed it!). There are lots of reasons for a separate eval set. Overfitting can come in many forms.

2 2 21 4K 6

Hamel Husain @HamelHusain

2 days ago

Has someone created materials around “fundamentals of ML for AI Engineers”, not focused on building models but things like evaluations, error analysis, etc Maybe something already exists? I don’t want to do it lol - looking for a resource I can share with people

35 31 367 175K 543

Hugo Bowne-Anderson @hugobowne

3 days ago

📺Tune in next week as @rasbt and I riff on "Developing and Training LLMs From Scratch" in a live podcast recording for @VanishingData 💫 lu.ma/build-llms-fro… This will likely be a sprawling convo in which we tell you everything you need to know about LLMs, but were too…

0 9 54 30K 21

Stas Bekman @StasBekman

4 days ago

NVIDIA has just added CUDA checkpointing functionality via: github.com/NVIDIA/cuda-ch… which should allow CRIU to do application-level checkpointing, that includes GPU state save/restore. Thank you for addressing this long-outstanding request, @NVIDIAAI Discovered via this…

2 46 274 26K 130

Hamel Husain @HamelHusain

3 days ago

I’ve tried 7+ AI note takers and my favorite one by far is @circlebackai Here is some code I have been playing with to automate the tedious process of writing up consulting proposals based on meeting summaries using circleback webooks + @modal_labs gist.github.com/hamelsmu/ac72d…

5 15 148 17K 193

Hamel Husain @HamelHusain

3 days ago

I’m getting lots of questions about why this is a bad idea. Repeatedly peeking at the validation set in the process optimizing anything makes that validation set very biased It’s very bad hygiene to intermingle your validation and test/eval set. The consequences of this…

Hamel Husain @HamelHusain

3 days ago

10 3 77 46K 38

Download Image

4 5 83 17K 34

anton @abacaj

3 days ago

@HamelHusain The secret is training on test set

2 2 34 4K 0

Hamel Husain @HamelHusain

3 days ago

Am I misunderstanding something? dspy-docs.vercel.app/docs/quick-sta…

10 3 77 46K 38

Download Image

Wing Lian (caseus) @winglian

a week ago

Here's the 256k (262k) version built on OSS tools so that anyone can reproduce on their own. Trained using PoSE further extending our previous 64k version at the original RoPE theta. Per our previous experiments, I expect this should handle passkey retrieval up to 512k. 🤗Model:…

Gradient @Gradient_AI_

a week ago

31 80 426 198K 289

Download Image

9 24 151 37K 83

Hamel Husain @HamelHusain

4 days ago

This is the way I feel about some LLM frameworks

Andrej Karpathy @karpathy

4 days ago

This is the way I feel about some LLM frameworks

15 7 419 91K 93

5 2 76 14K 7

Zach Mueller @TheZachMueller

4 days ago

Happy to say that @huggingface accelerate has hit 100 MILLION downloads today! It's been so much fun enabling so many users to have their code just run on any system with as minimal friction as possible. Here's to 200M 🚀🚀🚀

3 14 84 10K 6

Download Image

jason liu @jxnlco

4 days ago

Finetuning Embeddings: Most people don't know that if you had any production-ready data, you should be able to fine-tune and outperform OpenAI. 1. With even 2,000 examples, you can fine-tune an embedding. 2. By using the Hugging Face Inference Server and Modal Labs, we showed…

8 51 416 66K 752

Download Image

Jeremy Howard @jeremyphoward

5 days ago

There's a new bill, SB-1047 "Safe and Secure Innovation for Frontier Artificial Intelligence Models Act". I think it could do a great deal of harm to startups, American innovation, open source, and safety. So I've written a response to the authors: 🧵 answer.ai/posts/2024-04-…