My Hugging Face repos: https://t.co/yh7J4DFGTc
Discord server: https://t.co/5h6rGsGfBx
Patreon: https://t.co/yfQwFggGtxpatreon.com/TheBlokeAI UKJoined July 2010
FYI to anyone using @MistralAI's Mixtral for long context tasks -- you can get even better performance by disabling sliding window attention (setting it to your max context length)
config.sliding_window = 32768
Transformers now supports Mixtral GPTQs and I've updated my READMEs accordingly. It was awesome working with @_marcsun and @younesbelkada of @huggingface on this!
Credit to LaaZa for coding the AutoGPTQ quant and inference implementation which enabled me to get GPTQs out fast!
Transformers now supports Mixtral GPTQs and I've updated my READMEs accordingly. It was awesome working with @_marcsun and @younesbelkada of @huggingface on this!
Credit to LaaZa for coding the AutoGPTQ quant and inference implementation which enabled me to get GPTQs out fast!
@TheBlokeAI joined me to share his work in the open-source AI space - don't miss it! happening right now
server link: discord.gg/peBrCpheKE
(see the general channel or events channel for google meet link)
Blazing fast text generation using AWQ and fused modules! 🚀
Up to 3x speedup compared to native fp16 that you can use right now on any models supported by @TheBlokeAI
Simply pass an `AwqConfig` with `do_fuse=True` to `from_pretrained` method!
huggingface.co/docs/transform…
It's been awesome to see Transformers getting support for more and more quantisation methods. And I've loved collaborating with @younesbelkada and @huggingface again!
All my AWQ uploads now support Transformers. READMEs will update soon to show a Transformers Python example.
It's been awesome to see Transformers getting support for more and more quantisation methods. And I've loved collaborating with @younesbelkada and @huggingface again!
All my AWQ uploads now support Transformers. READMEs will update soon to show a Transformers Python example.
oh hello @TheBlokeAI
I want to bookmark your 'Recent models' Collection on @huggingface 🔥
Well... you can now upvote Collections!
and browse upvoted collections on your profile ❤️
Thanks again to @latitudesh for the loan of a beast 8xH100 server this week. I uploaded over 550 new repos, maybe my busiest week yet!
Quanting is really resource intensive. Needs not only fast GPUs, but many CPUs, lots of disk, and 🚀 network. A server that ✅ all is v. rare!
🔥Excited to introduce LMSYS-Chat-1M, a large-scale dataset of 1M real-world conversations with 25 cutting-edge LLMs!
This dataset, collected from chat.lmsys.org, offers insights into user interactions with LLMs and intriguing use cases.
Link: huggingface.co/datasets/lmsys…
New feature alert in the @huggingface ecosystem!
Flash Attention 2 natively supported in huggingface transformers, supports training PEFT, and quantization (GPTQ, QLoRA, LLM.int8)
First pip install flash attention and pass use_flash_attention_2=True when loading the model!
This is fantastic! Git clone was already dead for HF as far as I was concerned - I had my own hf_upload.py and hf_download.py scripts (wrapping HfAPI) for fast, efficient transfers.
But huggingface_hub v0.17 makes those redundant! I will be using this now. Awesome stuff,🤗
This is fantastic! Git clone was already dead for HF as far as I was concerned - I had my own hf_upload.py and hf_download.py scripts (wrapping HfAPI) for fast, efficient transfers.
But huggingface_hub v0.17 makes those redundant! I will be using this now. Awesome stuff,🤗
This new filter 🔎 on @huggingface user's profile is very helpful, especially to check if @TheBlokeAI has quantized and released the last trending models 😁
Chronos 70B v2 release! Thanks to Pygmalion for generously providing the compute and @TheBlokeAI for quantizing the model. As usual, the model optimized for chat, roleplay, storywriting, and now includes vastly improved reasoning skills.
huggingface.co/elinas/chronos…
Transformers 4.32.0 now supports GPTQ models natively!
Over the last couple of days I have updated 296 of my GPTQ repos to provide automatic support for this.
It's awesome you can now load a GPTQ model directly in Transformers with only two lines of code!
Transformers 4.32.0 now supports GPTQ models natively!
Over the last couple of days I have updated 296 of my GPTQ repos to provide automatic support for this.
It's awesome you can now load a GPTQ model directly in Transformers with only two lines of code!
21 Followers 154 Following. A creature of the future.
𐄁 Batteries not included.
𐄁 Emotions optional.
𐄁 Probably sentient.
𐄁 Still running. Still wrong.
6 Followers 71 FollowingPosts on AI, local AI and creating with AI
Sharing my insights, on my way to build a 𝐥𝐨𝐜𝐚𝐥 𝐚𝐢 that 𝐰𝐨𝐫𝐤𝐬 in life and 𝐩𝐥𝐚𝐲𝐬 in games
132 Followers 303 FollowingHead of IT @ https://t.co/zRdglIpzlF | 30y in tech | First compiled 🐧 Linux: 2.2.14 | Now connecting IT & business | Home Automation 🏠 | Bike 🚵♂️ | Dad 👶🏻👦🏻👧
50K Followers 3K FollowingDeveloper Experience Lead at @GoogleDeepMind
Building Gemini API, Gemma, AI Studio and more AI products. My views
ex-Chief Llama Officer @huggingface 🇵🇪🇲🇽
451K Followers 77 FollowingTensors and neural networks in Python with strong hardware acceleration. PyTorch is an open source project at the Linux Foundation. #PyTorchFoundation
2K Followers 654 FollowingTech CMO / Cranfield Top 100 Women / International Marketer of Year / Non-Exec / tweeting on tech,the arts, design, music,film. Views my own!
7K Followers 3K Following🛠️ Founder @AbideAI 👐 ML Engineer 👩💻☕
📚 Book Author: LLMOps (2025), ✍️ GPU Engg for AI Systems (2026)
💬🐦 Talk to me about LLMs, MLSys & GPU Training
45K Followers 1K FollowingAI Developer Experience @GoogleDeepMind | prev: Tech Lead at @huggingface, AWS ML Hero 🤗 Sharing my own views and AI News 🧑🏻💻 https://t.co/7IosdlNz22
106K Followers 2K FollowingCovering the latest in AI development • ML Eng since 2017 • Building @AlphaSignalAI into the #1 source of news for AI devs → At 250k users.
13K Followers 4K FollowingDevoted to addressing alignment. We develop state of the art open sourced AI.
https://t.co/oANsMnut7V
https://t.co/6aJDLUvuU5
3K Followers 40 FollowingCleaning data hurts. We make it painless @TeraflopAI
Odin gave his eye to acquire knowledge but I would give far more.
Discord: https://t.co/MY7FsJdk99
196K Followers 6K Followingcanadian startup founder. prev eng @ x, stripe. yacine_kv on insta
i make my memes with https://t.co/pWRBfY8kn2 -
I write a subscriber only blog. Subscribe!
64K Followers 1K FollowingCo-founder and CTO at @huggingface 🤗. ML/AI for everyone, building products to propel communities fwd. @Stanford + @Polytechnique
7K Followers 1K Followingcatholic, ai researcher, co-founder/ceo of @NousResearch
alignment: whatever the opposite of yudkowsky + bryan johnson is.
blessed be God in all his designs.
92K Followers 207 FollowingLMArena: Open Platform for Community-driven AI Benchmarking. Graduated from UC Berkeley / @lmsysorg. We’re hiring: https://t.co/1OkfLq2n0I
19K Followers 68 Followingcreation is destruction is creation is destruction is creation is destruction is creation is destruction is creation is destruction is...