Excited to see Orthogonal Finetuning (OFT) and Quantized OFT (QOFT) now merged into LLaMA-Factory! 🎉
OFT & QOFT are memory/time/parameter-efficient and excel at preserving pretraining knowledge. Try them in:
🔗 LLaMA-Factory: github.com/hiyouga/LLaMA-…
🔗 PEFT:…
Falcon-H1 technical report is now available! The latest open hybrid Transformer–Mamba model family.
The 80+ page report details the key design decisions behind H1, from architecture innovations, data strategies to training recipes challenging conventional practices in the filed
New tech report out! 🚀
Scaling Up RL: Unlocking Diverse Reasoning in LLMs via Prolonged Training
An expanded version of our ProRL paper — now with more training insights and experimental details.
Read it here 👉 arxiv.org/abs/2507.12507
New tech report out! 🚀
Scaling Up RL: Unlocking Diverse Reasoning in LLMs via Prolonged Training
An expanded version of our ProRL paper — now with more training insights and experimental details.
Read it here 👉 arxiv.org/abs/2507.12507
📢📢📢 Releasing OpenThinker3-1.5B, the top-performing SFT-only model at the 1B scale! 🚀
OpenThinker3-1.5B is a smaller version of our previous 7B model, trained on the same OpenThoughts3-1.2M dataset.
Introduce Easy Dataset
No-code framework for synthesizing fine-tuning data from unstructured documents using LLMs/Ollamas
Supports OCR, chunking, QA augmentation, and export to LlamaFactory/Unsloth fine-tuning frameworks
huggingface.co/papers/2507.04…
PPO and GRPO — a workflow breakdown of the most popular reinforcement learning algorithms
➡️ Proximal Policy Optimization (PPO): The Stable Learner
It’s used everywhere from dialogue agents to instruction tuning as it balances between learning fast and staying safe.
▪️ How PPO…
Fine-tune Llama-3.1 8B with Llama-Factory on AMD GPUs with this step-by-step guide: bit.ly/4k14ORL
Discover more fine-tuning tutorials on the ROCm AI Developer Hub: bit.ly/4kLQiOQ
DeepSeek 671b and Qwen3 236b support with Megatron backend is now available as preview in verl v0.4.0 🔥🔥🔥
We will continue optimizing MoE model performance down the road.
DeepSeek 671b: verl.readthedocs.io/en/latest/perf…
verl v0.4: github.com/volcengine/ver…
102 Followers 442 FollowingFather. Founder and CEO. ASD Advocate. Distincto: AI-First Apps from an AI-First Company, Delivering Big for the Global Neurodiverse Community
2K Followers 3K FollowingFounder @pictor_network - the decentralized GPU aggregator for 3D rendering and AI workloads. Enthusiast in AI & Blockchain | making moves on @Aptos
11K Followers 975 FollowingJan is an open source ChatGPT-alternative that runs 100% offline. Built by @menloresearch. Community: https://t.co/gXXor3poY5
357 Followers 37 FollowingEfficient Systems for Foundation Models Workshop, ICML2025.
Join us if you are interested in the challenges associated with large models training & inference!
17K Followers 20 FollowingA high-throughput and memory-efficient inference and serving engine for LLMs. Join https://t.co/lxJ0SfX5pJ to discuss together with the community!
47K Followers 106 FollowingAdvancing AI innovation together. Built with devs, for devs. Supported through an open ecosystem. Powered by AMD.
#TogetherWeAdvance
12K Followers 3K Followingresearch @MIT_CSAIL @thinkymachines. working on scalable and principled algorithms in #LLM and #MLSys. in open-sourcing I trust 🐳. she/her/hers
10K Followers 37 FollowingTeam member at something young.
Adjunct Prof @ McGill.
Member of Mila, Quebec AI Institute.
Stream of consciousness is my own.
2K Followers 56 FollowingAxolotl is the premier open source LLM fine tuning framework. find us on discord https://t.co/wlcE2wlJa9 or email us at [email protected]
1K Followers 103 FollowingAI/RL researcher, Assistant Prof. at @Tsinghua_Uni, leading the RL lab at @AntResearch_, PhD at @berkeley_ai, frequent flyer and milk tea lover.
750 Followers 203 Followingp/hd | Big RL energy | 0.71 |research⟩ + 0.71 |engineer⟩ @ Meta, but never speaking on behalf of the company | Prev. lead maintainer of Gymnasium
No recent Favorites. New Favorites will appear here.