nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16 Text Generation • 561B • Updated about 8 hours ago • 56.9k • 183
Running on CPU Upgrade Featured 398 ML Intern 🤖 398 Chat with an AI‑powered ML Intern for instant help
Running Featured 85 Distilling 100B+ Models 40x Faster with TRL 📝 85 TRL distillation for 100B+ teachers, 40x faster
Running on CPU Upgrade 246 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens 📝 246 Explore synthetic data benchmarks via an interactive bookshelf
Running 17 The Jagged AI Frontier is a Data Frontier 🧭 17 Why AI capabilities are shaped by data availability
Running on CPU Upgrade Featured 3.2k The Smol Training Playbook 📚 3.2k The secrets to building world-class LLMs
Running 111 Unlocking On-Policy Distillation for Any Model Family 📝 111 Visualize on‑policy distillation token alignment