Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
- Website
- Community
- Solutions
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2604.13016

This collection includes the models used in the paper "Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recip

lllyx/Qwen3-1.7B-SFT

Text Generation • 2B • Updated 27 days ago • 700 • 4
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published Apr 14 • 109
lllyx/Qwen3-4B-Base-GRPO

Text Generation • 4B • Updated May 3 • 411 • 3
lllyx/OpenThought3-Qwen3-4B

Viewer • Updated 27 days ago • 305k • 184 • 2

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published Apr 14 • 109

LLM Pruning and Distillation in Practice: The Minitron Approach

Paper • 2408.11796 • Published Aug 21, 2024 • 61
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering

Paper • 2408.09174 • Published Aug 17, 2024 • 53
To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20, 2024 • 45
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

Paper • 2408.11878 • Published Aug 20, 2024 • 64

about 20 hours ago

Code as Agent Harness

Paper • 2605.18747 • Published 21 days ago • 213
SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture

Paper • 2605.12500 • Published 27 days ago • 191
From Context to Skills: Can Language Models Learn from Context Skillfully?

Paper • 2604.27660 • Published May 3 • 166
PhysBrain 1.0 Technical Report

Paper • 2605.15298 • Published 25 days ago • 143

Visual Spatial Tuning

Paper • 2511.05491 • Published Nov 7, 2025 • 53
Adam's Law: Textual Frequency Law on Large Language Models

Paper • 2604.02176 • Published Apr 2 • 506
Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

Paper • 2604.10098 • Published Apr 11 • 82
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published Apr 14 • 109

This collection includes the models used in the paper "Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recip

lllyx/Qwen3-1.7B-SFT

Text Generation • 2B • Updated 27 days ago • 700 • 4
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published Apr 14 • 109
lllyx/Qwen3-4B-Base-GRPO

Text Generation • 4B • Updated May 3 • 411 • 3
lllyx/OpenThought3-Qwen3-4B

Viewer • Updated 27 days ago • 305k • 184 • 2

about 20 hours ago

Code as Agent Harness

Paper • 2605.18747 • Published 21 days ago • 213
SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture

Paper • 2605.12500 • Published 27 days ago • 191
From Context to Skills: Can Language Models Learn from Context Skillfully?

Paper • 2604.27660 • Published May 3 • 166
PhysBrain 1.0 Technical Report

Paper • 2605.15298 • Published 25 days ago • 143

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published Apr 14 • 109

Visual Spatial Tuning

Paper • 2511.05491 • Published Nov 7, 2025 • 53
Adam's Law: Textual Frequency Law on Large Language Models

Paper • 2604.02176 • Published Apr 2 • 506
Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

Paper • 2604.10098 • Published Apr 11 • 82
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published Apr 14 • 109

LLM Pruning and Distillation in Practice: The Minitron Approach

Paper • 2408.11796 • Published Aug 21, 2024 • 61
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering

Paper • 2408.09174 • Published Aug 17, 2024 • 53
To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20, 2024 • 45
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

Paper • 2408.11878 • Published Aug 20, 2024 • 64

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs