Text Generation
• 2B • Updated • 6
• 1
Text Generation
• 0.2B • Updated • 14
• 2
Text Generation
• 0.5B • Updated • 10
• 1
Text Generation
• 1B • Updated • 2.36k
• 1
Text Generation
• 7B • Updated • 61
• 3
fla-hub/SmolLM-1.7b-predecay
2B • Updated • 4
Text Generation
• 0.2B • Updated • 169
• 5
Text Generation
• 0.5B • Updated • 1.31k
Text Generation
• 2B • Updated • 343
Text Generation
• 3B • Updated • 268
• 2
Text Generation
• 7B • Updated • 37
• 2
fla-hub/Qwen2.5-3B-Instruct
3B • Updated • 32
8B • Updated • 2
fla-hub/Qwen2.5-7B-Instruct
8B • Updated • 3
Text Generation
• 3B • Updated • 88
• 3
Text Generation
• 2B • Updated • 369
• 9
Text Generation
• 0.2B • Updated • 53
• 1
Text Generation
• 0.5B • Updated • 272
• 1
Text Generation
• 1B • Updated • 15
Text Generation
• 0.4B • Updated • 8
Text Generation
• 0.2B • Updated • 8
• 4
fla-hub/transformer-340M-4K-0.5B-20480-lr3e-4-decay0.1-sqrt
0.4B • Updated • 3
fla-hub/transformer-340M-4K-0.5B-20480-lr3e-4-cosine
0.4B • Updated • 57
• 1
fla-hub/transformer-3B-qwen2.5
3B • Updated • 3
fla-hub/transformer-3B-qwen2.5-instruct
3B • Updated • 3
fla-hub/transformer-1.5B-qwen2.5-instruct
2B • Updated • 2
fla-hub/transformer-1.5B-qwen2.5
2B • Updated • 5
• 1
fla-hub/transformer-340M-10B
Text Generation
• 0.3B • Updated • 8
fla-hub/delta_net-1.3B-100B
Text Generation
• 1B • Updated • 1.59k
Text Generation
• 3B • Updated • 7