Tiny models used for testing
Inference Optimization
community
AI & ML interests
None defined yet.
Recent Activity
View all activity
Qwen3.6-35B-A3B mixed-precision HIGGS model variants, plus base FP16/FP8/NVFP4 references.
-
inference-optimization/Qwen3.6-35B-A3B-5.0-bits-mode-heuristic
Image-Text-to-Text • 24B • Updated • 100 -
inference-optimization/Qwen3.6-35B-A3B-5.0-bits-mode-hybrid
Image-Text-to-Text • 24B • Updated • 100 -
inference-optimization/Qwen3.6-35B-A3B-5.0-bits-mode-noise
Image-Text-to-Text • 24B • Updated • 63 -
inference-optimization/Qwen3.6-35B-A3B-5.5-bits-mode-heuristic
Image-Text-to-Text • 26B • Updated • 45
Tiny models used for testing
Qwen3.6-35B-A3B mixed-precision HIGGS model variants, plus base FP16/FP8/NVFP4 references.
-
inference-optimization/Qwen3.6-35B-A3B-5.0-bits-mode-heuristic
Image-Text-to-Text • 24B • Updated • 100 -
inference-optimization/Qwen3.6-35B-A3B-5.0-bits-mode-hybrid
Image-Text-to-Text • 24B • Updated • 100 -
inference-optimization/Qwen3.6-35B-A3B-5.0-bits-mode-noise
Image-Text-to-Text • 24B • Updated • 63 -
inference-optimization/Qwen3.6-35B-A3B-5.5-bits-mode-heuristic
Image-Text-to-Text • 26B • Updated • 45
models 386
inference-optimization/Qwen3-8B-speculators.peagle-qwen3arch-ckpt4
2B • Updated
inference-optimization/Qwen3-30B-A3B-Instruct-2507-speculator.dflash
0.7B • Updated
inference-optimization/Qwen3-8B-speculator.dflash.swa.non-qwen3-step21004
2B • Updated
inference-optimization/Laguna-XS.2-speculator.dflash-Qwen235B-500k-ckpt4
0.6B • Updated • 41
inference-optimization/Qwen3-8B-speculator.dflash.swa.non-qwen3-step126024
2B • Updated • 74
inference-optimization/Qwen3-8B-speculator.dflash.swa.non-qwen3-step56712
2B • Updated • 408
inference-optimization/Laguna-XS.2-speculator.dflash-Qwen235B-500k-ckpt3
0.6B • Updated • 113
inference-optimization/Laguna-XS.2-speculator.dflash-Qwen235B-500k-ckpt2
0.6B • Updated • 44
inference-optimization/Qwen3-8B-speculator.dflash.swa.non-qwen3-step21k
2B • Updated • 68
inference-optimization/Qwen3-8B-from-Qwen3-8B_regen-speculators.eagle3-qwen3arch-ckpt1
1B • Updated • 10
datasets 25
inference-optimization/DeepSeek-V4-Flash-responses
Viewer • Updated • 508k
inference-optimization/every-eval-ever-demo
Updated • 35
inference-optimization/Qwen3.5-4B-responses
Viewer • Updated • 7.47k • 68
inference-optimization/Qwen3.5-0.8B-responses
Viewer • Updated • 7.47k • 96
inference-optimization/Qwen3.5-9B-responses
Viewer • Updated • 7.67k • 47
inference-optimization/Qwen3-8B-Regenerated-Collection
Preview • Updated • 195
inference-optimization/Qwen3-30B-A3B-responses
Preview • Updated • 64
inference-optimization/Qwen3-32B-responses
Preview • Updated • 40
inference-optimization/ctest-Qwen3.6-27B-speculator-dataset
Viewer • Updated • 5.61k • 34
inference-optimization/Gemma4-Responses-Nemotron
Viewer • Updated • 762k • 64 • 1