How to use from
Unsloth Studio
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for mudler/Qwopus3.6-35B-A3B-v1-APEX-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for mudler/Qwopus3.6-35B-A3B-v1-APEX-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for mudler/Qwopus3.6-35B-A3B-v1-APEX-GGUF to start chatting
Quick Links

Qwopus 3.6 35B-A3B v1 APEX GGUF

APEX (Adaptive Precision for EXpert Models) quantizations of Jackrong/Qwopus3.6-35B-A3B-v1.

Brought to you by the LocalAI team | APEX Project

Available Files

File Profile Size Best For
Qwopus3.6-35B-A3B-v1-APEX-I-Quality.gguf I-Quality 23 GB Highest quality with imatrix
Qwopus3.6-35B-A3B-v1-APEX-Quality.gguf Quality 23 GB Highest quality standard
Qwopus3.6-35B-A3B-v1-APEX-I-Balanced.gguf I-Balanced 25 GB Best overall quality/size ratio
Qwopus3.6-35B-A3B-v1-APEX-Balanced.gguf Balanced 25 GB General purpose
Qwopus3.6-35B-A3B-v1-APEX-I-Compact.gguf I-Compact 17 GB Consumer GPUs, best quality/size
Qwopus3.6-35B-A3B-v1-APEX-Compact.gguf Compact 17 GB Consumer GPUs
Qwopus3.6-35B-A3B-v1-APEX-I-Mini.gguf I-Mini 14 GB Smallest viable, fastest inference

What is APEX?

APEX is a quantization strategy for Mixture-of-Experts (MoE) models. It classifies tensors by role (routed expert, shared expert, attention) and applies a layer-wise precision gradient — edge layers get higher precision, middle layers get more aggressive compression. I-variants use diverse imatrix calibration (chat, code, reasoning, tool-calling, agentic traces, Wikipedia).

See the APEX project for full details.

Architecture

  • Base Model: Jackrong/Qwopus3.6-35B-A3B-v1
  • Architecture: Qwen3.5-MoE 35B-A3B
  • Layers: 40
  • Experts: 256 routed (8 active per token)
  • Total Parameters: ~35B
  • Active Parameters: ~3B per token
  • APEX Config: 6+6 symmetric edge gradient across 40 layers
  • Calibration: v1.3 diverse dataset (chat, code, reasoning, tool-calling, multilingual)

Run with LocalAI

local-ai run mudler/Qwopus3.6-35B-A3B-v1-APEX-GGUF@Qwopus3.6-35B-A3B-v1-APEX-I-Balanced.gguf

Credits

APEX is brought to you by the LocalAI team. Developed through human-driven, AI-assisted research. Built on llama.cpp.

Downloads last month
9,254
GGUF
Model size
35B params
Architecture
qwen35moe
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mudler/Qwopus3.6-35B-A3B-v1-APEX-GGUF

Quantized
(20)
this model

Collection including mudler/Qwopus3.6-35B-A3B-v1-APEX-GGUF