Michael Goin's picture

Michael Goin

mgoin

·

mgoin_
mgoin

AI & ML interests

LLM inference optimization, compression, quantization, pruning, distillation

Recent Activity

updated a model 11 days ago

google/gemma-4-E4B-it-qat-mobile-ct

updated a model 11 days ago

google/gemma-4-E2B-it-qat-mobile-ct

published a model 13 days ago

google/gemma-4-E4B-it-qat-mobile-ct

View all activity

Organizations

mgoin 's models 103

mgoin/Qwen3.6-35B-A3B-2Bit-GSQ-ct

Image-Text-to-Text • 35B • Updated 18 days ago • 22

mgoin/Qwen3-0.6B-MXFP8

0.6B • Updated Feb 16 • 29

mgoin/GLM-4.6-FP8-BLOCK

Text Generation • 357B • Updated Feb 10 • 9

mgoin/Qwen3-0.6B-NVFP4

0.6B • Updated Aug 26, 2025 • 2

mgoin/mlperf-inference-llama3.1-8b-data

Updated Jul 15, 2025

mgoin/Llama-3.1-8B-Instruct-FP8-BLOCK

8B • Updated Jul 1, 2025 • 3

mgoin/SEMIKONG-70B-W4A16-G128

71B • Updated Jun 16, 2025 • 4

mgoin/llama-4-tiny-random

Text Generation • 6.69M • Updated May 14, 2025 • 3

mgoin/Qwen1.5-14B-Chat-GPTQ

Text Generation • Updated Mar 5, 2025 • 3

mgoin/pixtral-12b

Image-Text-to-Text • 13B • Updated Feb 7, 2025 • 256 • 1

mgoin/Llama-3.2-1B-Instruct-FP8-ATTN

1B • Updated Dec 23, 2024 • 2

mgoin/Llama-3.2-1B-Instruct-FP8-dynamic-ATTN

1B • Updated Dec 23, 2024 • 1

mgoin/Pixtral-Large-Instruct-2411

Updated Nov 19, 2024 • 1

mgoin/Qwen2.5-Coder-32B-Instruct-fp8

Updated Nov 13, 2024

mgoin/nemotron-3-8b-chat-4k-sft-hf

Text Generation • 9B • Updated Nov 13, 2024 • 355

mgoin/llava-onevision-qwen2-7b-ov-hf-bnb-full-4bit

Image-Text-to-Text • 8B • Updated Nov 5, 2024 • 5

mgoin/MiniCPM-Llama3-V-2_5-int4

Visual Question Answering • 9B • Updated Oct 31, 2024 • 1

mgoin/DeepSeek-Coder-V2-Lite-Instruct-FP8

16B • Updated Sep 20, 2024 • 3

mgoin/Mixtral-8x7B-Instruct-v0.1-FP8

47B • Updated Sep 20, 2024 • 4

mgoin/Nemotron-nemo-checkpoints

Updated Aug 30, 2024

mgoin/Minitron-4B-Base-FP8

Text Generation • 4B • Updated Aug 16, 2024 • 4 • 3

mgoin/Nemotron-4-340B-Base-hf

Text Generation • 341B • Updated Aug 8, 2024 • 5 • 1

mgoin/Nemotron-4-340B-Instruct-hf-FP8

Text Generation • 341B • Updated Aug 8, 2024 • 43 • 3

mgoin/Nemotron-4-340B-Base-hf-FP8

Text Generation • 341B • Updated Aug 8, 2024 • 5 • 2

mgoin/Nemotron-4-340B-Instruct-hf

Text Generation • 341B • Updated Aug 8, 2024 • 2.65k • 4

mgoin/SparseLLama-2-7b-ultrachat_200k-pruned_50.2of4-compressed-tensors

4B • Updated Aug 5, 2024 • 5

mgoin/Minitron-8B-Base-FP8

Text Generation • 8B • Updated Jul 26, 2024 • 4 • 3

mgoin/Nemotron-4-340B-Instruct-FP8-Dynamic

Text Generation • 341B • Updated Jul 23, 2024 • 4

mgoin/Nemotron-4-340B-Instruct-vllm

Text Generation • 341B • Updated Jul 23, 2024 • 6

mgoin/Mistral-Nemo-Instruct-2407-FP8-KV

Text Generation • 12B • Updated Jul 18, 2024 • 77 • 1