abnormalmapstudio's picture
scaffold: initial model card + license + LFS
c37c59a
metadata
license: apache-2.0
library_name: mlx
pipeline_tag: text-generation
base_model: Qwen/Qwen3-Next-80B-A3B-Thinking
tags:
  - mlx
  - qwen3_next
  - 8-bit
  - affine
  - text-generation
quantization_config:
  bits: 8
  mode: affine
  group_size: 64
model-index:
  - name: Qwen3-Next-80B-A3B-Thinking 8-bit (MLX)
    results: []

Qwen3-Next-80B-A3B-Thinking — MLX 8-bit (affine)

Apple MLX-optimized 8-bit affine-quantized checkpoint of the base model Qwen/Qwen3-Next-80B-A3B-Thinking for local inference on Apple Silicon.

Key details

  • Format: MLX runtime, safetensors sharded weights
  • Quantization: affine int8, group_size=64
  • Task: text generation / chat
  • Tokenizer: provided via tokenizer.json (BPE) with chat_template.jinja

Usage (MLX)

pip install mlx-lm
from mlx_lm import load, generate
repo_id = "abnormalmapstudio/Qwen3-Next-80B-A3B-Thinking-8bit-mlx"
model, tokenizer = load(repo_id)
out = generate(model, tokenizer, "List 5 creative dinner ideas.", max_tokens=200)
print(out)

Benchmarks

  • Will be added after upload completes; see scripts/bench/qwen_mxfp4_vs_int4.py and scripts/bench/model_queue_eval.py.

License

  • Apache-2.0 for this packaging. See LICENSE.
  • Base model license and terms apply (Qwen/Qwen3-Next-80B-A3B-Thinking).