nightmedia
/

Qwen3-Coder-REAP-25B-A3B-qx64-hi-mlx

Text Generation

Model card Files Files and versions

Qwen3-Coder-REAP-25B-A3B-qx64-hi-mlx / README.md

nightmedia's picture

Update README.md

32284fc verified 15 days ago

|

history blame contribute delete

1.79 kB

	---
	language:
	- en
	library_name: mlx
	tags:
	- qwen-coder
	- MOE
	- pruning
	- compression
	- mlx
	license: apache-2.0
	name: cerebras/Qwen3-Coder-REAP-25B-A3B
	description: 'This model was obtained by uniformly pruning 20% of experts in Qwen3-Coder-30B-A3B-Instruct
	using the REAP method.

	'
	readme: 'https://huggingface.co/cerebras/Qwen3-Coder-REAP-25B-A3B/main/README.md

	'
	license_link: https://huggingface.co/cerebras/Qwen3-Coder-REAP-25B-A3B/blob/main/LICENSE
	pipeline_tag: text-generation
	base_model: cerebras/Qwen3-Coder-REAP-25B-A3B
	---

	# Qwen3-Coder-REAP-25B-A3B-qx64-hi-mlx

	The regular Deckard(qx) formula uses embeddings at the same bit as the data stores, in this case 4 bit.

	The head and select attention paths are enhanced to 6 bit, and the model is quantized with group size 32(hi).

	There is an updated model: [Qwen3-Coder-REAP-25B-A3B-qx65x-hi-mlx](https://huggingface.co/nightmedia/Qwen3-Coder-REAP-25B-A3B-qx65x-hi-mlx) that uses embeddings at 6 bit and a base of 5 bit, and should perform slightly better on long context.

	Metrics coming soon.

	-G

	This model [Qwen3-Coder-REAP-25B-A3B-qx64-hi-mlx](https://huggingface.co/nightmedia/Qwen3-Coder-REAP-25B-A3B-qx64-hi-mlx) was
	converted to MLX format from [cerebras/Qwen3-Coder-REAP-25B-A3B](https://huggingface.co/cerebras/Qwen3-Coder-REAP-25B-A3B)
	using mlx-lm version 0.28.3.

	## Use with mlx

	```bash
	pip install mlx-lm
	```

	```python
	from mlx_lm import load, generate

	model, tokenizer = load("Qwen3-Coder-REAP-25B-A3B-qx64-hi-mlx")

	prompt = "hello"

	if tokenizer.chat_template is not None:
	messages = [{"role": "user", "content": prompt}]
	prompt = tokenizer.apply_chat_template(
	messages, add_generation_prompt=True
	)

	response = generate(model, tokenizer, prompt=prompt, verbose=True)
	```