This is a MXFP4_MOE quantization of the model MiniMax-M2-THRIFT
Original model: https://huggingface.co/VibeStudio/MiniMax-M2-THRIFT
Download the latest llama.cpp to use it.
Compression: 25% expert pruning (256 -> 192), top_k = 8
Also keep in mind that this a coding model.
- Downloads last month
- 758
Hardware compatibility
Log In
to view the estimation
4-bit