Model Card for Behemoth-ReduX-123B-v1c-GGUF
I just made this so I could fit it on my GPU with a little more room for comfort.
I had to quantize from their Q5_K_M.
This is in theory bad, according to llama.cpp. So much so they require a flag to allow it. In practice though, this isn't the case. Every time I've quantized from a GGUF the result has been great.
So don't be worried. This model, while quantized twice, should perform perfectly.
Bias, Risks, and Limitations
Uncensored
Recommendations
See the base model for information Mistral V2/V3 template
- Downloads last month
- 54
Hardware compatibility
Log In
to view the estimation
3-bit
Model tree for IIEleven11/Behemoth-ReduX-123B-v1c-Q3_K_M_GGUF
Base model
mistralai/Mistral-Large-Instruct-2407
Quantized
TheDrummer/Behemoth-ReduX-123B-v1.1-GGUF