Model Card for Behemoth-ReduX-123B-v1c-GGUF

I just made this so I could fit it on my GPU with a little more room for comfort.

I had to quantize from their Q5_K_M.

This is in theory bad, according to llama.cpp. So much so they require a flag to allow it. In practice though, this isn't the case. Every time I've quantized from a GGUF the result has been great.

So don't be worried. This model, while quantized twice, should perform perfectly.

Bias, Risks, and Limitations

Uncensored

Recommendations

See the base model for information Mistral V2/V3 template

Downloads last month: 54

GGUF

Model size

123B params

Architecture

llama

Hardware compatibility

3-bit

Model tree for IIEleven11/Behemoth-ReduX-123B-v1c-Q3_K_M_GGUF

Base model

mistralai/Mistral-Large-Instruct-2407

Quantized

TheDrummer/Behemoth-ReduX-123B-v1.1-GGUF

Quantized

(2)

this model