Apertus-8B-Instruct-2509-FP8-Dynamic

This is an FP8 dynamically quantized version of swiss-ai/Apertus-8B-Instruct-2509 using llm-compressor.

Quantization Details

  • Quantization Scheme: FP8_dynamic
  • Method: Dynamic quantization of weights and activations to FP8 format
  • Targets: All Linear layers
  • Ignored Layers: lm_head (kept in higher precision for better output quality)
  • Tool: llm-compressor (Neural Magic)
Downloads last month
31
Safetensors
Model size
8B params
Tensor type
BF16
·
F8_E4M3
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for starbix/Apertus-8B-Instruct-2509-FP8_dynamic

Quantized
(27)
this model