| license: mit | |
| tags: | |
| - text-classfication | |
| - int8 | |
| - Intel® Neural Compressor | |
| - neural-compressor | |
| - PostTrainingDynamic | |
| - onnx | |
| datasets: | |
| - glue | |
| metrics: | |
| - f1 | |
| # INT8 deberta-v3-base-mrpc | |
| ## Post-training Dynamic quantization | |
| ### ONNX | |
| This is an INT8 ONNX model quantized with [Intel® Neural Compressor](https://github.com/intel/neural-compressor). | |
| The original fp32 model comes from the fine-tuned model [Intel/deberta-v3-base-mrpc](https://huggingface.co/Intel/deberta-v3-base-mrpc). | |
| #### Test result | |
| | |INT8|FP32| | |
| |---|:---:|:---:| | |
| | **Accuracy (eval-f1)** |0.9239|0.9223| | |
| | **Model size (MB)** |350|705| | |
| #### Load ONNX model: | |
| ```python | |
| from optimum.onnxruntime import ORTModelForSequenceClassification | |
| model = ORTModelForSequenceClassification.from_pretrained('Intel/deberta-v3-base-mrpc-int8-dynamic') | |
| ``` |