hf-tuner
/

donut-classification-turbo

Image-Text-to-Text

vision-encoder-decoder

Generated from Trainer

Model card Files Files and versions

Metrics Training metrics Community

donut-classification-turbo

This model is a fine-tuned version of donut-base-finetuned-rvl-cdip on rvl-cdip-document-classification dataset. It achieves the following results on the evaluation set:

Loss: 0.0499
Accuracy: 93.34%

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 2
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 3
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
0.1319	0.5	2000	0.1200
0.1365	1.0	4000	0.0845
0.1203	1.5	6000	0.0751
0.1128	2.0	8000	0.0677
0.0734	2.5	10000	0.0541
0.0707	3.0	12000	0.0499

Framework versions

Transformers 4.56.1
Pytorch 2.8.0+cu126
Datasets 4.0.0
Tokenizers 0.22.1

Downloads last month: 32

Safetensors

Model size

0.2B params

Tensor type

I64

·

F32

·

Inference Providers NEW

Image-Text-to-Text

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for hf-tuner/donut-classification-turbo

Base model

hf-tuner/donut-efficient-test2

Finetuned

(1)

this model

Dataset used to train hf-tuner/donut-classification-turbo

Collection including hf-tuner/donut-classification-turbo

Donut 🍩

OCR-free Document Understanding Transformer (Donut) • 4 items • Updated 19 days ago

Evaluation results

Metadata error: specify a dataset to view leaderboard