Apertus-8B-Instruct-2509-tool-use

This model is a LoRA fine-tuned version of swiss-ai/Apertus-8B-Instruct-2509. It has been trained using TRL. It was created during the Lausanne Hackathon within the context of the Swiss AI Weeks.

Training procedure

The training notebook can be found here. It is an adaptation of this notebook by Hugging Face with (most) of the hyperparameters recommended by Swiss AI.

Evaluation

We test the model on 357 function calling examples, with the following results:

Metric	Value
eval_loss	0.5191
eval_entropy	0.5170
eval_num_tokens	2,674,282
eval_mean_token_accuracy	0.8914

Framework versions

TRL: 0.23.0
Transformers: 4.56.2
Pytorch: 2.8.0+cu126
Datasets: 4.0.0
Tokenizers: 0.22.0

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for mattiaferrarini/Apertus-8B-Instruct-2509-tool-use

Base model

swiss-ai/Apertus-8B-2509

Finetuned

swiss-ai/Apertus-8B-Instruct-2509

Finetuned

(3)

this model