Apertus-8B-Instruct-2509-tool-use
This model is a LoRA fine-tuned version of swiss-ai/Apertus-8B-Instruct-2509. It has been trained using TRL. It was created during the Lausanne Hackathon within the context of the Swiss AI Weeks.
Training procedure
The training notebook can be found here. It is an adaptation of this notebook by Hugging Face with (most) of the hyperparameters recommended by Swiss AI.
Evaluation
We test the model on 357 function calling examples, with the following results:
| Metric | Value |
|---|---|
| eval_loss | 0.5191 |
| eval_entropy | 0.5170 |
| eval_num_tokens | 2,674,282 |
| eval_mean_token_accuracy | 0.8914 |
Framework versions
- TRL: 0.23.0
- Transformers: 4.56.2
- Pytorch: 2.8.0+cu126
- Datasets: 4.0.0
- Tokenizers: 0.22.0
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for mattiaferrarini/Apertus-8B-Instruct-2509-tool-use
Base model
swiss-ai/Apertus-8B-2509
Finetuned
swiss-ai/Apertus-8B-Instruct-2509