Apertus-8B-Instruct-2509-tool-use

This model is a LoRA fine-tuned version of swiss-ai/Apertus-8B-Instruct-2509. It has been trained using TRL. It was created during the Lausanne Hackathon within the context of the Swiss AI Weeks.

Training procedure

The training notebook can be found here. It is an adaptation of this notebook by Hugging Face with (most) of the hyperparameters recommended by Swiss AI.

Evaluation

We test the model on 357 function calling examples, with the following results:

Metric Value
eval_loss 0.5191
eval_entropy 0.5170
eval_num_tokens 2,674,282
eval_mean_token_accuracy 0.8914

Framework versions

  • TRL: 0.23.0
  • Transformers: 4.56.2
  • Pytorch: 2.8.0+cu126
  • Datasets: 4.0.0
  • Tokenizers: 0.22.0
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for mattiaferrarini/Apertus-8B-Instruct-2509-tool-use

Finetuned
(3)
this model