fix: update `generation_config.json`to default to stochastic sampling (temp 0.15)

#24

by casinca - opened 8 days ago

base: refs/heads/main

←

from: refs/pr/24

Discussion Files changed

-0

casinca

8 days ago

•

edited 8 days ago

Hello,

This is a mirror PR for Devstral-Small, based on: https://huggingface.co/mistralai/Devstral-2-123B-Instruct-2512/discussions/18

adds the required hparam arguments to enable stochastic sampling (temp 0.15) rather than greedy decoding in the generation_config.json.
So when users load the mistralai/Devstral-Small-2-24B-Instruct-2512 model, they automatically get the default sampling settings intended by Mistral.

Motivation: Not all users might know about these sampling hparams and what they do, defaulting to what Mistral recommends, could lower complaints about potential poor generations/model performances.

Opened as a separate PR in case you want to keep it greedy by default, this is originally linked to: https://huggingface.co/mistralai/Devstral-2-123B-Instruct-2512/discussions/9
(It's true that low temp is close to greedy but this is still stochastic nonetheless)

fix: update `generation_config.json`to default to stochastic sampling (temp 0.15)bf57b434

juliendenize

Mistral AI_ org 7 days ago

Thanks !

juliendenize changed pull request status to merged 7 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment