Discrepancy between recommended temperatures and no default/greedy for `transformers`

by casinca - opened 19 days ago

19 days ago

(Posting here but same thing for Devstral small.)
In the blogpost, it is mentioned recommended temp=0.2, in the model card here it's 0.15. It's a detail but in any case it seems you recommend low temps.

for the transformers code snippet in the model card, it's going to use greedy decoding. It might lead to subpar perfs if people copy/paste/want to do local tests.
Additionally you could add a temperature parameter in generation_config.json so that people get your recommended sampling hparam by default with do_sample=True

I can open a PR if it's relevant

juliendenize

Mistral AI_ org 8 days ago

Hey @casinca thanks for your feedback. Feel free to open PRs !

juliendenize

Mistral AI_ org 7 days ago

Thanks @casinca for the contributions very nice for the community !

Gonna close this discussion now

juliendenize changed discussion status to closed 7 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment