MMS_10langs_simultane

This model is a fine-tuned version of facebook/mms-1b-all on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3003
  • Wer: 0.3716
  • Bleu: 0.4691
  • Rouge1: 0.6871
  • Rouge2: 0.5597
  • Rougel: 0.6859

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 10
  • total_train_batch_size: 40
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 100
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Bleu Rouge1 Rouge2 Rougel
2.2952 1.0 450 0.3890 0.4483 0.3755 0.6177 0.4654 0.6160
0.631 2.0 900 0.3822 0.4494 0.3774 0.6166 0.4639 0.6149
0.6019 3.0 1350 0.3704 0.4405 0.3805 0.6263 0.4729 0.6248
0.5863 4.0 1800 0.3545 0.4255 0.4021 0.6419 0.4948 0.6399
0.5718 5.0 2250 0.3480 0.4241 0.4044 0.6414 0.4948 0.6397
0.561 6.0 2700 0.3464 0.4195 0.4081 0.6465 0.5013 0.6448
0.5568 7.0 3150 0.3487 0.4214 0.4055 0.6433 0.4976 0.6414
0.547 8.0 3600 0.3442 0.4180 0.4101 0.6443 0.5009 0.6427
0.5462 9.0 4050 0.3395 0.4210 0.4101 0.6478 0.5042 0.6462
0.5354 10.0 4500 0.3377 0.4136 0.4141 0.6510 0.5083 0.6493
0.5358 11.0 4950 0.3405 0.4141 0.4107 0.6495 0.5059 0.6481
0.5295 12.0 5400 0.3399 0.4138 0.4167 0.6501 0.5070 0.6485
0.5261 13.0 5850 0.3351 0.4084 0.4222 0.6557 0.5144 0.6540
0.5238 14.0 6300 0.3345 0.4110 0.4199 0.6513 0.5084 0.6494
0.52 15.0 6750 0.3346 0.4104 0.4198 0.6511 0.5083 0.6498
0.5126 16.0 7200 0.3332 0.4127 0.4181 0.6515 0.5087 0.6498
0.5128 17.0 7650 0.3331 0.4043 0.4263 0.6597 0.5199 0.6578
0.5069 18.0 8100 0.3284 0.4024 0.4290 0.6613 0.5231 0.6597
0.5074 19.0 8550 0.3351 0.4090 0.4207 0.6565 0.5168 0.6547
0.4996 20.0 9000 0.3342 0.4035 0.4308 0.6565 0.5177 0.6544
0.4991 21.0 9450 0.3281 0.4030 0.4275 0.6678 0.5312 0.6664
0.4931 22.0 9900 0.3268 0.4075 0.4276 0.6526 0.5140 0.6508
0.4959 23.0 10350 0.3290 0.4043 0.4282 0.6594 0.5211 0.6580
0.4937 24.0 10800 0.3304 0.4115 0.4215 0.6516 0.5108 0.6495
0.4889 25.0 11250 0.3226 0.3999 0.4333 0.6605 0.5221 0.6587
0.4871 26.0 11700 0.3221 0.3974 0.4357 0.6618 0.5251 0.6604
0.4828 27.0 12150 0.3338 0.4048 0.4310 0.6510 0.5130 0.6499
0.4843 28.0 12600 0.3206 0.3974 0.4343 0.6661 0.5283 0.6642
0.4782 29.0 13050 0.3215 0.3994 0.4357 0.6603 0.5234 0.6587
0.4738 30.0 13500 0.3218 0.3964 0.4375 0.6637 0.5254 0.6618
0.4735 31.0 13950 0.3231 0.4000 0.4317 0.6632 0.5260 0.6610
0.4707 32.0 14400 0.3183 0.3917 0.4421 0.6705 0.5355 0.6688
0.4692 33.0 14850 0.3198 0.3985 0.4351 0.6663 0.5312 0.6651
0.4672 34.0 15300 0.3137 0.3932 0.4395 0.6717 0.5394 0.6699
0.4668 35.0 15750 0.3135 0.3947 0.4391 0.6676 0.5321 0.6657
0.4645 36.0 16200 0.3169 0.3958 0.4397 0.6672 0.5324 0.6652
0.4645 37.0 16650 0.3147 0.3923 0.4402 0.6694 0.5338 0.6678
0.4617 38.0 17100 0.3160 0.3924 0.4448 0.6668 0.5301 0.6647
0.4592 39.0 17550 0.3132 0.3883 0.4477 0.6736 0.5399 0.6718
0.456 40.0 18000 0.3108 0.3888 0.4474 0.6729 0.5391 0.6710
0.4562 41.0 18450 0.3138 0.3921 0.4435 0.6680 0.5340 0.6662
0.4507 42.0 18900 0.3137 0.3918 0.4426 0.6723 0.5385 0.6707
0.4521 43.0 19350 0.3147 0.3899 0.4479 0.6687 0.5335 0.6671
0.4492 44.0 19800 0.3121 0.3892 0.4473 0.6693 0.5353 0.6679
0.4481 45.0 20250 0.3109 0.3903 0.4474 0.6696 0.5353 0.6682
0.4458 46.0 20700 0.3146 0.3861 0.4505 0.6733 0.5397 0.6720
0.4469 47.0 21150 0.3107 0.3877 0.4495 0.6731 0.5407 0.6717
0.446 48.0 21600 0.3100 0.3877 0.4500 0.6742 0.5426 0.6728
0.4453 49.0 22050 0.3099 0.3885 0.4506 0.6732 0.5410 0.6715
0.4412 50.0 22500 0.3136 0.3860 0.4485 0.6779 0.5459 0.6763
0.4396 51.0 22950 0.3181 0.3879 0.4488 0.6701 0.5377 0.6688
0.4371 52.0 23400 0.3102 0.3860 0.4499 0.6772 0.5446 0.6757
0.4376 53.0 23850 0.3098 0.3884 0.4489 0.6727 0.5391 0.6704
0.4356 54.0 24300 0.3096 0.3837 0.4552 0.6731 0.5426 0.6716
0.4324 55.0 24750 0.3115 0.3832 0.4548 0.6801 0.5497 0.6787
0.4331 56.0 25200 0.3089 0.3869 0.4527 0.6756 0.5458 0.6740
0.4301 57.0 25650 0.3084 0.3848 0.4541 0.6778 0.5467 0.6763
0.4307 58.0 26100 0.3128 0.3823 0.4553 0.6759 0.5460 0.6741
0.43 59.0 26550 0.3070 0.3813 0.4559 0.6799 0.5502 0.6782
0.4244 60.0 27000 0.3076 0.3833 0.4539 0.6781 0.5458 0.6767
0.4236 61.0 27450 0.3109 0.3846 0.4554 0.6748 0.5449 0.6735
0.4257 62.0 27900 0.3085 0.3814 0.4544 0.6808 0.5496 0.6793
0.4226 63.0 28350 0.3068 0.3837 0.4537 0.6776 0.5454 0.6760
0.4239 64.0 28800 0.3052 0.3821 0.4561 0.6798 0.5491 0.6783
0.4206 65.0 29250 0.3095 0.3820 0.4548 0.6762 0.5457 0.6749
0.4212 66.0 29700 0.3055 0.3822 0.4541 0.6771 0.5457 0.6756
0.4191 67.0 30150 0.3063 0.3787 0.4605 0.6809 0.5520 0.6797
0.4137 68.0 30600 0.3056 0.3792 0.4577 0.6818 0.5536 0.6804
0.4156 69.0 31050 0.3023 0.3783 0.4602 0.6808 0.5507 0.6793
0.413 70.0 31500 0.3034 0.3785 0.4597 0.6821 0.5530 0.6803
0.4112 71.0 31950 0.3022 0.3805 0.4577 0.6804 0.5509 0.6790
0.4116 72.0 32400 0.3031 0.3793 0.4586 0.6794 0.5496 0.6782
0.4101 73.0 32850 0.3021 0.3766 0.4632 0.6819 0.5540 0.6804
0.4073 74.0 33300 0.3039 0.3788 0.4608 0.6816 0.5526 0.6805
0.4071 75.0 33750 0.3076 0.3776 0.4622 0.6823 0.5529 0.6809
0.4063 76.0 34200 0.3034 0.3776 0.4624 0.6794 0.5496 0.6783
0.407 77.0 34650 0.3058 0.3755 0.4637 0.6816 0.5524 0.6800
0.4039 78.0 35100 0.3048 0.3760 0.4620 0.6813 0.5510 0.6800
0.4052 79.0 35550 0.3063 0.3777 0.4620 0.6822 0.5526 0.6811
0.4066 80.0 36000 0.3029 0.3782 0.4612 0.6804 0.5489 0.6792
0.4036 81.0 36450 0.3041 0.3781 0.4603 0.6829 0.5520 0.6815
0.3987 82.0 36900 0.3048 0.3760 0.4625 0.6838 0.5549 0.6824
0.4007 83.0 37350 0.3008 0.3736 0.4659 0.6863 0.5573 0.6849
0.4016 84.0 37800 0.3011 0.3739 0.4653 0.6865 0.5586 0.6849
0.3981 85.0 38250 0.3007 0.3731 0.4666 0.6864 0.5588 0.6845
0.3986 86.0 38700 0.3005 0.3719 0.4670 0.6860 0.5583 0.6846
0.3955 87.0 39150 0.3002 0.3737 0.4656 0.6857 0.5576 0.6844
0.3942 88.0 39600 0.2999 0.3729 0.4672 0.6860 0.5596 0.6848
0.3951 89.0 40050 0.3021 0.3756 0.4646 0.6826 0.5536 0.6811
0.3963 90.0 40500 0.3000 0.3713 0.4683 0.6880 0.5624 0.6867
0.3941 91.0 40950 0.2998 0.3716 0.4677 0.6884 0.5622 0.6872
0.3913 92.0 41400 0.3018 0.3722 0.4687 0.6871 0.5600 0.6859
0.3938 93.0 41850 0.3009 0.3726 0.4687 0.6857 0.5583 0.6845
0.3912 94.0 42300 0.3003 0.3717 0.4679 0.6879 0.5617 0.6868
0.3921 95.0 42750 0.2997 0.3712 0.4693 0.6876 0.5621 0.6863
0.392 96.0 43200 0.3012 0.3710 0.4700 0.6884 0.5617 0.6870
0.3885 97.0 43650 0.3015 0.3713 0.4694 0.6875 0.5602 0.6861
0.3893 98.0 44100 0.3006 0.3718 0.4692 0.6868 0.5598 0.6856
0.3872 99.0 44550 0.3002 0.3712 0.4692 0.6874 0.5602 0.6859
0.3903 100.0 45000 0.3003 0.3716 0.4691 0.6871 0.5597 0.6859

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
-
Safetensors
Model size
1.0B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ilyes25/MMS_10langs_simultane

Finetuned
(336)
this model