MMS_10langs_simultane
This model is a fine-tuned version of facebook/mms-1b-all on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.3003
- Wer: 0.3716
- Bleu: 0.4691
- Rouge1: 0.6871
- Rouge2: 0.5597
- Rougel: 0.6859
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 4
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 10
- total_train_batch_size: 40
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 100
- num_epochs: 100
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Wer | Bleu | Rouge1 | Rouge2 | Rougel |
|---|---|---|---|---|---|---|---|---|
| 2.2952 | 1.0 | 450 | 0.3890 | 0.4483 | 0.3755 | 0.6177 | 0.4654 | 0.6160 |
| 0.631 | 2.0 | 900 | 0.3822 | 0.4494 | 0.3774 | 0.6166 | 0.4639 | 0.6149 |
| 0.6019 | 3.0 | 1350 | 0.3704 | 0.4405 | 0.3805 | 0.6263 | 0.4729 | 0.6248 |
| 0.5863 | 4.0 | 1800 | 0.3545 | 0.4255 | 0.4021 | 0.6419 | 0.4948 | 0.6399 |
| 0.5718 | 5.0 | 2250 | 0.3480 | 0.4241 | 0.4044 | 0.6414 | 0.4948 | 0.6397 |
| 0.561 | 6.0 | 2700 | 0.3464 | 0.4195 | 0.4081 | 0.6465 | 0.5013 | 0.6448 |
| 0.5568 | 7.0 | 3150 | 0.3487 | 0.4214 | 0.4055 | 0.6433 | 0.4976 | 0.6414 |
| 0.547 | 8.0 | 3600 | 0.3442 | 0.4180 | 0.4101 | 0.6443 | 0.5009 | 0.6427 |
| 0.5462 | 9.0 | 4050 | 0.3395 | 0.4210 | 0.4101 | 0.6478 | 0.5042 | 0.6462 |
| 0.5354 | 10.0 | 4500 | 0.3377 | 0.4136 | 0.4141 | 0.6510 | 0.5083 | 0.6493 |
| 0.5358 | 11.0 | 4950 | 0.3405 | 0.4141 | 0.4107 | 0.6495 | 0.5059 | 0.6481 |
| 0.5295 | 12.0 | 5400 | 0.3399 | 0.4138 | 0.4167 | 0.6501 | 0.5070 | 0.6485 |
| 0.5261 | 13.0 | 5850 | 0.3351 | 0.4084 | 0.4222 | 0.6557 | 0.5144 | 0.6540 |
| 0.5238 | 14.0 | 6300 | 0.3345 | 0.4110 | 0.4199 | 0.6513 | 0.5084 | 0.6494 |
| 0.52 | 15.0 | 6750 | 0.3346 | 0.4104 | 0.4198 | 0.6511 | 0.5083 | 0.6498 |
| 0.5126 | 16.0 | 7200 | 0.3332 | 0.4127 | 0.4181 | 0.6515 | 0.5087 | 0.6498 |
| 0.5128 | 17.0 | 7650 | 0.3331 | 0.4043 | 0.4263 | 0.6597 | 0.5199 | 0.6578 |
| 0.5069 | 18.0 | 8100 | 0.3284 | 0.4024 | 0.4290 | 0.6613 | 0.5231 | 0.6597 |
| 0.5074 | 19.0 | 8550 | 0.3351 | 0.4090 | 0.4207 | 0.6565 | 0.5168 | 0.6547 |
| 0.4996 | 20.0 | 9000 | 0.3342 | 0.4035 | 0.4308 | 0.6565 | 0.5177 | 0.6544 |
| 0.4991 | 21.0 | 9450 | 0.3281 | 0.4030 | 0.4275 | 0.6678 | 0.5312 | 0.6664 |
| 0.4931 | 22.0 | 9900 | 0.3268 | 0.4075 | 0.4276 | 0.6526 | 0.5140 | 0.6508 |
| 0.4959 | 23.0 | 10350 | 0.3290 | 0.4043 | 0.4282 | 0.6594 | 0.5211 | 0.6580 |
| 0.4937 | 24.0 | 10800 | 0.3304 | 0.4115 | 0.4215 | 0.6516 | 0.5108 | 0.6495 |
| 0.4889 | 25.0 | 11250 | 0.3226 | 0.3999 | 0.4333 | 0.6605 | 0.5221 | 0.6587 |
| 0.4871 | 26.0 | 11700 | 0.3221 | 0.3974 | 0.4357 | 0.6618 | 0.5251 | 0.6604 |
| 0.4828 | 27.0 | 12150 | 0.3338 | 0.4048 | 0.4310 | 0.6510 | 0.5130 | 0.6499 |
| 0.4843 | 28.0 | 12600 | 0.3206 | 0.3974 | 0.4343 | 0.6661 | 0.5283 | 0.6642 |
| 0.4782 | 29.0 | 13050 | 0.3215 | 0.3994 | 0.4357 | 0.6603 | 0.5234 | 0.6587 |
| 0.4738 | 30.0 | 13500 | 0.3218 | 0.3964 | 0.4375 | 0.6637 | 0.5254 | 0.6618 |
| 0.4735 | 31.0 | 13950 | 0.3231 | 0.4000 | 0.4317 | 0.6632 | 0.5260 | 0.6610 |
| 0.4707 | 32.0 | 14400 | 0.3183 | 0.3917 | 0.4421 | 0.6705 | 0.5355 | 0.6688 |
| 0.4692 | 33.0 | 14850 | 0.3198 | 0.3985 | 0.4351 | 0.6663 | 0.5312 | 0.6651 |
| 0.4672 | 34.0 | 15300 | 0.3137 | 0.3932 | 0.4395 | 0.6717 | 0.5394 | 0.6699 |
| 0.4668 | 35.0 | 15750 | 0.3135 | 0.3947 | 0.4391 | 0.6676 | 0.5321 | 0.6657 |
| 0.4645 | 36.0 | 16200 | 0.3169 | 0.3958 | 0.4397 | 0.6672 | 0.5324 | 0.6652 |
| 0.4645 | 37.0 | 16650 | 0.3147 | 0.3923 | 0.4402 | 0.6694 | 0.5338 | 0.6678 |
| 0.4617 | 38.0 | 17100 | 0.3160 | 0.3924 | 0.4448 | 0.6668 | 0.5301 | 0.6647 |
| 0.4592 | 39.0 | 17550 | 0.3132 | 0.3883 | 0.4477 | 0.6736 | 0.5399 | 0.6718 |
| 0.456 | 40.0 | 18000 | 0.3108 | 0.3888 | 0.4474 | 0.6729 | 0.5391 | 0.6710 |
| 0.4562 | 41.0 | 18450 | 0.3138 | 0.3921 | 0.4435 | 0.6680 | 0.5340 | 0.6662 |
| 0.4507 | 42.0 | 18900 | 0.3137 | 0.3918 | 0.4426 | 0.6723 | 0.5385 | 0.6707 |
| 0.4521 | 43.0 | 19350 | 0.3147 | 0.3899 | 0.4479 | 0.6687 | 0.5335 | 0.6671 |
| 0.4492 | 44.0 | 19800 | 0.3121 | 0.3892 | 0.4473 | 0.6693 | 0.5353 | 0.6679 |
| 0.4481 | 45.0 | 20250 | 0.3109 | 0.3903 | 0.4474 | 0.6696 | 0.5353 | 0.6682 |
| 0.4458 | 46.0 | 20700 | 0.3146 | 0.3861 | 0.4505 | 0.6733 | 0.5397 | 0.6720 |
| 0.4469 | 47.0 | 21150 | 0.3107 | 0.3877 | 0.4495 | 0.6731 | 0.5407 | 0.6717 |
| 0.446 | 48.0 | 21600 | 0.3100 | 0.3877 | 0.4500 | 0.6742 | 0.5426 | 0.6728 |
| 0.4453 | 49.0 | 22050 | 0.3099 | 0.3885 | 0.4506 | 0.6732 | 0.5410 | 0.6715 |
| 0.4412 | 50.0 | 22500 | 0.3136 | 0.3860 | 0.4485 | 0.6779 | 0.5459 | 0.6763 |
| 0.4396 | 51.0 | 22950 | 0.3181 | 0.3879 | 0.4488 | 0.6701 | 0.5377 | 0.6688 |
| 0.4371 | 52.0 | 23400 | 0.3102 | 0.3860 | 0.4499 | 0.6772 | 0.5446 | 0.6757 |
| 0.4376 | 53.0 | 23850 | 0.3098 | 0.3884 | 0.4489 | 0.6727 | 0.5391 | 0.6704 |
| 0.4356 | 54.0 | 24300 | 0.3096 | 0.3837 | 0.4552 | 0.6731 | 0.5426 | 0.6716 |
| 0.4324 | 55.0 | 24750 | 0.3115 | 0.3832 | 0.4548 | 0.6801 | 0.5497 | 0.6787 |
| 0.4331 | 56.0 | 25200 | 0.3089 | 0.3869 | 0.4527 | 0.6756 | 0.5458 | 0.6740 |
| 0.4301 | 57.0 | 25650 | 0.3084 | 0.3848 | 0.4541 | 0.6778 | 0.5467 | 0.6763 |
| 0.4307 | 58.0 | 26100 | 0.3128 | 0.3823 | 0.4553 | 0.6759 | 0.5460 | 0.6741 |
| 0.43 | 59.0 | 26550 | 0.3070 | 0.3813 | 0.4559 | 0.6799 | 0.5502 | 0.6782 |
| 0.4244 | 60.0 | 27000 | 0.3076 | 0.3833 | 0.4539 | 0.6781 | 0.5458 | 0.6767 |
| 0.4236 | 61.0 | 27450 | 0.3109 | 0.3846 | 0.4554 | 0.6748 | 0.5449 | 0.6735 |
| 0.4257 | 62.0 | 27900 | 0.3085 | 0.3814 | 0.4544 | 0.6808 | 0.5496 | 0.6793 |
| 0.4226 | 63.0 | 28350 | 0.3068 | 0.3837 | 0.4537 | 0.6776 | 0.5454 | 0.6760 |
| 0.4239 | 64.0 | 28800 | 0.3052 | 0.3821 | 0.4561 | 0.6798 | 0.5491 | 0.6783 |
| 0.4206 | 65.0 | 29250 | 0.3095 | 0.3820 | 0.4548 | 0.6762 | 0.5457 | 0.6749 |
| 0.4212 | 66.0 | 29700 | 0.3055 | 0.3822 | 0.4541 | 0.6771 | 0.5457 | 0.6756 |
| 0.4191 | 67.0 | 30150 | 0.3063 | 0.3787 | 0.4605 | 0.6809 | 0.5520 | 0.6797 |
| 0.4137 | 68.0 | 30600 | 0.3056 | 0.3792 | 0.4577 | 0.6818 | 0.5536 | 0.6804 |
| 0.4156 | 69.0 | 31050 | 0.3023 | 0.3783 | 0.4602 | 0.6808 | 0.5507 | 0.6793 |
| 0.413 | 70.0 | 31500 | 0.3034 | 0.3785 | 0.4597 | 0.6821 | 0.5530 | 0.6803 |
| 0.4112 | 71.0 | 31950 | 0.3022 | 0.3805 | 0.4577 | 0.6804 | 0.5509 | 0.6790 |
| 0.4116 | 72.0 | 32400 | 0.3031 | 0.3793 | 0.4586 | 0.6794 | 0.5496 | 0.6782 |
| 0.4101 | 73.0 | 32850 | 0.3021 | 0.3766 | 0.4632 | 0.6819 | 0.5540 | 0.6804 |
| 0.4073 | 74.0 | 33300 | 0.3039 | 0.3788 | 0.4608 | 0.6816 | 0.5526 | 0.6805 |
| 0.4071 | 75.0 | 33750 | 0.3076 | 0.3776 | 0.4622 | 0.6823 | 0.5529 | 0.6809 |
| 0.4063 | 76.0 | 34200 | 0.3034 | 0.3776 | 0.4624 | 0.6794 | 0.5496 | 0.6783 |
| 0.407 | 77.0 | 34650 | 0.3058 | 0.3755 | 0.4637 | 0.6816 | 0.5524 | 0.6800 |
| 0.4039 | 78.0 | 35100 | 0.3048 | 0.3760 | 0.4620 | 0.6813 | 0.5510 | 0.6800 |
| 0.4052 | 79.0 | 35550 | 0.3063 | 0.3777 | 0.4620 | 0.6822 | 0.5526 | 0.6811 |
| 0.4066 | 80.0 | 36000 | 0.3029 | 0.3782 | 0.4612 | 0.6804 | 0.5489 | 0.6792 |
| 0.4036 | 81.0 | 36450 | 0.3041 | 0.3781 | 0.4603 | 0.6829 | 0.5520 | 0.6815 |
| 0.3987 | 82.0 | 36900 | 0.3048 | 0.3760 | 0.4625 | 0.6838 | 0.5549 | 0.6824 |
| 0.4007 | 83.0 | 37350 | 0.3008 | 0.3736 | 0.4659 | 0.6863 | 0.5573 | 0.6849 |
| 0.4016 | 84.0 | 37800 | 0.3011 | 0.3739 | 0.4653 | 0.6865 | 0.5586 | 0.6849 |
| 0.3981 | 85.0 | 38250 | 0.3007 | 0.3731 | 0.4666 | 0.6864 | 0.5588 | 0.6845 |
| 0.3986 | 86.0 | 38700 | 0.3005 | 0.3719 | 0.4670 | 0.6860 | 0.5583 | 0.6846 |
| 0.3955 | 87.0 | 39150 | 0.3002 | 0.3737 | 0.4656 | 0.6857 | 0.5576 | 0.6844 |
| 0.3942 | 88.0 | 39600 | 0.2999 | 0.3729 | 0.4672 | 0.6860 | 0.5596 | 0.6848 |
| 0.3951 | 89.0 | 40050 | 0.3021 | 0.3756 | 0.4646 | 0.6826 | 0.5536 | 0.6811 |
| 0.3963 | 90.0 | 40500 | 0.3000 | 0.3713 | 0.4683 | 0.6880 | 0.5624 | 0.6867 |
| 0.3941 | 91.0 | 40950 | 0.2998 | 0.3716 | 0.4677 | 0.6884 | 0.5622 | 0.6872 |
| 0.3913 | 92.0 | 41400 | 0.3018 | 0.3722 | 0.4687 | 0.6871 | 0.5600 | 0.6859 |
| 0.3938 | 93.0 | 41850 | 0.3009 | 0.3726 | 0.4687 | 0.6857 | 0.5583 | 0.6845 |
| 0.3912 | 94.0 | 42300 | 0.3003 | 0.3717 | 0.4679 | 0.6879 | 0.5617 | 0.6868 |
| 0.3921 | 95.0 | 42750 | 0.2997 | 0.3712 | 0.4693 | 0.6876 | 0.5621 | 0.6863 |
| 0.392 | 96.0 | 43200 | 0.3012 | 0.3710 | 0.4700 | 0.6884 | 0.5617 | 0.6870 |
| 0.3885 | 97.0 | 43650 | 0.3015 | 0.3713 | 0.4694 | 0.6875 | 0.5602 | 0.6861 |
| 0.3893 | 98.0 | 44100 | 0.3006 | 0.3718 | 0.4692 | 0.6868 | 0.5598 | 0.6856 |
| 0.3872 | 99.0 | 44550 | 0.3002 | 0.3712 | 0.4692 | 0.6874 | 0.5602 | 0.6859 |
| 0.3903 | 100.0 | 45000 | 0.3003 | 0.3716 | 0.4691 | 0.6871 | 0.5597 | 0.6859 |
Framework versions
- Transformers 4.49.0
- Pytorch 2.6.0+cu124
- Datasets 3.2.0
- Tokenizers 0.21.0
- Downloads last month
- -
Model tree for ilyes25/MMS_10langs_simultane
Base model
facebook/mms-1b-all