Natural Order LMs
Collection
All the models trained in the paper 'Natural Order: Cross-lingual Limits of Transformer Language Acquisition' • 35 items • Updated
This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| 22.1846 | 1.0 | 87 | 7.3020 |
| 17.3675 | 2.0 | 174 | 5.8002 |
| 16.8631 | 3.0 | 261 | 5.5977 |
| 16.4538 | 4.0 | 348 | 5.4003 |
| 15.6803 | 5.0 | 435 | 5.1926 |
| 15.1404 | 6.0 | 522 | 5.0324 |
| 14.7478 | 7.0 | 609 | 4.9024 |
| 14.5035 | 8.0 | 696 | 4.7975 |
| 14.235 | 9.0 | 783 | 4.7188 |
| 13.9852 | 10.0 | 870 | 4.6540 |
| 13.8877 | 11.0 | 957 | 4.5981 |
| 13.6711 | 12.0 | 1044 | 4.5553 |
| 13.5173 | 13.0 | 1131 | 4.5190 |
| 13.4679 | 14.0 | 1218 | 4.4875 |
| 13.2962 | 15.0 | 1305 | 4.4638 |
| 13.2535 | 16.0 | 1392 | 4.4402 |
| 13.1309 | 17.0 | 1479 | 4.4224 |
| 13.0784 | 18.0 | 1566 | 4.4073 |
| 12.8672 | 19.0 | 1653 | 4.3936 |
| 12.9824 | 20.0 | 1740 | 4.3828 |
| 12.9498 | 21.0 | 1827 | 4.3739 |
| 12.7717 | 22.0 | 1914 | 4.3650 |
| 12.8094 | 23.0 | 2001 | 4.3585 |
| 12.7595 | 24.0 | 2088 | 4.3541 |
| 12.7992 | 25.0 | 2175 | 4.3483 |
| 12.5838 | 26.0 | 2262 | 4.3455 |
| 12.6073 | 27.0 | 2349 | 4.3425 |
| 12.7835 | 28.0 | 2436 | 4.3396 |
| 12.5611 | 29.0 | 2523 | 4.3382 |
| 12.5633 | 30.0 | 2610 | 4.3376 |
| 12.645 | 31.0 | 2697 | 4.3367 |
| 12.4761 | 32.0 | 2784 | 4.3362 |
| 12.4169 | 33.0 | 2871 | 4.3359 |
| 12.5719 | 34.0 | 2958 | 4.3357 |
| 33.6696 | 34.4863 | 3000 | 4.3357 |