Merge branch 'main' of https://huggingface.co/bertin-project/bertin-roberta-base-spanish into main
Browse files
    	
        README.md
    CHANGED
    
    | @@ -10,8 +10,8 @@ widget: | |
| 10 | 
             
            ---
         | 
| 11 |  | 
| 12 | 
             
            - [Version beta](https://huggingface.co/bertin-project/bertin-roberta-base-spanish/tree/beta): July 15th, 2021
         | 
| 13 | 
            -
            - [Version  | 
| 14 | 
            -
             | 
| 15 |  | 
| 16 | 
             
            # BERTIN
         | 
| 17 |  | 
| @@ -252,7 +252,7 @@ In addition to the tasks above, we also trained the [`beta`](https://huggingface | |
| 252 |  | 
| 253 | 
             
            Results for PAWS-X seem surprising given the large differences in performance. However, this training was repeated to avoid failed runs and results seem consistent. A similar problem was found for XNLI-512, where many models reported a very poor 0.3333 accuracy on a first run (and even a second, in the case of BSC-BNE). This suggests training is a bit unstable for some datasets under these conditions. Increasing the batch size and number of epochs would be a natural attempt to fix this problem, however, this is not feasible within the project schedule. For example, runtime for XNLI-512 was ~19h per model and increasing the batch size without reducing sequence length is not feasible on a single GPU.
         | 
| 254 |  | 
| 255 | 
            -
            We are also releasing the fine-tuned models for `Gaussian`-512 and making it our version  | 
| 256 |  | 
| 257 | 
             
            - POS: [`bertin-project/bertin-base-pos-conll2002-es`](https://huggingface.co/bertin-project/bertin-base-pos-conll2002-es/)
         | 
| 258 | 
             
            - NER: [`bertin-project/bertin-base-ner-conll2002-es`](https://huggingface.co/bertin-project/bertin-base-ner-conll2002-es/)
         | 
|  | |
| 10 | 
             
            ---
         | 
| 11 |  | 
| 12 | 
             
            - [Version beta](https://huggingface.co/bertin-project/bertin-roberta-base-spanish/tree/beta): July 15th, 2021
         | 
| 13 | 
            +
            - [Version v1](https://huggingface.co/bertin-project/bertin-roberta-base-spanish/tree/v1): July 26th, 2021
         | 
| 14 | 
            +
            - [Version v1-512](https://huggingface.co/bertin-project/bertin-roberta-base-spanish/tree/v1-512): July 26th, 2021
         | 
| 15 |  | 
| 16 | 
             
            # BERTIN
         | 
| 17 |  | 
|  | |
| 252 |  | 
| 253 | 
             
            Results for PAWS-X seem surprising given the large differences in performance. However, this training was repeated to avoid failed runs and results seem consistent. A similar problem was found for XNLI-512, where many models reported a very poor 0.3333 accuracy on a first run (and even a second, in the case of BSC-BNE). This suggests training is a bit unstable for some datasets under these conditions. Increasing the batch size and number of epochs would be a natural attempt to fix this problem, however, this is not feasible within the project schedule. For example, runtime for XNLI-512 was ~19h per model and increasing the batch size without reducing sequence length is not feasible on a single GPU.
         | 
| 254 |  | 
| 255 | 
            +
            We are also releasing the fine-tuned models for `Gaussian`-512 and making it our version v1 default to 128 sequence length since it experimentally shows better performance on fill-mask task, while alse releasing the 512 sequence length version (v1-512) for fine-tuning.
         | 
| 256 |  | 
| 257 | 
             
            - POS: [`bertin-project/bertin-base-pos-conll2002-es`](https://huggingface.co/bertin-project/bertin-base-pos-conll2002-es/)
         | 
| 258 | 
             
            - NER: [`bertin-project/bertin-base-ner-conll2002-es`](https://huggingface.co/bertin-project/bertin-base-ner-conll2002-es/)
         | 

