Commit
·
06fb61a
1
Parent(s):
61db6a0
Update README.md
Browse files
README.md
CHANGED
|
@@ -14,7 +14,7 @@ You can use the raw model for masked language modeling, but it's mostly intended
|
|
| 14 |
Note that this model is primarily aimed at being fine-tuned on tasks such as visuo-linguistic sequence classification or visual question answering. We used this model to fine-tuned on a multi-translated version of the visual question answering task - [VQA v2](https://visualqa.org/challenge.html). Since Conceptual-12M is a dataset scraped from the internet, it will involve some biases which will also affect all fine-tuned versions of this model.
|
| 15 |
|
| 16 |
### How to use❓
|
| 17 |
-
You can use this model directly with a pipeline for masked language modeling:
|
| 18 |
```python
|
| 19 |
>>> from torchvision.io import read_image
|
| 20 |
>>> import numpy as np
|
|
|
|
| 14 |
Note that this model is primarily aimed at being fine-tuned on tasks such as visuo-linguistic sequence classification or visual question answering. We used this model to fine-tuned on a multi-translated version of the visual question answering task - [VQA v2](https://visualqa.org/challenge.html). Since Conceptual-12M is a dataset scraped from the internet, it will involve some biases which will also affect all fine-tuned versions of this model.
|
| 15 |
|
| 16 |
### How to use❓
|
| 17 |
+
You can use this model directly with a pipeline for masked language modeling. You will need to clone the model from [here](https://github.com/gchhablani/multilingual-vqa). An example of usage is shown below:
|
| 18 |
```python
|
| 19 |
>>> from torchvision.io import read_image
|
| 20 |
>>> import numpy as np
|