Improve model card: Add transformers, image-text-to-text tags, paper, project page, and usage example

by nielsr HF Staff - opened Oct 11

←

nielsr

Oct 11

This PR significantly improves the model card for NaViL by:

Adding the library_name: transformers metadata tag to enable the automated "how to use" widget, as the model's usage example explicitly uses the transformers library.
Adding the pipeline_tag: image-text-to-text metadata tag to properly categorize the model, reflecting its multimodal question-answering capabilities.
Including the full paper abstract and linking to the official paper, project page, and GitHub repository.
Incorporating comprehensive details from the GitHub README, including "Core Insights", "NaViL Architecture", "Main Results", "Qualitative Analysis", and "Getting Started" sections.
Providing a detailed Python code snippet for inference, directly from the GitHub README, to guide users on how to use the model.
Ensuring image assets from the GitHub README render correctly on the Hugging Face Hub by updating their URLs.

These additions will make the NaViL model more discoverable, understandable, and easier to use for the Hugging Face community.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment