Update pipeline tag to image-text-to-text, add ReSum paper link and citation, and enhance content

Updating the pipeline_tag from text-generation to image-text-to-text. This change accurately reflects the model's multimodal capabilities, as evidenced by its use as a web agent that processes visual environments and the presence of vision-related tokens in its tokenizer configuration (tokenizer_config.json). This will improve the model's discoverability under the correct pipeline on the Hugging Face Hub (e.g., at https://huggingface.co/models?pipeline_tag=image-text-to-text).
Adding a direct link to the associated paper, "ReSum: Unlocking Long-Horizon Search Intelligence via Context Summarization", at the top of the model card for better visibility and context.
Updating the BibTeX citation section to include the specific citation for the ReSum paper, in addition to the existing project citation.
Integrating additional valuable information from the GitHub repository's README, such as badges, "News", "Deep Research Benchmark Results", "Deep Research Agent Family", "Misc", "Talent Recruitment", and "Contact Information", to provide a more complete overview of the model and its ecosystem. Image links from the GitHub README have been converted to raw URLs for proper rendering.
Updating the "Download" section to "Download and Usage" to clearly direct users to the GitHub repository for detailed setup and inference instructions, as no standalone Python code snippet for direct inference via the transformers library was found in the GitHub README.

These enhancements aim to provide a more accurate, informative, and user-friendly model card, aligning it with Hugging Face Hub best practices.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Cannot merge

This branch has merge conflicts in the following files:

· Sign up or log in to comment