--- language: - en base_model: - google-t5/t5-large pipeline_tag: text-classification tags: - gen-ir - information-retrieval - ir --- This repository contains one of the models analyzed in our paper [Reverse-Engineering the Retrieval Process in GenIR Models](https://dl.acm.org/doi/abs/10.1145/3726302.3730076). ### Training The model is based on T5-large and was trained on the TriviaQA dataset (with only generated questions as input) as a atomic GenIR model using the setup of [DSI](https://arxiv.org/abs/2202.06991). ### Model Overview | Model | Huggingface URL | | ------------ | ----------------------------------------------------------------------- | | NQ10k | [DSI-large-NQ10k](https://huggingface.co/AnReu/DSI-large-NQ10k) | | NQ100k | [DSI-large-NQ100k](https://huggingface.co/AnReu/DSI-large-NQ100k) | | NQ320k | [DSI-large-NQ320k](https://huggingface.co/AnReu/DSI-large-NQ320k) | | Trivia-QA | [DSI-large-TriviaQA](https://huggingface.co/AnReu/DSI-large-TriviaQA) | | Trivia-QA QG | [DSI-large-TriviaQA QG](https://huggingface.co/AnReu/DSI-large-TriviaQA-QG) | ### Citation ``` @inproceedings{Reusch2025Reverse, author = {Reusch, Anja and Belinkov, Yonatan}, title = {Reverse-Engineering the Retrieval Process in GenIR Models}, year = {2025}, isbn = {9798400715921}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3726302.3730076}, doi = {10.1145/3726302.3730076}, booktitle = {Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval}, pages = {668–677}, numpages = {10}, location = {Padua, Italy}, series = {SIGIR '25} } ```