GENRE: Autoregressive Entity Retrieval

Category: Natural Language Processing

License: Creative Commons

Model Type: Generative AI

GENRE (Generative ENtity REtrieval) is a sequence-to-sequence system designed for entity retrieval tasks such as entity disambiguation, end-to-end entity linking, and document retrieval. Unlike traditional approaches that rely on dense vector similarity, GENRE generates entity names token-by-token in an autoregressive manner, conditioned on the input context. This method enables efficient retrieval with a reduced memory footprint and eliminates the need for negative sampling during training. GENRE achieves state-of-the-art or competitive results across more than 20 benchmark datasets

Key Features

Autoregressive Generation: Generates entity names left-to-right, capturing fine-grained interactions between context and entity.
Memory Efficiency: Reduces memory usage by scaling model parameters with vocabulary size rather than the number of entities.
No Negative Sampling Required: Utilizes exact softmax loss computation without the need for negative data subsampling.
Multilingual Support (mGENRE): Extends GENRE to support over 100 languages, treating language as a latent variable and marginalizing over it during prediction.
Flexible Integration: Compatible with both Fairseq and Hugging Face Transformers frameworks.
Pretrained Models and Datasets: Provides pretrained models and scripts for downloading relevant datasets and resources

GitHub Arxiv

Project Screenshots

Similar Projects

GENRE: Autoregressive Entity Retrieval

Key Features

Project Screenshots

Similar Projects

GPT‑4o Language Translator

AI Minimalist Translation Assistant (Chrome Extension)

AI‑Translator Gemini API Chrome Extension

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

Ren’Py Translator

GPT-Neo: Open Source GPT-3 Style Language Model with TPU & GPU Support