T5 (Text-to-Text Transfer Transformer) is a unified framework for natural language processing (NLP) tasks, developed by Google Research. It converts all NLP problems into a text-to-text format—both input and output are treated as text strings. This approach simplifies the process of applying a single model to a variety of tasks such as translation, summarization, question answering, and classification. Trained on a large, cleaned dataset called C4 (Colossal Clean Crawled Corpus), T5 sets new benchmarks across many NLP tasks.
https://arxiv.org/abs/1910.10683
Key Features
Unified Text-to-Text Format: All NLP tasks are represented with text input and text output, creating a consistent and flexible model interface.
Transformer-Based Architecture: Uses an encoder-decoder transformer architecture similar to original Transformer models.
Multi-Task Learning: Supports multiple tasks with a single model and training objective.
Pretrained Checkpoints: Offers multiple pretrained versions (small to XXL), allowing trade-offs between performance and computational cost.
State-of-the-Art Performance: Achieves top results on GLUE, SuperGLUE, CNN/DailyMail, and other benchmarks.
Trained on C4 Dataset: Uses a curated version of the Common Crawl dataset for robust language modeling.