ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

Category: Natural Language Processing

License: Apache-2.0

Model Type: Other

ELECTRA is a sample-efficient pretraining method for transformer-based language models. Instead of masking and predicting tokens like BERT, ELECTRA introduces a discriminative approach where a small generator replaces masked tokens, and a larger discriminator learns to detect which tokens are original and which have been replaced. This method results in faster training and improved performance on a variety of NLP tasks.

Key Features

More compute-efficient than traditional masked language models
Trains using a replacement detection objective instead of masked token prediction
Combines a small generator and a large discriminator
Outperforms BERT on multiple downstream NLP tasks
Open-source implementation with pretrained models and training scripts

GitHub

Project Screenshots

Similar Projects

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

Key Features

Project Screenshots

Similar Projects

AI‑Translate

Text-to-Text Transfer Transformer (T5)

Sparse Attention: Efficient Attention Mechanisms for Long Sequences

ask‑multiple‑pdfs (MultiPDF Chat App)

Chat with PDF

GPT‑4o Language Translator