Technology

ELECTRA

ELECTRA: A highly compute-efficient NLP pre-training model using a discriminator-based approach to detect replaced tokens, significantly outperforming Masked Language Models (MLMs) on compute budget.

ELECTRA (Efficiently Learning an Encoder that Classifies Token Replacements Accurately) is a Google Research NLP model that shifts pre-training from generative (like BERT's masked language modeling) to discriminative: it trains a model to identify 'real' versus 'fake' tokens in the input, a method called Replaced Token Detection (RTD). This approach is highly efficient because the discriminator learns from all input tokens, not just the 15% masked subset used in MLMs. For example, ELECTRA-Small, trained on one GPU for just four days, surpasses GPT, a model requiring over 30x more compute. At scale, ELECTRA achieves state-of-the-art results on benchmarks like SQuAD 2.0 while using less than 1/4 of the compute required by comparable models like RoBERTa and XLNet.

https://research.google/blog/more-efficient-nlp-model-pre-training-with-electra/

2 projects · 2 cities

Related technologies

3D-GPT-Avatar 1 Aidy 1 FT-Transformer 1 Google Colab 12 GPT 25 Helpdesk AI 1 Python 618 Stay4Ever 1 Transformers 146

Recent Talks & Demos

Showing 1-2 of 2

Members-Only

Transformers Detect Netflow Anomalies

Toronto Jan 29

Python Transformers

Electra: AI Avatar Applications

Hamburg Feb 20

3D-GPT-Avatar Aidy