Technology

XLNet

A generalized autoregressive pretraining method that outperforms BERT by leveraging permutation language modeling to capture bidirectional context without masking noise.

Developed by researchers at Google Brain and Carnegie Mellon University, XLNet integrates the best features of autoregressive (AR) and autoencoding (AE) models. By utilizing Permutation Language Modeling (PLM), it predicts tokens across all possible factorizations of a sequence order, effectively bypassing the artificial [MASK] tokens that hinder BERT during fine-tuning. This architecture incorporates Transformer-XL mechanisms (segment-level recurrence and relative positional encoding) to handle long-range dependencies. In benchmarks, XLNet outperformed BERT on 20 tasks, achieving state-of-the-art results in SQuAD 2.0 and GLUE by modeling bidirectional context while maintaining the mathematical consistency of traditional language models.

https://arxiv.org/abs/1906.08237

3 projects · 3 cities

Related technologies

BERT 179 RoBERTa 118 alBERT 4 GPT-3 191 GPT-4 528 T5 5 arcprize 1 BART 4 Conversational AI 5 Deepgram 11 ELMo 1 GPT-2 4 Graph database 2 GraphRAG 13 Knowledge Graph 9 LLMs 82 LUIS 1 NLTK 4

Recent Talks & Demos

Showing 1-3 of 3

Members-Only

ARC: Building a Reasoning Solver

London Dec 4

arcprize BERT

GraphRAG: Improving RAG Accuracy

Bogotá Oct 30

RAG GraphRAG

Deepgram Voice AI Platform

Seattle Aug 15

Deepgram Speech-to-Text