.

Technology

XLNet

A generalized autoregressive pretraining method that outperforms BERT by leveraging permutation language modeling to capture bidirectional context without masking noise.

Developed by researchers at Google Brain and Carnegie Mellon University, XLNet integrates the best features of autoregressive (AR) and autoencoding (AE) models. By utilizing Permutation Language Modeling (PLM), it predicts tokens across all possible factorizations of a sequence order, effectively bypassing the artificial [MASK] tokens that hinder BERT during fine-tuning. This architecture incorporates Transformer-XL mechanisms (segment-level recurrence and relative positional encoding) to handle long-range dependencies. In benchmarks, XLNet outperformed BERT on 20 tasks, achieving state-of-the-art results in SQuAD 2.0 and GLUE by modeling bidirectional context while maintaining the mathematical consistency of traditional language models.

https://arxiv.org/abs/1906.08237
3 projects · 3 cities

Related technologies

Recent Talks & Demos

Showing 1-3 of 3

Members-Only

Sign in to see who built these projects