.

Technology

TinyBERT

TinyBERT uses a two-stage transformer distillation process to compress BERT models by 7.5x while retaining 96% performance.

Engineered by Huawei Noah’s Ark Lab, TinyBERT bridges the gap between massive language models and edge device deployment. It employs a novel Transformer-distillation method that targets knowledge transfer at the embedding, hidden state, and self-attention layers. By reducing parameters from 110 million to 14.5 million, the model achieves inference speeds up to 9.4x faster than BERT-base. This efficiency makes it a standard choice for real-time NLP tasks on mobile hardware (ARM/Android) without sacrificing the accuracy required for GLUE benchmark standards.

https://github.com/huawei-noah/Pretrained-Language-Model/tree/master/TinyBERT
2 projects · 2 cities

Related technologies

Recent Talks & Demos

Showing 1-2 of 2

Members-Only

Sign in to see who built these projects