Technology

TinyBERT

TinyBERT uses a two-stage transformer distillation process to compress BERT models by 7.5x while retaining 96% performance.

Engineered by Huawei Noah’s Ark Lab, TinyBERT bridges the gap between massive language models and edge device deployment. It employs a novel Transformer-distillation method that targets knowledge transfer at the embedding, hidden state, and self-attention layers. By reducing parameters from 110 million to 14.5 million, the model achieves inference speeds up to 9.4x faster than BERT-base. This efficiency makes it a standard choice for real-time NLP tasks on mobile hardware (ARM/Android) without sacrificing the accuracy required for GLUE benchmark standards.

https://github.com/huawei-noah/Pretrained-Language-Model/tree/master/TinyBERT

1 project · 1 city

Related technologies

alBERT 4 DistilBERT 2 MiniLM 1 MobileBERT 1 On-device AI 3 SqueezeBERT 1

Recent Talks & Demos

Showing 1-1 of 1

Members-Only

SafeGuide: Offline Emergency Guidance

Tokyo Jan 15

DistilBERT On-device AI