Technology
ProtBERT
A self-supervised transformer model trained on 217 million protein sequences to predict secondary structure and subcellular localization.
Developed by researchers at the Technical University of Munich, ProtBERT adapts the BERT architecture to the language of life by training on the UniRef100 dataset (217 million sequences). It bypasses the need for costly Multiple Sequence Alignments (MSA) by learning intrinsic biophysical properties directly from raw amino acid strings. The model achieves state-of-the-art performance on benchmarks like NetSurfP-2.0 for secondary structure prediction and DeepLoc for localization, providing a high-throughput alternative to traditional bioinformatics pipelines.
Related technologies
Recent Talks & Demos
Showing 1-1 of 1