Technology
ProtT5
A protein language model based on the T5 architecture that translates amino acid sequences into high-dimensional functional representations.
ProtT5 leverages the Transformer-based T5 framework to master the language of life by training on 2.1 billion sequences from the BFD and UniRef50 databases. Developed by the Rost Lab, it treats proteins as text (amino acids as words) to predict secondary structure, subcellular localization, and conservation scores with state-of-the-art accuracy. By utilizing a self-supervised masked language modeling objective, ProtT5-XL-UniRef50 generates 1024-dimensional embeddings that capture deep evolutionary information without requiring expensive multiple sequence alignments (MSAs).
Related technologies
Recent Talks & Demos
Showing 1-1 of 1