Technology
ProtGPT2
ProtGPT2 is an autoregressive transformer model trained on the entire UniRef50 database to generate functional, de novo protein sequences.
Developed by the Ferruz lab, ProtGPT2 leverages a GPT-2 architecture with 738 million parameters to navigate the vast protein fitness landscape. It was trained on 45 million sequences from UniRef50 (spanning the entire protein space) to learn the underlying grammar of biological polymers. The model generates 400-residue sequences in seconds that maintain natural-like secondary structure and folding patterns. Researchers use it to design novel enzymes and binders that bypass the limitations of evolutionary-bound templates: providing a high-throughput alternative to traditional directed evolution.
Related technologies
Recent Talks & Demos
Showing 1-1 of 1