Technology
Pre-training
Pre-training is the foundational phase of deep learning: a model learns general features and world knowledge from massive, unlabeled datasets before being fine-tuned for a specific downstream task.
Pre-training is the critical first step in the transfer learning paradigm: the model (e.g., a Transformer architecture) is trained on a vast, general corpus—often petabytes of text like Common Crawl or Wikipedia—to establish foundational parameters. This initial phase uses self-supervised objectives, such as Masked Language Modeling (MLM) in BERT or next-token prediction in GPT models, to acquire a deep, generalized understanding of language or images. This 'base model' then only requires a second, efficient fine-tuning step on a smaller, labeled dataset to achieve state-of-the-art results (e.g., 90%+ accuracy) on specialized tasks, drastically reducing development time and computational cost.
Related technologies
Recent Talks & Demos
Showing 1-1 of 1