Technology

Pre-training

Pre-training is the foundational phase of deep learning: a model learns general features and world knowledge from massive, unlabeled datasets before being fine-tuned for a specific downstream task.

Pre-training is the critical first step in the transfer learning paradigm: the model (e.g., a Transformer architecture) is trained on a vast, general corpus—often petabytes of text like Common Crawl or Wikipedia—to establish foundational parameters. This initial phase uses self-supervised objectives, such as Masked Language Modeling (MLM) in BERT or next-token prediction in GPT models, to acquire a deep, generalized understanding of language or images. This 'base model' then only requires a second, efficient fine-tuning step on a smaller, labeled dataset to achieve state-of-the-art results (e.g., 90%+ accuracy) on specialized tasks, drastically reducing development time and computational cost.

https://huggingface.co/models

1 project · 1 city

Related technologies

Akkadian Oracle 1 BERT 179 BLOOM 115 GPT-3 191 GPT-4 528 Llama-2 227 PaLM 2 116 RAG 138 RoBERTa 118

Recent Talks & Demos

Showing 1-1 of 1

Members-Only

Akkadian Oracle

Los Angeles Jan 10

RAG GPT-4