.

Technology

RoBERTa

RoBERTa (Robustly Optimized BERT Pretraining Approach) is a high-performance language model from Facebook AI that significantly outperforms BERT by optimizing the pretraining strategy, not the core architecture.

RoBERTa is a robustly optimized version of the BERT model, developed by researchers at Facebook AI in 2019. The team conducted a replication study, proving BERT was undertrained and could achieve state-of-the-art results with a refined recipe: they removed the Next Sentence Prediction (NSP) objective, implemented dynamic masking, and scaled up training dramatically. Specifically, RoBERTa trained for 500K steps (up from 100K) on a massive 160GB of text data (ten times BERT’s data) using much larger batch sizes (up to 8K). This optimized approach yielded superior performance on major benchmarks like GLUE, RACE, and SQuAD, establishing RoBERTa as a benchmark for subsequent language model development.

https://arxiv.org/abs/1907.11692
118 projects · 40 cities

Related technologies

Recent Talks & Demos

Showing 81-104 of 118

Members-Only

Sign in to see who built these projects

LLMs Generate Knowledge Graphs
Boston Jun 10
Knowledge Graphs Structured Networks
Pick-Em's Bot
Seattle Jun 6
GPT-4 Chain-of-Thought
LLM Knowledge Graph Generation
New York City Jun 4
GPT-4 Knowledge Graphs
Reliable AI Agents
New York City May 22
GPT-4 RAG
Twitter '95 AI Social Simulation
New York City May 22
GPT-4 AI tools
Synthasaizer: LLMs and Time Travel
Los Angeles May 21
GPT-4 synthasaizer
Obsidian: Non-Linear AI Chat
Denver May 13
ChatGPT Obsidian
LLMs Analyze Qualitative Data
Zürich May 8
GPT-4 Topic Modeling
Vocal Docs: Talkable Document Editor
Seattle Apr 25
GPT-4 Transcription
RAG Copilot for Big Data extraction
Medellín Apr 25
AWS MapReduce
CrustyCrab: LLM C-to-Rust Translator
New York City Apr 24
C Rust
LLM Language Learning App
Boston Mar 25
GPT-4 GPT-3
Libra: AI Legal Argument Analysis
Berlin Mar 21
GPT-4 Prompt Engineering
Viewpoint.AI: Accelerating Group Decisions
Eastside Entrepreneurs Mar 7
GPT-4 Neural networks
LLM Causal Modeling for Social Science
Boston Feb 26
GPT-4 GPT-3
Pokemon LLM Battle Calculations
Chicago Feb 20
GPT-4 Fine-tuning
AI Filter: Local LLM Filtering
Chicago Feb 20
Chrome extension Twitter
ClaimValidator
Los Angeles Feb 8
GPT-4 Generative AI
OpenAI Tools and Extraction
Los Angeles Feb 8
Pydantic OpenAI API
EidOS: Agent Operating System
Seattle Jan 24
EidOS Kubernetes
nutritionGPT: LLM Nutrition Pipeline
Chicago Jan 23
nutritionGPT GPT-4
dstack: GPU Workloads on Any Cloud
Munich Jan 18
dstack GPU
midpage: Fixing Vector Search Context
Munich Jan 18
Vector Search Embeddings
Akkadian Oracle
Los Angeles Jan 10
RAG GPT-4