.

Technology

RoBERTa

RoBERTa (Robustly Optimized BERT Pretraining Approach) is a high-performance language model from Facebook AI that significantly outperforms BERT by optimizing the pretraining strategy, not the core architecture.

RoBERTa is a robustly optimized version of the BERT model, developed by researchers at Facebook AI in 2019. The team conducted a replication study, proving BERT was undertrained and could achieve state-of-the-art results with a refined recipe: they removed the Next Sentence Prediction (NSP) objective, implemented dynamic masking, and scaled up training dramatically. Specifically, RoBERTa trained for 500K steps (up from 100K) on a massive 160GB of text data (ten times BERT’s data) using much larger batch sizes (up to 8K). This optimized approach yielded superior performance on major benchmarks like GLUE, RACE, and SQuAD, establishing RoBERTa as a benchmark for subsequent language model development.

https://arxiv.org/abs/1907.11692
118 projects · 40 cities

Related technologies

Recent Talks & Demos

Showing 41-64 of 118

Members-Only

Sign in to see who built these projects

Beyond Presence: Hyper-Realistic Avatars
Munich Nov 21
GPT-4 RAG
LLMs for Branching Storytelling
Singapore Nov 19
Anthropic Gemini
React Compiler for LLMs
Amsterdam Nov 12
React GPT-4
Automated Video Editing with LLMs
Amsterdam Nov 12
Whisper FFmpeg
Evaluating AI Agents in Finance
London Oct 31
Generative models Agentic Pipelines
Pieces: Long-Term Developer Memory
Cincinnati Oct 30
RAG GPT-4
GraphRAG: Improving RAG Accuracy
Bogotá Oct 30
RAG GraphRAG
Agentic LLMs for Data Enrichment
Montreal Oct 29
Weaviate SQL
Cooktok
Montreal Oct 29
GPT-4 JSON
LLM Evaluation Labeling Workflow
Seattle Oct 24
OpenPipe GPT-4
Listen Labs
San Francisco Oct 22
Listen PowerPoint
Hypothesis Sage: Agentic RAG Statistics
Chicago Oct 22
RAG GPT-4
Lidar Processing for Machine Learning
Fort Wayne Oct 15
LiDAR Point Cloud
GenAI Firewall: Enterprise Security
Prague Oct 15
GPT-4 Regular expressions
Amazon Bedrock: Serverless AI Orchestration
Dubai Oct 5
Amazon Bedrock AWS Step Functions
LLM Data Classification
Amsterdam Sep 25
Generative AI GPT-4
LLM Prompting for Semantic Routing
Toronto Sep 20
GPT-4 Prompt Engineering
YT shorts finder
Austin Sep 12
Vision models Whisper
Repeated Inference Improves LLM Output
Hamburg Sep 12
GPT-4 Inference
Mirascope — V1
Los Angeles Aug 28
Mirascope GPT-4
Skipper
New York City Aug 28
GPT-4 Document scraping
Large Scale GPU Inference
Boston Aug 26
ChatGPT GPT-4
Cone AI: LLMs for YC Course
Boston Aug 26
ChatGPT Generative AI
Deepgram Voice AI Platform
Seattle Aug 15
Deepgram Speech-to-Text