Technology

RoBERTa

RoBERTa (Robustly Optimized BERT Pretraining Approach) is a high-performance language model from Facebook AI that significantly outperforms BERT by optimizing the pretraining strategy, not the core architecture.

RoBERTa is a robustly optimized version of the BERT model, developed by researchers at Facebook AI in 2019. The team conducted a replication study, proving BERT was undertrained and could achieve state-of-the-art results with a refined recipe: they removed the Next Sentence Prediction (NSP) objective, implemented dynamic masking, and scaled up training dramatically. Specifically, RoBERTa trained for 500K steps (up from 100K) on a massive 160GB of text data (ten times BERT’s data) using much larger batch sizes (up to 8K). This optimized approach yielded superior performance on major benchmarks like GLUE, RACE, and SQuAD, establishing RoBERTa as a benchmark for subsequent language model development.

https://arxiv.org/abs/1907.11692

118 projects · 40 cities

Related technologies

BERT 186 GPT-3 390 GPT-4 678 BLOOM 116 Llama-2 337 PaLM 2 117 RAG 253 scikit-learn 86 TensorFlow 97 Keras 76 ONNX 87 PyTorch 264 Python 739 Prompt Engineering 42 Generative AI 52 Fine-tuning 40 AI agents 44 ChatGPT 96

Recent Talks & Demos

Showing 41-64 of 118

Members-Only

Sign in to see who built these projects

Sign in View FAQ

Beyond Presence: Hyper-Realistic Avatars

LLMs for Branching Storytelling

Singapore Nov 19

Anthropic Gemini

React Compiler for LLMs

Amsterdam Nov 12

Automated Video Editing with LLMs

Amsterdam Nov 12

Evaluating AI Agents in Finance

Generative models Agentic Pipelines

Pieces: Long-Term Developer Memory

Cincinnati Oct 30

GraphRAG: Improving RAG Accuracy

Agentic LLMs for Data Enrichment

Montreal Oct 29

Montreal Oct 29

LLM Evaluation Labeling Workflow

San Francisco Oct 22

Listen PowerPoint

Hypothesis Sage: Agentic RAG Statistics

Lidar Processing for Machine Learning

Fort Wayne Oct 15

LiDAR Point Cloud

GenAI Firewall: Enterprise Security

GPT-4 Regular expressions

Amazon Bedrock: Serverless AI Orchestration

Amazon Bedrock AWS Step Functions

LLM Data Classification

Amsterdam Sep 25

Generative AI GPT-4

LLM Prompting for Semantic Routing

GPT-4 Prompt Engineering

YT shorts finder

Vision models Whisper

Repeated Inference Improves LLM Output

GPT-4 Inference

Mirascope — V1

Los Angeles Aug 28

Mirascope GPT-4

New York City Aug 28

GPT-4 Document scraping

Large Scale GPU Inference

Cone AI: LLMs for YC Course

ChatGPT Generative AI

Deepgram Voice AI Platform

Deepgram Speech-to-Text