Technology

RoBERTa

RoBERTa (Robustly Optimized BERT Pretraining Approach) is a high-performance language model from Facebook AI that significantly outperforms BERT by optimizing the pretraining strategy, not the core architecture.

RoBERTa is a robustly optimized version of the BERT model, developed by researchers at Facebook AI in 2019. The team conducted a replication study, proving BERT was undertrained and could achieve state-of-the-art results with a refined recipe: they removed the Next Sentence Prediction (NSP) objective, implemented dynamic masking, and scaled up training dramatically. Specifically, RoBERTa trained for 500K steps (up from 100K) on a massive 160GB of text data (ten times BERT’s data) using much larger batch sizes (up to 8K). This optimized approach yielded superior performance on major benchmarks like GLUE, RACE, and SQuAD, establishing RoBERTa as a benchmark for subsequent language model development.

https://arxiv.org/abs/1907.11692

118 projects · 40 cities

Related technologies

BERT 186 GPT-3 390 GPT-4 678 BLOOM 116 Llama-2 337 PaLM 2 117 RAG 253 scikit-learn 86 TensorFlow 97 Keras 76 ONNX 87 PyTorch 264 Python 739 Prompt Engineering 42 Generative AI 52 Fine-tuning 40 AI agents 44 ChatGPT 96

Recent Talks & Demos

Showing 61-84 of 118

Members-Only

Sign in to see who built these projects

Sign in View FAQ

New York City Aug 28

GPT-4 Document scraping

Large Scale GPU Inference

Cone AI: LLMs for YC Course

ChatGPT Generative AI

Deepgram Voice AI Platform

Deepgram Speech-to-Text

Fixie: Real-Time Multi-Modal AI

Fixie TensorFlow

Milton: Economic Data Agent

Singapore Aug 2

Analisis de comentarios

LangChain GPT-4

Dream Daimon: AI Subconscious

New York City Jul 24

Dream Daimon TensorFlow

CodeDD: AI Code Due Diligence

Anthropic LLM Neo4j

Extreme Multi-Label ICL

GPU-Powered LLMs and GNNs

Ollama: Testing LLM Non-Determinism

Testcontainers Ollama

Generative AI API Testing

San Francisco Jul 11

Postman Generative AI

LearnQuantum: AI Physics Engine

San Francisco Jul 11

LearnQuantum Chinchilla

Momentum: Git Push Code Auditor

San Francisco Jul 11

LLMs: Tabular Data to SQL Insights

OpenSesame Demo

AI Legal Contract Analysis

Kuala Lumpur Jun 27

Cortex: Hardware LLM Runtime

Los Angeles Jun 25

Scripter Studio: AI Script Analysis

Los Angeles Jun 25

LLMs Generate Knowledge Graphs

Knowledge Graphs Structured Networks

GPT-4 Chain-of-Thought

LLM Knowledge Graph Generation

New York City Jun 4

GPT-4 Knowledge Graphs

Reliable AI Agents

New York City May 22