.

Technology

Transformer

The Transformer is a neural network architecture that uses a multi-head self-attention mechanism to process sequences in parallel, replacing slower recurrent (RNN) and convolutional (CNN) layers.

The Transformer architecture, introduced in the landmark 2017 paper 'Attention Is All You Need' by Vaswani et al. (Google), revolutionized sequence-to-sequence modeling. It operates entirely on an attention mechanism (multi-head self-attention), eliminating the need for sequential processing via Recurrent Neural Networks (RNNs). This design allows for massive parallelization, drastically reducing training time and enabling the scale-up of models to billions of parameters. It is the foundational technology for all modern Large Language Models (LLMs), including BERT and the Generative Pre-trained Transformer (GPT) series, driving state-of-the-art performance across Natural Language Processing (NLP) and computer vision tasks.

https://arxiv.org/abs/1706.03762
64 projects · 40 cities

Related technologies

Recent Talks & Demos

Showing 21-44 of 64

Members-Only

Sign in to see who built these projects

Constrained Decoding: LLM Pixel Art
Montreal Nov 20
Modal Transformers
Sentence Transformers: Content Categorization
Nürnberg Nov 20
GPT-4 LangChain
CityPulse: Multi-Modal Video Understanding
New York City Nov 17
Ollama FastAPI
Finetuning SLMs for Agents
Amsterdam Nov 11
Distill Labs Transformers
Scalable Production RAG Architecture
Toronto Nov 10
FAISS OpenAI API
This Is So You! Event Newsletter
Toronto Nov 10
FastAPI PostgreSQL
This Is So You! Event Newsletter
Toronto Nov 10
FastAPI PostgreSQL
This Is So You! Event Newsletter
Toronto Nov 10
FastAPI PostgreSQL
Instruct Lab LLM Evaluation Playbook
Toronto Nov 10
Merlinite-7B-Lab Mistral Mixtral
CityPulseNYC: Multi-Modal RAG
New York City Nov 6
Ollama LLaMA 3B
RapidFire AI: Parallel LLM Experimentation
San Diego Oct 29
PyTorch Transformers
BERT Fine-tuning on MultiNLI
Houston Oct 14
Claude Code Transformers
Full-Precision LLMs on Small Machines
London Oct 7
Transformers Apple MLX
FiftyOne Visual Similarity Search
Raleigh Sep 30
FiftyOne CLIP
Hexagone: Anonymize Data for AI
Paris Sep 18
vLLM Transformers
Hexagone AI: Multimodal Anonymization
Paris Sep 18
React Next
AutoRAG: Specialized AI Datasets
Brisbane Sep 11
Llama-3-8B-Instruct FAISS
fastWorkflow: Deterministic Conversational AI
Houston Sep 9
LiteLLM Transformers
LLM Safety: Model vs Prompt
Dubai Aug 23
GPT-4 GPT-3
Production LLM Cost Optimization
Orange County Jul 31
Transformers vLLM
University RAG System Deployment
Nürnberg Jul 3
FastAPI sentence-transformers
Couchbase Semantic Recipe Search
Dubai Jun 28
Transformers Ollama
Phi-4 + FastViT-HD VLM
Seattle Jun 27
Phi-4-mini PyTorch
CLOP: Omics Pretraining
Lausanne Jun 16
PyTorch ONNX