.

Technology

Transformer

The Transformer is a neural network architecture that uses a multi-head self-attention mechanism to process sequences in parallel, replacing slower recurrent (RNN) and convolutional (CNN) layers.

The Transformer architecture, introduced in the landmark 2017 paper 'Attention Is All You Need' by Vaswani et al. (Google), revolutionized sequence-to-sequence modeling. It operates entirely on an attention mechanism (multi-head self-attention), eliminating the need for sequential processing via Recurrent Neural Networks (RNNs). This design allows for massive parallelization, drastically reducing training time and enabling the scale-up of models to billions of parameters. It is the foundational technology for all modern Large Language Models (LLMs), including BERT and the Generative Pre-trained Transformer (GPT) series, driving state-of-the-art performance across Natural Language Processing (NLP) and computer vision tasks.

https://arxiv.org/abs/1706.03762
64 projects · 40 cities

Related technologies

Recent Talks & Demos

Showing 41-64 of 64

Members-Only

Sign in to see who built these projects

University RAG System Deployment
Nürnberg Jul 3
FastAPI sentence-transformers
Couchbase Semantic Recipe Search
Dubai Jun 28
Transformers Ollama
Phi-4 + FastViT-HD VLM
Seattle Jun 27
Phi-4-mini PyTorch
CLOP: Omics Pretraining
Lausanne Jun 16
PyTorch ONNX
Transformers: Latent Space Attractors
Milan Jun 10
PyTorch Transformers
DSPy: Self-Programming Meta-Agents
New York City Jun 3
DSPY vLLM
ML for Government Transparency
New York City Jun 3
Llama-4 Gemini
Entropix: Compute for Molecular Structure
Milan May 8
decoder-only Transformer
VIT Attention: Finding Image Features
Mumbai Apr 26
GPT-4 LangChain
Transformers para Daño Sísmico
Quito Apr 24
PyTorch Python
AI's Future and Ethics
Dublin Apr 24
GPT-4 OpenAI API
Single Embedding LLM Control
Dublin Apr 24
GPT-4 LangChain
Music Box
Lausanne Apr 1
GPT-4 LangChain
Unstructured data visualization
Atlanta Feb 27
Transformers datasets
nanoDiffusion
Zürich Feb 6
NanoGPT nanoDiffusion
Ollama Groq Local Inference
Manizales Jan 22
Llama-2 Mistral
Multi-task Audio Transformer Model
Bengaluru Nov 12
Transformer Text-to-Speech
toby
New York City Aug 28
Speech Translation Google Cloud Translation API
Attention: 4 Lines Explained
Berlin Aug 22
Transformer Attention
Diffusion Transformer Avatar Navigation
San Francisco Jul 11
Diffusion model Transformer
Go Transformer: llm.c Implementation
Toronto Jun 27
LLM Go
Chimnie: Training Small Address Models
Amsterdam May 23
Transformer Python
Diffusion Style Transfer on Single GPU
Los Angeles May 21
Stable Diffusion Style Transfer
Spreadsheet GPT-2 Forward Pass
Seattle Oct 25
GPT-2 Excel