Technology

Transformer

The Transformer is a neural network architecture that uses a multi-head self-attention mechanism to process sequences in parallel, replacing slower recurrent (RNN) and convolutional (CNN) layers.

The Transformer architecture, introduced in the landmark 2017 paper 'Attention Is All You Need' by Vaswani et al. (Google), revolutionized sequence-to-sequence modeling. It operates entirely on an attention mechanism (multi-head self-attention), eliminating the need for sequential processing via Recurrent Neural Networks (RNNs). This design allows for massive parallelization, drastically reducing training time and enabling the scale-up of models to billions of parameters. It is the foundational technology for all modern Large Language Models (LLMs), including BERT and the Generative Pre-trained Transformer (GPT) series, driving state-of-the-art performance across Natural Language Processing (NLP) and computer vision tasks.

https://arxiv.org/abs/1706.03762

64 projects · 40 cities

Related technologies

Transformers 168 Python 739 PyTorch 264 LangChain 439 FastAPI 159 Docker 157 GPT-4 678 OpenAI API 500 sentence-transformers 17 FAISS 18 pgvector 28 Claude-3 444 Ollama 82 PostgreSQL 144 React 260 Gemini 254 Llama-2 337 LLM 399

Recent Talks & Demos

Showing 41-64 of 64

Members-Only

Sign in to see who built these projects

Sign in View FAQ

University RAG System Deployment

Nürnberg Jul 3

FastAPI sentence-transformers

Couchbase Semantic Recipe Search

Transformers Ollama

Phi-4 + FastViT-HD VLM

Phi-4-mini PyTorch

CLOP: Omics Pretraining

Lausanne Jun 16

Transformers: Latent Space Attractors

PyTorch Transformers

DSPy: Self-Programming Meta-Agents

New York City Jun 3

ML for Government Transparency

New York City Jun 3

Entropix: Compute for Molecular Structure

decoder-only Transformer

VIT Attention: Finding Image Features

GPT-4 LangChain

Transformers para Daño Sísmico

AI's Future and Ethics

GPT-4 OpenAI API

Single Embedding LLM Control

GPT-4 LangChain

GPT-4 LangChain

Unstructured data visualization

Transformers datasets

NanoGPT nanoDiffusion

Ollama Groq Local Inference

Manizales Jan 22

Llama-2 Mistral

Multi-task Audio Transformer Model

Bengaluru Nov 12

Transformer Text-to-Speech

New York City Aug 28

Speech Translation Google Cloud Translation API

Attention: 4 Lines Explained

Transformer Attention

Diffusion Transformer Avatar Navigation

San Francisco Jul 11

Diffusion model Transformer

Go Transformer: llm.c Implementation

Chimnie: Training Small Address Models

Amsterdam May 23

Transformer Python

Diffusion Style Transfer on Single GPU

Los Angeles May 21

Stable Diffusion Style Transfer

Spreadsheet GPT-2 Forward Pass