.

Technology

Transformer

The Transformer is a neural network architecture that uses a multi-head self-attention mechanism to process sequences in parallel, replacing slower recurrent (RNN) and convolutional (CNN) layers.

The Transformer architecture, introduced in the landmark 2017 paper 'Attention Is All You Need' by Vaswani et al. (Google), revolutionized sequence-to-sequence modeling. It operates entirely on an attention mechanism (multi-head self-attention), eliminating the need for sequential processing via Recurrent Neural Networks (RNNs). This design allows for massive parallelization, drastically reducing training time and enabling the scale-up of models to billions of parameters. It is the foundational technology for all modern Large Language Models (LLMs), including BERT and the Generative Pre-trained Transformer (GPT) series, driving state-of-the-art performance across Natural Language Processing (NLP) and computer vision tasks.

https://arxiv.org/abs/1706.03762
64 projects · 40 cities

Related technologies

Recent Talks & Demos

Showing 1-24 of 64

Members-Only

Sign in to see who built these projects

Optimización de recursos para LLMs
Bogotá
Transformers PEFT
WhichBox: Multi-modal Vision Search
Raleigh May 6
OpenAI Azure Foundry
SAM: Portable ONNX/C++ Implementation
Lausanne Apr 30
SAM2 ONNX Runtime
Eric Chat: Local Mac AI
Ottawa Apr 25
Eric Transformer MLX-LM
Nanochat: Train LLMs from Scratch
Brussels Apr 1
Python Torch
LogAnalyzer: LLM Log Anomaly Detection
Manizales Mar 25
FastAPI Vue 3
UofT: AI Job Discovery Engine
Toronto Mar 25
FastAPI PostgreSQL
Niuwn AI: Personal AI Twins
Bremen Mar 25
Python FastAPI
Words to World: AI Models
San Diego Feb 26
Unreal Engine 5 PyTorch
Reality Check: Personal Fact-Checking
Tokyo Feb 19
Claude Code OpenAI Codex
Hugging Face RAG: Reduce Hallucinations
Tiruchirappalli Jan 31
Transformers RAG
JobsYo: Multi-Model Job Search AI
Toronto Jan 29
GPT-5 Gemini 3
Transformers Detect Netflow Anomalies
Toronto Jan 29
Python Transformers
Diagnosable ColBERT: Debugging Vector Search
Brussels Dec 17
GPT-4 Claude-3
SLM Fine-tuning on 16GB CPU
Waterloo Dec 15
LangChain Transformers
AI-First Clinical Trials EDC
Chicago Dec 9
React Spring Boot
fastworkflow: SOTA with Small Models
Houston Dec 9
GPT-4 Claude-3
Attention Context Memory Unlocking
Portland Dec 3
Transformer SSM
Number Theory: AI, Crypto, Optimization
Boston Dec 2
Python Apache Kafka
Arbiter: Zero-Instrumentation LLM Costs
San Francisco Nov 20
OpenAI SDK Gemini
Constrained Decoding: LLM Pixel Art
Montreal Nov 20
Modal Transformers
Sentence Transformers: Content Categorization
Nürnberg Nov 20
GPT-4 LangChain
CityPulse: Multi-Modal Video Understanding
New York City Nov 17
Ollama FastAPI
Finetuning SLMs for Agents
Amsterdam Nov 11
Distill Labs Transformers