.

Technology

NanoGPT

NanoGPT is Andrej Karpathy's minimalist, PyTorch-based implementation of the GPT-2 transformer: a streamlined, ~300-line codebase designed for rapid training and educational clarity.

NanoGPT, developed by Andrej Karpathy, is the definitive minimalist PyTorch implementation of the GPT-2 transformer architecture. It prioritizes "teeth over education," delivering a streamlined, efficient codebase: `model.py` and `train.py` are each approximately 300 lines of Python. This simplicity allows users to quickly train or fine-tune medium-sized GPTs; for example, it can reproduce the 124M parameter GPT-2 model on OpenWebText. The project is a core resource for researchers and practitioners seeking clarity, speed, and a highly hackable foundation for large language model experimentation.

https://github.com/karpathy/nanoGPT
4 projects · 4 cities

Related technologies

Recent Talks & Demos

Showing 1-4 of 4

Members-Only

Sign in to see who built these projects