Technology
NanoGPT
NanoGPT is Andrej Karpathy's minimalist, PyTorch-based implementation of the GPT-2 transformer: a streamlined, ~300-line codebase designed for rapid training and educational clarity.
NanoGPT, developed by Andrej Karpathy, is the definitive minimalist PyTorch implementation of the GPT-2 transformer architecture. It prioritizes "teeth over education," delivering a streamlined, efficient codebase: `model.py` and `train.py` are each approximately 300 lines of Python. This simplicity allows users to quickly train or fine-tune medium-sized GPTs; for example, it can reproduce the 124M parameter GPT-2 model on OpenWebText. The project is a core resource for researchers and practitioners seeking clarity, speed, and a highly hackable foundation for large language model experimentation.
Related technologies
Recent Talks & Demos
Showing 1-4 of 4