Technology

bitsandbytes

A PyTorch library for k-bit quantization and 8-bit optimizers: It dramatically reduces memory consumption for large language models (LLMs).

Bitsandbytes is your go-to toolkit for GPU memory efficiency in deep learning, specifically for PyTorch models. The library provides core features like 8-bit optimizers (e.g., Adam8bit) and k-bit quantization (8-bit and 4-bit) via custom CUDA functions. This enables you to run or finetune massive LLMs—like Llama-13b—on consumer-grade hardware (e.g., a 16GB NVIDIA T4 GPU). Key techniques include LLM.int8() for high-performance inference and QLoRA (4-bit quantization) for memory-efficient training, saving up to 75% of memory state. Deploy this to scale your model size without upgrading your hardware.

https://github.com/bitsandbytes-foundation/bitsandbytes

1 project · 1 city

Related technologies

accelerate 1 datasets 6 PEFT 4 Transformers 146

Recent Talks & Demos

Showing 1-1 of 1

Members-Only

Entrena tu propio modelo sin morir en el intento: Optimización de rec…

Bogotá

Transformers PEFT