Technology

DeepSpeed

DeepSpeed is Microsoft's open-source deep learning optimization library: it enables training and inference for models with trillions of parameters, maximizing scale and speed on existing hardware.

DeepSpeed is your high-performance deep learning optimization library, built by Microsoft Research. It integrates with PyTorch to democratize large-scale AI, handling models up to one trillion parameters with its core technology: the Zero Redundancy Optimizer (ZeRO). ZeRO, combined with 3D Parallelism and ZeRO-Offload, delivers significant efficiency gains: models like Megatron-Turing NLG 530B were trained using this stack. Expect up to 5x system performance improvement and a 1.8x higher throughput compared to state-of-the-art alternatives, making distributed training efficient, effective, and accessible.

https://www.deepspeed.ai/

3 projects · 4 cities

Related technologies

Transformers 168 LoRA (peft) 1 Mamba 2 Phi-4-mini 1 PyTorch 264 SlimPajama 1 State Space Model 1 Weights & Biases 10

Recent Talks & Demos

Showing 1-3 of 3

Members-Only

Cloud AI: Elasticity and Scale

Paris Mar 17

Kubernetes Docker

Phi-4 + FastViT-HD VLM

Seattle Jun 27

Phi-4-mini PyTorch

Mamba Long Context Training

Austin Apr 11

Mamba DeepSpeed