Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
Quantizing SDXL for Inference
This talk explains quantization principles and demonstrates SVD quantization on Stable Diffusion XL, showing how to reduce GPU VRAM usage effectively for inference.
This presentation offers a concise and accessible introduction to the principles of quantization, a technique used to optimize computational efficiency. It includes an overview of a basic, straightforward implementation of Singular Value Decomposition (SVD) quantization applied to the Stable Diffusion XL (SDXL) model. The approach demonstrates a practical method to significantly reduce GPU VRAM usage, dropping from 6.5 GB to 3.5 GB with minimal code. Designed for professionals and enthusiasts alike, this talk highlights the potential for resource optimization in machine learning workflows.
Jupyter notebook demonstrates quantitative finance concepts using Python for analysis.
Related projects
AI in compliance
Pune
Learn how AI agents automatically analyze documents, identify compliance gaps, and provide real‑time monitoring for ISO‑27001/SOC, using LLMs,…
Building AI Workflows
Delhi
This talk covers building complex AI workflows using Julep, an open-source platform that simplifies creating agentic AI applications…
Quantization for Edge AI
Nairobi
A live walkthrough of building an Android AI app for community health workers, covering model quantization, edge deployment,…
AI Agents for Enterprises: Automating Workflows at Scale
Delhi
This talk demonstrates AI Agents automating complex enterprise workflows, showcasing real-world task automation and management within a dedicated…
Mobile Use Agent - Operator for apps
Delhi
This talk demonstrates an AI agent that uses computer vision and touch emulation to operate mobile apps, automating…
Big Models, Small Machines: Run Full-Precision LLMs on Low Memory
London
Learn how to run full‑precision LLMs on low‑memory devices using a custom inference strategy, demonstrated with a 1.7B…