.

Members-Only

Recent Talks & Demos are for members only

Exclusive feed

You must be an AI Tinkerers active member to view these talks and demos.

April 24, 2025 · San Francisco

TensorRT-LLM: High-Throughput Embeddings

This talk demonstrates an optimized TensorRT-LLM embedding runtime achieving up to twice the performance of alternatives, with code, benchmarks, and architecture insights.

Overview
Links
Tech stack

Related projects