Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
Bhumi: Outpacing Native AI Inference
Learn how Bhumi's Rust-powered client delivers faster AI inference than native libraries and HTTP calls, supporting OpenAI, Anthropic, and Gemini with parallel processing.
Bhumi is a high-performance AI inference client designed to be faster than any other library, including native implementations and direct HTTP calls. Built in Rust with Python bindings, it optimizes request handling, reduces latency, and significantly improves throughput. Supporting OpenAI, Anthropic, and Gemini, Bhumi provides seamless multi-model switching while being 2-3x faster than LiteLLM and other alternatives.
Bhumi: Rust-built Python AI inference client for fast LLM inference.
Bhumi is a Rust-powered Python client for fast, unified AI inference.
Related projects
Introduction Groq - world's fastest AI inference
Singapore
Learn how Groq's LPU hardware and software platform provides high‑speed, energy‑efficient AI inference, offers free cloud compute for…
Run Local, open source AI
Singapore
Learn how to run open-source models like Llama3, Mistral, and Gemma locally using Jan.ai and Cortex.so, with practical…
Artecon - A hotspot for AI
Seattle
Learn how to run CPU‑based ML models with low latency, using small public models and post‑processing, then bundle…
Big Models, Small Machines: Run Full-Precision LLMs on Low Memory
London
Learn how to run full‑precision LLMs on low‑memory devices using a custom inference strategy, demonstrated with a 1.7B…
How we built one of the most accurate computer use agent, and how we are scaling it
Singapore
The talk covers Iris, a computer use agent capable of browsing, reading files, and connecting to MCP servers,…
Ho Jiak Bo: a food recommender app based on top food blogs
Singapore
Build an AI food recommender by crawling Singapore blogs, extracting JSON via LLMs, adding map data, and blending…