Technology
Cactus Compute Runtime (on‑device inference engine for FunctionGemma
High-performance on-device inference engine built to execute FunctionGemma models with sub-150ms latency for local tool calling.
Cactus Compute Runtime serves as the dedicated execution layer for FunctionGemma: a model family optimized for precise API orchestration. It leverages hardware acceleration (specifically Vulkan and Metal) to run 2B and 7B parameter models directly on mobile chipsets. By utilizing 4-bit quantization and tight memory management, the runtime enables complex intent recognition and local device actions (like smart home control or calendar scheduling) without cloud round-trips.
0 projects
·
0 cities
Recent Talks & Demos
Showing 1-0 of 0
No public projects found for this technology yet.