Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
Agents on edge
Examining deployment of TinyLlama on a 4 GB Jetson Nano, measuring memory, CPU, and GPU usage while assessing feasibility of LLM agents and their workload performance.
What if we deploy tinyllama on Nvidia Jetson nano 4GB variant and track how system resource usage would look like for agent workloads?
Rationale behind the project:
Edge devices present a compelling use case for LLM agents due to their ability to perform inference locally, which is particularly valuable in applications where data privacy, connectivity limitations, or ecosystem boundaries are significant concerns. The primary question addressed in this project is whether these devices can effectively manage the workload demands of deploying locally hosted small language models, given their inherent memory and hardware constraints. The analysis focuses on the Nvidia Jetson Nano developer kit board, selected as a representative edge device. This choice establishes specific limitations in terms of memory capacity and software compatibility, as the Jetson Nano is an older model compared to the newer Jetson Orin Nano, which is currently supported by Nvidia.
Analyzes TinyLlama 1.1B workloads on Jetson Nano using Llama.cpp.