Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
LLM Fingerprinting: Model Classification
This talk demonstrates a system to identify and classify large language models by analyzing their responses to benchmark prompts, using live API classification and code walkthroughs.
The presentation walks through a LLM classification system to identify and classify Large Language Models (LLMs) based on their ability to respond to various prompts from diverse disciplines. This project involves evaluating performance on specific benchmarks (relating to math, logic, self-identification, etc) and scoring the LLMs at various temperatures to then use that data to build a classifier.
The implementation combines benchmarking and classification to classify LLMs from different families, such as GPT, LLaMA, Claude, and Gemini.
The demo will include:
- A live classification run to determine which LLM is accessed through an API key.
- A code walkthrough of the frontend, evaluation process and classification model.
Compares Deepinfra LLM first-token responses, analyzing distribution across models/temperatures.
Related projects
LLM.f90 - Minimal Large Language Model Inference Framework
Toronto
A low‑dependency Fortran framework for LLM inference, showing zero‑dependency implementation, matrix operations, and support for Llama, Phi, and…
Unlocking Insights from Tabular Data with LLMs
Toronto
Learn how to convert natural‑language questions into SQL, retrieve data, and get concise summaries, enabling product managers to…
Fine Tune and Evaluate 10 LLMs in in 10 minutes
Toronto
Demonstrate rapid synthetic data creation, fine‑tuning ten open LLMs, evaluating with LLM‑as‑Judge and G‑Eval, and discuss dataset collaboration,…
All the Trainingz, No Codez
Toronto
This talk demonstrates how to train, finetune, and preference tune large language models on a home computer using…
Browsing the web with AI
Toronto
Explore building reliable web‑scraping agents using a vision-language model, Claude reasoning, Selenium automation, and prompt engineering, demonstrated with…
Educational AI
Toronto
Learn how we built a code‑annotation model and a documentation generator, the challenges faced, and insights for improving…