Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
Jailbreak RUN: Gamifying LLM Security
Participants collaboratively attempt to jailbreak an LLM‑powered bot using prompt engineering and RAG manipulation, revealing security flaws and learning LLM logic in real time.
Jailbreak RUN is both a community event and a Discord LLM-powered bot with a simple concept: challenge an entire server of users to work together to break the bot and extract a password or force it to perform a specific action. The bot is designed to hybridize prompt engineering with application logic tailored to a specific event, and through attempts to break the bot, it also enables storytelling tied to uncovering information about the game directly from the bot.
So far, the following editions have been conducted:
- A classic Jailbreak with guardrails.
- A Jailbreak with encrypted input and output messages, making some jailbreak methods more difficult or outright impossible to use.
- Jailbreaking a vector database by attempting to override the RAG (Retrieval-Augmented Generation) system to extract hidden sensitive data from the database while simultaneously bypassing guardrails.
This concept is highly scalable to other variants, such as the planned PvP mode, where players will design bots for one another and compete in a tournament to see who can jailbreak the other teams’ bots first.
Related projects
Deceiving LLMs in a videogame into surrendering passwords
London
Explore how a realistic game uses LLM-driven NPCs to demonstrate social engineering attacks, guard‑rail bypasses, and practical strategies…
Genaicode - programming on steroids
Poland
Live demo of Genaicode, an AI code generator, modifying a personal game in real time and covering latency,…
Paradigm – Understand the Code You Don’t Understand
Poland
Understand legacy code with Paradigm, a developer tool for deciphering unfamiliar systems, from old bank code to recent…
Jailbreaking Small Language Models
Seattle
Explore how small language models can be jailbroken, assess associated risks, and discuss practical red‑team techniques for securing…
How I hacked the hottest SF startup
Poland
Discover how Poke's system prompt was leaked and its architecture reverse-engineered, offering a glimpse into future AI interactions…
It’s Just a Game… Until the AI Starts Asking Questions
Prague
Live demo of an experimental choice‑based game where you act as a newly trained LLM, navigating alignment tests,…