Jailbreak RUN: Gamifying LLM Security

Participants collaboratively attempt to jailbreak an LLM‑powered bot using prompt engineering and RAG manipulation, revealing security flaws and learning LLM logic in real time.

Video

Overview

Jailbreak RUN is both a community event and a Discord LLM-powered bot with a simple concept: challenge an entire server of users to work together to break the bot and extract a password or force it to perform a specific action. The bot is designed to hybridize prompt engineering with application logic tailored to a specific event, and through attempts to break the bot, it also enables storytelling tied to uncovering information about the game directly from the bot.

So far, the following editions have been conducted:

A classic Jailbreak with guardrails.
A Jailbreak with encrypted input and output messages, making some jailbreak methods more difficult or outright impossible to use.
Jailbreaking a vector database by attempting to override the RAG (Retrieval-Augmented Generation) system to extract hidden sensitive data from the database while simultaneously bypassing guardrails.

This concept is highly scalable to other variants, such as the planned PvP mode, where players will design bots for one another and compete in a tournament to see who can jailbreak the other teams’ bots first.

Links

https://www.canva.com/design/DAGZFOLqiEg/j-DXOqhZ_rKZVNBdqgdIgw/edit

Tech stack

Related projects

Deceiving LLMs in a videogame into surrendering passwords

London

Explore how a realistic game uses LLM-driven NPCs to demonstrate social engineering attacks, guard‑rail bypasses, and practical strategies…

Genaicode - programming on steroids

Poland

Live demo of Genaicode, an AI code generator, modifying a personal game in real time and covering latency,…

Paradigm – Understand the Code You Don’t Understand

Poland

Understand legacy code with Paradigm, a developer tool for deciphering unfamiliar systems, from old bank code to recent…

Jailbreaking Small Language Models

Seattle

Explore how small language models can be jailbroken, assess associated risks, and discuss practical red‑team techniques for securing…

How I hacked the hottest SF startup

Poland

Discover how Poke's system prompt was leaked and its architecture reverse-engineered, offering a glimpse into future AI interactions…

It’s Just a Game… Until the AI Starts Asking Questions

Prague

Live demo of an experimental choice‑based game where you act as a newly trained LLM, navigating alignment tests,…