Wrapperone

Explore how Wrapperone handles parallel, rate‑limited LLM requests, immutable versioned conversation state, automatic tool integration, and flexible multi‑step, schema-based response formats.

Overview

A technical library designed for power users who need:

Parallel, rate-limited request processing to LLM providers (OpenAI, Anthropic, VLLM, LiteLLM, and OpenRouter).
A system to store conversation states (chat threads) with built-in immutability and versioning.
Automatic tool integration (both function-based and schema-based).
Flexible response formats: text, JSON, structured schemas, or multi-step workflows.
The core design revolves around Entities. Each conversation, message, or piece of configuration is an Entity that undergoes forking and versioning whenever modified. This ensures you have a complete lineage of conversation states without accidental in-place mutations.

Links

https://github.com/marketagents-ai/MultiInference
This Python library orchestrates concurrent, immutable LLM inference with integrated tools.

Tech stack

Related projects

AI Call Analyst

Medellín

Learn how to convert recruiter‑candidate call audio to text, assess pronunciation, apply business criteria with an LLM, score…

Lessons from building an LLM-first framework

London

Explore how Tonk enables non‑coders to quickly update multiplayer applets by limiting context to the frontend and using…

One Embedding to Rule Them All: Bending LLMs to Your Will

Dublin

Live demonstration of altering a single embedding vector to direct a language model's output without retraining, showing how…

"Bidimensional testing in the LLM era"

Milan

Examine how test pyramids and parameter‑exploration apply to GenAI workflows, using counterfeit Rolex detection to reveal practical and…

LLM.f90 - Minimal Large Language Model Inference Framework

Toronto

A low‑dependency Fortran framework for LLM inference, showing zero‑dependency implementation, matrix operations, and support for Llama, Phi, and…

PaperBench: Evaluating AI’s Ability to Replicate AI Research

Rome

This talk presents PaperBench, a benchmark for evaluating AI agents’ ability to replicate state-of-the-art AI research through code…