Q-Learning for Inventory Optimization

Live code walkthrough of Q‑Learning applied to inventory optimization, covering reward design, Q‑table mechanics, state transitions, and practical deployment considerations.

Overview

This hands-on technical session demonstrates the implementation and practical application of Q-Learning in solving real-world inventory management challenges. The presentation will feature:

Live code walkthrough of a complete Q-Learning implementation for inventory optimization, examining the mathematical foundations and state-action-reward framework that drives the system

Interactive demonstration showing how reinforcement learning outperforms traditional rule-based inventory systems across various scenarios (seasonal demand, supply chain disruptions, capacity constraints)

Deep dive into the Q-table mechanics, demonstrating how the algorithm evaluates different states and actions, including visualization of the learning process over training epochs

Step-by-step examination of the reward function design and its critical impact on model behavior, with live modification of parameters to demonstrate their effects

Technical dissection of state transition functions, balancing exploration vs. exploitation, and hyperparameter optimization techniques specific to inventory applications

The session will demonstrate how to implement several advanced RL techniques within the inventory context:

Temporal difference learning with multi-step lookahead
Decay-based exploration strategies
State discretization for continuous inventory spaces
Lead time handling in RL decision systems
Deployment considerations in production environments

I’ll conclude with a live demonstration of the model adapting to audience-suggested challenging scenarios, showing how it responds to unexpected demand patterns, capacity constraints, and supply chain disruptions.

Links

Tech stack

Related projects

from LLMs to reasoning models

Quito

This talk covers implementing reasoning models by scaling inference-time computation on open source LLMs using techniques like Monte…

Building High-Performance Search Agents: Local Inference with DuckDuckGo and Google Search Integration

Quito

This talk demonstrates building local high-performance search agents using llama-cpp-agents, integrating DuckDuckGo and Google, optimizing memory for large…

Building a Unified AI Interface: Live Demo of Dolphin MCP's Cross-Provider Tool Orchestration

Quito

This talk demonstrates building a unified AI interface using Dolphin MCP to orchestrate tools across multiple LLM providers…

SomosNPL

Quito

The talk covers efforts to advance Spanish natural language processing by creating open resources, highlighting SomosNPL's role in…

Q&A session - Google deepmind research engineer

Medellín

An open Q&A covering DeepMind research methods, current projects, technical challenges, and career insights, allowing attendees to ask…

Deep RL for User Experience

Chicago

Learn how to use Ray’s distributed tuning and parallel processing to scale reinforcement learning predictions and training, including…