.

Members-Only

Recent Talks & Demos are for members only

Exclusive feed

You must be an AI Tinkerers active member to view these talks and demos.

February 25, 2025 · Denver

ComfyStream: Real-Time Video AI

This talk demonstrates Livepeer’s AI application Day Dream and explores ComfyStream, the open source tool used to develop real-time video AI workflows.

Overview
Tech stack
  • GPT-4
    GPT-4 is OpenAI’s large multimodal model: it processes both text and image inputs, delivering human-level performance on complex professional and academic benchmarks.
    This is OpenAI’s latest milestone in scaling deep learning: a large multimodal model accepting both text and image inputs. It demonstrates a significant capability leap over its predecessor, scoring in the top 10% on a simulated bar exam (GPT-3.5 scored in the bottom 10%). The model handles nuanced instructions and long-form content, supporting context windows up to 32,768 tokens (32K model). This capacity allows processing up to 25,000 words in a single, complex prompt. GPT-4 is engineered for enhanced reliability, steerability, and advanced reasoning across diverse tasks.
  • OpenAI API
    OpenAI API: Your direct gateway to cutting-edge AI models (GPT-4o, DALL-E 3, Whisper), enabling scalable, multimodal intelligence integration into any application.
    The OpenAI API provides authenticated, programmatic access to a powerful suite of generative AI models. Developers leverage REST endpoints and official libraries (Python, Node.js) to integrate capabilities like advanced text generation (GPT-4o), image creation (DALL-E 3), and speech-to-text transcription (Whisper). This platform is engineered for scale, supporting millions of daily requests for tasks from complex reasoning to real-time customer support agents, ensuring your application gets reliable, state-of-the-art intelligence.
  • Livepeer
    Livepeer is the decentralized video infrastructure network for developers, providing cost-efficient video transcoding and real-time AI processing.
    Livepeer delivers a robust, decentralized video infrastructure, built on Ethereum, for developers building streaming applications. It functions as a backend service, not a consumer-facing platform (think AWS, not YouTube), drastically cutting video transcoding and distribution costs, often by up to 50x. The network uses a cryptoeconomic protocol with the Livepeer Token (LPT) to incentivize a global network of 'Orchestrators' who contribute GPU power for processing. This system ensures high-quality, scalable video delivery and has recently expanded to support real-time AI video pipelines, enabling next-generation decentralized media and compute workflows.
  • ComfyUI
    ComfyUI is the premier open-source, node-based interface for building and executing highly customized generative AI workflows (Stable Diffusion, video, 3D).
    ComfyUI is your foundational, open-source engine for visual generative AI, operating on a powerful node-based interface. This architecture gives users total control: you visually construct complex AI pipelines—connecting components like samplers, models, and custom nodes (e.g., Animatediff, IPAdapter)—to achieve precise, reproducible results. Unlike simpler UIs, ComfyUI excels in advanced Stable Diffusion workflows, video generation, and 3D asset creation, maximizing hardware efficiency and allowing for low-level parameter adjustments. It is the flexible operating system for the next generation of AI artists and developers.
  • ComfyStream
    ComfyStream is the native, open-source extension that enables real-time video processing by running ComfyUI workflows on live WebRTC streams.
    This is a critical tool for low-latency AI video. ComfyStream functions as an extension for ComfyUI (the backend inference engine), taking a video stream as input, processing each frame through a specified workflow, and outputting the result in a new stream (Source: Livepeer Blog). The architecture leverages key components: a ComfyUI Backend, a WebRTC Streaming Layer, and a specialized Optimization Layer. Performance is driven by specifics like a one-step distilled UNet and TensorRT acceleration for faster inference, achieving real-time frame rates. Developed in collaboration with Livepeer, ComfyStream unlocks new possibilities for interactive entertainment, AI-enhanced live performances, and sophisticated, real-time video effects.

Related projects