Technology
GPT-4o
GPT-4o (omni) is OpenAI's flagship multimodal model: it delivers GPT-4 intelligence with native, real-time processing across text, audio, and vision.
This is GPT-4o, OpenAI’s 'omni' model: a single neural network natively handling text, audio, and image inputs and outputs. It matches GPT-4 performance on English text and code, but surpasses it on non-English language, vision, and audio benchmarks. The speed is a major upgrade: it achieves human-level responsiveness in voice, with an average response time of 0.32 seconds (a significant jump from GPT-4’s 5.4 seconds). Developers get a 128K token context window and a model that is more cost-efficient than its predecessor, making high-intelligence, real-time applications viable.
Related technologies
Recent Talks & Demos
Showing 1-24 of 72