Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
Nebius AI: Substitute ChatGPT
Learn how to replace ChatGPT with third‑party inference services, compare benefits, integrate Nebius AI Studio, and build a simple RAG and evaluation pipeline.
- Shortly talk about advantages of 3rd party inference providers.
- I will show easily can be substituted any inference provider.
- Show example with Nebius AI Studio
- (if we will have enough time) Show a primitive RAG and eval pipeline
Nebius offers NVIDIA H100/H200/GB200 GPU clusters via InfiniBand and Kubernetes orchestration.
Scalable LLM inference service offers ultra-low latency via OpenAI-compatible API.
- ChatGPTOpenAI's Generative Pre-trained Transformer (GPT) model: a conversational AI chatbot for instant text generation, coding assistance, and complex problem-solving.Launched by OpenAI in November 2022, ChatGPT is a state-of-the-art conversational AI, built on the Generative Pre-trained Transformer (GPT) architecture (e.g., GPT-4). The system is fine-tuned using Reinforcement Learning from Human Feedback (RLHF) to produce human-like dialogue, admit mistakes, and reject inappropriate requests. Users leverage the chatbot to execute diverse tasks: generating code snippets, drafting professional emails, summarizing technical documents, and even creating original images via DALL-E integration. It functions as a powerful, multi-purpose tool for rapid content creation and information retrieval.
- Nebius AI StudioNebius AI Studio is a high-performing Inference-as-a-Service platform for deploying, fine-tuning, and scaling leading open-source LLMs and text-to-image models.This is your end-to-end platform for AI inference: deploy models like Llama 3.1 and Mistral with zero MLOps overhead. Nebius AI Studio provides an OpenAI-compatible API and a user-friendly Playground for testing, comparing, and fine-tuning models against your domain-specific data. Leverage its proprietary infrastructure for ultra-low latency and cost-efficient, per-token pricing, a factor recognized by Artificial Analysis. The platform supports high-volume workloads, offering a standard capacity of 100M+ tokens per minute for text models. Beyond LLMs, it integrates text-to-image capabilities using models like Flux Schnell and SDXL, ensuring you can scale both language and visual generation at an enterprise level.
- RAGRAG (Retrieval-Augmented Generation) is the GenAI framework that grounds LLMs (like GPT-4) on external, verified data, drastically reducing model hallucinations and providing verifiable sources.RAG is a critical GenAI architecture: it solves the LLM 'hallucination' problem by inserting a retrieval step before generation. A user query is vectorized, then used to query an external knowledge base (e.g., a Pinecone vector database) for relevant document chunks (typically 512-token segments). These retrieved facts augment the original prompt, providing the LLM (e.g., Gemini or Llama 3) the specific, current, or proprietary context required. This process ensures the final response is accurate and grounded in domain-specific data, avoiding the high cost and latency of full model retraining.
Related projects
Building a full-stack AI business co-founder
Amsterdam
We’ll show how Starnus uses Gemini‑supervised integration of third‑party AI agents for automated investor discovery and lead generation,…
Artecon - A hotspot for AI
Seattle
Learn how to run CPU‑based ML models with low latency, using small public models and post‑processing, then bundle…
Building AI experiences that businesses pay for
Prague
Learn how to build AI features that deliver real business value, not just cool tech. This talk shares…
AI science agent
Amsterdam
An experiment automating scientific research using GPT-5 Codex CLI resulted in a complete, non-trivial research paper, presented on…
Demo: Manus — The AI Agent That Thinks Before It Acts
Cincinnati
A live demonstration of Manis, an AI agent that visualizes its reasoning steps, showing how it plans and…
Vital AI Agent Ecosystem + Chat.ai
New York City
Demonstration of an open standard and open-source Vital AI Agent Ecosystem for deploying agents, enabling inter-agent communication, collaboration,…