.

Members-Only

Recent Talks & Demos are for members only

Exclusive feed

You must be an AI Tinkerers active member to view these talks and demos.

April 24, 2025 · Pereira

PGVector: RAG in Postgres

Learn how to store, update, and delete embeddings in real-time using PostgreSQL with the PGVector extension, demonstrating user-specific RAG with FastAPI, LangChain, and OpenAI.

Overview
Tech stack
  • GPT-4
    GPT-4 is OpenAI’s large multimodal model: it processes both text and image inputs, delivering human-level performance on complex professional and academic benchmarks.
    This is OpenAI’s latest milestone in scaling deep learning: a large multimodal model accepting both text and image inputs. It demonstrates a significant capability leap over its predecessor, scoring in the top 10% on a simulated bar exam (GPT-3.5 scored in the bottom 10%). The model handles nuanced instructions and long-form content, supporting context windows up to 32,768 tokens (32K model). This capacity allows processing up to 25,000 words in a single, complex prompt. GPT-4 is engineered for enhanced reliability, steerability, and advanced reasoning across diverse tasks.
  • LangChain
    The open-source framework for building and deploying reliable, data-aware Large Language Model (LLM) applications.
    LangChain is the essential framework for engineering LLM-powered applications: it simplifies connecting models (like GPT-4 or Claude) to external data, computation, and APIs. The platform provides a modular set of components—Chains, Agents, Tools, and Memory—allowing developers to quickly build complex workflows like Retrieval-Augmented Generation (RAG) pipelines and sophisticated conversational agents. Its Python and JavaScript libraries, combined with LangChain Expression Language (LCEL), offer a standardized interface for rapid prototyping and moving applications to production with confidence.
  • FastAPI
    FastAPI is a modern, high-performance Python web framework for building APIs with automatic OpenAPI documentation.
    FastAPI is a robust, high-speed Python web framework: it is built on Starlette (for async capabilities) and Pydantic (for data validation and serialization). Leveraging standard Python 3.8+ type hints, the framework automatically generates interactive API documentation (Swagger UI/ReDoc) and enforces data validation, effectively reducing developer-induced errors by an estimated 40%. This architecture delivers performance on par with Node.js and Go, significantly increasing feature development speed (up to 300% faster). It is production-ready, fully supporting OpenAPI and JSON Schema standards for all API specifications.
  • Python
    Python: The high-level, general-purpose language built for readability, powering everything from web backends to advanced machine learning models.
    Python is the high-level, general-purpose language prioritizing clear, readable syntax (via significant indentation), ensuring rapid development for any team . Its ecosystem is massive: use it for robust web development with frameworks like Django and Flask, or leverage its power in data science with libraries such as Pandas and NumPy . The Python Package Index (PyPI) provides thousands of community-contributed modules, offering immediate solutions for tasks from network programming to GUI creation . The language is actively maintained by the Python Software Foundation (PSF), with the stable release currently at Python 3.14.0 (as of November 2025) .
  • OpenAI API
    OpenAI API: Your direct gateway to cutting-edge AI models (GPT-4o, DALL-E 3, Whisper), enabling scalable, multimodal intelligence integration into any application.
    The OpenAI API provides authenticated, programmatic access to a powerful suite of generative AI models. Developers leverage REST endpoints and official libraries (Python, Node.js) to integrate capabilities like advanced text generation (GPT-4o), image creation (DALL-E 3), and speech-to-text transcription (Whisper). This platform is engineered for scale, supporting millions of daily requests for tasks from complex reasoning to real-time customer support agents, ensuring your application gets reliable, state-of-the-art intelligence.

Related projects