.

Members-Only

Recent Talks & Demos are for members only

Exclusive feed

You must be an AI Tinkerers active member to view these talks and demos.

February 21, 2025 · Singapore

NanoBrowser: Open Source Automation

This talk demonstrates NanoBrowser, an open-source AI-powered browser extension for web automation, highlighting community contributions, practical use cases, and development opportunities.

Overview
Links
Tech stack
  • NanoBrowser
    NanoBrowser is the open-source, multi-agent AI web automation tool: it executes complex browser workflows from a single prompt, running locally for maximum privacy and zero subscription cost.
    This is your free, high-performance solution for browser automation: an open-source Chrome and Edge extension that leverages a collaborative multi-agent system (Planner, Navigator, Validator) to execute web tasks. Users input a single prompt, and the agents handle complex actions like data validation and form filling. It prioritizes privacy: all core operations run locally in your browser, keeping sensitive data secure. NanoBrowser offers flexible LLM support, connecting to providers like OpenAI, Anthropic, and Ollama, allowing you to use your own API keys without recurring subscription fees.
  • OpenAI Operator
    Autonomous AI Agent: Operator uses the Computer-Using Agent (CUA) model to execute complex web tasks (e.g., form filling, ordering) by directly interacting with graphical user interfaces (GUIs).
    Operator was OpenAI's first autonomous AI agent, designed to manage repetitive digital workflows. It leveraged the Computer-Using Agent (CUA) model: this system combined GPT-4o's vision capabilities with advanced reasoning via reinforcement learning. The agent operated a remote browser, mimicking human actions like clicking and typing across any website (no custom APIs required). Performance was strong: it scored 58.1% on the WebArena benchmark for web interactions. This technology allowed Pro-tier users to automate complex tasks: ordering groceries, scheduling appointments, and filling out multi-step forms. The core mission was clear: transform AI from a passive tool to an active participant in the digital ecosystem.
  • Browser extension
    Small, focused software modules that directly enhance and customize the functionality of a web browser (Chrome, Firefox, Edge) using standard web technologies.
    Browser extensions are compact software packages that inject new features or modify existing behavior directly into a host browser, such as Google Chrome, Mozilla Firefox, or Microsoft Edge. These tools, often built using standard web technologies (HTML, CSS, JavaScript), operate with specific permissions to execute tasks like content modification or data handling. Key examples include productivity boosters (e.g., Grammarly, Todoist), security layers (e.g., uBlock Origin for ad blocking), and utility managers (e.g., Bitwarden for password management), fundamentally customizing the user's web experience beyond the browser's default capabilities.
  • Web automation
    Deploy software bots to mimic human actions (e.g., clicks, data entry) directly in a web browser, eliminating repetitive, high-volume manual tasks.
    Web automation is the strategic deployment of software robots to execute repetitive, browser-based tasks: think form filling, data extraction, and navigation. It’s a critical component of Robotic Process Automation (RPA), specifically handling front-end web interactions. For example, a bot can complete a multi-step data migration into a web-based CRM in minutes, a task that would take a human operator hours. This technology drastically improves operational efficiency, allowing teams to focus on high-value strategy rather than manual data input or repetitive testing cycles using frameworks like Selenium WebDriver.

Related projects