Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
GRPO: Rust LLM Training
This talk demonstrates using Cargo feedback to create a training pipeline that improves large language models on low-resource languages like Rust through reinforcement learning.
I am putting together a training pipeline to improve LLMs on low resource languages like rust
- RustRust is a high-performance systems programming language that guarantees memory and thread safety via its compile-time ownership model.Rust is a statically-typed systems language engineered for performance and reliability, directly challenging C/C++ in speed. Its core innovation is the ownership model and 'borrow checker,' which enforces strict memory and thread safety at compile-time, eliminating data races and null pointer dereferences without a conventional garbage collector. Rust achieves near-native speed through 'zero-cost abstractions,' allowing high-level features to compile into highly optimized code. Major industry players, including Microsoft and Cloudflare, leverage Rust for critical infrastructure, and it is now officially supported for development in the Linux kernel.
- CargoCargo is the official build system and package manager for the Rust programming language.Cargo acts as the central operator for all Rust projects: it handles dependency resolution, ensuring your project's required libraries (crates) are downloaded from the community registry, crates.io. The tool manages the entire build lifecycle, from creating a new project with `cargo new` to compiling the code with `cargo build` and running tests with `cargo test`. Project configuration lives in the `Cargo.toml` manifest file, a critical component for defining metadata, dependencies, and build profiles. It's the standard, non-negotiable tool for any serious Rust development.
- GPT-4GPT-4 is OpenAI’s large multimodal model: it processes both text and image inputs, delivering human-level performance on complex professional and academic benchmarks.This is OpenAI’s latest milestone in scaling deep learning: a large multimodal model accepting both text and image inputs. It demonstrates a significant capability leap over its predecessor, scoring in the top 10% on a simulated bar exam (GPT-3.5 scored in the bottom 10%). The model handles nuanced instructions and long-form content, supporting context windows up to 32,768 tokens (32K model). This capacity allows processing up to 25,000 words in a single, complex prompt. GPT-4 is engineered for enhanced reliability, steerability, and advanced reasoning across diverse tasks.
- Training pipelineAutomate the entire machine learning lifecycle: ingest raw data, preprocess features, train models, validate performance, and register the final artifact for deployment.The Training Pipeline is the codified, automated workflow that transforms raw data into a production-ready machine learning model. It begins with data ingestion and validation, ensuring data quality and consistency before moving to feature engineering and preprocessing. The core training step iteratively optimizes the model's parameters using frameworks like TensorFlow or PyTorch. Post-training, the pipeline executes rigorous evaluation and validation, comparing metrics (e.g., F1-score, AUC) against a defined baseline. Tools like Kubeflow Pipelines or MLflow orchestrate this entire process, guaranteeing reproducibility, versioning, and scalability across development and production environments. This structure minimizes manual error and accelerates the model iteration cycle from months to days.
- GPT-3A 175-billion parameter autoregressive language model that masters complex tasks through few-shot learning.OpenAI debuted GPT-3 in 2020: a transformer-based engine trained on 570GB of filtered text. It utilizes 175 billion parameters to execute diverse functions (including Python scripting and logical reasoning) using only natural language prompts. This architecture removed the requirement for task-specific fine-tuning: establishing the foundation for modern tools like GitHub Copilot and the initial ChatGPT release.
- Llama-2Llama 2 is Meta AI's powerful, openly accessible family of large language models (LLMs), featuring models from 7B to 70B parameters for research and commercial applications.Llama 2 is Meta AI's next-generation LLM family, released for free research and commercial use. The collection includes both pre-trained foundation models and instruction-tuned 'Chat' variants, scaling from 7 billion (7B) up to 70 billion (70B) parameters. Key technical upgrades over Llama 1 involve training on 2 trillion tokens (40% more data) and doubling the context length to 4096 tokens. The Llama-2-chat models were rigorously aligned using Reinforcement Learning from Human Feedback (RLHF), positioning them as a top-tier, openly available option for developers building advanced generative AI solutions.
- PaLM 2Google's versatile large language model optimized for advanced reasoning, multilingual translation, and coding across four distinct scales.PaLM 2 powers 25+ Google products (including Gemini and Workspace) using a Transformer-based architecture trained on a massive corpus of 100+ languages. It excels in specialized tasks: solving complex math problems, generating high-quality code, and passing professional-level exams. Developers deploy the model via the PaLM API in four sizes: Gecko, Otter, Bison, and Unicorn. Gecko is lightweight enough to run locally on mobile devices (offline), while Unicorn handles the most complex, data-heavy reasoning tasks at scale.
- BLOOMA 176-billion parameter open-access multilingual language model built by the BigScience research collective.BLOOM is the result of a year-long collaboration involving 1,000+ researchers from 70+ countries. It supports 46 natural languages and 13 programming languages: it provides a high-performance alternative to proprietary models. The model was trained on the Jean Zay supercomputer in France using the 1.6-terabyte ROOTS dataset (a massive collection of diverse text sources). By providing full access to its weights and training process, BLOOM enables global developers to build and audit AI tools without the restrictions of closed-door APIs.
- BERTBERT (Bidirectional Encoder Representations from Transformers) is a foundational, pre-trained NLP model that uses a Transformer encoder to process text bidirectionally, capturing full word context for superior language understanding.BERT is a revolutionary language representation model introduced by Google AI Language in 2018. It is built on the Transformer architecture and distinguishes itself by being deeply bidirectional: it processes the entire sequence of words (left and right context) simultaneously, unlike previous unidirectional models. This capability is achieved through a Masked Language Model (MLM) pre-training objective. The model, released in sizes like BERTBASE (110 million parameters) and BERTLARGE (340 million parameters), dramatically improved the state-of-the-art across 11+ Natural Language Processing tasks, including question answering (SQuAD) and sentiment analysis, establishing a new baseline for the field.
- RoBERTaRoBERTa (Robustly Optimized BERT Pretraining Approach) is a high-performance language model from Facebook AI that significantly outperforms BERT by optimizing the pretraining strategy, not the core architecture.RoBERTa is a robustly optimized version of the BERT model, developed by researchers at Facebook AI in 2019. The team conducted a replication study, proving BERT was undertrained and could achieve state-of-the-art results with a refined recipe: they removed the Next Sentence Prediction (NSP) objective, implemented dynamic masking, and scaled up training dramatically. Specifically, RoBERTa trained for 500K steps (up from 100K) on a massive 160GB of text data (ten times BERT’s data) using much larger batch sizes (up to 8K). This optimized approach yielded superior performance on major benchmarks like GLUE, RACE, and SQuAD, establishing RoBERTa as a benchmark for subsequent language model development.
Related projects
Real-Time AI Infrastructure for the Multi-Agent Era: Building AG-UI Protocol in Pure Rust
Los Angeles
Learn how the AG‑UI protocol enables real‑time AI agent‑to‑UI communication using Rust and WebAssembly, providing sub‑10 ms cold starts,…
CorpusKeeper - Talk To Data
Seattle
Demonstrating how to integrate LLMs, RAG, and function calls to create an agency‑style interface that designs, scripts, and…
CrustyCrab: An Experimental LLM-based C-to-Rust Translator
New York City
This talk explores using large language models to translate legacy C code into safe, idiomatic Rust, improving memory…
Event concierge
Los Angeles
The talk demonstrates how to parse emails with LLMs to extract event details, generate structured outputs, and preview…
Long term and short term memory
Los Angeles
A demo of a Gradio interface using ChromaDB to store and retrieve conversation embeddings, enabling long‑term memory with…
Better Insights into Team Activity
Los Angeles
The session demonstrates a Jupyter Notebook proof‑of‑concept that combines usage‑tracking dashboards, forecasting models, and a retrieval‑augmented LLM to…