Members-Only
Recent Talks & Demos are for members only
You must be an AI Tinkerers active member to view these talks and demos.
July 21, 2026
·
Orange County
Build Your Own Agentic Harness
Learn to build an agentic harness for bounded tasks. This talk provides a "compiler" to create your own agentic systems for specific, well-scoped projects.
Video
Overview
A “compiler” for building your own agentic harness for a bounded task – https://www.youtube.com/watch?v=75V0ZbCE9_0
Links
This harness engineering framework structures LLM workflows using deterministic state-machines.
Tech stack
- Agentic harnessThe software infrastructure that wraps raw AI models with state, tool execution, feedback loops, and safety boundaries to make them functional, reliable agents.A raw foundation model cannot act autonomously: it needs a harness to interact with the real world. An agentic harness serves as the execution environment, providing the model with memory persistence, tool integration (like Model Context Protocol), and structured feedback loops. By handling state management and enforcing strict safety constraints, the harness transforms a static LLM into a reliable, goal-driven agent capable of executing complex workflows without human intervention.
- Enchiridion LabsEnchiridion Labs builds the infrastructure layer for the AI agent ecosystem to turn complex workflows into reliable, executable processes.Reliable AI execution requires structure over improvisation. Enchiridion Labs delivers this through a composable infrastructure stack built on three core pillars: data access (Model Context Protocol), procedural knowledge (Skills), and dynamic interfaces (Generative UI). By developing open-source developer tools like Proceda (a terminal-first Python SDK that converts markdown-based Standard Operating Procedures into step-by-step agents) and mcpknife (a utility for generating and transforming MCP servers), the lab equips engineers to deploy constrained, task-specific execution environments with built-in human-in-the-loop approval gates.
- Specialized harness engineeringSpecialized harness engineering wraps raw language models in deterministic execution loops, sandboxes, and validation rails to build highly autonomous software agents.A raw model is not an agent: it requires a structured environment to execute complex tasks reliably. Specialized harness engineering builds this outer scaffolding, combining system prompts, tool-access protocols, and deterministic validation loops (such as running test suites or checking linters) to keep models on track. By offloading context management, sandboxed execution, and multi-agent orchestration to code rather than the model itself, this approach eliminates hallucinated completions and maintains architectural integrity. The result is a system where autonomous agents can run multi-hour development loops, execute code safely, and self-correct errors before human review is ever needed.
- YouTubeYouTube is the definitive global video platform: it hosts over 2.5 billion monthly active users and drives more than 1 billion hours of daily watch time, operating as the world's second-largest search engine (after Google).YouTube is the premier global video distribution network, a powerful Google subsidiary consistently ranking as the world's second most visited website. The platform commands an audience of over 2.5 billion monthly active users; viewers consume more than 1 billion hours of content daily. Functioning as the internet's second-largest search engine, YouTube drives discovery across diverse formats: long-form videos, Shorts (seeing billions of daily views), and live streams. Top channels, such as T-Series (with over 257 million subscribers), validate its massive creator economy and solidify its role as the essential hub for entertainment, education, and commerce worldwide.
- LLMLarge Language Models (LLMs) are deep learning models, built on the Transformer architecture, that process and generate human-quality text and code at scale.LLMs are a class of foundation models: massive, pre-trained neural networks (often with billions to trillions of parameters) that leverage the self-attention mechanism of the Transformer architecture (introduced in 2017) to predict the next token in a sequence. Trained on vast datasets (e.g., Common Crawl's 50 billion+ web pages), these models—like GPT-4, Gemini, and Claude—acquire predictive power over syntax and semantics. They function as general-purpose sequence models, enabling critical applications such as complex content generation, language translation, and automated code completion (e.g., GitHub Copilot). Their core value: generalizing across diverse tasks with minimal task-specific fine-tuning.