The Loop Bookshelf · 4 volumes · 290 pages

Read the entire system in full.

Every methodology Loop runs is published as a book. The operating model, the frameworks, the assessments, the adoption path. We publish in full so serious buyers can evaluate how Loop thinks before any sales conversation happens.

Get the complete bundle (PDF)4 books · all assessments · 1 email
Vol. 01QE-2026Edition 4.2 · 84pp

AI-Native Quality Engineering

Operating model for the modern QA function.

If you already have QA people, this is what the function should become. Layered tests, named owners, leverage metrics, and the 5-bucket maturity framework that tells you where you are and where to go next.

  • 5-bucket maturity framework
  • Leverage measurement system
  • Adoption path for legacy teams
Vol. 02TDD-2026Edition 3.1 · 62pp

AI-Driven Test-Driven Development

Tests define intent. AI accelerates implementation.

If you want development to own both speed and quality, this is where to start. The AI-native TDD model. Where tests define intent, AI accelerates implementation, and engineers own verification.

  • AI-assisted TDD workflow
  • Velocity & defect-escape analysis
  • Test layer architecture
Vol. 03AGT-2026Edition 2.4 · 96pp

Bespoke Agentic Pipelines

The multiplier behind every other Loop methodology.

Leverage doesn't come from a generic agent with a giant prompt. It comes from purpose-built pipelines with roles, permissions, observability, and project-specific operating rules. Teams running bespoke pipelines see up to 17× automation output.

  • Bespoke pipeline design
  • Rediscovery overhead analysis
  • Pipeline starter kit
Vol. 04POL-2026Edition 1.0 · 48pp

Policy as Code

The deterministic floor beneath agentic development.

AI rules alone top out around 60–75% accuracy on the invariants teams actually care about. Generic static analysis tops out around 40–55%. Policy as Code reliably reaches 80–90%, with drift as a first-class measurement concern.

  • The deterministic floor
  • Drift measurement framework
  • Rule authoring template

Who reads these

QA leaders evaluating the system before signing anything.

The books are how Loop earns the right to talk. They're free in full so you can decide whether the methodology lands before any sales conversation happens.

01
“I forwarded chapter 3 to my CTO. We rebuilt our QA org chart on Monday.”
. SarahQA Director at

Series-B fintech · ~50 engineers

Read: AI-Native Quality Engineering

02
“Read it on a flight. Walked into the AI vendor meeting the next day with a different set of questions.”
. MarcusHead of Quality at

Healthcare SaaS · 120 engineers

Read: Bespoke Agentic Pipelines

03
“We stopped writing the kind of tests this book argues against. Suite is 40% smaller. Confidence is up.”
. PriyaQA Director at

E-commerce platform · 200+ engineers

Read: AI-Driven TDD

Names + companies anonymized at the speakers' request.

Watch · Companion videos

Talks behind the books

Subscribe on YouTube · @benfellows-dev
Set Up Policy as Code in 1 Hour (Control AI Code Fast)

Apr 28, 2026

Set Up Policy as Code in 1 Hour (Control AI Code Fast)

If you want to start controlling AI-generated code today, this is the simplest way I’ve found to do it. In the previous videos, I talked about why agentic development breaks at scale and introduced the concept of policy as code as a way to fix it. In this video, I’m showing how to actually get started. The idea is straightforward. Instead of relying only on prompts, rules, or memory to guide AI, you introduce a deterministic layer that scans your codebase and flags violations. Think of it as a much more comprehensive, fully customizable linting system that works alongside tools like Claude. What surprised me is how easy it is to get a first version working. In this walkthrough, I show how you can go from zero to a basic policy as code setup in a very short amount of time. We start by generating a small set of rules, wire up a simple scanner, and immediately run it against a real codebase. Even with a basic setup, you’ll start catching issues and inconsistencies right away. This is not the full system I use in production. At scale, this turns into hundreds or even thousands of rules, with more advanced concepts like evidence layers, caching, and reporting. But the goal of this video is to show that you don’t need any of that to begin. If you’re using AI to write code and you’re starting to see drift, inconsistency, or quality issues over time, this is a practical way to start putting guardrails in place. Over time, what I’ve found is that as you add more rules, the amount of drift drops significantly, and the system becomes more reliable without slowing development down. If you haven’t watched the earlier videos in this series, I’d recommend starting with those for more context on why this approach exists and how it fits into a larger agentic workflow. If you try this yourself, I’d be interested to hear what kinds of rules you end up writing and what it catches in your codebase.

Watch on YouTube →
I Tried Building with Agentic Factories. They Failed. Here’s What Worked Instead.

Apr 27, 2026

I Tried Building with Agentic Factories. They Failed. Here’s What Worked Instead.

I spent time building with “agentic factories” - multi-agent pipelines that promise fully autonomous workflows. On paper, they look like the future. In practice, they broke down in ways that matter: reliability, coordination, and real-world constraints. In this video, I break down where these systems failed, why they fail structurally, and what actually worked instead in production. If you're building with AI agents, this will save you time (and probably some pain).

Watch on YouTube →
How We Use Policy as Code to Control Claude and AI Agents

Apr 24, 2026

How We Use Policy as Code to Control Claude and AI Agents

Claude and other AI agents are incredibly good at writing code. The problem is they don’t stay consistent over time. In the first few iterations, everything looks great. Output is fast, patterns are mostly correct, and it feels like you’ve unlocked a new level of development speed. But as the codebase grows, small inconsistencies start to compound. Patterns drift, structure degrades, and eventually the system becomes harder to maintain than it was before. That’s the problem this video is about. In this walkthrough, I break down how we use a concept called policy as code to control AI-generated code in real systems. Instead of relying only on prompts, rules files, or memory, we introduce a deterministic layer that enforces how code is allowed to be written. Every time an agent makes changes, those changes are checked against a large set of rules. If something doesn’t match the expected patterns, it fails. The agent has to fix it before moving forward. This ends up acting like a much more comprehensive version of linting, but tailored specifically to your architecture, your patterns, and your codebase. The result is that we’re able to keep the speed benefits of AI while dramatically reducing drift and long-term degradation. This video focuses on how the system works in practice. What kinds of rules we write, how they’re structured, and how they integrate into an agentic workflow using tools like Claude. If you’re experimenting with AI coding and running into issues with inconsistency or quality over time, this is one approach that has worked well for us. I’ll also be doing follow-up videos on how to implement this from scratch and how it fits into larger agentic pipeline systems. If you’ve tried something similar or have different approaches to controlling AI-generated code, I’d be interested to hear about it.

Watch on YouTube →

Bundle

All 4 books, one PDF.

290 pages of methodology, all assessments included. Free with email. No drip funnel.

Template

90-Day QA Leverage Plan