The Loop Bookshelf · 4 volumes · 290 pages

Read the entire system in full.

Every methodology Loop runs is published as a book. The operating model, the frameworks, the assessments, the adoption path. We publish in full so serious buyers can evaluate how Loop thinks before any sales conversation happens.

Browse the full bookshelf4 books · all assessments · read in browser
Vol. 01QE-2026Edition 4.2 · 84pp

AI-Native Quality Engineering

Operating model for the modern QA function.

If you already have QA people, this is what the function should become. Layered tests, named owners, leverage metrics, and the 5-bucket maturity framework that tells you where you are and where to go next.

  • 5-bucket maturity framework
  • Leverage measurement system
  • Adoption path for legacy teams
Vol. 02TDD-2026Edition 3.1 · 62pp

AI-Driven Test-Driven Development

Tests define intent. AI accelerates implementation.

If you want development to own both speed and quality, this is where to start. The AI-native TDD model. Where tests define intent, AI accelerates implementation, and engineers own verification.

  • AI-assisted TDD workflow
  • Velocity & defect-escape analysis
  • Test layer architecture
Vol. 03AGT-2026Edition 2.4 · 96pp

Bespoke Agentic Pipelines

The multiplier behind every other Loop methodology.

Leverage doesn't come from a generic agent with a giant prompt. It comes from purpose-built pipelines with roles, permissions, observability, and project-specific operating rules. Teams running bespoke pipelines see up to 17× automation output.

  • Bespoke pipeline design
  • Rediscovery overhead analysis
  • Pipeline starter kit
Vol. 04POL-2026Edition 1.0 · 48pp

Policy as Code

The deterministic floor beneath agentic development.

AI rules alone top out around 60–75% accuracy on the invariants teams actually care about. Generic static analysis tops out around 40–55%. Policy as Code reliably reaches 80–90%, with drift as a first-class measurement concern.

  • The deterministic floor
  • Drift measurement framework
  • Rule authoring template

Who reads these

QA leaders evaluating the system before signing anything.

The books are how Loop earns the right to talk. They're free in full so you can decide whether the methodology lands before any sales conversation happens.

01
“I forwarded chapter 3 to my CTO. We rebuilt our QA org chart on Monday.”
. SarahQA Director at

Series-B fintech · ~50 engineers

Read: AI-Native Quality Engineering

02
“Read it on a flight. Walked into the AI vendor meeting the next day with a different set of questions.”
. MarcusHead of Quality at

Healthcare SaaS · 120 engineers

Read: Bespoke Agentic Pipelines

03
“We stopped writing the kind of tests this book argues against. Suite is 40% smaller. Confidence is up.”
. PriyaQA Director at

E-commerce platform · 200+ engineers

Read: AI-Driven TDD

Names + companies anonymized at the speakers' request.

Watch · Companion videos

Talks behind the books

Subscribe on YouTube · @benfellows-dev
Agentic development wasn't working for my large codebase. Then I implemented anchor tags

May 26, 2026

Agentic development wasn't working for my large codebase. Then I implemented anchor tags

Agentic development works great until your codebase gets big. As the repo grows, the AI starts missing context. greps gets unreliable. Planning looks detailed but quietly skips important files. Validation becomes harder. And after enough agentic coding, you end up with random orphaned code scattered throughout the codebase. I spent months trying to solve this problem, and the thing that finally made a major difference was adding anchor tags throughout my codebase. In this video, I walk through what anchor tags are, why they help, and how I use them to make large-scale agentic development more reliable. The basic idea: anchor tags are metadata inside the codebase that give AI a deterministic, queryable system for understanding where things live, how features connect, and what needs to be included during planning and validation. Instead of asking AI to “go research the codebase,” we can point it toward a manifest, have it query relevant anchor surfaces, and then use normal grep/search on top of a much better starting point. This has helped me: - Improve planning accuracy in large codebases - Reduce orphaned and leftover legacy code - Validate refactors with more confidence - Link related code across services - Connect test coverage back to product surfaces - Give AI a better map of the repo without pretending it understands everything I also talk through how we pair anchor tags with policy-as-code rules, why the tag system needs to stay boring, and why this only works if the metadata is enforced consistently. This is not a perfect system, and I’m not claiming anchor tags magically solve agentic development. But for large codebases, they’ve been one of the most useful changes I’ve made. If you’re using AI coding agents on a large repo and running into context, planning, or validation issues, this is worth trying. Topics covered: - Why agentic development breaks down in large codebases - What anchor tags are - How anchor tags create deterministic codebase context - Why AI misses things even with large context windows - Using manifests and custom queries for planning - Validating deprecated features and refactors - Reducing orphan code - Pairing anchor tags with policy-as-code - Mapping tests to code surfaces - Practical rules for keeping anchor tags useful If you want the presentation or have questions about implementing this in your own codebase, drop a comment or reach out. Like and subscribe if you want more videos on agentic development, AI coding workflows, QA, automation, and building software with large language models.

Watch on YouTube →
Inside a Real Agentic Pipeline (Step-by-Step Breakdown)

May 4, 2026

Inside a Real Agentic Pipeline (Step-by-Step Breakdown)

Agentic pipelines sound great in clean demos, but what do they actually look like in production? In this video, I break down one of the real AI development pipelines I use almost every day: how it starts from a prompt, creates its own branch and worktree, runs research, builds a plan, gets reviewed by a second agent, writes failing tests, implements until green, runs policy checks, and produces receipts at the end. I also cover what’s worked, what’s been over-engineered, where deterministic checks matter, and why “just run more agents in parallel” is not always the right answer. Sorry for the lower-energy video, I hadn’t eaten all day before recording this one 😅 Links: Newsletter: https://tinyideas.ai/#newsletters QA work at Loop: https://www.workwithloop.com/ LinkedIn: https://www.linkedin.com/in/ben-f-44778426/ X: https://x.com/FellowsBen

Watch on YouTube →
Are Agentic Pipelines Actually Worth It?

May 1, 2026

Are Agentic Pipelines Actually Worth It?

Are agentic pipelines actually worth the extra time, tokens, and complexity? My honest answer: it depends. Agentic pipelines can improve accuracy, visibility, governance, and control, but they also add real cost. They often take longer to run, use more tokens, introduce more orchestration, and create another layer of abstraction around your development process. So the question is not “do pipelines work?” The better question is: did this pipeline earn its cost? In this video, I walk through the framework I’m using to evaluate whether an agentic pipeline is actually worth running. That includes measuring the pipeline tax, tracking run receipts, comparing quality improvements, and using a ledger system to understand whether a pipeline is making the work better or just making it more complicated. I also share an example of a pipeline that looked good on paper but probably wasn’t worth it in practice. That’s an important part of the lesson: not every task needs a pipeline. Sometimes a single Claude Code or Codex session, guided by a strong engineer, is enough. The goal is to use pipelines surgically. Start simple. Measure what happens. Add complexity only when the pipeline is solving a real problem. And when a pipeline gets too large, use the data to make it smaller. If you’re experimenting with agentic development, this video is about how to think about ROI, accuracy, governance, and cost before building complex AI workflows everywhere.

Watch on YouTube →

Bundle

All 4 books, free to read.

290 pages of methodology, all assessments included. Read every word in the browser. No drip funnel.

Template

90-Day QA Leverage Plan

Coming soon