Build an AI Agent Harness for Large Codebases | Anthropic's Guide

💡

TL;DR / Key Takeaways

Anthropic revealed that the tools around an AI matter more than the model itself for coding in large codebases. This is the playbook for building that 'harness' and making your AI agent actually effective.

The Harness Is the New Hype

Standard AI coding agents consistently fail when confronted with the sprawling complexity of real-world codebases. These systems, often touted for their prowess, falter dramatically in environments with tens or hundreds of thousands of lines of code, lacking the crucial situational awareness required to navigate intricate architectures and legacy systems. Strategies effective in simple projects quickly prove inadequate, exposing a fundamental limitation in their autonomous operation.

Anthropic recently delivered a MasterClass on this exact challenge, asserting a powerful central thesis: the harness surrounding an AI agent is more critical than the raw power of the underlying large language model (LLM) itself. This ecosystem of tooling, context, and configuration—not just benchmark scores—dictates an agent's success. It’s about curating the right environment to guide the agent, enabling it to operate effectively across multi-million line monorepos or distributed systems.

This indispensable harness now constitutes a new, essential third component of a modern codebase, aptly termed the AI Layer. It exists alongside the traditional application code and its associated tests, serving as the explicit guide for agentic systems. The AI Layer comprises elements like global rules, path-scoped skills, self-improving hooks, and a Model Context Protocol (MCP) server, all designed to provide the structured context an agent needs to perform complex tasks reliably.

Architecting Your AI Layer

Architecting an effective AI layer begins with a Lean & Layered rules system, epitomized by `claude.md` files. Root-level `claude.md` files establish global context—core codebase purpose and overarching conventions. Subdirectory `claude.md` files then introduce progressively disclosed, scoped rules, providing agents with relevant, localized conventions for specific modules or features without overwhelming them with unnecessary details. This hierarchical structure ensures context is always precise and manageable.

Beyond static rules, dynamic capabilities are crucial. Path-Scoped Skills equip agents with specialized tools, enabling targeted actions within specific codebase areas. Complementing this is the Model Context Protocol (MCP), a system for efficient symbol-searching. The MCP allows agents to quickly locate definitions, usages, and relationships across a vast codebase, mirroring an engineer's ability to navigate complex projects with an IDE, significantly boosting navigational efficiency.

Contrast this intelligent layering with a common anti-pattern: a single, massive prompt file. This approach attempts to dump all possible context into one document, often thousands of lines long. Such monolithic prompts overwhelm even the most capable LLMs, degrading performance, increasing inference costs, and making agents less effective than a human engineer. Anthropic's MasterClass emphasizes that curated, layered context, not sheer volume, dictates an agent's success in large codebases.

From Static Rules to a Living System

Beyond static `claude.md` files, an effective AI layer demands a dynamic, self-improving architecture. Implement self-improving hooks to transform static guidelines into a living system. Specifically, `stop hooks` can review an agent's session, identify inefficiencies or common errors, and automatically propose updates to the project's rule files, refining the agent's future behavior and ensuring continuous optimization.

Complementing this, `start hooks` provide crucial dynamic context. Before an agent begins a task, a `start hook` can fetch relevant documentation from Confluence based on the developer's team or the specific module being edited. This pre-populates the agent's context, ensuring it starts with the most pertinent, real-time information. Anthropic's insights into building these sophisticated agent harnesses are detailed in their guide, How Claude Code works in large codebases.

For complex tasks, subagents offer a powerful strategy for focused execution. Instead of overwhelming the primary coding agent with broad exploration or specialized analysis, subagents can be dispatched to handle specific, intricate problems. These specialized entities might: - Deeply analyze legacy code architecture. - Explore new API documentation. - Generate comprehensive unit test suites. This compartmentalization allows the main agent to concentrate on its core implementation, significantly boosting efficiency and accuracy in large, real-world codebases. The result is a more robust, adaptable, and performant AI coding assistant, consistently learning and optimizing its approach across diverse projects.

Stop Prompting, Start Engineering

Stop approaching AI coding with "prompt whispering" or "vibe coding." The era of simply hoping for the best from an LLM is over. Instead, adopt a deliberate mindset of harness engineering, building robust systems for predictable, scalable results. Anthropic's recent MasterClass confirmed the critical insight: the harness around the model, the AI context and tooling within your repo, matters more than the model itself.

This engineering approach unlocks significant advantages. Projects gain greater AI autonomy and achieve more reliable code generation, moving beyond trivial tasks. Such a structured AI Layer empowers agents to effectively navigate and contribute to complex environments, including multi-million line monorepos, decades-old legacy systems, and distributed architectures spanning dozens of repositories. Internally, Anthropic engineers using Claude Code ship three times more code and merge 31% more pull requests, demonstrating tangible productivity gains.

Begin your journey into agentic engineering today. Create a simple `claude.md` file in your repository's root, establishing initial global context. Incrementally expand this foundation by adding layered rules in subdirectories and implementing self-improving stop hooks. This iterative process gradually builds out your project's bespoke AI Layer, transforming your development workflow.

Frequently Asked Questions

What is an AI agent harness?

An AI agent harness is the collection of context, tools, and configurations surrounding an AI model to help it operate effectively in a specific environment, like a large codebase. It's the ecosystem built around the model.

Why is a harness more important than the model?

In complex codebases, raw model intelligence is insufficient. The harness provides crucial, scoped context, defines rules, and offers specialized tools that guide the model, preventing it from getting lost or making critical mistakes.

What is agentic search?

It's how Claude Code explores a repository. Instead of using a pre-built index (like RAG), it uses command-line tools like `grep` to navigate the file system and understand the code's structure, much like a human developer would.

How do self-improving hooks work?

They are scripts that run at the start or end of an AI session. A 'stop hook', for example, can analyze the session's actions and suggest improvements to the project's rule files (claude.md), making the system smarter over time.

𝕏 in ↑↗

One weekly email of tools worth shipping. No drip funnel.

one email per week · unsubscribe in two clicks · no third-party tracking

Frequently Asked Questions

What is an AI agent harness?

Why is a harness more important than the model?

What is agentic search?

How do self-improving hooks work?

Your AI Coder Needs a Harness

TL;DR / Key Takeaways

The Harness Is the New Hype

Architecting Your AI Layer

From Static Rules to a Living System

Stop Prompting, Start Engineering

Frequently Asked Questions

What is an AI agent harness?

Why is a harness more important than the model?

What is agentic search?

How do self-improving hooks work?

One weekly email of tools worth shipping. No drip funnel.

Frequently Asked Questions

Read Next

AI Just Mapped My Entire Codebase

Google's AI 'God Mode' Unlocked

AI's Dark Secret: You're 'Redundant Biomass'

Stay Ahead of the AI Curve