The Model Is Only 10%: The Real Lesson of the New SDLC

📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

A recent whitepaper from Google highlights that in AI-driven software development, the model itself accounts for only 10% of system behavior. The real focus should be on harnessing, verification, and context engineering, which dominate system performance and cost.

A new Google whitepaper, titled ‘The New SDLC With Vibe Coding,’ states that the model accounts for only about 10% of system behavior in AI-driven software development. This shifts the conventional focus from improving models to enhancing harnesses, verification, and context engineering, which now constitute the majority of system effectiveness.The whitepaper, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, underscores that the most significant shift in software engineering is moving from writing code to expressing intent and trusting machines to interpret it. It reports that as of early 2026, approximately 85% of professional developers use AI coding agents regularly, with 51% doing so daily, and about 41% of new code being AI-generated. However, the core insight is that the actual influence of the AI model is minimal compared to the configuration and scaffolding around it. The paper emphasizes that the ‘harness’—comprising prompts, tools, rules, and observability—determines system behavior 90% of the time. This is supported by experiments where tweaking only the harness or context improved performance significantly, even with the same underlying model. For example, a coding agent moved from outside the top 30 to the top 5 on a benchmark by changing only its harness. Furthermore, the authors introduce the concept of ‘context engineering,’ which involves carefully managing instructions, knowledge, memory, examples, tools, and guardrails. They highlight that the choice between static and dynamic context loading has a profound impact on costs and scalability. The paper advocates for ‘agent skills’—packaged procedural knowledge loaded on demand—to improve efficiency and adaptability. Economically, the whitepaper warns that vibe coding—quick prompts and minimal review—appears cheap but incurs high ongoing costs due to token consumption, maintenance, and security vulnerabilities. In contrast, disciplined ‘agentic engineering’ involves upfront investment in schemas, testing, and context structuring, leading to lower costs over time, especially at scale.
At a glance
reportWhen: published early 2026
The developmentGoogle’s new whitepaper reveals that the core of effective AI software development is not the model but the surrounding harness and verification processes.
The Model Is Only 10% — The New SDLC With Vibe Coding
AI Dispatch · Field Notes
Google · Osmani, Saboo & Kartakis · May 2026

The model is only 10%

A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.

A spectrum, not a binary — the differentiator is how outputs get verified
Vibe Coding
Casual prompts · “does it seem to work?” · disposable code · high risk
Structured AI-Assisted
Detailed prompts + constraints · manual testing · features in real codebases
Agentic Engineering
Formal specs · automated tests + evals + CI gates · production scale · low risk
Tests verify the deterministic; evals verify the rest. Without both, it’s vibe coding — however clever the prompt.
The idea worth building your strategy around
Agent = Model + Harness
~10%
HARNESS — prompts · tools · context · hooks · sandboxes · observability
MODEL~90% IS YOUR SURFACE AREA, NOT THE PROVIDER’S
Outside Top 30 → Top 5 on Terminal Bench 2.0 by changing only the harness — same model.
“Most agent failures, examined honestly, are configuration failures” — a missing tool, a vague rule, a noisy context.
The economics: it’s a token-cost problem (CapEx vs OpEx)
Vibe Coding
Low CapEx · High OpEx
Looks free, hides debt: token burn (fix-it loops), maintenance tax (AI spaghetti), security remediation. Crosses over to 3–10× more per feature.
Agentic Engineering
High CapEx · Low OpEx
Pay upfront (specs, evals, context), then ship cheaply. Levers: context engineering for first-pass success + intelligent model routing — cheap models for the easy work.
85%
of devs use AI coding agents (51% daily)
41%
of all new code is AI-generated
~90%
of agent behavior is the harness, not the model
+19%
longer on some tasks (METR) — verification is the cost
The read

The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.

Source: Osmani, Saboo & Kartakis, “The New SDLC With Vibe Coding,” Google (May 2026). Figures are the paper’s own, incl. METR & LangChain. Analysis is the author’s.
thorstenmeyerai.com

Why the Focus on Harnesses and Verification Matters

This shift in understanding redefines how organizations should invest in AI development. Instead of chasing ever-better models, the emphasis should be on building robust harnesses, verification processes, and effective context management. This approach can lead to better system reliability, lower costs, and faster iteration cycles. For decision-makers, recognizing that the model is only 10% of the equation means reallocating resources toward configuration, testing, and security, which are critical for scalable and secure AI deployment.
Mastering Codex for Parallel AI Agents: Run multiple AI agents at once and verify their work — a non-engineer's guide to supervising Codex (Codex Mastery Series Book 2)

Mastering Codex for Parallel AI Agents: Run multiple AI agents at once and verify their work — a non-engineer's guide to supervising Codex (Codex Mastery Series Book 2)

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Background of the AI Development Paradigm Shift

Historically, improvements in AI performance focused on developing larger, more capable models. However, recent developments, including the rise of AI coding agents, have shown that the surrounding infrastructure—prompts, tools, and verification—has a greater impact on system behavior. The whitepaper builds on early 2026 data indicating widespread adoption of AI in coding workflows, with over 80% of developers using AI tools regularly. The authors argue that the industry is moving toward a new software development lifecycle (SDLC) centered on intent expression and trust in configuration rather than raw model improvements. This reflects a broader trend toward modular, configurable AI systems that prioritize verification and context management over model size.

“The model is only 10% of what determines behavior; the harness is 90%. The behavior you experience is dominated by scaffolding you can build, own, and improve.”

— Addy Osmani

Agentic Artificial Intelligence: Harnessing AI Agents to Reinvent Business, Work and Life

Agentic Artificial Intelligence: Harnessing AI Agents to Reinvent Business, Work and Life

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Unresolved Questions About Implementation and Costs

It remains unclear how organizations will transition from vibe coding to disciplined agentic engineering at scale, and what the precise cost-benefit trade-offs will be for different industries and use cases. The long-term impact on security and maintenance costs also requires further observation as adoption accelerates.
Context Engineering for Claude: A Practitioner's Guide to CLAUDE.md, Memory Tools, and Three-Layer Workflows for Solopreneurs, Freelancers, and Product Managers

Context Engineering for Claude: A Practitioner's Guide to CLAUDE.md, Memory Tools, and Three-Layer Workflows for Solopreneurs, Freelancers, and Product Managers

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Next Steps for AI Development and Adoption Strategies

Organizations should focus on developing and testing robust harnesses, verification frameworks, and context management practices. Future research will likely explore standardized tools for context engineering and metrics for measuring configuration effectiveness. Industry adoption of these principles could reshape AI development workflows, emphasizing configuration over model size, with ongoing evaluation of cost savings and security improvements.
Observability in the AI-Native Era: Leveraging AIOps to build, observe, and operate resilient systems

Observability in the AI-Native Era: Leveraging AIOps to build, observe, and operate resilient systems

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

Why is the model only 10% of system behavior?

According to the whitepaper, the model’s influence is minimal compared to the surrounding harness, verification, and context management, which shape the system’s performance and reliability.

What is meant by ‘harness’ in AI systems?

The harness includes prompts, tools, rules, observability, and configuration settings that control how an AI model behaves in a specific application.

How does this shift affect AI development costs?

While vibe coding appears cheaper upfront, it incurs higher ongoing costs due to token usage, maintenance, and security risks. Disciplined engineering requires higher initial investment but reduces long-term expenses.

What skills should developers focus on?

Developers should prioritize skills in configuration, verification, context engineering, and designing effective harnesses to optimize AI system performance.

Will this change how AI is integrated into products?

Yes, the emphasis on configuration and verification suggests a move toward more modular, controllable AI systems that are easier to maintain, secure, and scale.

Source: ThorstenMeyerAI.com

You May Also Like

Your Coding Agent Is an Attack Surface: The Claude Code Security Reckoning

Recent vulnerabilities in Claude Code reveal critical attack surfaces, risking token theft and code execution for developers using agentic AI tools.

Why 10GbE Is Finally Starting to Make Sense for Home Studios

Inevitably, 10GbE is revolutionizing home studios by offering faster speeds and lower latency—discover how this technology can transform your setup.

The Compounding Error Problem — Why 99.9% Alignment Decays to 60% in 500 Generations

Analysis of how small per-generation alignment errors compound over multiple AI generations, risking control loss in recursive self-improvement.

Edge Computing: Bringing Data Processing Closer to You

How does edge computing enhance your device performance and security? Discover the transformative impact it can have on your technology interactions.