📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

A recent Google whitepaper reveals that in AI-assisted software development, the AI model accounts for only 10% of system behavior. The focus should be on harness design and context engineering, which dominate performance and costs.

A new Google whitepaper, titled The New SDLC With Vibe Coding, states that the AI model accounts for only about 10% of system behavior in AI-driven development. This challenges common assumptions and highlights the importance of harness design and context engineering in managing AI systems effectively, with significant implications for development costs and strategies.

The whitepaper, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, emphasizes that the biggest shift in software engineering is moving from writing code to expressing intent and trusting machines to interpret that intent. It reports that as of early 2026, 85% of professional developers use AI coding agents regularly, with 51% using them daily and roughly 41% of all new code generated by AI.

The core insight is that the AI model itself is only a small part of the system. The paper states that the behavior of an AI agent is predominantly determined by its harness — the prompts, tools, rules, and context around it. Evidence from experiments shows that changing only the harness, not the model, can significantly improve performance, such as moving a coding agent from outside the Top 30 to the Top 5 on a benchmark.

Furthermore, the whitepaper advocates for a focus on verification, judgment, and configuration rather than just model advancement. It argues that most failures are due to configuration issues, missing tools, or vague rules, which can be addressed through better harness design and context engineering.

At a glance
reportWhen: announced March 2026
The developmentThe new SDLC framework shifts emphasis from the AI model to the surrounding harness and context engineering as the primary drivers of system behavior and cost efficiency.
The Model Is Only 10% — The New SDLC With Vibe Coding
AI Dispatch · Field Notes
Google · Osmani, Saboo & Kartakis · May 2026

The model is only 10%

A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.

A spectrum, not a binary — the differentiator is how outputs get verified
Vibe Coding
Casual prompts · “does it seem to work?” · disposable code · high risk
Structured AI-Assisted
Detailed prompts + constraints · manual testing · features in real codebases
Agentic Engineering
Formal specs · automated tests + evals + CI gates · production scale · low risk
Tests verify the deterministic; evals verify the rest. Without both, it’s vibe coding — however clever the prompt.
The idea worth building your strategy around
Agent = Model + Harness
~10%
HARNESS — prompts · tools · context · hooks · sandboxes · observability
MODEL~90% IS YOUR SURFACE AREA, NOT THE PROVIDER’S
Outside Top 30 → Top 5 on Terminal Bench 2.0 by changing only the harness — same model.
“Most agent failures, examined honestly, are configuration failures” — a missing tool, a vague rule, a noisy context.
The economics: it’s a token-cost problem (CapEx vs OpEx)
Vibe Coding
Low CapEx · High OpEx
Looks free, hides debt: token burn (fix-it loops), maintenance tax (AI spaghetti), security remediation. Crosses over to 3–10× more per feature.
Agentic Engineering
High CapEx · Low OpEx
Pay upfront (specs, evals, context), then ship cheaply. Levers: context engineering for first-pass success + intelligent model routing — cheap models for the easy work.
85%
of devs use AI coding agents (51% daily)
41%
of all new code is AI-generated
~90%
of agent behavior is the harness, not the model
+19%
longer on some tasks (METR) — verification is the cost
The read

The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.

Source: Osmani, Saboo & Kartakis, “The New SDLC With Vibe Coding,” Google (May 2026). Figures are the paper’s own, incl. METR & LangChain. Analysis is the author’s.
thorstenmeyerai.com

Why Harness and Context Engineering Matter More Than the Model

This shift in focus has profound implications for AI development strategies. By understanding that the model is only 10% of the system, organizations can prioritize building robust harnesses and context management, leading to more cost-effective and reliable AI systems. It also suggests that competitive advantage lies in configuration mastery rather than chasing the latest model upgrades alone.

For engineering leaders, this redefines investment priorities, emphasizing system design, tooling, and testing over solely model procurement. It also highlights that cost efficiency in AI development depends heavily on how well the surrounding infrastructure is engineered.

Coding with AI For Dummies (For Dummies: Learning Made Easy)

Coding with AI For Dummies (For Dummies: Learning Made Easy)

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Background on the Evolving AI Development Paradigm

Prior to this whitepaper, the common narrative centered on acquiring larger, more powerful AI models as the key to better performance. However, recent experiments and industry reports indicate that the surrounding system — including prompts, tools, and configuration — has a greater impact on outcomes. The rise of AI coding agents and their widespread adoption, with over 85% of developers using them, underscores this shift. The concept of ‘vibe coding’ has been criticized for its lack of discipline, and the new framework advocates for a spectrum from casual prompt tweaks to disciplined, verified engineering practices.

This development builds on earlier insights that system design, not just model size, determines AI effectiveness. It aligns with ongoing industry efforts to improve AI reliability through better tooling and configuration, rather than solely focusing on model improvements.

“The behavior of an AI agent is predominantly determined by its harness — the prompts, tools, rules, and context around it.”

— Addy Osmani

How Google Tests Software

How Google Tests Software

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Remaining Questions About Implementation and Impact

While the whitepaper provides strong evidence that harness design dominates AI system behavior, it does not specify precise methodologies for optimizing harnesses across different domains. The long-term impact on AI development costs and the best practices for large-scale adoption are still being explored. Additionally, the relative importance of model improvements versus harness engineering in emerging AI architectures remains an open question.

The AI Prompt Playbook: Master AI Prompt Engineering with 140 Ready-to-Use Templates for ChatGPT, Claude, Gemini & Copilot

The AI Prompt Playbook: Master AI Prompt Engineering with 140 Ready-to-Use Templates for ChatGPT, Claude, Gemini & Copilot

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Next Steps for Organizations Adopting the New SDLC Framework

Organizations should evaluate their current AI workflows, focusing on harness design, context management, and verification processes. Developing standardized practices for system configuration and testing can lead to more predictable and cost-effective AI deployment. Industry leaders are likely to invest in tooling that simplifies harness management and promotes best practices in context engineering. Future research and case studies will clarify how these principles scale across different AI applications and industries.

The AI Cloud Infrastructure Blueprint: Practical Designs and Configurations for Scalable AI

The AI Cloud Infrastructure Blueprint: Practical Designs and Configurations for Scalable AI

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

Why is the model only 10% of the system’s behavior?

The whitepaper shows that the surrounding harness — prompts, tools, rules, and context — has a much larger influence on how an AI behaves than the model itself, which is only about 10% of the overall system.

How can organizations improve AI performance according to the new framework?

By focusing on harness design, context engineering, and verification processes, organizations can significantly enhance AI reliability and efficiency without always upgrading the model.

Does this mean AI model development is less important?

Not necessarily less important, but the whitepaper emphasizes that system configuration and harness engineering have a greater impact on outcomes and costs, shifting the strategic focus.

What are the practical steps for implementing these insights?

Organizations should audit their current AI workflows, invest in tooling for harness management, and establish best practices for context and verification to optimize system behavior.

Will this approach reduce AI development costs?

Yes, focusing on harness and context engineering can lower marginal costs and improve system reliability, especially as AI systems scale.

Source: ThorstenMeyerAI.com

You May Also Like

AI Ethics in Autonomous Systems

Understanding AI Ethics in Autonomous Systems reveals how fairness, transparency, and responsibility shape trustworthy technology that benefits society and invites further exploration.

Blockchain Security: How Consensus Works

Discover how consensus mechanisms safeguard blockchain security and why understanding their inner workings is essential to appreciating blockchain’s resilience.

Transform Your Forms: How Multi-Step Process Triples Completion Rates

Discover how breaking forms into steps can triple your completion rates. Learn practical tips to boost conversions and reduce drop-offs today.

AI in Education: Personalized Learning and Assessment

Introducing AI in education: discover how personalized learning and assessment are transforming student success and the ethical challenges involved.