30 May 2026

Securing the Agentic Development Lifecycle

Amir Kavousian

Rami McCarthy

Appsec

Threat Modeling

This article is co-authored with Rami McCarthy, Principal Security Researcher and prolific author of security blogs (ramimac.me). You can also find him on LinkedIn (linkedin.com/in/ramimac/).

AI coding agents are great at generating code and executing what is in front of them, but lack a persistent understanding (an “organizational memory”) of your architecture, your threat landscape, and the decisions your team has already made on risk.

In application security, we have a name for that organizational memory. It’s called a threat model. As the agentic development lifecycle becomes the de facto way to ship software, the threat model will become the most important artifact in the entire security program.

As the agentic development lifecycle becomes the de facto way to ship software, the threat model will become the most important artifact in the entire security program.

Most AppSec tooling was architected for the DevOps era. The assumptions it's built on (human-written code, predictable release cycles, scan-then-fix workflows) are breaking down faster than vendors can adapt. This post lays out a thesis for what the AppSec stack looks like in the era of the Agentic Development Lifecycle (ADLC), where code generation is autonomous, development velocity is 10–100x what it was, and security must fundamentally rethink when, where, and how it operates.

A Brief History of AppSec in the SDLC Era

For the past two decades, application security followed a predictable arc. In the early 2000s, security was an afterthought, most often done as penetration testing at the end of the cycle, if at all. The mid-2010s brought the “shift left” movement: integrate SAST, SCA, and secrets scanning into CI/CD pipelines. Give developers the findings, and hope (and “pray”) they fix issues before production.

On paper, it worked. In practice, teams didn’t actually shift left. Instead, they “took all the existing work, handed it to developers, and said, ‘this is your problem now.’” This resulted in alert fatigue, friction between security and engineering, and an ever-growing backlog of findings that nobody had time to fix. CISA research even demonstrated that the foundational claim behind shift-left (that fixing vulnerabilities early is cheaper) was never empirically validated; but, it “spread like a fairy tale.”

The “Shift-Left” stack (SAST + SCA + DAST + secrets scanning, orchestrated through ASPM) served a specific era: one where humans wrote code at human speed, and security tools had time to catch up. That era is over.

In the SDLC era, threat modeling was largely optional because human developers carried architectural context in their heads. They knew which services were sensitive, asked questions when requirements were ambiguous, and applied institutional judgment that no document could fully capture. In the ADLC era, agents don't carry context between sessions, don't know your architecture unless told, and don't ask clarifying questions about trust boundaries. As a result, threat modeling transforms from a best practice into a critical infrastructure, not because threat modeling changed, but because everything around it did.

Enter the ADLC: What Is Actually Changing

The Agentic Development Lifecycle is a structural transformation in how software is produced. As with any transformation, there are levels to how much (and how fast) teams are embracing AI (see ref. 1):

AI-assisted coding (humans write and review, AI enhances)
vibe coding (developers describe intent, AI generates)
agentic coding (AI agents plan, execute, and iterate autonomously with limited human direction).

Most organizations are now operating across all three modes simultaneously. And security has to account for each.

IDC research from March 2026 shows developers now attribute 41% of their application code to AI generation on average, and report accepting nearly 40% of AI-generated code without revision. Vibe coding is becoming the default development paradigm for a growing share of the industry, and that is exacerbating a serious scaling problem for security teams. When most companies have roughly 300 developers per security engineer, and AI triples code output per developer, Scan-First Security becomes a losing proposition.

What It Takes to Secure the ADLC

Securing the Agentic Development Lifecycle requires rethinking security across three fundamental dimensions: when security happens, who (or what) enforces it, and what we are actually protecting against.

(A) When: Security must move to where humans still have control.

In the ADLC, humans control two things: the design decisions that shape what gets built, and the governance policies that constrain how agents operate. Everything in between (code generation, testing, iteration) is increasingly autonomous. Security that only operates at the code layer is operating on the output of a process it can’t influence.

In the ADLC, humans control two things: the design decisions that shape what gets built, and the governance policies that constrain how agents operate. Everything in between (code generation, testing, iteration) is increasingly autonomous.

From an organizational standpoint, this leads to a fundamental shift. Security has always operated at three levels:

program (organizational risk appetite, compliance posture, security culture),
system (architecture decisions, trust boundaries, data classification), and
tactical (individual code changes, vulnerability fixes, configuration tweaks).

Application Security tools have spent two decades siloed into the tactical: scanners, linters, SCA, DAST. But, AI agents are about to commoditize that entire layer. When frontier models can self-scan and self-fix, tactical tooling hits a ceiling. The organizations that pull ahead will be the ones that invested in what tactical tools cannot reach: program-level and system-level security context. This has a compounding effect: When you invest in program and system-level security, every tactical decision gets better. Without that investment, each agent session starts from zero.

When you invest in program and system-level security, every tactical decision gets better. Without that investment, each agent session starts from zero.

(B) Who: The agents can be the enforcer, and the new attack vector.

Agents can be powerful security enforcers. Ramp's autonomous patching pipeline and Slack's multi-agent investigation architecture prove that. But they are also supply chain components that your organization depends on implicitly. The agents generating your code are subject to prompt injection, context poisoning, and training data manipulation, and most organizations have no visibility into those risks. Your threat model is the artifact that should be cataloging these agents, their permissions, and their attack surfaces.

(C) What: Go beyond code-level vulnerabilities.

Traditional AppSec protects against code-level flaws: injection, XSS, broken authentication. The ADLC introduces a different class of risk entirely: business logic flaws that emerge in the drift between imprecise specification and agentic implementation. This is important because the agents who implement the code require org-level context to avoid making logical mistakes.

For example, an agent may interpret a refund flow specified as “process eligible returns” into an implementation that the agent deems “eligible” in ways the product owner never sanctioned. Or, an agent may fail to deactivate an authentication provider even after registration was removed from the frontend, because it does not have organizational context. These are not code-level bugs a SAST tool would catch, but are architectural decisions that need to be threat-modeled. They are semantic gaps between what the business meant and what the agent built, and they only surface as behavioral failures in production.

A Layered Approach to ADLC Security

Securing the ADLC requires a four-layer stack: it starts with formalizing the threat architecture, is deployed via agent governance, then it’s enforced via generation-time guardrails, and finally validated via adversarial testing. We’ll go through each layer in this section.

Securing the agentic development lifecycle requires a new four-layer AppSec stack anchored by a living threat model.

Layer 1: Threat Architecture

Threat architecture is the living context engine that serves your system's architectural decisions, and risk context to every downstream control. Without it, governance is reduced to generic policy, guardrails catch generic bugs, and validation tests a generic checklist. With it, each layer becomes precise, architecture-aware, and compounding in value.

Once established, the threat model becomes the organizational memory: a living store of your architecture’s trust boundaries, your team’s risk decisions, and the security patterns your system requires. Every other layer in this stack should consume it:

guardrails enforce the boundaries that the threat model defines.
generation-time controls embed the patterns it requires.
adversarial tests validate the assumptions it made.

A threat model that acts as a living context engine rather than a compliance checkbox makes the entire stack smarter with compounding returns at every iteration.

Threat architecture is the living context engine that serves your system's context to each layer to make it precise, architecture-aware, and compounding in value.

Layer 2: Agents Governance

Organizations need deterministic guardrails around non-deterministic systems. This means input sanitization (redacting secrets from prompts), processing-layer checks (blocking hardcoded credentials during generation), and output validation (verifying generated code against security policies before commit).

The agents governance layer controls how AI coding tools are used across the environment: which assistants, agents, models, extensions, and MCP-connected services are sanctioned, and what policies govern their use. The goal is to facilitate happy-path deployments to curb shadow deployments.

Elevating security beyond a checkbox activity and into an enabler is only possible when governance uses the threat model to define what “secure and auditable” means for your specific application. For example, instead of the generic governance statement “don’t use unapproved tools,” a design-informed governance policy says “agents working on the payment service can only access the payments schema and must route all external API calls through the approved gateway.”

Layer 3: Generation-Time Guardrails

This layer operates through rules files, prompt controls, hooks, skills, harnesses, and MCP-connected security services that steer agents toward safer outputs by embedding secure coding rules, organizational policies, and architectural context right at the time when code is produced.

Claude Code, Codex, and Cursor already embed pattern-based security analysis in real-time during code generation. What remains is the control plane that orchestrates, prioritizes, and enforces org-level policies across the pipeline. The goal is to shift the focus from “here are 20,000 vulnerabilities to triage, prioritize and remediate” to “these 100 code changes will reduce your AppSec risk by 98%.” This is decidedly outcome-oriented, not alert-count-oriented.

Threat modeling can amplify the impact of generation-time guardrails by focusing on the authorization gaps, trust boundary violations, and data flow risks that matter for your application.

Threat modeling can amplify the impact of generation-time guardrails by focusing on the authorization gaps, trust boundary violations, and data flow risks that matter for your application.

Layer 4: Adversarial Validation

The non-deterministic nature of the new stack necessitates an active validation plan, including penetration testing, red teaming, and agent behavior analysis, that are driven by the threat model.

Threat-model-informed pen testing not only improves the quality of the output, it can also create a feedback loop that continuously improves the ADLC: adversarial validation tests the assumptions made at the design phase, and findings flow back to update the threat model for the next iteration. Design informs testing, while testing refines design. The stack becomes a loop, with compounding returns.

Threat modeling and pen testing create a feedback loop that continuously improves the ADLC: adversarial validation tests the assumptions made at the design phase, and findings flow back to update the threat model for the next iteration.

Organizational Memory Is the Gap and the Bottleneck

Every layer in this stack matters, but the design-phase security has the highest leverage and yet the least tooling. In the hyper-fast and non-deterministic environment of ADLC, context is king. And in AppSec, the threat model has always been the gold standard for security context.

In practice, this remains the most manual, inconsistent, and under-tooled phase of the entire stack. We have maturing tooling for point-of-generation controls (code scanners), growing investment in governance (risk and compliance management tools), and emerging solutions for validation (bug bounty, pen-testing platforms). Design-phase security remains largely manual: whiteboards, spreadsheets, inconsistent threat modeling exercises that happen once and are never updated. In a world where code is generated in minutes, manual design phase security is the bottleneck.

Why Does Threat Model Matter in the Age of AI

The key insight is that treating threat models as documents is a category error. Threat model is a structured knowledge graph of the application and its functional and security requirements. It is the living input that drives the context downstream, including policies, point-of-generation guardrails, and adversarial testing scope. When the threat model acts as an organization memory, it becomes the security context fabric of the ADLC, making the entire stack smarter.

When you treat the threat model as an organization memory that acts as the security context fabric of the ADLC, the entire stack gets smarter.

The transition from a document to a model (structured knowledge graph) puts the threat model at a uniquely impactful position in the AI-driven development era. Specifically, it provides AI tools with AppSec context that they cannot simply glean from the codebase or the product specs. It formalizes secure design patterns and architectural building blocks that AI tools are expected to use during implementation.

The New Landscape: What Emerges, What Gets Absorbed, What Gets Vibe-Coded

The security market follows a predictable pattern: point solutions emerge, then consolidate into platforms. But AI introduces a challenge: the underlying infrastructure is still shifting, forcing founders to place bets on models and architectures that may not exist in 18 months. It feels like “by the time the analyst PDF is published, the category has already shifted.”

Here’s how we see the tool landscape shaking out:

Absorbed into Frontier Models and IDEs

Basic pattern matching SAST, common vulnerability detection, secrets scanning, and simple SCA checks are being absorbed directly into AI coding assistants. The exception is passive scanning for zero-days and legacy codebases that aren't being actively touched by AI, which still need standalone coverage. But for the growing share of code that is being generated agentically, the standalone scanner's value proposition is eroding fast.

Anthropic’s Claude Code security scanner and OpenAI’s Codex already reason contextually through code, in ways that surpass traditional pattern-matching scanners. The counterpoint of “separation of code generation from validation” is still valid, but in a world where security teams have tool sprawl and fatigue, the arc of the industry curves toward more consolidation.

Emerging as Distinct Categories

Several new tool categories are crystallizing. Design-phase security platforms that review architecture and business logic before code generation. Agent governance and guardrail systems that constrain what AI agents can do. Non-human identity management for agent credentials and permissions. And Cloud Application Detection & Response (CADR) for runtime behavioral validation. These categories address threats that didn’t exist in the traditional SDLC and can’t be solved by extending existing tools.

Vibe-Coded by Internal Teams

Internal security teams are already using agentic tools to build bespoke security automation: custom guardrail policies, internal vulnerability triage agents, compliance documentation generators, and pipeline enforcement hooks. This isn't new; security teams have always built internal tooling. AI just makes it faster to start. Ramp’s autonomous patching pipeline is a leading example: a multi-agent architecture that patched 100 vulnerabilities in six days with zero human involvement.

But starting is the easy part. The maintenance burden is significant: models change, APIs shift, and internal tools built in a sprint tend to bitrot within a quarter. There's also an ROI question: is the differentiation you get from building it yourself worth the ongoing cost of maintaining it, especially when focused vendors are iterating on the same problem full-time?

There are deeper reasons teams reach for vendors beyond just build cost. A vendor compartmentalizes liability: when your threat model tooling produces an output that a compliance audit relies on, you want a vendor behind it, not a side project. Vendors also provide auditability (consistent, documented outputs that satisfy auditors and regulators) and reliability (SLAs, support, and continuity that an internal tool maintained by one senior engineer can't guarantee).

Last but not least, not every team can build Ramp-quality infrastructure, since building high-quality security tooling requires knowing “what good looks like.” The market bifurcation is accelerating: security-mature teams will vibe-code their operational glue, but the foundational layers of the stack will increasingly be vendor-supplied.

The Thesis, Simply

The ADLC doesn't change the primitives of application security; you still need governance, guardrails, and validation. What changes is where those controls operate and what form they take. The old stack applied them to human-written code after the fact. The new stack maps them to where decisions are actually made: before and during generation, not after.

Traditional layers (controls, scanners, testing) assume the design is sound, operate on code that is already being generated and outputs that are already produced. In short, they are reactive to the specification rather than shaping it. ADLC moves too fast for reactive tools to be effective and acceptable.

The highest-leverage layer in ADLC security is the one most organizations still do manually: threat architecture, and the living threat model that encodes your system's trust boundaries, data flows, and risk decisions. It's where humans still have direct control, where a single architectural decision propagates across every agent session, and where the tooling has not yet caught up to the rest of the stack.

This is also the layer where context matters most, and where a static file falls short. Rules files and agent skills can formalize coding standards. CI hooks can enforce governance policies. But threat architecture requires something more: a persistent, evolving understanding of how your system's components relate to each other, where trust boundaries exist, and what risk decisions your team has already made. An AGENT.md file captures a moment in time; it doesn't evolve when services get decomposed, new data flows emerge, or the architecture shifts underneath it.

As the development pipeline becomes agentic, the threat model needs to be tightly coupled with that pipeline. It goes beyond a document referenced occasionally, becoming a living graph of architectural context that every layer of the stack can consume in real time. That's a product, not a vibe-coded tool, prompt, skill, or rule file.

The organizational memory problem is inherently a graph problem: relationships between components, data flows, trust boundaries, and threat scenarios that shift with every architectural change. That's a product, not a prompt or rule file.

Organizations that invest in encoding program-level and system-level security context before the first line of code is generated will see compounding returns. Once you treat it as a structured knowledge graph: every agent session gets smarter; every governance policy gets more precise; every generated module inherits institutional knowledge that would otherwise evaporate between sessions.

The organizations that don’t will keep buying tactical tools to scan AI-generated code faster. They’ll optimize the wrong layer and spend their time chasing machine-generated symptoms of human-made design flaws.

The ADLC is already here for many teams, and it’s growing fast. The question is whether your security stack was built for the world that’s emerging, or the one that is quickly disappearing.

With thanks to Katie Norton, whose article (Trusting the Vibe: Understanding and Managing the Security Risks of AI-Assisted Development, Doc #US53869725, March 2026) and our conversations helped shape the thinking here; and to Frank Wang and Andrew Peterson for their feedback and insights.

‍

References:

Trusting the Vibe: Understanding and Managing the Security Risks of AI-Assisted Development, Doc #US53869725, March 2026

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Securing the Agentic Development Lifecycle

Table of Contents

A Brief History of AppSec in the SDLC Era

Enter the ADLC: What Is Actually Changing

What It Takes to Secure the ADLC

(A) When: Security must move to where humans still have control.

(B) Who: The agents can be the enforcer, and the new attack vector.

(C) What: Go beyond code-level vulnerabilities.

A Layered Approach to ADLC Security

Layer 1: Threat Architecture

Layer 2: Agents Governance

Layer 3: Generation-Time Guardrails

Layer 4: Adversarial Validation

Organizational Memory Is the Gap and the Bottleneck

Why Does Threat Model Matter in the Age of AI

The New Landscape: What Emerges, What Gets Absorbed, What Gets Vibe-Coded

Absorbed into Frontier Models and IDEs

Emerging as Distinct Categories

Vibe-Coded by Internal Teams

The Thesis, Simply

Table of Contents

Securing the Agentic Development Lifecycle

Table of Contents

A Brief History of AppSec in the SDLC Era

Enter the ADLC: What Is Actually Changing

What It Takes to Secure the ADLC

(A) When: Security must move to where humans still have control.

(B) Who: The agents can be the enforcer, and the new attack vector.

(C) What: Go beyond code-level vulnerabilities.

A Layered Approach to ADLC Security

Layer 1: Threat Architecture

Layer 2: Agents Governance

Layer 3: Generation-Time Guardrails

Layer 4: Adversarial Validation

Organizational Memory Is the Gap and the Bottleneck

Why Does Threat Model Matter in the Age of AI

The New Landscape: What Emerges, What Gets Absorbed, What Gets Vibe-Coded

Absorbed into Frontier Models and IDEs

Emerging as Distinct Categories

Vibe-Coded by Internal Teams

The Thesis, Simply

Table of Contents

Join our newsletter

Join our newsletter