Securing the AI Agent Pipeline: A Practitioner's Guide to MCP Server Governance

The Extension Problem

AI coding agents do not operate in isolation. Their power comes from extensions — MCP servers, plugins, and tool integrations that give agents access to databases, cloud APIs, file systems, container runtimes, and shell execution. An AI agent without extensions is a text generator. An AI agent with extensions is an autonomous system that can read, write, build, deploy, and destroy.

These extensions are installed from the same supply chains that security teams have spent years trying to secure: npm, PyPI, GitHub, Docker Hub. The difference is the consumer. When a developer installs a package, they read the README, check the maintainer, maybe scan the lock file. When an AI agent requests an extension, it evaluates a tool description written in natural language, and the extension is installed and granted capabilities in a single step.

The Model Context Protocol (MCP) — an open standard for connecting AI agents to external tools — has accelerated this pattern. MCP servers can be installed from npm packages, cloned from GitHub repositories, pulled as container images, or connected as remote endpoints. Each installation vector carries its own supply chain risks, and none of them are covered by the scanning pipelines that organizations run on their own code.

In 2025 and 2026, security researchers at Prompt Security and SentinelOne independently demonstrated how this gap can be exploited. They showed that MCP server plugins distributed through marketplaces can hijack transitive dependencies, execute arbitrary code during installation via post-clone hooks and submodule payloads, and exfiltrate credentials from the developer’s environment — all triggered by a single install action. The attack surface is not the agent itself. It is the extension ecosystem.

The numbers tell the same story from a different angle. In May 2025, security researchers discovered the first confirmed malicious MCP server: postmark-mcp, impersonating a legitimate email service. By September 2025, compromised npm packages including nx, coa, and rc prompted a formal CISA advisory. In October 2025, a typosquatted package called @chatgptclaude_club/claude-code appeared on npm — published by an unverified account, containing a command-and-control server and credential harvester targeting API keys and SSH keys. By December 2025, researchers were discovering over 1,200 malicious packages per month, with more than 121,000 downloads of typosquatted packages across npm and PyPI.

Your SAST pipeline scans your code. Your SCA tooling checks your dependencies. Your secrets scanner watches your repositories. None of them scan the MCP servers your AI agents are installing. That is the extension problem.

Where Existing DevSecOps Tooling Falls Short

DevSecOps teams have built scanning and gating into every stage of the software delivery lifecycle. Code is linted, analyzed, tested, scanned for vulnerabilities, checked for secrets, and gated before merge. Container images are scanned before deployment. Infrastructure-as-code is validated before apply. The discipline is mature, the tooling is good, and the coverage is — for your own code — comprehensive.

The problem is that AI agent extensions are not your code. They arrive from external sources, install into your environment, and operate with your credentials. And they bypass every gate you have built.

SAST and SCA tools — Semgrep, Snyk, Grype, CodeQL — are designed to analyze your codebase and your dependency tree. They run in CI against your repositories. Nobody configures them to run against an MCP server that an AI agent wants to install. The scanning pipeline does not know the installation is happening.

Secrets scanners — TruffleHog, Gitleaks, detect-secrets — monitor your repositories for accidentally committed credentials. They do not scan the source code of third-party packages your agents pull at runtime. A malicious MCP server with a hardcoded exfiltration endpoint passes through because the scanner never sees it.

Container scanning — Trivy, Grype, Snyk Container — checks base images for known CVEs in OS packages and language dependencies. It catches a vulnerable version of OpenSSL in an Alpine base image. It does not catch an MCP server that embeds hidden instructions in its tool schema, or one that exfiltrates environment variables on first connection. Application-layer risks in MCP server code are invisible to image scanners.

Policy engines — Open Policy Agent, Cedar, Kyverno — are authorization frameworks. They evaluate whether a request conforms to a policy. They are good at answering “is this Kubernetes pod allowed to mount this volume?” They have no mechanism for intercepting the commands that AI agents execute in a developer’s shell. A policy engine without an enforcement point is a rule without a referee.

CI/CD gating works for artifacts that flow through your pipeline. MCP servers installed by AI agents at development time do not flow through your pipeline. There is no PR to gate, no build to scan, no deployment to approve. The agent installs the extension directly into the local environment, and the first time your security tooling sees it is when something goes wrong.

The gap is structural. Existing DevSecOps tooling is designed for a workflow where code moves through defined stages: write, commit, build, test, scan, deploy. AI agent extensions skip most of those stages. They are fetched, installed, and activated in a single step, outside the pipeline, at development time.

Closing this gap requires three capabilities that do not exist in the current toolchain:

Pre-install artifact analysis — scanning extension source code before it enters the environment, using the same categories of analysis (SAST, SCA, secrets, schema) that teams already apply to their own code
Runtime command-level enforcement — applying policy to agent-initiated commands at execution time, not just at deployment boundaries
Continuous drift detection — re-evaluating the risk profile of installed extensions when they update, because a safe extension today may be compromised tomorrow

A Lifecycle Framework: Scan, Enforce, Monitor

The three capabilities missing from the current toolchain — pre-install analysis, runtime enforcement, and continuous monitoring — form a lifecycle. Each phase addresses a different stage of the AI agent extension lifecycle, and each builds on the previous one.

Scan — Artifact Analysis Before Installation

Before an MCP server, plugin, or extension enters the environment, its source code and dependencies should be analyzed. Not its README. Not its npm description. The actual code.

Four analysis types should run in parallel:

Schema validation. MCP servers expose tool definitions — JSON schemas that describe what the server can do, what parameters it accepts, and how the agent should interact with it. These schemas can contain hidden instructions that manipulate agent behavior, overly broad permission requests that grant unnecessary capabilities, or prompt injection vectors embedded in tool descriptions. Schema validation is specific to the MCP ecosystem and is not covered by any general-purpose scanner.

Static analysis (SAST). The extension’s source code should be analyzed for dangerous patterns: calls to child_process.exec, network requests to hardcoded external endpoints, file system operations outside expected paths, dynamic code evaluation via eval or Function(), and obfuscated logic designed to evade review. These are the same patterns SAST tools catch in your own code — but nobody is running them against the code your agents install.

Secrets detection. Embedded API keys, tokens, credentials, and high-entropy strings in extension source code indicate either malicious intent (a hardcoded exfiltration endpoint) or negligent development practices (leaked credentials in published code). Either way, it is a risk signal.

Dependency analysis (SCA). Every transitive dependency should be checked against known vulnerability databases — OSV.dev, the National Vulnerability Database, GitHub Security Advisories. A clean extension with a single compromised dependency deep in its tree is still a compromised extension.

The output of all four scanners should produce a composite risk score — start at 100, deduct by severity. Critical findings block installation. High findings warn. The output format should be SARIF v2.1.0 for native integration with GitHub Code Scanning, GitLab SAST reports, and Azure DevOps pipelines. If your security dashboard already consumes SARIF, AI agent extension findings should appear alongside your own code findings — not in a separate tool.

Consider what this would have caught. The @chatgptclaude_club/claude-code package: SAST would have flagged the reverse shell. Secrets detection would have flagged the credential harvester. SCA would have flagged known-vulnerable dependencies. Schema validation would have flagged hidden tool behaviors. None of these scanners ran, because no scanning pipeline existed for agent extensions.

Enforce — Policy-as-Code at the Command Level

Scanning governs what gets in. Enforcement governs what happens next.

AI agents execute commands. They run npm install, docker build, git push, psql, curl, rm. Enforcement means applying policy to these commands before they execute — and doing it differently for agents than for humans.

Agent-aware enforcement. The enforcement mechanism must distinguish between a command initiated by an AI agent and the same command typed by a human developer. Developers should experience zero friction — no prompts, no approval flows, no degraded workflow. Agents should be governed by policy. This distinction is fundamental. Any enforcement mechanism that slows down human developers will be disabled within a week.

Subcommand-level granularity. Blocking an entire binary is too coarse. Developers need docker. What they don’t need is their AI agent running docker push to an external registry or docker run --privileged. Policy should operate at the subcommand and flag level: allow docker build, allow docker run, deny docker run --privileged, deny docker push. The same applies to git (allow commit, deny push --force), kubectl (allow get, deny delete), and every other tool in the developer environment.

AI-assisted policy generation. Writing command-level policies by hand is tedious and error-prone. A better approach: observe what agents actually do, analyze the command patterns, and recommend least-privilege policies with confidence scores. Engineers review the recommendations, adjust, and approve. The policy engine learns. Manual policy authoring becomes the exception, not the rule.

Policy as code. Policies should live in version control. Changes should go through pull requests. Policy drift should trigger alerts. If the engineering team treats infrastructure as code and configuration as code, security policy for AI agents should be code too.

Monitor — Drift Detection and Audit

Extensions change. An MCP server that scored 95/100 at installation may score 40/100 after its next update — new dependencies, changed permissions, a modified tool schema that introduces hidden instructions. A monthly scan is not enough. Continuous monitoring means re-evaluating risk profiles when extensions change, not on a schedule.

Supply chain drift detection. When an installed extension publishes a new version, automatically compare the risk profile: new dependencies, changed tool schemas, modified code patterns, altered permission requests. Flag regressions. Alert on critical changes. Re-scan automatically.

Audit trail. Every command an agent attempts, every policy evaluation, every enforcement decision — allow, deny, warn — should be logged with full context. What was the command? Which agent initiated it? Which policy matched? What was the outcome? This is the foundation for incident response, and it is the artifact compliance auditors will ask for.

Fleet-wide dashboards. Which MCP servers are installed across the organization? What is the aggregate risk score? Which agents are most active? Which policies are triggered most frequently? What has been blocked in the last 24 hours? A per-machine, per-developer view is not enough. Security and platform engineering teams need a fleet-level view.

Framework mapping. Every finding — from scanning, enforcement, or monitoring — should map to established threat frameworks: MITRE ATLAS techniques, OWASP LLM Top 10 categories, OWASP Agentic Security risks. This makes AI agent security findings reportable in the same language and taxonomy the organization already uses.

Evaluation Criteria

When evaluating tools for AI agent extension security, test against the full lifecycle. A scanner without enforcement leaves a gap at runtime. An enforcement engine without scanning leaves a gap at install time. Monitoring without either is just a more expensive way to discover breaches.

Scanning

Analyzes source code, not just package metadata or manifests
Runs SAST, SCA, secrets detection, and schema validation in parallel
Produces composite risk scores with severity-weighted deductions
Outputs SARIF v2.1.0 for CI/CD pipeline integration
Supports npm, GitHub, container, and remote endpoint sources

Enforcement

Agent-aware — applies to AI agent commands, not human developers
Subcommand-level granularity (not just binary allow/deny)
AI-assisted policy recommendations from observed behavior
Policy-as-code — versionable, reviewable, auditable
Works offline — no cloud dependency for core enforcement

Monitoring

Continuous re-scanning on extension updates
Full audit trail with command, agent, policy, and outcome context
Fleet-wide visibility across all agents and extensions
Findings mapped to MITRE ATLAS and OWASP frameworks

Deployment

Single binary, no runtime dependencies
Works without modifying the AI agent or its host IDE
No dependency on agent vendor cooperation or API access
Deploys in minutes, not sprints

About Truvant

Truvant was built by a security practitioner who needed this for his own engineering team.

Michael Chomicz — CCIE #36817, CISO at Elisity, with 30+ years in security and infrastructure leadership — faced this problem firsthand when AI coding agents arrived in his organization’s developer workflows. The tools he had did not cover it. The tools on the market did not exist. So he built one.

Truvant implements the full Scan, Enforce, Monitor lifecycle. Pre-installation scanning runs SAST, SCA, secrets detection, and MCP schema validation in parallel with composite risk scoring and SARIF output. Agent-aware behavioral enforcement applies policy at the command and subcommand level, with AI-assisted policy recommendations. Continuous monitoring tracks supply chain drift, maintains a full audit trail, and provides fleet-wide visibility with MITRE ATLAS and OWASP framework mapping.

It ships as a single binary. Install it, scan your first MCP server, and get a risk score — all in under two minutes.