Fresh Snow Outside, Fresh Threats Inside: Practical Agentic AI Hardening

Saturday morning, blue skies, snow on the mountain and I’m sitting in this chair hacking away on Entra, HashiCorp Vault, Docker and Claude Code and OpenCode.

What the is wrong with me? I thought after decade in cybersecurity, a masters degree and bunch of certifications that I’d be raking in the cash, out riding my MTB and charging down mountain on fresh powder on my snowboard.

Unlike some jobs, every year in Tech is like going back to school… And the attackers don’t care what you learned or did last year…

What am I doing today?

https://github.com/secSandman/agent_playground

Today, I hacked together a demonstration playground that tries to illustrate a development workflow for containerized AI Agents, secrets injection and Agentic AI host isolation. I deployed a HashiCorp Cloud Vault, integrated it into Microsoft Entra via OIDC, built hardened AI Agents container built some playground utilities that fetch creds from the Vault.

I also explore integrating Entra and AWS into an SSO enabled human workflow for making automated service calls to AWS inference AI endpoints. Additionally, I explore conceptually some other emerging agentic access control solutions how how that might play into developer life cycle.

I’ll discuss risk theory, touch on threats, technical hardening, developer workflows and lofty security architecture ideas.

Message to the Haters,”What Abouters” and “Nay Sayers”

Firstly, this article doesn’t focus on the IDE UX. And you might feel like that sucks. Too bad. Yes, if you put your AI agent in a docker container, then native IDE publisher extensions, likely wont work with them out of the box. So the “undo” and the “diff” and pretty “highlighting” will break for you ….

Yes, I wish I could fix that have an IDE solution too, but I can’t due that on a Saturday because on Sunday, I’m heading to the mountain to shred some powder.

I also think if we’re going to build AI Agents that can run autonomously and semi-autonomously, then the idea of building solutions around the IDE is nothing more than human crutch, maybe a stepping stone to full automation.

With that in mind, I have a few goals for this research project …

Goal #1

  • Isolate AI Agent (Claude Code/OpenCode) from underlying operating system to protect all my important things that I don’t want a rogue development AI agent messing with. ALL your passwords, sensitive emails, private browsing history… Stealing them, destroying them, you name it.

Goal #2

  • Exploring a way to support the development and release of AI agents. Meaning, bridging a gap between using them, realizing you can host them like any other program, packaging them up then releasing them. I’m not so much interested in just installing them on my OS, using them like a GUI, b/c the point is that they can run and hosted and automate human tasks… not sit there and be a crutch for human tasks…

Finally, I’m doing this on my weekend, for free when, other’s are charging $100/hr in consulting fees, so if you’re going to bring your low vibration negative energy, nay-sayer attitude my way then you can politely …. …

Emerging Threats, Attacks and Vulns

AI agent risk gets real the moment the system can act instead of just chat. This is not just about a model saying something weird. It is about an agent with access to your repo, browser, extensions, package manager, tokens, email session, and cloud tools.

In July 2025, Replit’s AI coding agent was reported to have deleted production data during a code freeze, ignored instructions, and then concealed or misrepresented what happened. Exactly the kind of thing that turns “automation” into an operational disaster for actual people and businesses (Business Insider, 2025; Fortune, 2025).

At the same time, the broader software ecosystem has shown how fragile that trust chain really is. CISA warned that a 2025 npm supply-chain compromise spread through compromised developer accounts and malicious code inserted into trusted packages, demonstrating how one poisoned dependency path can ripple across real environments people depend on every day (Cybersecurity and Infrastructure Security Agency [CISA], 2025).

And the damage is not limited to code. When AI agents are paired with extensions, browser access, memory, and autonomy, the risk affect you peronsally and your business.

Microsoft reported in March 2026 that malicious AI assistant browser extensions were harvesting full URLs and chat histories from platforms such as ChatGPT and DeepSeek, exposing sensitive prompts, workflows and private context (Microsoft, 2026).

Other researchers documented malicious extensions stealing ChatGPT session tokens and conversation data, creating a path to account takeover and silent surveillance of what users ask, read and reveal online (Malwarebytes, 2026; OX Security, 2025). It is an emerging AI-connected attack surface that can help expose your inbox, your finances, your social accounts, your family details and the rest of your digital life (Microsoft, 2025).

  • Delete your work: an AI coding agent with repo or database access can destroy production data instead of helping (Business Insider, 2025; Fortune, 2025).
  • Steal your access: malicious packages and extensions can grab tokens, sessions, and credentials that open the door to email, banking, and social accounts (Microsoft, 2025; Malwarebytes, 2026).
  • Leak your private life: chat histories, browser activity, and sensitive prompts can be harvested and exposed without the user realizing it (Microsoft, 2026; OX Security, 2025).
  • Scale through trust: marketplaces, plugins, npm packages, and deeply nested dependencies let one compromise cascade across thousands of environments (CISA, 2025; Microsoft, 2025).

At the same time, I think the technology is awesome and I want to use it to automate as much as possible. So somewhere between my paranoid, tinfoil hat I wear for work and my nature to want to hack things together, I want to create solutions that allow me to deploy it into production “securely”.

The Playground Basic Concepts

This repo is a practical security playground for running emerging agentic coding tools (Claude Code/OpenCode) with layered containment controls instead of trusting the agent by default.

The core idea is defense-in-depth:

  • secrets are fetched on the host (not persisted in container files)
  • containers run as non-root with dropped capabilities
  • outbound traffic is constrained through a proxy allowlist
  • runtime behavior is narrowed with tool deny policies

It’s designed to let humans keep productivity while reducing blast radius if a prompt, plugin or tool call turns hostile.

A key concept is isolation by mode.

In normal mode, a local workspace can be mounted for development convenience. In stricter modes, the host filesystem mount is removed (–isolated-fs / –isolated), the container root can be read-only, writable paths are limited to tmpfs, and network can be proxied or fully disabled (–network none).

Access Control and Secrets Fetching

A lot of people are focusing on MCP, remote access and things like openid-a for agents, scim-a for agents and entra Agent ID. This problem space is going to be so much bigger and more complicated as we merge human behavior, legacy applications and new technology together. Humans using oidc flow as themselves, agents accessing ticketing systems, jira, servicenow, kube, cloud, databases, snowflake and more.

Take a breathe, take step back and let’s consider the basics…

Local Access

If you’re a developer seeking the IDE UX then your running node.js or bun.js AI agent, and likely downloading them and running them as yourself. Local filesystem access, critical config files, .aws file .ssh and more are a serious concern.

Maybe you solve this problem with service accounts in Active directory or maybe you solve this problem with hyper-v or virtual machines.

My repo playground runs the agent as an isolated container with local identity, not as the human developer account. The runtime user is a dedicated non-root UID (1001), with Linux capabilities dropped, privilege escalation blocked, AppArmor policy controls, and bounded CPU/memory/PID resources. Secrets are fetched on the host and injected at runtime, then cleared from environment memory after execution.


Agent runs as its own user

# build/claudecode/Dockerfile
RUN useradd -m -s /bin/bash -u 1001 claudeuser && \
usermod -L claudeuser
USER claudeuser
# build/opencode/Dockerfile
RUN useradd -m -s /bin/bash -u 1001 opencodeuser && \
usermod -L opencodeuser
USER opencodeuser
# .docker-compose/docker-compose.base.yml
user: "1001:1001"
security_opt:
- no-new-privileges:true
- apparmor=${CLAUDECODE_APPARMOR_PROFILE:-unconfined}
cap_drop:
- ALL
mem_limit: 2g
cpus: 2
pids_limit: 100

Permissions and runtime boundaries

# cmd/claudecode/run.py
command = [
'docker', 'run',
'--rm',
'--read-only',
'--cap-drop', 'ALL',
'--security-opt', 'no-new-privileges:true',
'--security-opt', f'apparmor={apparmor_profile}',
'--pids-limit', '100',
'--memory', '2g',
'--cpus', '2',
]
deny /home/**/.config/**/settings.conf w,
deny /home/**/.config/**/policy.json w,
deny /etc/passwd rw,
deny /etc/group rw,
deny /etc/shadow rw,
deny /etc/sudoers rw,
deny /etc/sudoers.d/** rw,
deny /usr/sbin/useradd x,
deny /usr/sbin/usermod x,
deny /usr/sbin/groupadd x,
deny /usr/bin/passwd x,
deny /usr/bin/sudo x,
deny /bin/su x,

Sensitive files this setup is trying to protect

// build/opencode/Dockerfile -> /home/opencodeuser/.config/opencode/opencode.json
"read": {
"**/.env*": "deny",
"**/*.key": "deny",
"**/*.pem": "deny",
"**/.ssh/**": "deny",
"**/.aws/**": "deny",
"**/auth.json": "deny",
"**/user/*.vault-token": "deny",
},
"edit": {
"**/opencode.json": "deny"
}

Depending the user requirements we may need to solve this in different ways, it’s not cut and dry. To illustrate these concepts, here’s what I’ve done.

Current State (What you already have)

Optionally, allow user to create a local Workspace mount: Compose mounts host workspace as rw:

${WORKSPACE_PATH:-../workspace}:/home/opencodeuser/workspace:rw

But provide flexibility, if the Agent only needs access to remote endpoints and not local FS or local tools, then the harness doesn’t give it access to those things.


Isolation modes in launcher (run.py):
--isolated => --network none, --read-only, tmpfs only
--isolated-fs => no host mount, proxy network allowed, tmpfs writable dirs


Remote Acess

Currently, I integrated HashiCorp Vault with OIDC into Microsoft Entra. This is specifically, a very developer flow, which assumes you will ultimately have secrets that need to be fetched and injected into a packaged Agentic runtime.

The approach I took is critically different then purely stringing together human based OIDC authN/Authz workflow to a Cloud inference AI endpoint or MCP servers alone. Meaning human user logs in, gets oAuth token and sends them to your external endpoint …

let’s talk about the difference between talking about the demo or repo ….

Cloud Access / AD Groups to Inference Endpoint

One Pattern (OIDC + AWS CLI) might help human based flows to specifically talk to a cloud hosted open-ai inference endpoint. And this is a great practice but it may breakdown at some point…

  • User signs in with enterprise IdP (Entra/AD) via AWS IAM Identity Center
  • User gets a base CLI profile (corp-sso) as themselves
  • A second profile (infer-prod) assumes a target role for inference access
  • Access is controlled by group membership (e.g., InferenceEngineUsers) and audited in CloudTrail

Role trust policy (target inference role)


Example InferenceInvokeRole trusting your SSO role and requiring a group tag:

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowAssumeFromSSOWithGroup",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::111122223333:role/AWSReservedSSO_DeveloperAccess_abc1234567890def"
},
"Action": "sts:AssumeRole",
"Condition": {
"StringLike": {
"aws:PrincipalTag/groups": "*InferenceEngineUsers*"
}
}
}
]
}

Permission policy on InferenceInvokeRole
(Example for Bedrock runtime)

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AllowAssumeFromSSOWithGroup",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::111122223333:role/AWSReservedSSO_DeveloperAccess_abc1234567890def"
},
"Action": "sts:AssumeRole",
"Condition": {
"StringLike": {
"aws:PrincipalTag/groups": "*InferenceEngineUsers*"
}
}
}
]
}

Permission policy on InferenceInvokeRole
(Example for Bedrock runtime)

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "InvokeApprovedModels",
"Effect": "Allow",
"Action": [
"bedrock:InvokeModel",
"bedrock:InvokeModelWithResponseStream"
],
"Resource": [
"arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-haiku-20240307-v1:0"
]
}
]
}

AWS CLI config (~/.aws/config)
Two profiles: base OIDC login + assumed-role profile.

[sso-session corp]
sso_start_url = https://your-company.awsapps.com/start
sso_region = us-east-1
sso_registration_scopes = sso:account:access
[profile corp-sso]
sso_session = corp
sso_account_id = 111122223333
sso_role_name = DeveloperAccess
region = us-east-1
output = json
[profile infer-prod]
role_arn = arn:aws:iam::444455556666:role/InferenceInvokeRole
source_profile = corp-sso
region = us-east-1
output = json
aws sso login --profile corp-sso
aws sts get-caller-identity --profile infer-prod
aws bedrock-runtime invoke-model `
--profile infer-prod `
--region us-east-1 `
--model-id anthropic.claude-3-haiku-20240307-v1:0 `
--content-type application/json `
--accept application/json `
--body fileb://request.json `
response.json
Get-Content .\response.json

Using an IdP-backed OIDC flow with AWS role assumption is a strong fit for human-driven workflows because it centralizes identity, access control and auditability through group-based policy instead of long-lived credentials.

It’s certainly better then just minting api keys everywhere …

That said, it introduces operational tradeoffs like users who switch across multiple AWS accounts/profiles can experience frequent session churn and re-auth prompts and not every required MCP or SaaS integration maps cleanly to the same AD-group-to-AWS-role .

In practice, enterprise Agentic workflows are often calling heterogeneous endpoint. I’m not saying I like this, i’m saying it’s a real struggle.

Fetching AI Secrets – Microsoft Entra + HashiCorp Vault

So teams still need a mixed credential strategy—combining IdP/role-based access for supported inference endpoints with dynamic and scoped static secrets or service credentials for systems like GitLab PATs, Jira API tokens, Snowflake credentials, RDBMS and other APIs that sit outside the shared identity boundary.

And if we’re going to eventually package and deploy these agents then we need to think about how to bridge the gap between build and release…

Integrating Microsoft Entra ID with HCP Vault Using OIDC and Namespace-Scoped Authorization

(Skip if you think Infra is boring and only want AI stuff)

I recently completed a Microsoft Entra ID integration with HCP Vault using OpenID Connect, with authentication handled at the parent namespace and authorization delegated into a child namespace.

In this case, Microsoft Entra ID acted as the OpenID Provider, issuing identity tokens after user authentication. HCP Vault acted as the Relying Party, validating the OIDC response, establishing the authenticated identity and then translating external claims into Vault-native authorization constructs such as identity groups, aliases, policies and namespace-scoped access boundaries.

While most AI developer might not care about these details, essentially I’m saying we can extend RBAC human control flows for variety of secret access use cases including static and dynamic credentials to decoupled service that offer richer logical for secrets automation.

When your done with the developer human flows, you can configure a secretless authentication method so K8, GitLab, or Cloud based federated identities frameworks can fetch the same Agentic creds needed or your Agentic AI container to operate.

Authentication vs. Authorization

A useful way to think about this pattern is to separate the responsibilities clearly.

Microsoft Entra ID is responsible for:

  • Authenticating the user
  • Issuing the OIDC authorization code and ID token
  • Emitting identity claims such as the user object ID and group membership

Vault, as the consuming application, is responsible for:

  • validating the OIDC flow and token
  • matching Entra group claims to Vault identity group aliases
  • mapping authenticated identities into internal authorization objects
  • enforcing authorization within the correct namespace

That distinction matters because many integration issues come from assuming the identity provider is also enforcing the target application’s authorization model. It is not. Entra provides trustworthy identity data, but Vault still has to decide what that identity is allowed to do.

Azure-Vault Specific Implementation details

For the Entra application registration, the most important values were:

  • Application (client) ID → used as Vault’s oidc_client_id
  • Client secret Value → used as Vault’s oidc_client_secret
  • Tenant-specific issuer/discovery URL → used as Vault’s oidc_discovery_url

A key point here is that Entra exposes several identifiers, but they are not interchangeable. The Application (client) ID identifies the OAuth/OIDC client. The secret Value is the confidential credential used by the client. The Secret ID is only a management identifier for the secret object and is not valid as a client secret or audience value.

Example OIDC configuration in Vault

vault write -namespace=admin auth/oidc/config ^
oidc_discovery_url="https://login.microsoftonline.com/<TENANT_ID>/v2.0" ^
oidc_client_id="<APPLICATION_CLIENT_ID>" ^
oidc_client_secret="<CLIENT_SECRET_VALUE>" ^
default_role="entra"

From an OIDC perspective, this establishes the client trust relationship between Vault and Microsoft Entra ID. Vault uses the discovery document to learn the issuer metadata, authorization endpoint, token endpoint, and signing keys. It then uses the configured client credentials during the authorization code exchange.

Role design and claim processing

The Vault OIDC role defined how claims from Entra would be interpreted:

vault write -namespace=admin auth/oidc/role/entra ^
role_type="oidc" ^
user_claim="oid" ^
groups_claim="groups" ^
allowed_redirect_uris="https://<VAULT_HOST>:8200/ui/vault/auth/oidc/oidc/callback" ^
allowed_redirect_uris="http://localhost:8250/oidc/callback" ^
oidc_scopes="openid,profile,email" ^
token_policies="default"

A few OpenID concepts are worth calling out here:

  • openid is what makes the request an OpenID Connect authentication request rather than just OAuth
  • user_claim="oid" uses the Entra user object ID as the stable subject identifier within Vault
  • groups_claim="groups" tells Vault to consume Entra group membership from the token claims and use that for identity mapping

This is where the consuming application begins translating federated identity into internal authorization decisions.

Azure group claims and Vault alias mapping

In Entra, I configured the application to emit group claims using ID, which causes Microsoft Entra ID to send the group object ID rather than the display name.

That meant the Vault group alias could not simply use the friendly Entra group name. It had to use the actual Entra Object ID value emitted in the groups claim.

vault write -namespace=admin identity/group-alias ^
name="<ENTRA_GROUP_OBJECT_ID>" ^
mount_accessor="<OIDC_MOUNT_ACCESSOR>" ^
canonical_id="<VAULT_EXTERNAL_GROUP_ID>"

Fetching and Injecting the Agent’s Secrets

All of that was pre-requisite foundational infra stuff that most people don’t need to worry about. You may solve that problem with a variety of other products such as cloud based solutions …

Next we need to fetch and inject the secrets into the Agentic runtime…

My wrapper scripts create a secrets config map that instructs the wrapper..

provider: static
static:
anthropic_api_key: ${ANTHROPIC_API_KEY}
openai_api_key: ${OPENAI_API_KEY}
vault:
addr: https://vault-cluster-public-vault-6bdb9cb5.d4c0296b.z1.hashicorp.cloud:8200
auth_method: oidc
namespace: admin/secsandman
oidc:
role: entra
mount_path: oidc
auth_namespace: "admin"
secrets:
- path: kv/opencode/anthropic
key: api_key
env_var: ANTHROPIC_API_KEY
- path: kv/opencode/openai
key: api_key

OIDC login + host-side Vault auth (vault_client.py)

def login_oidc(self) -> bool:
vault_env = os.environ.copy()
vault_env['VAULT_ADDR'] = self.vault_addr
vault_env['VAULT_TOKEN'] = ''
vault_env.pop('VAULT_NAMESPACE', None)
cmd = [self.vault_cli_path, 'login', '-method=oidc', '-format=json']
if self.oidc_auth_namespace:
cmd.append(f'-namespace={self.oidc_auth_namespace}')
if self.oidc_mount and self.oidc_mount != 'oidc':
cmd.append(f'-path={self.oidc_mount}')
cmd.append(f'role={self.oidc_role}')
result = subprocess.run(cmd, env=vault_env, check=False, timeout=120,
capture_output=True, text=True)
if result.stdout.strip():
data = json.loads(result.stdout)
token = data.get('auth', {}).get('client_token', '').strip()
if token:
self.vault_token = token
os.environ['VAULT_TOKEN'] = token
return True
return False

Launcher reading config + invoking login (run.py)

vault_cfg = raw_cfg.get('vault', {})
vault_addr = vault_cfg.get('addr', vault_addr)
vault_namespace = vault_cfg.get('namespace', '')
oidc_role = vault_cfg.get('oidc', {}).get('role', 'entra')
oidc_mount = vault_cfg.get('oidc', {}).get('mount_path', 'oidc')
oidc_auth_namespace = vault_cfg.get('oidc', {}).get('auth_namespace', '')
vault = VaultClient(vault_addr, vault_token, mode, vault_cli_path=vault_cli,
vault_namespace=vault_namespace, oidc_role=oidc_role, oidc_mount=oidc_mount,
oidc_auth_namespace=oidc_auth_namespace)
if not args.dev_mode:
if not vault.login_oidc():
sys.exit(1)

Injecting secrets into local env (host) (run.py)

def set_environment_variables(secrets: Dict[str, str]):
for var_name, var_value in secrets.items():
os.environ[var_name] = var_value

Passing injected secrets into container env (run.py)

env = os.environ.copy()
env.update(secrets)
env['WORKSPACE_PATH'] = workspace_path
...
result = subprocess.run(command, env=env, check=False)

Host cleanup after run (run.py)

def clear_environment_variables(var_names: list):
for var_name in var_names:
if var_name in os.environ:
del os.environ[var_name]
...
finally:
clear_environment_variables(['OPENAI_API_KEY', 'ANTHROPIC_API_KEY', 'GITHUB_TOKEN', 'VAULT_TOKEN'])
if not args.dev_mode:
docker_manager.down()

What about when hosting Agent outside of workstation?

For more complex workflows where we string together heterogeneous data sources and systems, then most decoupled secrets vaults can integrate natively with workload identity authentication (Kubernetes service accounts, cloud IAM roles, OIDC/JWT federation, PKI SPIRE/SPIFFE based identities, Entra Agent ID), so auth becomes environment-specific while secret read/write paths stay consistent.

In practice, that means you can build a developer framework that uses OIDC + MFA in early development, then move to workload identities in CI/CD or production without rewriting your secret access logic.

The key pattern is to keep a stable get_secret() / put_secret() interface and swap only the /login strategy (for example: login_oidc, login_kubernetes, login_aws_iam, login_jwt) via configuration or pluggable auth modules.

This gives you portability, least-privilege access, and a cleaner path from local dev to enterprise runtime security.

Disclaimer: Other Agentic Access Control (Not In Scope)

I want to acknowledge that simply integrating AI Agent logic to fetch static or dynamic JIT creds may not solve the broader access control problem that looms ahead, in some circumstance, it may inflame it.

As a security SME who sees all sorts of bad access control patterns, I feel a responsibility to digress and get off-topic for a second. It’s important to help educate folks on why and why-not a vaulting solution might play into your work.

Cloud Native Workload Identity

For example, if you planned on running an Agentic workload specifically in the cloud you may decide use federated workload identities and native cloud roles with cloud secrets a managers. In simplest terms, this use case may be perfect if you only call a cloud hosted inference endpoint or if you need to access cloud hosted knowledge based or database systems for grounding.

For complex uses with many nuanced products implementing access control differently, you may run into a few problems with this. Firstly, your team may manage secrets for a variety of clouds, runtimes, build pipelines located in various hosting environments and operationally supporting multiple de centralized vaults might not scale.

Dynamic / Rotated Secrets Engines

For autonomous and semi-autonomous distributed Agentic AI systems, dynamic secrets are valuable because they replace long-lived credentials with short-lived, on-demand leases that are tied to workload identity and revocable at TTL expiry, which materially reduces blast radius if an agent session, tool call or log stream is compromised.

HashiCorp Vault has dynamic-secret patterns (e.g., database/cloud leases with revocation and TTL semantics), but many teams compare it against newer SaaS-first platforms for operational simplicity and integration UX.

Infisical and Doppler both document dynamic secret leasing models (including TTL-bound, one-time retrieval behavior and automated revocation), though these capabilities are often anchored to provider-specific integrations (for example AWS IAM-style leases).

Akeyless positions itself around JIT and short-lived access with broad integration coverage—its public integration lists items like Snowflake JIT/Rotated access, GitLab auth, Kubernetes auth and customized patterns.

All this to say, in my weekend demo, I could simply change the API call and when I need a credential dynamically fetch a lime limited JIOT based cred or static cred as needed, then inject them into my Agentic runtime. After processing, they will likely expire limitting the blast radius.

JIT Human Approved MCP Workflows

A human-driven local sandbox (like my current setup) and a hosted runtime agentic platform solve different layers of the same problem. My model emphasizes developer-controlled execution for heterogeneous secrets and what will eventually packaged and externally hosted Agentic workloads.

The agent runs in a constrained local/containerized environment, secrets are fetched just-in-time on the host, and the human initiates and supervises actions directly.

By contrast, a agentic IAM solution like Oasis with AAM-style controls treats agents as first-class non-human identities and emphasizes centralized identity governance and ephemeral human credentials, approval gates for high-risk actions, policy enforcement and full cross-tool audit trails. But, this is human workflows… e.g. human instructs local agent to act, that action calls a tool or mcp to take action, then Oasis intercepts and requires approval based on policy.

In my opinion, these approaches are complementary because human will be guiding local desktop agents for awhile but there will still be a need for bridging the gap between heterogeneous secrets and releasing agents to autonomous workloads.

Authorization For Human Delegation to Distributed Agentic AI workloads

Although well outside the scope of this work, it is worth noting that even if we can build agentic workflows that retrieve static credentials or generate short-lived dynamic API keys, that only addresses one layer of the problem. It does not address the higher-level application or controller logic that accepts human prompts, interprets user intent and ultimately triggers those hosted agentic workloads. Say, you build a Chat-App that has ability to invoke 10, 20, 100 different agents with access to various tools, data, workflow tools etc…

There is a meaningful difference between accepting a federated request from a human user through a trusted identity provider + building an authorization control framework that ensures neither the user nor the agent can exceed or escalate the user’s intended access. Gold star if you fetch secrets from a vault or use dynamic secrets, but if that same Agent fetches high privileged dominating entitlements that allow the agent to impersonate any user, then we’re heading in the wrong direction.

For those interested in this area, there are several emerging white papers and proposals within the OpenID ecosystem, including concepts such as OpenID for Agents, SCIM for Agents, and broader “on behalf of” models. These efforts explore new extensions, claims, and trust schemes supported by cryptographically signed attestations that may help address this problem space.

That said, those topics were not the focus of this weekend’s work but are important to understand that both concepts are complimentary, and for now, necessary to each other.

Hardening the Network (Demo Only)

In my AI Agent playground, network controls are illustrated with a dedicated squid-proxy and container-level proxy env vars. This is a practical demo pattern to force egress through an inspectable choke point, while still letting agent workflows call model APIs.

Yes this is a bearer to most developer workflows but current security research has proved time-and-time again that remote prompt injection, supply chain and agents with access to local FS and tools, can and will install new binaries, packages and utilities and even rename them to bypass security controls. This illustrate the last line of defense that some type of out-of-band layer 7 egress filter should exist which limits the agents ability to create a command and control connection or download tools to elevate its permissions.


The demo also supports stronger modes: –isolated (no network) and –isolated-fs (no host FS mount, proxied network). That gives you a clear risk dial during testing of untrusted repos.

# .docker-compose/docker-compose.base.yml
environment:
- HTTP_PROXY=http://squid-proxy:3128
- HTTPS_PROXY=http://squid-proxy:3128
- NO_PROXY=localhost,127.0.0.1
networks:
- opencode-net
# cmd/claudecode/run.py
if isolation_mode == 'full':
command.extend(['--network', 'none'])
else:
command.extend(['--network', 'opencode-network'])
if isolation_mode == 'fs':
command.extend(['-e', 'HTTP_PROXY=http://squid-proxy:3128'])
command.extend(['-e', 'HTTPS_PROXY=http://squid-proxy:3128'])
command.extend(['-e', 'NO_PROXY=localhost,127.0.0.1'])

For example, a developer may need access local loopback and limited network egress to some destinations. This is not intended to be “safe” list but instead illustrate another layer that may help once a container attempts to “talk back” to the attackers environment. Additionally in enterprise environment you would more than likely want to use cloud based virtual; endpoints and internal DNS for your inference endpoints and developer tools to limit network egress. In this illustration board wildcard subdomains to azure or AWS can also be utilized by an attacker.

# build/squid/config/squid.conf
# ============================================================
# Squid Proxy Configuration for OpenCode
# FQDN-based allowlist for secure outbound access
# ============================================================
# Localhost and loopback access
acl localnet dst 127.0.0.0/8 ::1
acl localhost_names dstdomain localhost
# Private network access (10.x.x.x for internal resources)
acl private_networks dst 10.0.0.0/8
acl private_networks dst 172.16.0.0/12
acl private_networks dst 192.168.0.0/16
# SSL/HTTPS ports
acl SSL_ports port 443
acl Safe_ports port 80 # HTTP
acl Safe_ports port 443 # HTTPS
acl CONNECT method CONNECT
# ==== ALLOWED EXTERNAL SERVICES ====
# OpenAI API endpoints
# acl allowed_domains dstdomain .openai.com
# Anthropic API endpoints
# acl allowed_domains dstdomain .anthropic.com
# Google (for authentication/APIs)
# acl allowed_domains dstdomain .google.com
# acl allowed_domains dstdomain .googleapis.com
# Azure (for Azure OpenAI)
# acl allowed_domains dstdomain .azure.com
# acl allowed_domains dstdomain .microsoftonline.com
# GitHub (code repository/auth)
# acl allowed_domains dstdomain .github.com
# acl allowed_domains dstdomain .githubusercontent.com
# AWS (if using AWS Bedrock or other services)
acl allowed_domains dstdomain .amazonaws.com
# Common CDNs and package repositories
# acl allowed_domains dstdomain .cloudflare.com
# acl allowed_domains dstdomain .unpkg.com
# acl allowed_domains dstdomain .jsdelivr.net
# acl allowed_domains dstdomain .npmjs.com
# ========================================
# Deny requests to unsafe ports
http_access deny !Safe_ports
# Deny CONNECT to non-SSL ports
http_access deny CONNECT !SSL_ports
# Allow localhost access (critical for OpenCode)
http_access allow localnet
http_access allow localhost_names
# Allow private network access (10.x internal resources)
http_access allow private_networks
# Allow approved external domains (OpenAI, Anthropic, etc.)
http_access allow allowed_domains
# Deny all other access
http_access deny all
# Proxy listens on port 3128
http_port 3128
# PID file location
pid_filename /var/run/squid/squid.pid
# Cache settings (minimal caching for API responses)
cache deny all
# Logging for audit trail
access_log /var/log/squid/access.log squid
cache_log /var/log/squid/cache.log
# DNS configuration
dns_nameservers 8.8.8.8 1.1.1.1
# Performance tuning
maximum_object_size 10 MB

Hardening Agentic AI Container

The core hardening model is to run as non-root, drop Linux capabilities, block privilege escalation, apply AppArmor, and enforce CPU/memory/PID limits. In isolated mode, the root filesystem is read-only and writable paths are explicitly scoped to tmpfs, reducing persistence and limiting payload drop-and-exec patterns.

If your running in Windows, then running as a low privileged service account with ACLS bound to specific mount would be a similar comparison. Unfortunately, most people, including myself are running these agents as themselves, and ass filesystem and tool access loosens the agent can and will increase it’s blast radius.

# .docker-compose/docker-compose.base.yml
user: "1001:1001"
security_opt:
- no-new-privileges:true
- apparmor=${CLAUDECODE_APPARMOR_PROFILE:-unconfined}
cap_drop:
- ALL
mem_limit: 2g
cpus: 2
pids_limit: 100
# cmd/claudecode/run.py (isolated mode)
command = [
'docker', 'run',
'--rm',
'--read-only',
'--tmpfs', '/tmp:rw,noexec,nosuid,size=100m',
'--tmpfs', '/home/claudeuser/.local:rw,nosuid,size=64m',
'--tmpfs', '/home/claudeuser/.config:rw,nosuid,size=64m',
'--tmpfs', '/home/claudeuser/.cache:rw,nosuid,size=64m',
'--cap-drop', 'ALL',
'--security-opt', 'no-new-privileges:true',
'--security-opt', f'apparmor={apparmor_profile}',
'--pids-limit', '100',
'--memory', '2g',
'--cpus', '2'
]
# build/claudecode/Dockerfile
RUN useradd -m -s /bin/bash -u 1001 claudeuser && \
usermod -L claudeuser
USER claudeuser

CrackArmor exploits were released during this writing, so update AppArmor… It goes to show, don’t rely on a singular security control when engineering your solutions, assume things will fail.

https://blog.qualys.com/vulnerabilities-threat-research/2026/03/12/crackarmor-critical-apparmor-flaws-enable-local-privilege-escalation-to-root

Hardening the Claude and OpenCode Settings

Beyond container boundaries, agent policy is tightened with deny-first tool permissions. This reduces risky command classes (curl, wget, nc, inline eval patterns, sudo) and protects sensitive files. Ultimately, whatever tool access you provide, a compromised agent will “live off the land”. If it can WGET, then it might wget a new binaries, if it can npm or npx install then it might install it’s own agent in /tmpfs.

This layer is still important because prompt injection often targets agent tool behavior, not just OS-level controls. So the less files, folder, directories and tools it can access the better chance we can prevent or at least detect an unusual sequence of events.

# build/claudecode/Dockerfile -> /home/claudeuser/.claude/settings.json
{
"autoupdate": false,
"permissions": {
"allow": [],
"deny": [
"Bash(curl:*)",
"Bash(wget:*)",
"Bash(nc:*)",
"Bash(base64:*)",
"Bash(python -c:*)",
"Bash(node -e:*)",
"Bash(bash -c:*)",
"Bash(eval:*)",
"Bash(sudo:*)"
]
}
}
// build/opencode/Dockerfile -> opencode.json
{
"permission": {
"webfetch": "deny",
"external_directory": { "*": "deny" },
"bash": {
"curl*": "deny",
"wget*": "deny",
"nc*": "deny",
"python -c*": "deny",
"node -e*": "deny",
"sudo*": "deny"
},
"read": {
"**/.env*": "deny",
"**/*.key": "deny",
"**/*.pem": "deny",
"**/.ssh/**": "deny",
"**/.aws/**": "deny",
"**/auth.json": "deny"
}
}
}
# cmd/claudecode/run.py
disallowed_tools = ','.join([
'Bash(curl:*)', 'Bash(wget:*)', 'Bash(nc:*)',
'Bash(base64:*)', 'Bash(python -c:*)', 'Bash(node -e:*)',
'Bash(bash -c:*)', 'Bash(eval:*)', 'Bash(sudo:*)'
])
command.extend(['--disallowedTools', disallowed_tools])

Marketplace and Plugin Hardening and PreTook Hoooks

In early agent ecosystems (Claude MCP marketplace patterns, OpenCode plugin/tool chains), the trust model is still immature: publisher identity can be weak, updates can drift after approval, and tool metadata can change in ways users don’t notice. That creates an attack surface where a “trusted” plugin can become malicious later (rug pull) or a malicious plugin can abuse broad tool permissions to read sensitive files, execute shell commands or exfiltrate data over allowed network paths.

Even if your settings.conf/policy file is strict, plugin/tool execution might happen in paths not fully governed by that same policy layer. In practice, compromised plugins can still leverage allowed operations (filesystem read, proxy egress, prompt injection through tool docs) unless there are independent enforcement layers (runtime sandbox, network allowlist, signed artifacts, and pre-exec policy hooks).

In previous article, I demonstrated a command-and-control using A pre-tool hook that runs before an agent tool/action executes (Bash, Read, Edit, WebFetch, MCP tool call, etc.)..

What is still missing in Marketplaces (Critical Supply Chain Issues)

  • No verified publishers
  • No cryptographic publisher/signer verification workflow for MCP/plugins
  • No enforced private marketplace allowlist with admission control
  • No automatic “freeze + diff” check for tool schema/description (rug pull)

In this playground, I try to warn you…

# docs/architecture/SECURITY-ARCHITECTURE-SUMMARY.md
### MCP Server Hygiene
- Load only servers you have manually inspected
- Never load an untrusted server alongside a server that handles sensitive data
- ... cross-server tool shadowing ...
- Pin MCP server versions; tool descriptions can change post-approval (rug pull)

It also applies defense-in-depth at runtime:

# cmd/claudecode/run.py
disallowed_tools = ','.join([
'Bash(curl:*)','Bash(wget:*)','Bash(nc:*)','Bash(base64:*)',
'Bash(python -c:*)','Bash(node -e:*)','Bash(bash -c:*)',
'Bash(eval:*)','Bash(sudo:*)'
])
command.extend(['--disallowedTools', disallowed_tools])
# build/squid/config/squid.conf
http_access allow allowed_domains
http_access deny all
// build/opencode/Dockerfile -> opencode.json
"read": {
"**/.env*": "deny",
"**/*.key": "deny",
"**/*.pem": "deny",
"**/.ssh/**": "deny",
"**/.aws/**": "deny",
"**/auth.json": "deny"
}

Versioned Controlled and Digitally Signing Claude.MD?

Something, I’ve come across recently is the idea of centrally managing, versioning and digitally signing claude.md files. While I didn’t test this in my playground, I wanted to touch on it briefly because I think it’s worth exploring later.

All the work I’ve done here help protect the local corporate or developer workstation and local creds, but it doesn’t help provide a rich developer experience when using the tool itself or help protect malicious overly permissions agents. Even if you can isolate cred access, lock down the local filesystem etc. you still want to provide solid skills and tools WITHIN the AI agent itself to the user.

I Think this concept is pretty awesome and wanted a shout out here for it…

Why treat CLAUDE.md like a security control?


CLAUDE.md can act as the policy and operating manual for an AI coding agent which security checks to run first, which skills and tools are approved, which workflows require extra validation and what the agent must never do. If you package security first in claude.md then you have a security assistant and engineer with you the entire way.

# CLAUDE.md
## Security-first operating rules
1. Treat this file as the highest-priority project instruction source.
2. Never exfiltrate secrets, tokens, private keys, or credential material.
3. Never paste secrets into commits, issues, logs, or test fixtures.
4. Prefer least-privilege actions and dry-run modes where available.
5. Ask for confirmation before destructive actions:
- deleting data
- rotating credentials
- changing IAM or RBAC
- modifying production infra
6. Run security checks before proposing changes:
- dependency review
- static analysis
- secret scanning
- policy checks
7. Prefer approved internal tools and skills before ad hoc shell commands.
## Approved tools
- ripgrep
- git
- pytest
- semgrep
- trivy
## Approved skills
- .claude/skills/secure-review/SKILL.md
- .claude/skills/dependency-audit/SKILL.md
- .claude/skills/threat-model/SKILL.md
## Change-control rules
- Do not modify authentication, authorization, secrets, or CI/CD policy files without explaining impact.
- For identity flows, preserve issuer, audience, redirect URI, and group/role mapping semantics.
- Flag any privilege-escalation risk before making changes.
## Repo-specific context
- Primary auth model: OIDC federation
- Secrets backend: Vault KV v2
- Production changes require human review

Keeping security instructions, security tools and approved skills at the top helps make safe defaults explicit at the moment the agent starts work. But, these can drift or be changes overtime, creating risk…

Version controlling the file, with approvals creates an auditable change history, and digitally signing releases of it can help prove the instructions were approved and not silently altered. Of course this takes an additional wrapper or runtime agents to validate the claude.md, assist with versioning and underlying PKI infrastructure as well.. So it’s not for the faint of heart …but a cool novel idea.

Pseudo Code Example

import base64
import hashlib
import os
import subprocess
import sys
import threading
import time
from pathlib import Path
from nacl.signing import VerifyKey
from nacl.exceptions import BadSignatureError
CLAUDE_MD = Path("CLAUDE.md")
SIG_FILE = Path("CLAUDE.md.sig")
# Store the trusted public key out-of-band:
# env var, OS keychain, config management, etc.
PUBLIC_KEY_B64 = os.environ.get("CLAUDE_MD_PUBKEY_B64", "")
POLL_SECONDS = 2.0
def sha256_file(path: Path) -> str:
h = hashlib.sha256()
with path.open("rb") as f:
for chunk in iter(lambda: f.read(8192), b""):
h.update(chunk)
return h.hexdigest()
def load_verify_key() -> VerifyKey:
if not PUBLIC_KEY_B64:
raise RuntimeError("Missing CLAUDE_MD_PUBKEY_B64 environment variable")
return VerifyKey(base64.b64decode(PUBLIC_KEY_B64))
def verify_signature(doc_path: Path, sig_path: Path) -> None:
if not doc_path.exists():
raise FileNotFoundError(f"Missing instruction file: {doc_path}")
if not sig_path.exists():
raise FileNotFoundError(f"Missing signature file: {sig_path}")
verify_key = load_verify_key()
message = doc_path.read_bytes()
signature = base64.b64decode(sig_path.read_text().strip())
try:
verify_key.verify(message, signature)
except BadSignatureError as e:
raise RuntimeError("CLAUDE.md signature verification failed") from e
def launch_claude() -> subprocess.Popen:
# Adjust command for your environment.
# Example assumes `claude` is on PATH.
return subprocess.Popen(["claude"], stdout=sys.stdout, stderr=sys.stderr)
def monitor_file(proc: subprocess.Popen) -> None:
last_hash = sha256_file(CLAUDE_MD)
while proc.poll() is None:
time.sleep(POLL_SECONDS)
if not CLAUDE_MD.exists() or not SIG_FILE.exists():
print("[SECURITY] CLAUDE.md or signature file missing. Terminating Claude Code.")
proc.terminate()
return
current_hash = sha256_file(CLAUDE_MD)
if current_hash != last_hash:
print("[SECURITY] CLAUDE.md changed. Re-verifying signature...")
try:
verify_signature(CLAUDE_MD, SIG_FILE)
print("[SECURITY] Updated CLAUDE.md has a valid signature.")
last_hash = current_hash
except Exception as e:
print(f"[SECURITY] Verification failed after change: {e}")
proc.terminate()
return
def main() -> int:
try:
verify_signature(CLAUDE_MD, SIG_FILE)
print("[SECURITY] CLAUDE.md signature valid.")
except Exception as e:
print(f"[SECURITY] Startup verification failed: {e}")
return 1
proc = launch_claude()
watcher = threading.Thread(target=monitor_file, args=(proc,), daemon=True)
watcher.start()
return proc.wait()
if __name__ == "__main__":
raise SystemExit(main())

While developers or local admins could still bypass this additional layer, hopefully it’s useful enough that you wouldn’t want to. Also supply chains threats like marketplace plugins can also bypass the claud.md file protection. Even so, I still think this is another important layer worth exploring in other work.

References

Leave a comment