The AI agent security checklist for production teams

You have AI agents in production — or you're about to. Here's the eight-point security checklist that governance-conscious teams should run through before an incident forces the conversation.

  • AI agent security
  • production AI security
  • LLM agent governance
  • AI checklist

You have AI agents in production, or you are about to. The model works. The tooling is solid. The demos are clean.

What is less clean: the governance layer. Before your agents touch production systems, here is the checklist that security-conscious teams should run through — before an incident forces the conversation.

1. Does every agent have a defined identity?

An agent that runs anonymously is an agent you cannot audit. In production, every agent must have a registered identity with a clear trust level.

This means:

  • A unique agent identifier, not a shared service account
  • An explicit trust tier (e.g., read-only, write-restricted, write-approved)
  • A record of what that agent is permitted to do — and what it is not

If your agents share credentials with other systems or run under a catch-all API key, this is the first thing to fix.

What to verify: Can you produce a list of all agents currently running in your environment, their trust levels, and their permitted tool categories?

2. Have you inventoried every tool the agent can call?

Agents are only as risky as the tools they have access to. A read-only agent that can also call send_email or delete_record is not read-only.

Before production, document every tool exposed to every agent. For each tool, answer:

  • What data can this tool read?
  • What state can this tool modify?
  • Does this tool call an external service?
  • Does this tool require authentication, and whose credentials does it use?

If you are using MCP servers, this becomes: every tool exposed by every server registered to this agent, across every session that server has open.

What to verify: Can you produce a complete tool inventory for each agent, including the permission scope of each tool call?

3. Are write-capable actions gated on human approval?

This is the check that most teams skip, and the one that matters most when something goes wrong.

Read-only agent actions (summarise, analyse, retrieve) can often run autonomously. Write-capable actions (send, create, update, delete, execute) should require a human in the loop for anything that touches production data or external systems — at least until you have enough run history to trust a specific action pattern.

Your approval mechanism needs to:

  • Intercept the action before execution, not after
  • Present the redacted action to a reviewer (do not expose raw credentials or sensitive arguments)
  • Log the reviewer’s decision and the reasoning
  • Resume or abort the agent session based on that decision

Without a pre-execution approval gate, you have post-incident review, not governance.

What to verify: Do you have a mechanism that pauses agent execution before write-capable actions and routes to a human reviewer?

4. Are you logging the right things — and not logging the wrong things?

Audit logging for AI agents is not the same as application logging. You need:

  • A canonical trace ID per request that links prompt → tool call → outcome
  • A record of what the agent decided to do, and why (policy decision, not just outcome)
  • What data was detected in the request and how it was handled
  • What approval was granted, by whom, and when

What you should not be logging:

  • Raw prompts containing PII, secrets, or client-confidential data
  • Full tool call arguments containing bearer tokens or connection strings
  • Unredacted user input in append-only audit records

Logging everything is not a compliance posture. Logging the right things — with PII and secrets handled before they hit the log — is.

What to verify: Can you produce a trace for any given agent action, from initial request to final outcome, without that trace containing raw sensitive data?

5. Can you kill an agent session in real time?

If an agent enters an unexpected state — looping, escalating permissions, making calls you did not expect — you need to stop it before it completes. Not after.

This means:

  • A kill switch that terminates an active agent session immediately
  • A mechanism that prevents the session from resuming automatically after termination
  • A record of why the session was terminated and who triggered it

“We can redeploy without that agent” is not a kill switch.

What to verify: Can you terminate a specific running agent session right now, within 30 seconds, without taking down adjacent services?

6. Are provider credentials scoped to the session — not the agent?

If your agents authenticate to AI providers or tools using long-lived API keys, those keys become a target the moment any part of your pipeline is compromised.

The safer architecture: issue short-lived, scoped tokens at the session boundary. The token is valid for one session, with one set of permitted actions. When the session ends, the token expires. No long-lived secrets sitting in your agent runtime.

What to verify: Are your agent-to-provider credentials session-scoped and auto-expiring, or are they long-lived keys that could be extracted from your deployment?

7. Do your agents run under a policy document — or under assumption?

Most agent deployments run under assumption: the developer assumed the agent would behave a certain way, but nothing enforces that assumption at runtime.

A policy document means:

  • Explicit allow/deny rules for which models the agent can call
  • Explicit allow/deny rules for which tools and actions the agent can take
  • Explicit rules for what triggers an approval requirement vs. automatic execution
  • Version-controlled policy — so you can see what changed, when, and why

Policy-as-code for agents is not a nice-to-have for regulated teams. It is the thing your auditor will ask for.

What to verify: Do you have a written, versioned policy document governing each production agent? Could you produce it during an audit?

8. Have you tested what happens when an agent fails?

Failure modes for AI agents are different from failure modes for deterministic code. An agent can:

  • Hallucinate a tool call that does not exist
  • Loop on a task it cannot complete
  • Escalate its own permissions if the policy layer is not enforced at the execution boundary
  • Silently succeed at the wrong action

Before production, run failure scenarios:

  • What happens if the approval gate is unreachable?
  • What happens if the agent calls a tool that returns an error?
  • What happens if the agent tries to call a tool it is not permitted to use?
  • What happens if the LLM returns a malformed tool call argument?

If the answer to any of these is “we are not sure,” that is a production readiness gap.

What to verify: Do you have documented failure scenarios and expected behaviour for each?

The production-ready threshold

An AI agent is production-ready from a governance perspective when you can answer yes to all eight checks above.

Most teams launching agents today can answer yes to two or three. The gaps are not a sign of carelessness — they are a sign of a market that moved faster than the tooling.

Qadar’s Shield Control closes these gaps: agent identity registry, tool inventory, pre-execution approval gates, session-scoped credentials, policy enforcement, and audit trail — deployed as a single governance layer that your operations team can configure without custom development.


Preparing agents for production? See how Qadar governs the full agent lifecycle. Book a walkthrough

Get a live walkthrough of your AI exposure.

Every request is reviewed against your AI surface, control gaps, and rollout goals before the first call.

  • Scoped to your stack, workflows, and risk posture
  • Pilot-first rollout — no platform rip-and-replace required
  • Response from the Qadar team within 48 hours

Requests are reviewed by the Qadar team — response within 48 hours.