Audit-Ready AI Models
GenAI Model Risk Standards
5-point compliance framework with red-team testing built in
Download Free GuideFeatures List
7
Performance Thresholds
6
Red-Team Attack Vectors
Days
Not Months to Red-Team
Why Risk and Compliance Teams Use This Standard
Register Before You Deploy
Every GenAI model must be registered within 5 business days of deployment. This standard gives you a 13-field inventory spec, a unique model ID format, and a lifecycle status system — Draft, Pending, Active, Suspended, Retired. Your MRM team always knows what's running and who owns it.
Risk Tier Every Application
Not every AI tool carries the same risk. The Green/Amber/Red tiering framework tells your team exactly which controls apply to each deployment. Internal productivity tools go Green. Customer-facing financial outputs go Red. No grey areas, no guessing
Red-Team Before Go-Live
Amber and Red tier models must pass adversarial testing across 6 attack vectors: prompt injection, jailbreaking, data extraction, hallucination probing, bias testing, and toxicity. This standard gives you the test descriptions and pass criteria. Red-teaming takes days, not months, when you know what to test.
7 Thresholds, No Ambiguity
Accuracy must hit 85%. Hallucination rate must stay below 5%. PII leakage tolerance is zero. Prompt injection resistance must reach 95%. These thresholds are documented, measurable, and defensible. Auditors get numbers, not assertions.
Change Control That Holds
Seven change types trigger mandatory revalidation — foundation model changes, system prompt modifications, RAG knowledge base updates, API provider changes, and more. Each change type maps to a revalidation scope and approval authority. Your model stays compliant through every update.
What You Get
-
Model Registration & Inventory Spec
A 13-field mandatory registration template with model ID format (GENAI-[BU]-[YYYY]-[NNN]), owner assignment, use case description, foundation model, deployment type, data classification, risk tier, and lifecycle status fields.
-
3-Tier Risk Classification Framework
Green, Amber, and Red tier criteria with real-world examples for each tier. Includes a full controls matrix showing what's required vs. optional at each tier across 9 control dimensions.
-
Pre-Production Validation Requirements
Functional testing standard (minimum 100 test cases, 95% pass rate), prompt chain testing protocol, and a 6-vector red-team testing framework for Amber and Red tier models.
-
7 Quantitative Performance Thresholds
Documented pass/fail thresholds for accuracy (85%), bias score (0.10 max deviation), PII leakage (zero), prompt injection resistance (95%), hallucination rate (5% max), response consistency (90%), and latency P95 (5 seconds).
-
Change Control Requirements Matrix
Seven change types mapped to revalidation scope and approval authority. Covers foundation model changes, version updates, system prompt modifications, RAG updates, data source changes, API provider changes, and use case expansion.
-
Production Monitoring Schedule
Seven ongoing monitoring activities with frequency and responsible party. Covers output quality sampling, latency tracking, error rate analysis, user feedback review, drift detection, bias monitoring, and quarterly security scans.
-
10-Field Interaction Logging Specification
Mandatory log fields for all GenAI interactions: timestamp, user ID, session ID, input prompt, model output, model version, latency metrics, token usage, error codes, and HITL review outcome.
-
Incident Classification & Response
4-level severity matrix (Critical, High, Medium, Low) with response SLAs and escalation paths. Critical incidents require C-Suite notification within 1 hour. Includes a 7-step response procedure from detection to lessons learned.
-
Document Retention Schedule
Retention requirements by document type: model registration and risk assessments retained for model life plus 7 years, validation evidence for model life plus 5 years, interaction logs from 6 to 24 months by tier.
-
Exception Management Process
5-step formal exception process: submit request, document justification, propose compensating controls, obtain AI Oversight Committee approval, and renew annually. Exceptions do not auto-renew.
When to Use This Standard
Standing up a GenAI MRM program
Your organization is deploying AI across multiple business units and needs a formal model risk management standard. This document gives you the full framework — registration, risk tiering, validation, monitoring, change control, and incident management — in one structured standard you can adopt or adapt.
Preparing for an AI model audit
Regulators or internal auditors are asking how your GenAI models are validated, monitored, and controlled. This standard maps to EU AI Act, GDPR, PDPA, and industry-specific requirements. Section 11.2 lists exactly what auditors can request — and you'll have all of it documented.
Running red-team testing for the first time
Your team needs to adversarially test an AI model but doesn't know where to start. Section 6.1.3 gives you 6 attack vectors with test descriptions. Minimum 50 prompt injection patterns. Jailbreaking attempts using published techniques. Hallucination probing with edge cases. You can run structured red-team testing in days.
Managing an AI model change or upgrade
You're updating a foundation model, modifying a system prompt, or switching API providers. Section 7 tells you which changes trigger mandatory revalidation, what the revalidation scope covers, and who needs to approve. No change slips through without the right controls.
Building a cross-functional AI risk team
You're defining responsibilities across Model Owners, the AI Risk Officer, MRM Team, and Internal Audit. Section 3's Three Lines of Defense model gives each function specific GenAI responsibilities — from model registration through independent validation to audit reporting.
The 5-Point Compliance Framework
| Compliance Area | What It Covers | Key Threshold | Applies To |
|---|---|---|---|
| 1. Model Registration | Mandatory inventory registration within 5 business days; 13-field spec; lifecycle status tracking | 100% of models must be registered before production | All tiers |
| 2. Risk Tiering | Green/Amber/Red classification based on use case, data, and output type; full controls matrix by tier | Controls escalate from self-assessment (Green) to Committee approval (Red) | All tiers |
| 3. Validation & Testing | 100 test cases minimum; 95% pass rate; prompt chain testing; 6-vector red-team for Amber/Red | 95% functional pass rate; 7 quantitative thresholds met | Amber / Red mandatory |
| 4. Ongoing Monitoring | 7 production monitoring activities from daily output sampling to quarterly security scans; 10-field logging | Daily automated + weekly human output quality review | All active models |
| 5. Change Control | 7 change types trigger revalidation; emergency procedure with 10-day full validation window | Foundation model changes require full revalidation | All tiers |
Common Questions
Who is this standard written for?
How does this differ from the GenAI Governance Policy (AICFO-007)?
What is red-team testing and how long does it take?
What are the 7 quantitative performance thresholds?
Does this standard cover third-party AI tools like ChatGPT or Claude?
What happens if we can't meet one of the requirements?
Your AI Models Need Standards. Start Here.
A complete GenAI MRM standard for risk and compliance teams. Free. Updated for 2026.