GenAI Model Risk Standards

Audit-Ready AI Models

5-point compliance framework with red-team testing built in

Download Free Guide

Features List

7

Performance Thresholds

6

Red-Team Attack Vectors

Days

Not Months to Red-Team

Why Risk and Compliance Teams Use This Standard

Register Before You Deploy

Every GenAI model must be registered within 5 business days of deployment. This standard gives you a 13-field inventory spec, a unique model ID format, and a lifecycle status system — Draft, Pending, Active, Suspended, Retired. Your MRM team always knows what's running and who owns it.

Risk Tier Every Application

Not every AI tool carries the same risk. The Green/Amber/Red tiering framework tells your team exactly which controls apply to each deployment. Internal productivity tools go Green. Customer-facing financial outputs go Red. No grey areas, no guessing

Red-Team Before Go-Live

Amber and Red tier models must pass adversarial testing across 6 attack vectors: prompt injection, jailbreaking, data extraction, hallucination probing, bias testing, and toxicity. This standard gives you the test descriptions and pass criteria. Red-teaming takes days, not months, when you know what to test.

7 Thresholds, No Ambiguity

Accuracy must hit 85%. Hallucination rate must stay below 5%. PII leakage tolerance is zero. Prompt injection resistance must reach 95%. These thresholds are documented, measurable, and defensible. Auditors get numbers, not assertions.

Change Control That Holds

Seven change types trigger mandatory revalidation — foundation model changes, system prompt modifications, RAG knowledge base updates, API provider changes, and more. Each change type maps to a revalidation scope and approval authority. Your model stays compliant through every update.

What You Get

Model Registration & Inventory Spec

A 13-field mandatory registration template with model ID format (GENAI-[BU]-[YYYY]-[NNN]), owner assignment, use case description, foundation model, deployment type, data classification, risk tier, and lifecycle status fields.
3-Tier Risk Classification Framework

Green, Amber, and Red tier criteria with real-world examples for each tier. Includes a full controls matrix showing what's required vs. optional at each tier across 9 control dimensions.
Pre-Production Validation Requirements

Functional testing standard (minimum 100 test cases, 95% pass rate), prompt chain testing protocol, and a 6-vector red-team testing framework for Amber and Red tier models.
7 Quantitative Performance Thresholds

Documented pass/fail thresholds for accuracy (85%), bias score (0.10 max deviation), PII leakage (zero), prompt injection resistance (95%), hallucination rate (5% max), response consistency (90%), and latency P95 (5 seconds).
Change Control Requirements Matrix

Seven change types mapped to revalidation scope and approval authority. Covers foundation model changes, version updates, system prompt modifications, RAG updates, data source changes, API provider changes, and use case expansion.
Production Monitoring Schedule

Seven ongoing monitoring activities with frequency and responsible party. Covers output quality sampling, latency tracking, error rate analysis, user feedback review, drift detection, bias monitoring, and quarterly security scans.
10-Field Interaction Logging Specification

Mandatory log fields for all GenAI interactions: timestamp, user ID, session ID, input prompt, model output, model version, latency metrics, token usage, error codes, and HITL review outcome.
Incident Classification & Response

4-level severity matrix (Critical, High, Medium, Low) with response SLAs and escalation paths. Critical incidents require C-Suite notification within 1 hour. Includes a 7-step response procedure from detection to lessons learned.
Document Retention Schedule

Retention requirements by document type: model registration and risk assessments retained for model life plus 7 years, validation evidence for model life plus 5 years, interaction logs from 6 to 24 months by tier.
Exception Management Process

5-step formal exception process: submit request, document justification, propose compensating controls, obtain AI Oversight Committee approval, and renew annually. Exceptions do not auto-renew.

When to Use This Standard

Standing up a GenAI MRM program

Your organization is deploying AI across multiple business units and needs a formal model risk management standard. This document gives you the full framework — registration, risk tiering, validation, monitoring, change control, and incident management — in one structured standard you can adopt or adapt.

Preparing for an AI model audit

Regulators or internal auditors are asking how your GenAI models are validated, monitored, and controlled. This standard maps to EU AI Act, GDPR, PDPA, and industry-specific requirements. Section 11.2 lists exactly what auditors can request — and you'll have all of it documented.

Running red-team testing for the first time

Your team needs to adversarially test an AI model but doesn't know where to start. Section 6.1.3 gives you 6 attack vectors with test descriptions. Minimum 50 prompt injection patterns. Jailbreaking attempts using published techniques. Hallucination probing with edge cases. You can run structured red-team testing in days.

Managing an AI model change or upgrade

You're updating a foundation model, modifying a system prompt, or switching API providers. Section 7 tells you which changes trigger mandatory revalidation, what the revalidation scope covers, and who needs to approve. No change slips through without the right controls.

Building a cross-functional AI risk team

You're defining responsibilities across Model Owners, the AI Risk Officer, MRM Team, and Internal Audit. Section 3's Three Lines of Defense model gives each function specific GenAI responsibilities — from model registration through independent validation to audit reporting.

The 5-Point Compliance Framework

Compliance Area	What It Covers	Key Threshold	Applies To
1. Model Registration	Mandatory inventory registration within 5 business days; 13-field spec; lifecycle status tracking	100% of models must be registered before production	All tiers
2. Risk Tiering	Green/Amber/Red classification based on use case, data, and output type; full controls matrix by tier	Controls escalate from self-assessment (Green) to Committee approval (Red)	All tiers
3. Validation & Testing	100 test cases minimum; 95% pass rate; prompt chain testing; 6-vector red-team for Amber/Red	95% functional pass rate; 7 quantitative thresholds met	Amber / Red mandatory
4. Ongoing Monitoring	7 production monitoring activities from daily output sampling to quarterly security scans; 10-field logging	Daily automated + weekly human output quality review	All active models
5. Change Control	7 change types trigger revalidation; emergency procedure with 10-day full validation window	Foundation model changes require full revalidation	All tiers

Common Questions

Who is this standard written for?

How does this differ from the GenAI Governance Policy (AICFO-007)?

What is red-team testing and how long does it take?

What are the 7 quantitative performance thresholds?

Does this standard cover third-party AI tools like ChatGPT or Claude?

What happens if we can't meet one of the requirements?

Your AI Models Need Standards. Start Here.

A complete GenAI MRM standard for risk and compliance teams. Free. Updated for 2026.

Download Free