MRM Thread The MRM Thread
Regulatory compliance and model risk management for AI recommendation systems, from a GARP FRM practitioner perspective.
[MRM Thread] Ep 1 — Why MRM Belongs in the Architecture
The validation-first view of Model Risk Management breaks once the model becomes an LLM agent pipeline — multi-step attack surface, non-traditional failure modes, drift between validation cycles. Push MRM into the architecture itself, not the review calendar.
[MRM Thread] Ep 2 — Champion-Challenger as a Gate
A Monday-3am walkthrough of `_decide_promotion()` — the 4-step short-circuit ladder (force-promote, bootstrap, fidelity floor, competition) that replaces a 2-to-4-week MRM committee cycle with seconds, and every outcome writes one HMAC-signed audit entry.
[MRM Thread] Ep 3 — Auditing the Auditors: Chain of Custody and Consensus Arbitration
Seven audit tables and an HMAC hash chain give you 'continuity of record'. But who verifies the record? The trap of the single-LLM auditor, multi-agent consensus with α/β/γ perspectives, a minority-report-that-never-gets-deleted design, and why AWS parallel voting and on-prem 2-round deliberation chose different paths.
[MRM Thread] Ep 4 — When Explanation Is Architecture: Inherent XAI and FD-TVS Scoring
Post-hoc XAI (SHAP, LIME) wobbled on us when we ran the cost and stability numbers. We chose architectural XAI instead — the gate weights of seven heterogeneous PLE experts as the explanation, with CEH attribution and Mahalanobis OOD as the second and third layers. FD-TVS is the operational scoring philosophy that grew out of fixing the on-prem per-product weights model.
[MRM Thread] Ep 5 — RAG + LanceDB: Why Audit Infrastructure Is a Retrieval Problem
We started with the idea that the audit log was a write-only archive. The first time a risk officer needed similar-case context inside a five-minute decision window, that idea broke. RAG over LanceDB is what came out of refusing to maintain two copies of the same source of truth — and what unlocked human oversight, fairness monitoring, and quarterly aggregation as queries on the same store.
[MRM Thread] Ep 6 — Modular Adaptability: When Regulations Evolve, Architecture Doesn't
The early design temptation was a single ComplianceReporter class parametrised on jurisdiction. We rejected it after walking through what amendment-driven change would look like. The five separate generators, sitting on a regulation-agnostic substrate of audit log + XAI + retrieval, are the structural bet that pays off only when regulations actually start to move.