Colorado SB 24-205 and NAIC Model Bulletin: What They Mean for Your AI Tools
Colorado SB 24-205 requires insurers using AI in underwriting and pricing to test for unfair discrimination and maintain governance documentation. The NAIC Model Bulletin provides a framework that other states are expected to follow. If you use AI in underwriting or claims, you need a compliance plan now.
If you work in insurance and use AI in any part of your underwriting, pricing, or claims process, regulatory requirements are no longer theoretical. Colorado’s SB 24-205, signed into law in May 2024, creates specific obligations for insurers using AI and algorithmic decision-making. The NAIC Model Bulletin, adopted in December 2023, provides a framework that other states are beginning to adopt.
This guide translates both documents from regulatory language into practical requirements. We also compare three compliance monitoring tools (Monitaur, Arthur AI, and Credo AI) for insurance organizations that need to implement governance programs.
What Colorado SB 24-205 Actually Requires
Colorado SB 24-205 is titled “Concerning Consumer Protections for Artificial Intelligence.” It applies to “developers” and “deployers” of “high-risk AI systems,” and insurance is explicitly listed as a high-risk domain. Here is what the law requires, in plain language.
Governance Framework
Deployers (insurers using AI systems) must implement a governance program that includes:
- Risk assessment. Before deploying an AI system, you must assess the risk of algorithmic discrimination. This is not a one-time check; you must reassess at least annually or when the system is materially modified.
- Documentation. You must maintain records of what AI systems you use, what decisions they influence, what data they were trained on, and what testing you have performed. The documentation must be sufficient for a regulator to understand your AI decision-making process.
- Impact assessments. For high-risk AI systems (which includes underwriting and pricing models), you must complete impact assessments that analyze the potential for unfair discrimination based on protected classes (race, gender, age, disability, and others).
Testing for Unfair Discrimination
This is the core requirement that most affects day-to-day operations. SB 24-205 requires deployers to:
- Test AI outputs for disparate impact across protected classes before deployment and on an ongoing basis.
- Document testing methodology including the statistical methods used, the protected classes tested, and the results.
- Take corrective action if testing reveals unfair discrimination. “Corrective action” can mean model adjustment, additional human oversight, or discontinuing use of the system.
The law does not prescribe specific testing methodologies. It requires that testing be “reasonable” and that results be documented. This gives insurers flexibility in choosing methods but also means there is no safe harbor from following a specific checklist.
Transparency and Notice
Deployers must:
- Inform consumers when an AI system has made or substantially contributed to an adverse decision (denial, rate increase, claim denial).
- Provide a right to appeal adverse decisions made by AI systems, with human review available.
- Disclose to the Division of Insurance the types of AI systems used in regulated insurance activities.
Penalties
Non-compliance is treated as an unfair trade practice under Colorado insurance law. The Division of Insurance can impose fines, require corrective actions, and in serious cases, take action against an insurer’s license. Specific penalty amounts are determined by the severity and willfulness of the violation.
Effective Date and Timeline
SB 24-205 was signed in May 2024, with most provisions taking effect February 1, 2026. If you are reading this in 2026, the compliance clock is already running.
What the NAIC Model Bulletin Says
The National Association of Insurance Commissioners adopted its “Model Bulletin on the Use of Artificial Intelligence Systems by Insurers” in December 2023. While model bulletins are not law (individual states must adopt them), the NAIC bulletin provides the framework that state regulators are using to shape their AI governance requirements.
Key Principles
The NAIC Model Bulletin establishes five principles for insurer AI use:
- Fair and ethical. AI systems must not unfairly discriminate against consumers.
- Accountable. Insurers are responsible for the outcomes of AI systems they use, including third-party systems and vendor models.
- Compliant. AI use must comply with existing insurance laws and regulations, including unfair trade practice laws.
- Transparent. Insurers should be able to explain AI-driven decisions in a manner appropriate to the audience (regulators, consumers, internal stakeholders).
- Secure and safe. AI systems must be protected against misuse, manipulation, and data breaches.
The AIO (Artificial Intelligence and Predictive Models) Governance Framework
The NAIC bulletin calls for an “AIO Program” within each insurer, which includes:
- Board or senior management oversight of AI governance.
- Written AIO policies and procedures covering development, deployment, monitoring, and retirement of AI systems.
- Risk-based tiering of AI systems by their potential impact on consumers. Higher-impact systems (underwriting, pricing, claims decisions) require more rigorous governance.
- Third-party risk management. If you use vendor AI models (predictive scoring, automated underwriting, claims triage), you are still responsible for governance and testing. “My vendor built it” is not a defense.
- Ongoing monitoring of AI system performance, accuracy, and fairness after deployment.
How the NAIC Bulletin Relates to Colorado SB 24-205
Colorado’s law is more specific and enforceable than the NAIC bulletin. The NAIC bulletin provides principles and a governance framework; Colorado’s law creates legal obligations with penalties. In practice, implementing the NAIC bulletin’s framework will put you most of the way toward Colorado compliance, but you need to add the specific testing, documentation, and consumer notification requirements from SB 24-205.
Which States Are Following Colorado’s Lead
Colorado was first, but other states are moving:
Connecticut (Bulletin IC-47): Issued guidance requiring insurers to establish AI governance programs and conduct bias testing. Less prescriptive than Colorado but establishes regulatory expectations.
New York (DFS Circular Letter): The New York Department of Financial Services has issued guidance on AI in insurance underwriting, requiring insurers to ensure AI models do not produce unfairly discriminatory outcomes. New York has historically been the most aggressive state regulator on insurance practices.
California (pending): Multiple AI-related bills have been introduced in the California legislature. Given California’s insurance market size and regulatory history, whatever California enacts will affect a large portion of the industry.
Other states watching: Virginia, Illinois, and Maryland have introduced or discussed AI governance legislation. The NAIC bulletin provides the template that most states are expected to follow.
The practical implication: even if you do not write business in Colorado, building your governance program now positions you for the regulations that are coming in your state.
What This Means for Your AI Tools
If you use AI in any of these functions, you need to assess your compliance obligations:
Underwriting Models
Predictive models that score risks, recommend pricing, or flag applications for review are high-risk AI systems under both Colorado SB 24-205 and the NAIC framework. This includes:
- Third-party underwriting scores (LexisNexis, Verisk, Cape Analytics property scores)
- In-house predictive models for risk selection
- Automated triage rules that determine which applications receive expedited vs. full review
- AI-driven premium optimization
For each of these, you need to document what the model does, test for disparate impact, and maintain records of your testing.
Claims Models
AI systems that influence claims decisions are explicitly covered. This includes:
- Automated claims triage and routing
- Fraud detection models (Shift Technology, FRISS, and similar tools)
- Reserve-setting algorithms
- Automated denial or settlement recommendations
Fraud detection models deserve special attention. A model that disproportionately flags claims from specific demographic groups for fraud investigation could create unfair discrimination even if the model’s intent is legitimate.
Pricing Algorithms
Any AI or algorithmic system that influences the price a consumer pays is covered. This includes:
- Price optimization models
- Rating factor models that use non-traditional data (credit-based insurance scores, telematics, property imagery)
- Dynamic pricing algorithms
Rating models are already subject to rate filing requirements in most states, but the AI governance requirements add bias testing obligations that go beyond traditional actuarial review.
Compliance Tooling Comparison
Three platforms offer tools specifically designed for AI governance in regulated industries, including insurance. Here is how they compare.
| Feature | Monitaur | Arthur AI | Credo AI |
|---|---|---|---|
| Focus | Insurance-specific AI governance | ML monitoring and bias detection | GRC framework for AI |
| Founded by | Former insurance regulators | ML research team (ex-Uber, ex-Google) | AI governance researchers |
| Bias detection | Yes, insurance-specific metrics | Yes, statistical and model-level | Yes, policy-driven |
| Model monitoring | Continuous performance tracking | Real-time model monitoring | Risk-based monitoring |
| Regulatory mapping | NAIC bulletin, state-specific requirements | General regulatory frameworks | Multi-framework (NIST, EU AI Act, state laws) |
| Documentation generation | Automated compliance docs for regulators | Model cards, audit reports | AI risk assessments, policy docs |
| LLM safety | Limited | Yes (Arthur Shield for LLM outputs) | Limited |
| Insurance traction | Strong (built for insurance) | Growing (cross-industry) | Growing (cross-industry) |
| Pricing | Enterprise (contact sales) | Enterprise (contact sales) | Enterprise (contact sales) |
| Deployment | Cloud, on-prem | Cloud, on-prem | Cloud |
Monitaur: Built by Former Regulators
Monitaur was founded by people with direct insurance regulatory experience. Their platform is purpose-built for the compliance requirements that insurers face, including NAIC bulletin alignment, state-specific regulatory reporting, and insurance-specific bias metrics.
Strengths: Monitaur speaks the language of insurance regulators. Their compliance documentation templates map directly to the NAIC bulletin framework and Colorado SB 24-205 requirements. When you need to explain your AI governance program to a state examiner, Monitaur generates documentation in the format regulators expect. Their bias testing methodology accounts for insurance-specific protected classes and rate-filing requirements.
Limitations: Monitaur is narrower than Arthur AI or Credo AI in technical ML monitoring capabilities. If you need real-time model performance dashboards, drift detection, or LLM safety monitoring, Monitaur’s technical depth is less than Arthur AI’s. Monitaur is best for governance and compliance rather than technical ML operations.
Best for: Insurance companies that need to build or improve their AI governance program specifically for regulatory compliance. If your primary concern is satisfying regulators and documenting compliance, Monitaur is the most direct path.
Arthur AI: Technical ML Monitoring
Arthur AI is a technical platform for monitoring machine learning models in production. Their toolset includes bias detection, performance monitoring, explainability, and (through Arthur Shield) LLM output safety monitoring.
Strengths: Arthur AI’s bias detection is statistically rigorous, supporting multiple fairness metrics (demographic parity, equalized odds, predictive parity) and enabling continuous monitoring as model outputs change over time. For insurers running predictive underwriting or claims models, Arthur AI can detect when model behavior drifts toward discriminatory patterns before it shows up in regulatory examinations. Arthur Shield, their LLM monitoring product, is unique among these three tools. If your organization uses LLMs (for claims summarization, underwriting notes, customer communication), Arthur Shield monitors outputs for hallucinations, toxicity, and policy violations.
Limitations: Arthur AI is not insurance-specific. The platform does not generate documentation in the format insurance regulators expect. You will need to translate Arthur AI’s technical outputs into regulatory compliance language, either manually or through additional tooling. The platform assumes your team includes ML engineers or data scientists who can interpret bias metrics and model performance data.
Best for: Insurance companies with ML engineering teams that need deep technical monitoring of production models. If you have data scientists building and maintaining predictive models, Arthur AI gives them the monitoring tools they need. Pair with Monitaur for regulatory documentation if you need both technical monitoring and compliance reporting.
Credo AI: GRC Framework for AI
Credo AI takes a governance, risk, and compliance (GRC) approach to AI oversight. Their platform maps AI systems to regulatory requirements (including Colorado SB 24-205, NIST AI Risk Management Framework, and EU AI Act), generates risk assessments, and automates policy documentation.
Strengths: Credo AI excels at the organizational governance layer. If you need to inventory all AI systems in use, assign risk ratings, map them to regulatory requirements, and track compliance across multiple jurisdictions, Credo AI provides a structured framework for doing so. Their policy automation generates AI-use policies, risk assessments, and impact analyses that align with multiple regulatory frameworks simultaneously. For insurers operating across many states, this multi-jurisdictional approach is valuable.
Limitations: Credo AI’s bias detection is less technically deep than Arthur AI’s. The platform is better at governance documentation and risk assessment than at statistical analysis of model outputs. Like Arthur AI, Credo AI is cross-industry rather than insurance-specific, so regulatory mapping to insurance-specific requirements (rate filing, unfair trade practice law) requires configuration.
Best for: Insurance companies that need an enterprise-wide AI inventory and governance framework. If your CRO or compliance officer needs visibility into all AI systems, their risk ratings, and their regulatory compliance status, Credo AI provides that organizational layer. Best paired with a technical monitoring tool (Arthur AI or Monitaur) for actual model testing.
Practical Compliance Checklist
Based on Colorado SB 24-205 requirements and the NAIC Model Bulletin framework, here is a practical checklist for insurance organizations using AI.
Immediate Actions (Do Now)
- Inventory all AI systems in use across underwriting, claims, pricing, and operations. Include vendor models and third-party scores, not just in-house models.
- Classify by risk tier. Systems that directly influence consumer-facing decisions (underwriting, pricing, claims) are high-risk. Systems that support internal operations (document extraction, scheduling) are lower risk.
- Assign governance ownership. Someone (compliance officer, CRO, or designated AIO officer) must be accountable for AI governance. This should not be a part-time afterthought.
- Review vendor contracts. Your AI vendor agreements should include provisions for bias testing access, model documentation, and regulatory audit support. If they do not, renegotiate.
Short-Term Actions (Next 90 Days)
- Conduct initial bias testing on your highest-risk AI systems. Test for disparate impact across protected classes (race, gender, age, disability). Document methodology and results.
- Document AI decision flows. For each high-risk AI system, create a written description of what data goes in, what the model does, and how the output influences decisions. A regulator should be able to read this document and understand your process.
- Implement consumer notification. When an AI system contributes to an adverse decision (denial, surcharge, claim denial), your consumer communication must disclose AI involvement and offer appeal with human review.
- Establish monitoring cadence. Define how often you will re-test AI systems for bias and performance. Quarterly is a common cadence for high-risk systems; annual may suffice for lower-risk tools.
Ongoing Actions (Quarterly or As Needed)
- Re-test for bias when models are updated, when training data changes, or when you enter new markets/demographics.
- Review third-party model changes. When vendors update their models, you need to reassess compliance. Build vendor notification requirements into your contracts.
- Update documentation. Your AI governance documentation is a living set of records, not a one-time filing. Update it when systems change.
- Track regulatory developments. As additional states adopt AI governance requirements, update your compliance program to address new jurisdictions.
Timeline: What to Do When
| Timeframe | Colorado SB 24-205 | NAIC Model Bulletin | Your Organization |
|---|---|---|---|
| Already effective | Governor signed May 2024 | Adopted Dec 2023 | Review AI inventory |
| Feb 2026 | Most provisions effective | States beginning adoption | Governance program operational |
| 2026-2027 | Enforcement begins | More states adopt | Ongoing testing, documentation |
| 2027+ | Potential amendments based on enforcement experience | Possible NAIC updates | Mature compliance program |
The Cost of Inaction
Compliance has a cost, but non-compliance costs more. Beyond regulatory penalties, the reputational risk of an AI discrimination finding is substantial. State insurance departments publish enforcement actions. Consumer advocacy groups monitor AI-related complaints. Class-action attorneys are watching this space closely.
The practical cost of a basic compliance program (AI inventory, initial bias testing, governance documentation) is modest compared to the cost of remediation after a regulatory finding. Our estimate, based on conversations with compliance officers at four carriers:
- Basic compliance program (small carrier): $50,000-$150,000 in first-year costs including tooling, consulting, and staff time.
- Moderate compliance program (mid-size carrier): $150,000-$500,000 including enterprise tooling, dedicated staff, and ongoing monitoring.
- Remediation after regulatory finding: $500,000-$5 million+ including fines, legal fees, model rework, and business disruption.
Starting now, even with a basic program, is significantly cheaper than waiting for a regulatory examination to expose gaps.
This article provides general guidance based on our reading of Colorado SB 24-205 and the NAIC Model Bulletin. It is not legal advice. Consult with insurance regulatory counsel for guidance specific to your organization and the states in which you operate.