Oct 9, 2023

Navigating the Shift: From Traditional Machine Learning Governance to LLM-centric AI Governance

The rapid advancement of large language models (LLMs) like ChatGPT has captured headlines and imaginations. AI systems can now generate amazingly human-like text on any topic with just a few prompts.These behemoths, with their unparalleled capabilities, have necessitated a reevaluation of governance models. As organizations explore integrating LLMs into business operations, it's crucial to implement governance measures enabling innovation while managing risks. As executives, understanding the transition from traditional machine learning governance to LLM-centric AI governance is crucial.

The Old Guardrails - Governing Narrow AI

Up until recently, most real-world AI applications involved narrow systems designed to perform specific tasks. For example, an AI system may be trained to analyze mortgage applications and flag ones likely to default for further review. This helps loan officers make better risk assessments. Or a system may classify skin lesions based on dermatology images to assist doctors screening for cancer.

To ensure these AIs were developed responsibly, groups like the IEEE and OECD published ethical principles centered on transparency, accountability, fairness, and human oversight. Organizations adopting narrow AI would then implement governance frameworks based on these guidelines.

For instance, a hospital deploying an AI system for diagnostic screening may have documented how the system was developed, tested it extensively for biases, established protocols allowing doctors to override recommendations, and conducted ongoing audits to monitor for issues. While not infallible, these efforts demonstrated a commitment to using AI responsibly within the constraints of narrow systems.

Components of Narrow AI Governance

Documentation: Traditional models, being simpler, allowed for comprehensive documentation, detailing every aspect from data sources to model architecture.
Risk Management: Risks were often uniform and could be addressed with standardized frameworks. For instance, overfitting was a common concern, addressed through techniques like cross-validation.
Integration: Traditional models, being less complex, integrated relatively seamlessly with legacy systems.
Operations Management: Monitoring and managing traditional models was straightforward, often requiring basic tools to track performance metrics.
Accuracy Monitoring: Drifts in accuracy were easier to spot and rectify using established methodologies.
Change Management: Updates to models were infrequent and well-documented, ensuring smooth transitions.
Auditing & Accountability: Given their transparency, traditional models were easier to audit, ensuring biases and errors could be traced and rectified.

The Governance Challenge of Open-Ended LLMs

LLMs like ChatGPT represent a seismic shift from narrow AI. Their ability to generate coherent, nuanced text on virtually any topic makes governing them far more complex.

Unlike narrow AI builders, organizations adopting LLMs have limited visibility into how they'll actually be used once deployed. Their broad capabilities open endless possibilities for benevolent and concerning applications that can't all be anticipated upfront.

For example, an LLM could help overwhelmed teachers create personalized learning plans for students. But it could also aid the spread of misinformation by generating convincing propaganda text. And even if used properly, LLMs may behave unpredictably when prompted on sensitive issues like politics and mental health.

So how can organizations govern open-ended tools like LLMs responsibly?

Challenges by component

Documentation: LLMs, with their vastness, are often seen as black boxes. Documenting them demands a shift from detailing architecture to focusing on inputs, outputs, and potential risks.
Risk Management: LLMs introduce unique risks, like perpetuating misinformation. Traditional risk frameworks, designed for simpler models, often fall short.
Integration: Integrating LLMs with legacy systems is a monumental task, given their complexity and resource demands.
Operations Management: LLMs require specialized tools for real-time monitoring, given their dynamic nature.
Accuracy Monitoring: LLMs can drift subtly, making traditional monitoring tools inadequate. For instance, an LLM might start generating slightly biased content over time, unnoticed by conventional tools.
Change Management: LLMs are frequently updated, and each update can significantly alter outputs. Ensuring these changes don't disrupt existing systems is a challenge.
Auditing & Accountability: Auditing LLMs demands sophisticated tools and methodologies, given their complexity and lack of transparency.

Evolving Governance for LLMs: The Way Forward

Thankfully, LLMs' very versatility also enables beneficial applications narrow AI could never support. The key is evolving governance to encourage innovation while managing risks.

Rather than just system-centric technical governance, organizations need policies guiding appropriate LLM uses across teams. For instance, a media company could establish guidelines allowing using LLMs to generate draft articles but prohibiting high-risk applications like automatically generating political op-eds.

Additionally, LLMs' broad use cases mean governance efforts should coordinate across industries, government, and civil society. Since full explainability of LLMs is technically elusive, governance should rely more on improved testing methods to evaluate LLMs for issues like biases. Focusing the highest scrutiny on high-risk use cases can maximize oversight where it's needed most.

Evolving components of AI governance for LLMs

Focus on Workflows: Instead of diving deep into LLM architecture, governance should focus on how LLMs are used, their inputs, outputs, and overarching risks
Dynamic Risk Management: Recognize the unique risks posed by LLMs and develop dynamic frameworks to address them.
Bespoke Integration Solutions: Develop custom solutions to integrate LLMs with existing systems, ensuring seamless operation.
Advanced Operations Tools: Invest in state-of-the-art tools to guardrail and manage LLMs interactions in real-time.
Proactive Accuracy Monitoring: Implement advanced instrumentation that can detect subtle drifts in LLM outputs.
Agile Change Management: Adopt agile methodologies to manage the frequent updates and refinements LLMs undergo.
Transparent Auditing: Develop transparent auditing methodologies that can trace LLM outputs, ensuring accountability.

In conclusion, while the rise of LLMs has disrupted traditional governance models, it also presents an opportunity. By understanding the nuances of LLM-centric governance, organizations can harness the power of LLMs responsibly and effectively.

If you are working on adopting governance or risk frameworks for using LLMs at your organization I would love to chat. Please reach out at diego@guardrailsai.com.

About the author:

Diego Oppenheimer is a serial entrepreneur, product developer and investor with an extensive background in all things data. Currently, he is a Managing Partner at Factory a venture fund specialized in AI investments as well as a co-founder at Guardrails AI. Previously he was an executive vice president at DataRobot, Founder and CEO at Algorithmia (acquired by DataRobot) and shipped some of Microsoft's most used data analysis products including Excel, PowerBI and SQL Server.
Diego is active in AI/ML communities as a founding member and strategic advisor for the AI Infrastructure Alliance and MLops.Community and works with leaders to define AI industry standards and best practices. Diego holds a Bachelor's degree in Information Systems and a Masters degree in Business Intelligence and Data Analytics from Carnegie Mellon University.

See Similar Articles

Guardrails x MLflow: Deterministic Safety, PII, and Quality Validators as GenAI Scorers

Mar 4, 2026

Guardrails x MLflow: Deterministic Safety, PII, and Quality Validators as GenAI Scorers

Mar 4, 2026

Guardrails AI and NVIDIA NeMo Guardrails - A Comprehensive Approach to AI Safety

Sep 25, 2025

Guardrails AI and NVIDIA NeMo Guardrails - A Comprehensive Approach to AI Safety

Sep 25, 2025

Introducing the AI Guardrails Index

Feb 12, 2025

Introducing the AI Guardrails Index

Feb 12, 2025

Guardrails x MLflow: Deterministic Safety, PII, and Quality Validators as GenAI Scorers

Mar 4, 2026

Guardrails AI and NVIDIA NeMo Guardrails - A Comprehensive Approach to AI Safety

Sep 25, 2025