On AI and Trust. Q&A with Karl Mehta

by Roberto Zicari · Published November 19, 2025 · Updated November 20, 2025

Q1. What inspired you to create TrustModel.ai, and what problem were you aiming to solve in the AI space?

The inspiration came from witnessing a dangerous disconnect between AI’s transformative potential and the infrastructure needed to deploy it responsibly. I’ve spent the past 25 years building AI models since 1999 in 3 of my companies across MobileAria(now WirelessMatrix), PlaySpan (now VISA) and EdCast (now Cornerstone). In 2022, I wrote a book “AI for DPI – Digital Public Infrastructure” and I kept seeing the same pattern: powerful technology deployed without the foundational trust infrastructure it requires.

I had a realization: we’re making the same mistake with AI that we made with the early internet. In the 1990s, companies rushed to deploy web applications without security infrastructure, and we spent the next two decades dealing with breaches, vulnerabilities, and building cybersecurity as an afterthought. With AI, the stakes are even higher—these systems make autonomous decisions affecting millions of lives.

The specific problem I set out to solve was this: How do you quantify trust? Companies were asking “Is this AI model safe?” but had no objective way to answer. It was all subjective assessment, vendor claims, and compliance theater. We needed the equivalent of Moody’s for credit ratings or Verisign for digital certificates—an independent third party that could evaluate AI systems rigorously and say definitively: “This model meets these standards.”

I founded TrustModel.ai to create that infrastructure layer. Not as a nice-to-have, but as essential infrastructure—the cybersecurity equivalent for the AI era. Every company deploying AI needs this, just like every company needs firewalls and encryption.

Q2. How does TrustModel.ai ensure transparency and accountability in AI model deployment?

We’ve built transparency and accountability into our core architecture through three fundamental mechanisms:

First, independent third-party evaluation. We’re not the model vendor, we’re not the consulting firm getting paid for a favorable report—we’re the independent evaluator. Think of us like UL (Underwriters Laboratories) for electrical safety or the FDA for pharmaceuticals. Our business model depends on maintaining rigorous, objective standards. When we certify a model, organizations can trust that assessment because we have no incentive to inflate or minimize findings.

Second, quantifiable, reproducible metrics. We don’t issue vague statements like “this model is generally safe.” We provide specific, measurable data: P95 latency under load, hallucination detection rates across 10,000 test cases, reasoning consistency scores, bias metrics across demographic groups, compliance ratings for specific regulatory frameworks. Every metric is reproducible—we provide Docker-containerized evaluation scripts so organizations can verify our findings in their own infrastructure.

Third, continuous monitoring, not point-in-time assessment. AI models drift, degrade, and develop unexpected behaviors over time. We track models across their entire lifecycle, flagging when performance changes, when safety characteristics shift, when compliance status is affected by regulatory updates. This creates an audit trail—a transparent record of how the model has behaved over time.

The accountability comes from making everything visible to stakeholders who need it. Technical teams get detailed performance metrics. Compliance teams get regulatory validation. Executives get dashboard views of trust scores and risk indicators. Board members get summary reports they can present to investors and regulators. Everyone sees the same ground truth, which eliminates finger-pointing and creates shared accountability.

Q3. Can you walk us through how your platform helps organizations maintain compliance with evolving AI regulations?

This is actually one of our core differentiators, because we recognized early that AI regulation wouldn’t be a single framework—it would be a constantly evolving patchwork across 20+ jurisdictions. The EU AI Act, NIST AI Risk Management Framework, state-level regulations in California and New York, emerging frameworks across APAC—each with different requirements, different timelines, different penalties.

Here’s how we help organizations navigate this complexity:

Automated multi-jurisdictional assessment. When a company deploys an AI system, we simultaneously evaluate it against every relevant regulatory framework for the markets they operate in. We’ve codified the requirements from 20+ jurisdictions into our evaluation engine. So instead of manual interpretation and documentation for each region, we provide instant compliance scoring: “This model meets EU AI Act requirements for high-risk systems, complies with NIST AI RMF Tier 3, satisfies California SB 1047 provisions…”

Regulatory change monitoring. AI regulations are evolving monthly. We track every proposed regulation, every amendment, every enforcement guidance globally. When the EU updates AI Act technical standards or when a US state proposes new requirements, we update our evaluation criteria and immediately re-assess affected models. Organizations get alerts: “Your customer service AI needs attention—new EU guidelines on emotional recognition just took effect.”

Compliance documentation automation. Regulators don’t just want claims of compliance—they want evidence. We generate the documentation automatically: detailed evaluation reports, test results, risk assessments, mitigation strategies. When a regulator asks “How do you know this model is compliant?”—the answer is ready, comprehensive, and defensible.

Pre-deployment compliance validation. Before launching in a new market, companies can submit their model for evaluation against that jurisdiction’s requirements. We identify gaps, suggest modifications, and re-certify once changes are made. This eliminates the “deploy first, deal with compliance later” approach that creates massive legal risk.

The key insight is that compliance isn’t a one-time checkbox—it’s continuous infrastructure. We make it automatic, comprehensive, and defensible.

Q4. What sets your trust and risk management framework apart from others in the market?

Three fundamental differentiators that no one else combines:

One: We’re infrastructure, not consulting. Most “AI governance” solutions are either consulting engagements—where firms do bespoke assessments that cost hundreds of thousands of dollars and take months—or they’re internal tools that require significant implementation effort. We’re neither. We’re scalable, automated infrastructure. Submit your model, get a comprehensive evaluation in hours or days, not months. It’s the difference between hiring accountants to manually audit your books versus using automated financial software.

Two: We evaluate ALL models—foundation and custom—with the same rigor. Other platforms might benchmark GPT vs. Claude on a few tasks, or they’ll do custom model evaluation if you pay for a consulting engagement. We do both, with the same comprehensive methodology. Whether you’re choosing between foundation models or you’ve built a proprietary medical diagnosis system, you get the same 12+ domain evaluation, the same compliance validation, the same continuous monitoring. This matters because enterprises use hybrid approaches—foundation models for some tasks, custom models for others—and they need unified visibility.

Three: We’re global from day one. Most solutions focus on one regulatory framework—maybe EU AI Act compliance, maybe NIST alignment. We cover 20+ jurisdictions simultaneously because that’s the reality for any enterprise operating globally. You can’t deploy AI in Europe only, or North America only—you need worldwide reach. We’re the only platform providing truly global compliance validation.

But here’s what really sets us apart: we’re building a network effect. Every model we evaluate strengthens our benchmarks. Every regulatory update we track improves our compliance engine. Every customer that submits a custom model contributes to our understanding of AI behavior patterns. We’re not just a tool—we’re becoming the definitive dataset on AI trust and safety globally.

Think of us as the Bloomberg Terminal for AI trust—comprehensive, authoritative, essential infrastructure that everyone in the ecosystem relies on.

Q5. How do you see the relationship between human oversight and AI governance evolving over the next few years?

This is fascinating because I think we’re heading toward a fundamental inversion in how we think about human oversight.

Right now, most organizations approach AI governance as “humans checking AI decisions.” A human reviews the AI’s loan approval, a human validates the AI’s medical diagnosis recommendation, a human approves the AI’s hiring decision. This doesn’t scale—you end up with human bottlenecks everywhere, and ironically, the humans often don’t have the expertise to meaningfully evaluate what the AI did.

I see us moving toward “AI systems operating within quantified trust boundaries, with humans governing those boundaries.” Here’s the difference:

Instead of reviewing individual AI decisions, humans will set the parameters: “This AI system must maintain a hallucination rate below 0.1%, must demonstrate reasoning consistency above 95%, must show no demographic bias above 2%, must comply with these specific regulatory requirements.” The AI operates autonomously within those boundaries, and automated infrastructure—platforms like TrustModel.ai—continuously monitors whether the boundaries are maintained.

When boundaries are approached or breached, that’s when humans intervene—not to review individual decisions, but to understand why the boundary was breached and whether the boundary itself needs adjustment.

This is actually more aligned with how we think about it from a Vedantic perspective, which deeply influences my thinking. In Vedantic philosophy, there’s a concept of dharma—operating within one’s proper sphere of responsibility. AI systems have their dharma—operating within well-defined, measurable trust boundaries. Humans have their dharma—setting those boundaries thoughtfully, monitoring the systems, and adapting as we learn.

The evolution I’m most excited about is moving from “human-in-the-loop” to “human-on-the-loop”—where humans aren’t bottlenecks in every decision, but governors of the system maintaining strategic oversight.

Q6. From your experience, what are some common misconceptions companies have about AI trust and risk management?

I encounter three major misconceptions constantly:

Misconception One: “AI trust is a technical problem that technical teams can solve.”

The reality is that AI trust is a socio-technical problem that requires technical rigor, regulatory expertise, ethical frameworks, and stakeholder management all at once. I see brilliant ML engineers who can optimize model performance to incredible levels but have no framework for evaluating fairness. I see lawyers who understand regulatory requirements but can’t interpret what a 0.3% hallucination rate means in practice.

The truth is you need interdisciplinary infrastructure. That’s why we built TrustModel.ai to speak multiple languages—providing technical metrics for engineers, compliance validation for legal teams, risk scores for executives, and understandable summaries for boards. AI trust isn’t something any single department owns—it’s organizational infrastructure.

Misconception Two: “We’ll address AI trust and safety after we prove the technology works.”

This is the “we’ll add security later” thinking that plagued early internet companies. What companies don’t realize is that trust and safety constraints often change the technical approach. If you build an AI system without considering explainability requirements, you might use a black-box approach that performs brilliantly but can never be certified for regulated industries. Then you have to rebuild from scratch.

The companies getting this right are doing trust and safety from day one—using our platform during development, not just before deployment. They’re making architecture decisions informed by compliance requirements, designing with transparency in mind, building monitoring into the system from the start.

Misconception Three: “Point-in-time audits or certifications are sufficient.”

I call this the “annual physical” approach to AI governance. You get checked once a year, get a clean bill of health, then go about your business. But AI models aren’t static—they drift, they degrade, they develop unexpected behaviors as they interact with real-world data. A model certified as safe in January might exhibit concerning behaviors by March.

What companies need is continuous monitoring—the equivalent of continuous glucose monitoring for diabetics rather than annual blood tests. That’s why our platform operates 24/7, tracking model behavior, flagging anomalies, updating compliance status as regulations change. AI trust isn’t an annual event; it’s daily infrastructure.

These misconceptions are expensive. They lead companies to under-invest in trust infrastructure, then face regulatory fines, reputational damage, or catastrophic failures. My goal is to shift the conversation: AI trust isn’t overhead—it’s essential infrastructure that enables faster, safer deployment.

Q7. Could you share an example of how one of your customers has successfully used TrustModel.ai to mitigate model risks or improve outcomes?

I’ll share a composite example that represents a pattern we see frequently, while respecting confidentiality:

A large financial services company was deploying a custom AI model for credit decisioning—determining loan approvals, credit limits, interest rates. They had already built the model, it performed well in internal testing, and they were ready to launch across multiple markets including the EU and several US states.

Before launch, they submitted the model to TrustModel.ai for evaluation. Our platform immediately flagged three critical issues:

First, demographic bias. While their internal testing showed “acceptable” performance, our granular analysis revealed that the model was systematically offering worse terms to applicants in certain zip codes that correlated strongly with protected demographic characteristics. This wasn’t intentional—it was emergent from historical data patterns—but it would have violated fair lending regulations and exposed them to massive regulatory penalties.

Second, explainability gaps. The EU AI Act requires that high-risk AI systems provide meaningful explanations for their decisions. Their model could generate predictions but couldn’t provide the detailed reasoning chains required for compliance. This would have blocked their entire EU rollout.

Third, performance degradation patterns. Our continuous monitoring during their pilot phase detected that the model’s accuracy was declining for certain customer segments as it encountered real-world data that differed subtly from training data. Left unchecked, this would have led to increasingly poor decisions over time.

Here’s what happened next:

They used our detailed evaluation reports to guide model refinement—adjusting training data, adding explainability layers, implementing bias mitigation techniques. After three iterations, the model passed our comprehensive evaluation. They deployed with confidence, knowing it met all regulatory requirements.

But the real value came post-deployment. Six months later, our continuous monitoring detected early signs of model drift—performance metrics declining slightly but consistently. We alerted them before it became a problem, they investigated and discovered that economic conditions had shifted in ways that required model retraining. They updated the model, re-certified through our platform, and deployed the new version—all before customers or regulators noticed any issues.

The outcome: they successfully deployed in all target markets, maintained regulatory compliance, avoided what could have been millions in fines and reputational damage, and most importantly, they’re making fairer, more reliable credit decisions that benefit their customers.

This is the pattern we see again and again—organizations that integrate trust and safety infrastructure from the beginning move faster, deploy with confidence, and avoid the catastrophic failures that plague companies that treat governance as an afterthought.

Q8. How does TrustModel.ai integrate with existing machine learning operations platforms and workflows?

We designed TrustModel.ai specifically to fit into existing MLOps workflows rather than requiring companies to rip and replace their infrastructure. Think of us as the trust and safety layer that sits alongside your existing ML pipeline—complementing tools like MLflow, Kubeflow, SageMaker, or Azure ML rather than competing with them.

The integration works at multiple levels:

API-first architecture. We provide comprehensive REST APIs that let organizations submit models for evaluation, retrieve results, query compliance status, and access continuous monitoring data programmatically. This means you can integrate TrustModel.ai directly into your CI/CD pipeline—every model version gets automatically evaluated before deployment, just like it gets automatically tested and security-scanned.

Pre-deployment gates. Many of our customers use us as a deployment gate. Before a model moves from staging to production, it must pass TrustModel.ai evaluation. If it doesn’t meet trust thresholds or compliance requirements, the deployment is blocked automatically. This prevents “oops, we deployed an unsafe model” scenarios.

Monitoring integration. We provide webhooks and streaming APIs that push real-time alerts into your existing monitoring infrastructure—Datadog, Splunk, PagerDuty, whatever you’re using. When we detect model drift, compliance violations, or safety anomalies, your team gets notified through their existing channels. No need to check another dashboard.

Custom model submission. For proprietary models, we provide Docker-based evaluation environments where you maintain full control of your model. You don’t send us your model weights or training data—you run our evaluation framework in your infrastructure and share only the results. This addresses IP and data privacy concerns that are non-negotiable for many enterprises.

MLOps platform partnerships. We’re actively integrating with major MLOps platforms as native partners. The goal is that if you’re using Azure ML or Databricks, TrustModel.ai evaluation becomes a built-in option—one click to submit for comprehensive trust and safety assessment.

The philosophy here is simple: we don’t want to be another tool that creates workflow friction. We want to be invisible infrastructure—trust and safety that happens automatically as part of how you already work. The best integration is one developers don’t have to think about; it just works.

Q9. Looking ahead, what advancements or features are you most excited to bring to TrustModel.ai?

Three major areas I’m incredibly excited about:

First: Real-time safety guardrails during inference. Right now, we evaluate models before deployment and monitor them continuously post-deployment. But imagine if we could provide real-time safety validation during inference—analyzing each AI-generated response before it reaches the user, flagging potential issues, even preventing harmful outputs in milliseconds. This would transform trust from “monitor and alert” to “prevent and protect.” We’re building this capability now, and it has profound implications for high-stakes deployments.

Second: Collaborative safety intelligence. Here’s a vision that draws from my work with consciousness and collective intelligence: What if every organization using TrustModel.ai contributed anonymized learnings about AI safety patterns to a shared knowledge base? When one company discovers a novel failure mode or mitigation technique, everyone benefits. We’re building the infrastructure for this—think of it as “federated learning for AI safety.” No company shares proprietary models or data, but collectively we build a comprehensive understanding of AI risks and mitigations that no single organization could develop alone.

Third: Predictive risk assessment. Currently, we detect issues when they occur. But we’re developing predictive capabilities—using our massive dataset of model behaviors to forecast: “Based on patterns we’ve observed, this model has a 73% probability of exhibiting bias issues in this demographic segment within the next 30 days.” This shifts organizations from reactive to proactive risk management. You fix problems before they manifest.

Beyond features, I’m excited about something larger: establishing TrustModel.ai as the definitive standard for AI trust globally. I want a world where “TrustModel.ai certified” carries the same weight as “FDA approved” or “UL certified”—where regulators reference our standards, where enterprise procurement requires our certification, where consumers look for our seal of approval.

This isn’t about building a successful company—though I certainly aim to do that. It’s about creating infrastructure that enables the responsible AI future we all want. I’ve spent years studying how consciousness emerges, how wisdom traditions understand truth and trust, how technology transforms society. TrustModel.ai is my attempt to ensure that as AI becomes ubiquitous, trust and safety are not afterthoughts but foundational principles.

We’re building the immune system for the AI ecosystem—protecting it from internal threats while enabling it to reach its full transformative potential. That’s what gets me out of bed every morning.

Closing Thought

The question I ask myself constantly is: “Ten years from now, what will we wish we had built today for AI governance?” TrustModel.ai is my answer to that question. We’re not waiting for catastrophic failures to force us to build trust infrastructure—we’re building it now, proactively, with the rigor and comprehensiveness this technology revolution demands.

The future of AI isn’t just about what the technology can do—it’s about whether we can trust it to do it responsibly. That’s the future we’re building at TrustModel.ai.

………………………………………………………….

Karl Mehta is a serial entrepreneur, author, investor, engineer, and civil servant with over 20 years of experience in founding, building, and funding technology companies in the U.S. and international markets. He is currently Founder & CEO of EdCast Inc., an AI-powered knowledge-cloud platform company backed by Stanford University & Softbank Capital. He is former venture partner at Menlo Ventures, a leading VC firm of Silicon Valley with over $4B under management.

Previously, he was the Founder & CEO of PlaySpan Inc., acquired by Visa Inc. (NYSE:V), the world’s largest payment network. Karl also served as a White House Presidential Innovation Fellow, selected by the Obama Administration during the inaugural 2012-13 term. In 2014 he was appointed by Governor Brown to the Workforce Investment Board of the State of California. In 2010, Karl won the “Entrepreneur of the Year” award from Ernst & Young for Northern California. Karl is on the boards of Simpa Networks and on the advisory board of Intel Capital and Chapman University’s Center of Entrepreneurship.

Karl is founder of several non-profit’s including Code For India (http://CodeforINDIA.org) and Grassroots Innovation (http://grassrootsinnovation.org ). He is author of ‘Financial Inclusion at the Bottom of the Pyramid”(http://www.openfininc.org)

………………………………………………………..

Ramesh Chitor

Ramesh Chitor is a seasoned business leader with over 20 years of experience in the high-tech industry working for Mondee. Ramesh brings a wealth of expertise in strategic alliances, business development, and go-to-market strategies. His background includes senior roles at prominent companies such as IBM, Cisco, Western Digital, and Rubrik, where he served as Senior Director of Strategic Alliances. Ramesh is actively contributing as a Business Fellow for Perplexity.

Ramesh is a value and data-driven leader known for his ability to drive successful business outcomes by fostering strong relationships with clients, partners, and the broader ecosystem. His expertise in navigating complex partnerships and his forward-thinking approach to innovation will be invaluable assets to Perplexity’s growth and strategic direction.

Connect on LinkedIn

On AI and Trust. Q&A with Karl Mehta

You may also like...

Resources

Search

News

Events

Archives

Sponsored By

Amazon Web Services (AWS)

InterSystems

MySQL/Oracle

SingleStore

Supporters

Exasol

McObject

Progress

Raima

Scality

Undo

Volt Active Data