Agent Name Service (ANS)

DNS for AI Agents: Building Trust, Governance, and Security for AI-Agent Workflows

Akshay Mittal

PhD Scholar, University of the Cumberlands | IEEE Senior Member

GSDC AI in Action 2026 — Global Webinar

January 2026

Good morning, everyone. Thank you for joining this webinar. I'm Akshay Mittal, a PhD researcher at University of the Cumberlands and an IEEE Senior Member. Over the next 30 minutes, I'm going to take you on a journey from the fundamental trust problem we face with AI agents today, to a production-ready solution that's already changing how organizations deploy autonomous AI systems. What I want to show you is something I've been building for the past year—what I call "DNS for AI Agents." But this isn't just a naming system. It's a complete trust layer that gives autonomous AI the security foundation it desperately needs. By the end of this session, you'll understand not just what ANS is, but how it works, why it matters, and how you can implement it in your own environments. Let's dive in.

The Agentic AI Revolution

From supervised ML to autonomous agent orchestration

📊 Traditional ML Pipeline

Human-supervised at every step
Data → Train → Deploy → Monitor
Manual intervention required
Days to weeks for updates

🤖 Agentic AI Reality

Autonomous agent orchestration
Concept-drift detector → Auto-retrainer
Deployer → Monitor → Remediate
Minutes to hours for updates

Critical Question: Who are these agents? Can we trust them?

Let me start by setting the context. Raise your hand—virtually, of course—if you've deployed a machine learning model in the last year. Now, keep that hand up if that model is still performing exactly as it did on day one. The reality is, models drift. Data changes. Performance degrades. And in the agentic AI world, we're not just managing models—we're managing autonomous agents that make decisions, take actions, and interact with each other. Traditional ML was human-supervised at every step. You had data validation, then manual review. Model training, then human approval. Deployment, then manual verification. Monitoring, then reactive alerts. Agentic AI changes this completely. Now we have concept-drift detectors that automatically trigger retraining agents. Deployment agents that update production systems. Monitoring agents that remediate issues without human intervention. But here's the critical question that changed everything for me: Who are these agents? Can we trust them? This isn't just an academic question—it's a production reality that I learned the hard way. This is where our research begins, and where ANS comes in.

The Incident That Changed Everything

A real production failure that revealed the trust gap

💥 The Scenario

50-agent ML operations system
Multi-tenant environment
Each agent with hardcoded endpoints
No identity verification between agents

⚡ What Happened

One Tuesday morning, a single agent was compromised through a configuration error.

Within 6 minutes: System-wide collapse

🔍 Root Cause

Compromised agent impersonated deployment service
Downstream agents deployed corrupted models
Monitoring agent couldn't distinguish legitimate from malicious traffic
No way to verify agent identity

Let me share a story that crystallized this problem for me. We were running a production ML operations system with 50 agents. Each agent had its own responsibility—concept drift detection, automated model retraining, deployment, monitoring. Each agent had its own credentials and its own hardcoded endpoints for communicating with other agents. One Tuesday morning, a single agent was compromised through a configuration error. I'm not talking about a sophisticated attack—just a misconfigured credential that gave an attacker access. Within six minutes, the entire system collapsed. Why? Because agents had no way to verify each other's identity. The compromised agent impersonated our model deployment service. Downstream agents, thinking they were talking to the legitimate deployment service, deployed corrupted models. Our monitoring agent, unable to distinguish legitimate from malicious traffic, dutifully reported everything as normal. This wasn't just a technical failure—it was a trust failure. We had built an autonomous system without the fundamental mechanisms for agents to discover, authenticate, and verify each other. It was like building a global network without DNS, where every connection relies on hardcoded IP addresses and blind trust. That incident revealed four critical gaps that we need to address. Let me show you what those are.

The Trust Problem in Agent Ecosystems

Four critical gaps that make agent systems vulnerable

❌ Gap 1: No Uniform Discovery

Manual configuration and hardcoded endpoints. No standard way to discover agents by capability.

🔐 Gap 2: Missing Cryptographic Auth

Authentication between agents is virtually nonexistent. Systems rely on API keys or basic auth.

🛡️ Gap 3: No Capability Verification

Agents can't prove capabilities without exposing sensitive implementation details or credentials.

📋 Gap 4: No Governance Framework

Governance frameworks are nonexistent or impossible to enforce consistently across agent interactions.

⚠️ Impact: In our research, 1 compromised agent in a 50-agent system led to cascading failures within minutes.

That incident revealed four critical gaps in how we deploy AI agents today. Let me break these down for you. First, there's no uniform discovery mechanism. Agents rely on manual configuration and hardcoded endpoints. There's no standard way to say "I need an agent that can retrain models" and discover it automatically. You have to know the exact IP address or service name. Second, cryptographic authentication between agents is virtually nonexistent. Most systems rely on API keys stored in environment variables or basic authentication. These are easily compromised, hard to rotate, and provide no cryptographic proof of identity. Third, agents can't prove their capabilities without exposing sensitive implementation details. If an agent needs to prove it can access a production database, it typically has to share credentials or API keys. This is a security nightmare. Fourth, governance frameworks are either nonexistent or impossible to enforce consistently. You might have policies written down, but there's no automated way to ensure agents follow them. Configuration drift is inevitable. In our research environment, we simulated scenarios where one compromised agent in a 50-agent system led to cascading failures within minutes. The lack of proper authentication meant malicious agents could impersonate legitimate ones. This isn't theoretical—this is happening in production systems today. But here's the thing: we've solved similar problems before. Let me show you how.

We've Solved This Before: The DNS Analogy

🌐 DNS (1987)

google.com
↓
142.250.191.14

Just location mapping

🤖 ANS (2025)

agent.capability.provider.v1.prod
↓
Identity + Capability + Trust

Complete trust layer

Key Innovation: ANS adds cryptographic verification, capability attestation, and governance support—everything DNS doesn't provide.

Everyone here knows DNS. You type google.com, you get an IP address. It's been working since 1987. Simple. Elegant. Revolutionary. But DNS only gives you location. It doesn't tell you if that server is trustworthy or what it can do. It's just a mapping from a human-readable name to an IP address. ANS is like DNS, but for agents. Instead of just mapping names to IP addresses, ANS maps agent names to their cryptographic identity, their capabilities, and their trust level. Look at the naming convention: "agent.capability.provider.v1.prod" – this is self-describing. You can discover agents by what they do, not just by name. But more importantly, when you resolve that name, you don't just get an IP address. You get a cryptographically verified identity, proven capabilities, and a trust level. This is the foundation for trustworthy AI agent ecosystems. But how does it actually work? Let me show you the architecture.

ANS Protocol Design

Self-describing names that encode identity, capability, and context

📝 Naming Convention

protocol://AgentID.Capability.Provider.v[Version].Extension

💡 Real Examples

a2a://alerter.security-monitoring.research-lab.v2.prod
mcp://validator.concept-drift-detection.ml-platform.v1.hipaa
acp://remediator.helm-deployment-fix.devsecops-team.v3.staging

✅ Benefits

Self-describing capabilities
Version-aware routing
Provider trust verification
Environment-specific deployment
Protocol-agnostic design

Let's break down our naming convention. It's designed to be self-describing and hierarchical. The format is: protocol://AgentID.Capability.Provider.v[Version].Extension Let me walk through a real example. "a2a://alerter.security-monitoring.research-lab.v2.prod" tells you: - This is an Agent-to-Agent protocol agent - It's an alerter for security monitoring - It's from research-lab - It's version 2 - It's in production Another example: "mcp://validator.concept-drift-detection.ml-platform.v1.hipaa" tells you it's an MCP protocol agent, it validates concept drift, it's from the ML platform team, version 1, and it's HIPAA-compliant. The benefits are significant. Self-describing capabilities mean you can search for agents by what they do, not just by name. Version-aware routing means you can have multiple versions running simultaneously. Provider trust verification means you know who built it. Environment-specific deployment means production agents can't accidentally talk to staging agents. And here's the key: we support multiple protocols. This isn't about creating another standard—it's about making existing standards work together. We support Google's A2A, Anthropic's MCP, IBM's ACP, and we can extend to new protocols as they emerge. But naming is just the beginning. The real power comes from the cryptographic trust foundation.

Cryptographic Trust Foundation

Three foundational technologies working together

🔑 DIDs

Decentralized Identifiers
W3C standard for globally unique, verifiable agent identity

📜 VCs

Verifiable Credentials
Capability attestations that prove what agents can do

🏛️ CA + RA

Certificate Authority + Registration Authority
Automated certificate management and lifecycle

🔗 Trust Chain Flow

Root CA → Intermediate CA → Agent Certificate → Capability Proof

Like mTLS for microservices, but capability-aware

Now let's talk about the cryptographic foundation. ANS is built on three technologies that work together to create comprehensive trust. First, Decentralized Identifiers, or DIDs. These are W3C standards originally designed for human identity management, but they work perfectly for agents. Each agent gets a globally unique, verifiable identity. Unlike traditional certificates, DIDs are self-sovereign—the agent controls its own identity. Second, Verifiable Credentials, or VCs. These are capability attestations. An agent doesn't just prove "I am agent X"—it proves "I am agent X and I have the verified capability to retrain models." These credentials are cryptographically signed and can be verified without contacting a central authority. Third, Certificate Authority and Registration Authority infrastructure. We use Sigstore for automated certificate provisioning, 90-day key rotation cycles, and revocation list management. This gives us the operational benefits of traditional PKI with the flexibility of decentralized identity. Let me walk through the trust chain. Root CA establishes the foundation—this is your organization's root of trust. Intermediate CA provides operational flexibility. Agent certificates are issued to specific agents with specific capabilities. And finally, capability proofs allow agents to demonstrate their abilities without revealing secrets. This is like mTLS for microservices, but with a crucial difference. mTLS only proves identity—"I am service A talking to service B." ANS proves identity AND capability—"I am agent A with model retraining capability talking to agent B with data access capability." This capability-aware approach means we can implement fine-grained access control. An agent can prove it has the right to access production data without revealing its internal implementation details. But how do we prove capabilities without revealing secrets? That's where zero-knowledge proofs come in.

Zero-Knowledge Capability Proofs

Prove capabilities without exposing secrets

❌ Traditional Approach

Agent: "I can access database"
Verifier: "Show password"
→ Credentials revealed

✅ ANS Zero-Knowledge

Agent: "I can prove access without revealing credentials"
Verifier: "Prove it cryptographically"
→ Capability verified, secrets protected

💡 Real-World Use Case

Agent proves model retraining capability without exposing API keys, database credentials, or internal implementation details.

The verifier gets cryptographic proof of capability, but never sees the actual secrets.

Zero-knowledge proofs are a cryptographic technique that lets you prove you know something without revealing what you know. It's like proving you have a valid driver's license without showing the actual license number. In the traditional approach, an agent would have to share its API keys or credentials to prove it can access a database. This is a security nightmare—you're essentially giving away your secrets to anyone who needs to verify your capabilities. With ANS, an agent can prove it has database access capability without revealing how it accesses the database. The verifier gets cryptographic proof that the agent has the right permissions, but the actual credentials remain secret. Here's a real-world example. Imagine a model retraining agent needs to prove it can access the training data. Instead of sharing database credentials, it generates a zero-knowledge proof that demonstrates it has the right permissions. The system can verify this proof without ever seeing the actual access credentials. This is crucial for security. Even if an attacker intercepts the proof, they can't extract the underlying credentials. The proof is valid only for that specific capability claim, and it can't be reused or replayed. This capability-aware approach works across multiple protocols and standards. But it needs infrastructure to enforce it. That's where Kubernetes comes in.

Kubernetes-Native Architecture

Four core components working together

📋 ANS Registry

Kubernetes Custom Resource Definitions (CRDs) store agent metadata, identity, and capabilities

🚪 Admission Controller

Validates every agent deployment against security policies before it hits the cluster

🛡️ Service Mesh

Istio/Linkerd mTLS + capability verification in certificate extensions

📊 Policy Engine

Open Policy Agent (OPA) enforces governance rules continuously

🔄 Agent Lifecycle

Register → Validate → Deploy → Authenticate → Monitor

ANS is designed to be Kubernetes-native. This was crucial for enterprise adoption—it works with the tools organizations already use, rather than requiring a complete infrastructure overhaul. Let me walk you through the four core components. First, the ANS Registry uses Kubernetes Custom Resource Definitions to store agent metadata. This means your agents are first-class citizens in Kubernetes, just like pods and services. When you register an agent, it creates a CRD entry with its identity, capabilities, and trust level. This is version-controlled, auditable, and integrates with your existing Kubernetes tooling. Second, we have an Admission Controller that validates every agent deployment against your security policies before it even hits the cluster. This is like a gatekeeper—if an agent doesn't meet your security requirements, it never gets deployed. This prevents configuration drift and ensures policy compliance from day one. Third, we integrate with service meshes like Istio and Linkerd for mutual TLS. But unlike traditional service mesh, we also verify capabilities in the certificate extensions. An agent doesn't just prove "I am agent X"—it proves "I am agent X and I have the verified capability to retrain models." This happens at the network layer, transparently to the application. Fourth, Open Policy Agent enforces your governance rules continuously. Policies are written in Rego, version-controlled, and tested like application code. They can enforce access control, resource limits, network policies, and compliance requirements. Let me walk you through the agent lifecycle. First, an agent registers with the ANS registry—this creates a CRD entry. Then it gets validated against OPA policies. If it passes, it gets deployed with proper certificates. At runtime, it authenticates using its certificates and gets monitored for compliance. Each team can have their own namespace with their own agents, but they can still discover and communicate with agents in other namespaces through ANS—with proper authentication and authorization. This gives you multi-tenancy with security. This all comes together in a GitOps workflow that's declarative and auditable.

GitOps Integration Workflow

From code commit to production deployment in under 30 minutes

🔄 Complete Pipeline

Code Commit → Policy Validation → Certificate Provisioning → Auto-Deploy → Runtime Verification

2-3 days → 30 minutes

🔑 Automated Key Management

Sigstore integration
Automatic certificate provisioning
90-day key rotation cycles
Zero-trust handshake validation
Revocation list management

✅ Security Benefits

No hardcoded secrets
Automated compliance
Audit trail for all operations
Rollback capability
Complete reproducibility

This is where ANS really shines. Every step is declarative, version-controlled, and auditable. You commit agent code, it gets validated against policies, certificates are provisioned automatically, and deployment happens through your existing GitOps pipeline. Let me walk through the pipeline. Code commit triggers the workflow. Policy validation ensures the agent meets security requirements—this happens through OPA Gatekeeper. Certificate provisioning happens automatically through Sigstore—no manual certificate requests, no waiting for security teams. Deployment is handled by ArgoCD or Flux—your existing GitOps tools. Runtime verification ensures the agent is behaving correctly and complying with policies. The impact is dramatic. Traditional agent deployment takes 2 to 3 days—manual configuration, security reviews, certificate provisioning, network setup. With ANS, it's under 30 minutes. And here's the key: every deployment either succeeds completely or rolls back cleanly. No partial deployments, no configuration drift, no manual cleanup. Automated key management is crucial. We integrate with Sigstore for automatic certificate provisioning. Keys rotate every 90 days automatically. Zero-trust handshakes validate every connection. Revocation lists are managed automatically. You never have to manually manage certificates. The security benefits are significant. No hardcoded secrets—everything uses cryptographic proof. Automated compliance—policies are enforced at every step. Complete audit trail—you know exactly what happened, when, and why. Rollback capability—if something goes wrong, you can roll back to the previous version instantly. Complete reproducibility—every deployment is identical because it's declarative. This gives you the same benefits for agents that you get for applications—rollbacks, audit trails, compliance reporting. But with the added security of cryptographic verification and capability attestation. Now let me show you how this works in practice with a real example.

Policy-as-Code Governance

OPA policies that enforce security, compliance, and operational rules

📝 Example OPA Policy

package agent.policy

default allow = false

allow {
    input.agent.certificate.issuer == "research-lab-trusted-ca"
    input.agent.capabilities["data-access"] == true
    input.environment == "production"
    input.agent.security_clearance >= 3
}

📋 Policy Categories

Access Control: RBAC policies
Resource Limits: CPU/memory constraints
Network Policies: Micro-segmentation
Compliance: HIPAA, GDPR, SOC 2

✅ Benefits

Version-controlled policies
Tested like application code
Platform-level compliance
Dynamic policy adaptation

Let me show you how policy-as-code works in practice. This is a real OPA policy we use in production. It's written in Rego, which is a declarative policy language. The policy says: allow access only if the agent's certificate was issued by 'research-lab-trusted-ca', the agent has 'data-access' capability, the environment is production, and the agent has security clearance level 3 or higher. This is simple, but powerful. You can write policies that check certificate issuers, capabilities, environments, security clearances, time of day, resource availability—anything you can express in logic. We have policies for different categories. Access control policies enforce RBAC—who can do what. Resource limit policies constrain CPU and memory usage. Network policies implement micro-segmentation—agents can only talk to agents they're authorized to communicate with. Compliance policies enforce HIPAA, GDPR, SOC 2 requirements automatically. The benefits are significant. Policies are version-controlled—you can see exactly when policies changed and why. They're tested like application code—you can write unit tests for policies. They're enforced at the platform level—agents can't bypass them. And they can adapt dynamically—policies can change based on conditions like time of day or system load. In our test environments, we've seen 95% reduction in misconfigurations and 100% policy compliance. This isn't because we're perfect—it's because policies are enforced automatically, and agents can't bypass them. But policies are just one part of the security story. Let me show you how zero trust principles apply to agent interactions.

Zero Trust Principles for Agent Interactions

Every interaction is verified, nothing is trusted by default

🔐 Verify Explicitly

Every agent-to-agent interaction requires cryptographic proof of identity and capability

📊 Use Least Privilege

Agents only get the minimum capabilities needed for their specific function

🚨 Assume Breach

Continuous monitoring and verification, even after initial authentication

🛡️ Dynamic Identity Controls

Unlike human users, agents have dynamic identities that change based on:

Current capabilities (can be revoked instantly)
Operational context (time, location, workload)
Trust scores (based on behavior and compliance history)

Zero trust isn't just a buzzword—it's a fundamental security principle that applies perfectly to agent interactions. The core idea is simple: verify explicitly, use least privilege, assume breach. For agents, this means every agent-to-agent interaction requires cryptographic proof of identity and capability. There's no "trusted network" or "internal zone." Every connection is verified, every request is authenticated, every capability is proven. We use least privilege at the capability level. An agent doesn't get broad permissions—it gets exactly the capabilities it needs for its specific function. A concept drift detector doesn't need database write access. A notification agent doesn't need model training capabilities. This limits the blast radius if an agent is compromised. We assume breach from the start. This means continuous monitoring and verification, even after initial authentication. An agent's trust score can change based on its behavior. If it starts acting suspiciously, its capabilities can be revoked instantly. If it violates policies, it gets isolated automatically. Here's what's interesting about agents versus human users. Human identities are relatively static—you are who you are. Agent identities are dynamic. An agent's identity changes based on its current capabilities, which can be revoked instantly. It changes based on operational context—time of day, location, current workload. And it changes based on trust scores—agents that behave well get more capabilities, agents that misbehave get isolated. This dynamic identity control is crucial for security. A compromised agent can't maintain its privileges indefinitely—its behavior will trigger policy violations, which will reduce its trust score, which will revoke its capabilities. This all comes together in real-world workflows. Let me show you a complete example.

Real-World Workflow: Concept Drift Detection

End-to-end autonomous remediation in under 30 seconds

🎬 The Complete Workflow

Drift Detector Agent notices 15% performance degradation
ANS Discovery: Finds model retrainer by capability
Zero-Knowledge Auth: Proves capability to trigger retraining
Policy Enforcement: OPA validates the request
Automated Execution: Retrainer updates the model
Notification Agent: Alerts team via Slack

⏱️ Total Time: <30 seconds | 🔒 100% Secure | 📝 Fully Audited

🔍 What Happens Behind the Scenes

mTLS handshake with capability verification
Zero-knowledge proof generation and validation
OPA policy evaluation (3-5ms)
Certificate chain validation
Complete audit log generation

✅ Operational Impact

Zero manual intervention
Automatic rollback on failure
Complete audit trail
Compliance reporting
Real-time monitoring

Let me walk you through a real scenario that demonstrates the power of trusted agent communication. This is a concept drift detection and automated remediation workflow that we run in production. Here's what happens. Your drift detector agent notices a 15% performance degradation in a production model. This is a real problem that needs immediate action, but it's 2 AM and no one is awake to handle it. Using ANS, the drift detector discovers the model retrainer agent by capability—not by hardcoded IP address or service name, but by what it can do. It queries ANS: "I need an agent that can retrain models in production." ANS returns the retrainer's identity and capabilities. The drift detector then proves it has the capability to trigger retraining using a zero-knowledge proof. The retrainer can verify this proof without ever seeing the actual credentials. This happens through mTLS with capability verification in the certificate extensions. OPA validates the request against your policies. Is this allowed? Is it the right time? Does it meet compliance requirements? Policy evaluation takes 3-5 milliseconds, and it's completely automated. If everything checks out, the retrainer executes the update. It pulls the latest training data, retrains the model, validates the new model's performance, and deploys it to production. All of this happens automatically. Finally, a notification agent alerts your team via Slack. But here's the key: by the time you see the notification, the problem is already fixed. The model has been retrained and deployed, and performance is back to normal. This entire workflow—discovery, authentication, authorization, execution, and notification—happens in under 30 seconds. It's 100% secure, fully audited, and happens without any human intervention. Behind the scenes, there's a lot happening. mTLS handshake with capability verification. Zero-knowledge proof generation and validation. OPA policy evaluation. Certificate chain validation. Complete audit log generation. But all of this is transparent to the agents—they just work. The operational impact is significant. Zero manual intervention means problems get fixed faster. Automatic rollback on failure means you don't have to worry about broken deployments. Complete audit trail means you can prove compliance. Real-time monitoring means you can see what's happening as it happens. In a traditional system, this would have taken 2-3 days. Someone would have to notice the alert, investigate the problem, manually trigger retraining, wait for it to complete, validate the results, and deploy. With ANS, it's completely automated. This is what trustworthy AI agent ecosystems look like in practice.

Performance Results and Benchmarks

Real production metrics from our research environment

<10ms

Service Response

100%

Deployment Success

10,000+

Concurrent Agents

📊 Production Environment

3-node Kubernetes cluster (EKS)
50+ registered agents
Full GitOps pipeline (ArgoCD)
OPA Gatekeeper enforcement
Prometheus + Grafana monitoring
Sigstore certificate authority

💡 Key Benefits

90% reduction in deployment time
100% policy compliance
Zero hardcoded secrets
Complete audit trail
Automatic compliance reporting
Real-time threat detection

Let's talk about the real numbers from our production environment. These aren't theoretical—these are measurements from a system that's running in production today. Service response times average under 10 milliseconds. That's fast enough for real-time agent orchestration while maintaining cryptographic security. This includes authentication, capability verification, policy evaluation, and actual service execution. We've achieved 100% deployment success rate. Remember, traditional approaches have a 65% success rate, with 35% of deployments requiring manual intervention. With ANS, every deployment either succeeds completely or rolls back cleanly. No partial deployments, no configuration drift, no manual cleanup. We've successfully tested with over 10,000 concurrent agents. This scales far beyond typical enterprise needs. The system handles 1000+ daily agent interactions with sub-50ms authentication latency. Our production environment is a 3-node Kubernetes cluster on EKS, with 50+ registered agents, full GitOps pipeline using ArgoCD, OPA Gatekeeper enforcing policies, Prometheus and Grafana for monitoring, and Sigstore as our certificate authority. This is a real, production-ready setup. The key benefits are significant. 90% reduction in deployment time—from 2-3 days to under 30 minutes. 100% policy compliance—no more configuration drift. Zero hardcoded secrets—everything uses cryptographic proof. Complete audit trail—you know exactly what happened, when, and why. Automatic compliance reporting—HIPAA, GDPR, SOC 2, it's all automated. Real-time threat detection—we catch problems before they become incidents. The 50ms authentication latency is crucial. It's fast enough for real-time agent orchestration, but slow enough to be cryptographically secure. This is the sweet spot for agentic AI. Traditional approaches either sacrifice security for speed or sacrifice speed for security. ANS gives you both. We've seen 30% faster deployment times and 95% reduction in misconfigurations compared to manual processes. This isn't just about speed—it's about reliability, security, and compliance.

Key Takeaways

🔑 Core Benefits

Security: Cryptographic agent identity and capability verification
Scale: Handle 1000+ agent interactions with sub-second latency
Governance: Policy-as-code enforcement with complete audit trails
Future-proof: Protocol-agnostic design supports evolving standards

📚 Research Contributions

Formal trust model for agent ecosystems
Open-source reference implementation
Production benchmarks and performance analysis
Kubernetes-native architecture patterns

The future of AI is agentic. The future of agentic AI must be secure.
ANS provides the trust layer that makes both possible.

Let me summarize what we've covered today. ANS provides a standardized framework for AI-agent identity management. It enhances trust, governance, and security in agent workflows through cryptographic verification, capability attestation, and policy-as-code enforcement. The core benefits are clear. Security through cryptographic agent identity and capability verification. Scale with 1000+ agent interactions and sub-second latency. Governance through policy-as-code with complete audit trails. Future-proof design that supports evolving standards. Our research contributions include a formal trust model for agent ecosystems, an open-source reference implementation, production benchmarks with performance analysis, and Kubernetes-native architecture patterns that others can build on. But here's the key message: The future of AI is agentic. We're moving from supervised ML to autonomous agent orchestration. But the future of agentic AI must be secure. You can't build autonomous systems without proper trust infrastructure. ANS provides the trust layer that makes both possible. It's not just a research project—it's a production-ready system that's already changing how organizations deploy AI agents. The question isn't whether you'll need something like ANS—it's when you'll need it. And based on my experience, the answer is: sooner than you think.

Your ANS Implementation Journey

Three phases from proof-of-concept to production scale

🚀 Phase 1: Foundation

Weeks 1–2

Deploy ANS in dev environment
Configure basic OPA policies
Set up Sigstore CA
Register first test agent

Resources: 1 DevOps engineer, 1 K8s cluster

🔗 Phase 2: Integration

Weeks 3–4

Integrate with GitOps pipeline
Migrate first production workload
Implement monitoring
Train team on ANS operations

Resources: 2 engineers, production access

📈 Phase 3: Production Scale

Weeks 5–8

Deploy with canary rollout
Scale to additional agent types
Optimize performance
Establish governance processes

Resources: Full team, security review

Prerequisites: Kubernetes 1.24+, GitOps pipeline, OPA Gatekeeper, Sigstore access

If you're thinking about implementing ANS, let me give you a roadmap based on our experience with early adopters. Phase 1 is about getting the foundation right. Deploy ANS in a development environment. Configure basic OPA policies. Set up Sigstore as your certificate authority. Register your first test agent. This takes about 2 weeks with one DevOps engineer and one Kubernetes cluster. The goal here is to prove the concept works in your environment. Phase 2 is about integration. Integrate ANS with your existing GitOps pipeline—ArgoCD, Flux, whatever you're using. Migrate your first production workload. Implement monitoring and alerting. Train your team on ANS operations. This takes another 2 weeks with 2 engineers and production cluster access. The goal here is to prove it works in production. Phase 3 is about scaling. Deploy to production with canary rollout—start with a small percentage of agents, then gradually increase. Scale to additional agent types. Optimize performance and security policies. Establish governance processes. This takes 4 weeks with your full team and a security review. The goal here is to make ANS the standard for all agent deployments. Prerequisites are straightforward. You need Kubernetes 1.24 or higher, a GitOps pipeline, OPA Gatekeeper, and Sigstore access. Most teams already have these. If you don't, they're relatively easy to set up. The key is to start small and iterate. Don't try to migrate everything at once. Pick one agent type, prove it works, then expand.

Resources and Community

🚀 Get Started

📦 Clone the repo:
github.com/akshaymittal/ans-mlops-demo
🎥 Watch the demo:
Full walkthrough available
📖 Read the paper:
IEEE conference submission
💬 Join the community:
#ans-community on Slack

📞 Contact

Akshay Mittal

📧 akshay.mittal@ieee.org

💼 linkedin.com/in/akshaymittal143

🐙 github.com/akshaymittal143

🔬 ORCID: 0009-0008-5233-9248

🤝 Collaboration Welcome: Research partnerships, industry implementations, open source contributions

Everything is open source. The reference implementation is on GitHub at github.com/akshaymittal/ans-mlops-demo. You can clone it, run it locally, and see ANS in action in about 10 minutes. It includes the code, Kubernetes manifests, demo agents, monitoring stack, and step-by-step instructions. We have a full demo walkthrough video that shows everything I've talked about today in detail. The complete technical paper is available as an IEEE conference submission. Join the #ans-community channel in MLOps World Slack. We have monthly research collaboration calls where we discuss new features, share experiences, and plan future development. I'm actively looking for research partners and industry collaborators. If you're interested in implementing ANS in your environment, or if you want to contribute to the open source project, please reach out. My email is akshay.mittal@ieee.org. I'm on LinkedIn at linkedin.com/in/akshaymittal143. All my code is on GitHub at github.com/akshaymittal143. My ORCID is 0009-0008-5233-9248. The future of AI is agentic. And the future of agentic AI is secure. ANS provides the trust layer that makes this possible. Let's build it together.

Thank You!

Questions?

📧 akshay.mittal@ieee.org

🐙 github.com/akshaymittal/ans-mlops-demo

💼 linkedin.com/in/akshaymittal143

Let's build the trust layer for autonomous AI together!

Scan to connect on LinkedIn

Thank you for your attention. We've covered a lot in 30 minutes—from the fundamental trust problem to a production-ready solution. To recap: AI agents have a trust problem. ANS solves it with DNS-like naming, cryptographic identity, zero-knowledge proofs, and policy-as-code governance. It's production-ready, it's fast, and it's open source. I'm here for questions. Whether you want to know more about the technical architecture, the performance benchmarks, the security model, the implementation details, or potential collaborations—I'm all ears. Common questions I get: How does this compare to service mesh? What about the performance overhead? How do you handle agent updates? What about multi-cloud deployments? What about compliance requirements? I've got answers for all of these. Thank you again, and let's chat!