Agent Name Service (ANS)
DNS for AI Agents: Building Trust, Governance, and Security for AI-Agent Workflows
Akshay Mittal
PhD Scholar, University of the Cumberlands | IEEE Senior Member
GSDC AI in Action 2026 — Global Webinar
January 2026
The Agentic AI Revolution
From supervised ML to autonomous agent orchestration
📊 Traditional ML Pipeline
- Human-supervised at every step
- Data → Train → Deploy → Monitor
- Manual intervention required
- Days to weeks for updates
🤖 Agentic AI Reality
- Autonomous agent orchestration
- Concept-drift detector → Auto-retrainer
- Deployer → Monitor → Remediate
- Minutes to hours for updates
Critical Question: Who are these agents? Can we trust them?
The Incident That Changed Everything
A real production failure that revealed the trust gap
💥 The Scenario
- 50-agent ML operations system
- Multi-tenant environment
- Each agent with hardcoded endpoints
- No identity verification between agents
⚡ What Happened
One Tuesday morning, a single agent was compromised through a configuration error.
Within 6 minutes: System-wide collapse
🔍 Root Cause
- Compromised agent impersonated deployment service
- Downstream agents deployed corrupted models
- Monitoring agent couldn't distinguish legitimate from malicious traffic
- No way to verify agent identity
The Trust Problem in Agent Ecosystems
Four critical gaps that make agent systems vulnerable
❌ Gap 1: No Uniform Discovery
Manual configuration and hardcoded endpoints. No standard way to discover agents by capability.
🔐 Gap 2: Missing Cryptographic Auth
Authentication between agents is virtually nonexistent. Systems rely on API keys or basic auth.
🛡️ Gap 3: No Capability Verification
Agents can't prove capabilities without exposing sensitive implementation details or credentials.
📋 Gap 4: No Governance Framework
Governance frameworks are nonexistent or impossible to enforce consistently across agent interactions.
⚠️ Impact: In our research, 1 compromised agent in a 50-agent system led to cascading failures within minutes.
We've Solved This Before: The DNS Analogy
🌐 DNS (1987)
google.com
↓
142.250.191.14
Just location mapping
🤖 ANS (2025)
agent.capability.provider.v1.prod
↓
Identity + Capability + Trust
Complete trust layer
Key Innovation: ANS adds cryptographic verification, capability attestation, and governance support—everything DNS doesn't provide.
ANS Protocol Design
Self-describing names that encode identity, capability, and context
📝 Naming Convention
protocol://AgentID.Capability.Provider.v[Version].Extension
💡 Real Examples
a2a://alerter.security-monitoring.research-lab.v2.prod
mcp://validator.concept-drift-detection.ml-platform.v1.hipaa
acp://remediator.helm-deployment-fix.devsecops-team.v3.staging
✅ Benefits
- Self-describing capabilities
- Version-aware routing
- Provider trust verification
- Environment-specific deployment
- Protocol-agnostic design
Cryptographic Trust Foundation
Three foundational technologies working together
🔑 DIDs
Decentralized Identifiers
W3C standard for globally unique, verifiable agent identity
📜 VCs
Verifiable Credentials
Capability attestations that prove what agents can do
🏛️ CA + RA
Certificate Authority + Registration Authority
Automated certificate management and lifecycle
🔗 Trust Chain Flow
Root CA → Intermediate CA → Agent Certificate → Capability Proof
Like mTLS for microservices, but capability-aware
Zero-Knowledge Capability Proofs
Prove capabilities without exposing secrets
❌ Traditional Approach
Agent: "I can access database"
Verifier: "Show password"
→ Credentials revealed
✅ ANS Zero-Knowledge
Agent: "I can prove access without revealing credentials"
Verifier: "Prove it cryptographically"
→ Capability verified, secrets protected
💡 Real-World Use Case
Agent proves model retraining capability without exposing API keys, database credentials, or internal implementation details.
The verifier gets cryptographic proof of capability, but never sees the actual secrets.
Kubernetes-Native Architecture
Four core components working together
📋 ANS Registry
Kubernetes Custom Resource Definitions (CRDs) store agent metadata, identity, and capabilities
🚪 Admission Controller
Validates every agent deployment against security policies before it hits the cluster
🛡️ Service Mesh
Istio/Linkerd mTLS + capability verification in certificate extensions
📊 Policy Engine
Open Policy Agent (OPA) enforces governance rules continuously
🔄 Agent Lifecycle
Register → Validate → Deploy → Authenticate → Monitor
GitOps Integration Workflow
From code commit to production deployment in under 30 minutes
🔄 Complete Pipeline
Code Commit → Policy Validation → Certificate Provisioning → Auto-Deploy → Runtime Verification
2-3 days → 30 minutes
🔑 Automated Key Management
- Sigstore integration
- Automatic certificate provisioning
- 90-day key rotation cycles
- Zero-trust handshake validation
- Revocation list management
✅ Security Benefits
- No hardcoded secrets
- Automated compliance
- Audit trail for all operations
- Rollback capability
- Complete reproducibility
Policy-as-Code Governance
OPA policies that enforce security, compliance, and operational rules
📝 Example OPA Policy
package agent.policy
default allow = false
allow {
input.agent.certificate.issuer == "research-lab-trusted-ca"
input.agent.capabilities["data-access"] == true
input.environment == "production"
input.agent.security_clearance >= 3
}
📋 Policy Categories
- Access Control: RBAC policies
- Resource Limits: CPU/memory constraints
- Network Policies: Micro-segmentation
- Compliance: HIPAA, GDPR, SOC 2
✅ Benefits
- Version-controlled policies
- Tested like application code
- Platform-level compliance
- Dynamic policy adaptation
Zero Trust Principles for Agent Interactions
Every interaction is verified, nothing is trusted by default
🔐 Verify Explicitly
Every agent-to-agent interaction requires cryptographic proof of identity and capability
📊 Use Least Privilege
Agents only get the minimum capabilities needed for their specific function
🚨 Assume Breach
Continuous monitoring and verification, even after initial authentication
🛡️ Dynamic Identity Controls
Unlike human users, agents have dynamic identities that change based on:
- Current capabilities (can be revoked instantly)
- Operational context (time, location, workload)
- Trust scores (based on behavior and compliance history)
Real-World Workflow: Concept Drift Detection
End-to-end autonomous remediation in under 30 seconds
🎬 The Complete Workflow
- Drift Detector Agent notices 15% performance degradation
- ANS Discovery: Finds model retrainer by capability
- Zero-Knowledge Auth: Proves capability to trigger retraining
- Policy Enforcement: OPA validates the request
- Automated Execution: Retrainer updates the model
- Notification Agent: Alerts team via Slack
⏱️ Total Time: <30 seconds | 🔒 100% Secure | 📝 Fully Audited
🔍 What Happens Behind the Scenes
- mTLS handshake with capability verification
- Zero-knowledge proof generation and validation
- OPA policy evaluation (3-5ms)
- Certificate chain validation
- Complete audit log generation
✅ Operational Impact
- Zero manual intervention
- Automatic rollback on failure
- Complete audit trail
- Compliance reporting
- Real-time monitoring
Performance Results and Benchmarks
Real production metrics from our research environment
10,000+
Concurrent Agents
📊 Production Environment
- 3-node Kubernetes cluster (EKS)
- 50+ registered agents
- Full GitOps pipeline (ArgoCD)
- OPA Gatekeeper enforcement
- Prometheus + Grafana monitoring
- Sigstore certificate authority
💡 Key Benefits
- 90% reduction in deployment time
- 100% policy compliance
- Zero hardcoded secrets
- Complete audit trail
- Automatic compliance reporting
- Real-time threat detection
Key Takeaways
🔑 Core Benefits
- Security: Cryptographic agent identity and capability verification
- Scale: Handle 1000+ agent interactions with sub-second latency
- Governance: Policy-as-code enforcement with complete audit trails
- Future-proof: Protocol-agnostic design supports evolving standards
📚 Research Contributions
- Formal trust model for agent ecosystems
- Open-source reference implementation
- Production benchmarks and performance analysis
- Kubernetes-native architecture patterns
The future of AI is agentic. The future of agentic AI must be secure.
ANS provides the trust layer that makes both possible.
Your ANS Implementation Journey
Three phases from proof-of-concept to production scale
🚀 Phase 1: Foundation
Weeks 1–2
- Deploy ANS in dev environment
- Configure basic OPA policies
- Set up Sigstore CA
- Register first test agent
Resources: 1 DevOps engineer, 1 K8s cluster
🔗 Phase 2: Integration
Weeks 3–4
- Integrate with GitOps pipeline
- Migrate first production workload
- Implement monitoring
- Train team on ANS operations
Resources: 2 engineers, production access
📈 Phase 3: Production Scale
Weeks 5–8
- Deploy with canary rollout
- Scale to additional agent types
- Optimize performance
- Establish governance processes
Resources: Full team, security review
Prerequisites: Kubernetes 1.24+, GitOps pipeline, OPA Gatekeeper, Sigstore access
Resources and Community
🚀 Get Started
- 📦 Clone the repo:
github.com/akshaymittal/ans-mlops-demo
- 🎥 Watch the demo:
Full walkthrough available
- 📖 Read the paper:
IEEE conference submission
- 💬 Join the community:
#ans-community on Slack
🤝 Collaboration Welcome: Research partnerships, industry implementations, open source contributions
Thank You!
Questions?