Operations Hub

One place for all your SRE tools — live agents, topology maps, AI analysis, cost monitoring, and more.

Agents
Events (24h)
Actions
Errors

🤖 AI & Automation

🤖
Agent Console
Multi-agent orchestration dashboard. Monitor HealthCheck, ErrorMonitor, Cost, Scan, and Analyze agents in real time. Trigger runs, view event feed, agent-to-agent handoffs.
● live5 agentsreal-time
🧠
AIOps Console
Live AIOps analysis feed. Every Azure alert analyzed with Elasticsearch logs, worker health checks, and automated verdicts posted back to Slack.
● liveES analysisauto-verdict

🗺️ Topology & Monitoring

📡
Smartscape
Dynatrace-style service topology. Live worker health, error rates, sparklines, app-to-plan relationships. Filter by region, click any node for deep metrics.
● liveworker healthmetrics
🌐
Azure Topology Map
Interactive Azure resource graph. Visualize App Service Plans, worker instances, regions, and their relationships as a live force-directed network map.
● liveforce graphazure
🔭
SnitchOps Scanner
On-demand worker scan across all regions. Detects unhealthy workers, stopped instances, and misconfigured App Service Plans with actionable restart recommendations.
multi-regionon-demand

🔧 Utilities

❤️
Health Check
Bot health endpoint. Returns current status, uptime, and connectivity check — useful for monitoring and container liveness probes.
livenessuptime
💉
PD Alert Injector
Inject a synthetic PagerDuty alert to test end-to-end flow: orchestrator routing → HealthCheckAgent → Slack reply. Replace app name in the URL to test any app.
testinge2e
Bot running Uptime 1h 10m SRE Grogu · Azure Container Apps · southcentralus