September 1, 2025 – On this date, Microsoft introduced a new flexible, predictable billing model for Azure SRE Agent, ending the service’s free public preview period. Azure SRE Agent (first announced at Microsoft Build 2025) is now billed based on usage measured in Azure Agent Units (AAUs). This allows organizations to understand costs in a transparent way, with a combination of a fixed baseline charge and pay-as-you-go variable charges. In summary, as of early September 2025, Azure SRE Agent is in public preview with billing enabled, after a period of zero-cost trial usage.
What is Azure SRE Agent
Azure SRE Agent is an AI-powered cloud reliability assistant. SRE stands for Site Reliability Engineering, a field dedicated to keeping services highly reliable and available. This agent was created to help automate and streamline incident response in Azure environments. By analyzing telemetry (logs, metrics, etc.) with advanced AI, it can detect problems quickly, diagnose root causes, and even resolve issues automatically. The goal is to improve service uptime and reduce the manual work (or “toil”) that human engineers have to do during outages or performance incidents.
Key things to know about Azure SRE Agent:
- Announcement and Purpose: It was unveiled at Microsoft Build 2025 as part of Microsoft’s “Agentic DevOps” initiative (which embeds intelligent agents into the software lifecycle). Azure SRE Agent is designed to watch over cloud applications and infrastructure, so your site reliability engineers can focus on higher-value tasks rather than constantly putting out fires.
- How It Works: Once deployed, the agent runs continuously. It has a dual-mode operation:
- In the “always-on” mode, it quietly observes your systems 24/7, learning normal behavior patterns and monitoring for any anomalies. It’s essentially always learning and ready.
- In the “active” mode, when the agent detects an issue (like a failure or an abnormal metric spike) or when a human triggers a request, it springs into action. The Azure SRE Agent uses AI to troubleshoot the problem and can execute predefined remediation steps (for example, restarting a service, scaling out a resource, or applying a known fix from a runbook).
- Benefits: By using machine learning for analysis and automated actions for common problems, Azure SRE Agent helps reduce downtime. It responds to issues faster than a human typically can, which can significantly lower the MTTR (Mean Time to Resolution) for incidents. Meanwhile, your team gets to spend more time on proactive improvements and new features, rather than reacting to operational issues.
- Scale and Integration: Azure SRE Agent is built to work for a wide range of applications – from small test environments to large mission-critical services. It integrates with Azure’s ecosystem (monitoring tools, incident management systems, etc.), fitting into your current DevOps workflows. This means it can post alerts, update incident tickets, or follow custom automation scripts as it works, ensuring that the right information is shared with your team.
In short, Azure SRE Agent acts like a virtual site reliability engineer that never sleeps: always monitoring, always learning, and ready to assist in keeping your Azure services healthy.
Azure SRE Agent’s key features include:
- Proactive 24/7 Monitoring: The agent provides continuous oversight of your cloud environment. It monitors metrics and logs in real time and understands what “normal” looks like for your applications. This always-on vigilance means it can immediately flag anything out of the ordinary, helping you catch issues early.
- Automated Incident Response: When an incident or anomaly occurs, Azure SRE Agent automatically engages in incident response. Thanks to built-in AI, it can diagnose the issue (identifying, for example, which microservice or component is failing) and then take action to remediate. Common actions include scaling up a resource, cycling a service, or applying a known workaround. This reduces downtime since the agent can often resolve problems in seconds.
- AI-Driven Analysis: The agent uses machine learning to perform deep analysis of system data. During an outage or performance dip, it correlates diverse data points (like recent deployments, CPU usage spikes, error logs) to find the root cause. These AI-driven insights can surface problems that might be missed by manual troubleshooting. The agent essentially encapsulates Azure’s best practices for diagnostics, helping your team understand why an incident happened.
- Dual-Mode Operation (Learning and Action): Azure SRE Agent’s architecture is dual-action: it is always learning (constantly improving its knowledge of your system’s baseline behavior) and ready to act (immediately executing a fix when needed). This combination ensures high reliability. The “always learning” mode means fewer false alarms and better preparedness, while the “action” mode means quicker recoveries from failures.
- Scalability and Flexibility: Whether you have one small app or many large systems, you can deploy Azure SRE Agent as needed. You might start with a single agent in a test environment or use multiple agents for different services in production. The agent’s design allows it to handle numerous incidents in parallel if needed, and its cost model (explained below) scales with usage, so you only pay for the value you get. This makes it a fit for both startups and enterprises looking to improve reliability without a huge upfront investment in operations tools.
Licensing
With the end of the free preview, Azure SRE Agent now has a clear licensing and billing model. Microsoft introduced Azure Agent Units (AAUs) as the metric for all Azure “agent” services, including Azure SRE Agent. This unified unit makes it easier to understand and predict costs across different Azure agents. Here’s how the Azure SRE Agent billing works:
- Always-On Baseline (Fixed Component): Each running Azure SRE Agent has a fixed consumption of 4 AAUs per hour. This is the cost for the agent’s continuous monitoring and learning — its “always-on” presence. Think of this as the base charge that keeps the intelligent agent service active for you 24/7. This fixed rate provides a predictable cost floor. For example, if you run one agent continuously, you know it will consume 4 AAU every hour, regardless of whether it actually needs to resolve any incidents.
- Active Usage (Variable Component): When the agent actively works on a task – for instance, mitigating an incident or performing a user-requested action – it incurs an additional usage charge of 0.25 AAUs per second during that activity. In simple terms, you pay per second of work the agent does when it’s solving problems or executing tasks (not just watching). If the agent isn’t handling any issues, you don’t pay this part. This usage-based component means costs remain flexible and proportional to how much you rely on the agent’s help in a given period.
New model vs. previous model: Under the previous preview model (before Sept 1, 2025), Azure SRE Agent was free of charge – Microsoft did not bill customers while the service was in early testing. With the new model now in effect, any organization using the agent will accumulate AAUs and be billed accordingly. The introduction of the fixed baseline plus variable usage is designed to strike a balance between predictability and fairness. You get a consistent baseline expense (so you can budget a known minimum for reliability coverage), and you only pay more if the agent actually does significant work on incidents (which is when it’s delivering clear value by resolving issues). This is a shift from “no cost” to “pay-as-you-use”, but with a transparent structure:
- If you use the agent sparingly or if your systems run smoothly (few incidents), you’ll incur mostly the baseline cost with minimal extra charges.
- If you heavily utilize the agent (many incidents or tasks), you’ll see higher AAU usage, but that is directly tied to the tangible work the agent performed to keep your services up.
Expert advice: Navigating Azure’s licensing can be complex, especially with new models like AAU-based billing. The cloud licensing specialists at SCHNEIDER IT MANAGEMENT recommend evaluating your expected usage of Azure SRE Agent. Consider how many agents you might deploy and how often incidents occur in your environment. This analysis will help you estimate AAU consumption under the new model. If you need guidance, SCHNEIDER IT MANAGEMENT’s experts are available to help interpret this billing model for your specific case. We can provide advice on optimizing the number of agents, understanding AAU forecasts, and integrating Azure SRE Agent into your existing Azure agreement in the most cost-effective way. Don’t hesitate to reach out for personalized support in planning your Azure SRE Agent adoption.
More Information
For more details, please refer to Microsoft’s official announcement on this change:
- Microsoft Tech Community – “Announcing a flexible, predictable billing model for Azure SRE Agent” (published August 2, 2025): https://techcommunity.microsoft.com/blog/appsonazureblog/announcing-a-flexible-predictable-billing-model-for-azure-sre-agent/4427270
This source provides the full announcement and examples of how the billing model works, as well as additional context about Azure SRE Agent’s capabilities and roadmap.
For our Microsoft page, please visit: https://www.schneider.im/software/microsoft.
Please contact us for expert services on your specific Microsoft software and online services requirements and to request a quote today.

