How to see the actual error that fired an Azure Monitor Alert

Question

How to see the actual error that fired an Azure Monitor Alert

Ben Sherrod 0

We keep getting an alert from Azure regarding our AI, and while it shows me that there were errors that fired the alert I cannot find out what the specific errors are. I think this is firing due to an API issue, but as I said I cannot actually view the error. How would I go about finding this info? I have queried the logs with queries that seem applicable, but they turn up nothing. 111111

Sandhya Kommineni 245 Reputation points Microsoft External Staff Moderator

2025-08-22T10:13:37.54+00:00

Hi Ben Sherrod, I just wanted to check if you had a chance to check my previous response and helped, do let me know if you have any further questions on this.
Sandhya Kommineni 245 Reputation points Microsoft External Staff Moderator

2025-08-25T01:11:32.36+00:00

Hi Ben Sherrod, I hope the provided answer is helpful, do let me know if you have any further questions on this Please accept as Yes and upvote if the answer is helpful so that it can help others in the community

2 answers

Your answer

Sandhya Kommineni 245 Reputation points Microsoft External Staff Moderator

2025-08-22T10:13:37.54+00:00

Hi Ben Sherrod, I just wanted to check if you had a chance to check my previous response and helped, do let me know if you have any further questions on this.
Sandhya Kommineni 245 Reputation points Microsoft External Staff Moderator

2025-08-25T01:11:32.36+00:00

Hi Ben Sherrod, I hope the provided answer is helpful, do let me know if you have any further questions on this Please accept as Yes and upvote if the answer is helpful so that it can help others in the community

Answer 1

Hi, What you’re looking at is a metric alert (TotalErrors) on your Azure OpenAI resource; metrics only tell you “how many”, not the actual failures. To see the real errors do this: 1) On the Azure OpenAI resource - Diagnostic settings - Send to Log Analytics (pick a workspace) and enable all log categories (request/response). 2) Wait a few minutes, then open Logs on the resource and run a query like:

union isfuzzy=true AzureOpenAIServiceLogs, AzureDiagnostics
| where _ResourceId has "/providers/Microsoft.CognitiveServices/accounts/optibot"
| where TimeGenerated > ago(24h)
| where tostring(ResponseCode) startswith "4" or tostring(ResponseCode) startswith "5"
| project TimeGenerated, OperationName, ApiName, DeploymentName, ResponseCode, ErrorCode, ErrorMessage, RequestId, ClientIp
| order by TimeGenerated desc

(adjust table names using the Tables pane if your workspace uses the legacy AzureDiagnostics table). 3) In Metrics on the same resource, open the “Total Errors” chart and Split by ApiName and, if available, Response code class to see which API (e.g., ChatCompletions) and which code family (4xx vs 5xx) spiked at the alert time. 4) In your app, log the HTTP failures with the status code, response body error.code/error.message, and headers x-ms-request-id and (optionally set) x-ms-client-request-id so you can correlate app logs ↔ Azure logs. 5) If you front the API with APIM/Front Door, also enable their diagnostics—many “errors” are throttles (429) or auth (401/403). Once diagnostics are flowing, the query above will show the exact calls that fired the alert.

Answer 2

Hi Ben Sherrod, Thank you for your question on Microsoft Q&A forum.

To help you identify the specific error that caused your Azure Monitor Alert, please follow the solution approach recommended below.

Identify the type of alert:

Navigate to Azure Monitor → Alerts → [Your Alert] → Alert Details.
Check the Condition: Determine whether it’s a Metric alert or a Log alert.
Metric alert-Tracks counts or thresholds (e.g., request failures).
Log alert- Triggered based on a Log Analytics query.

Review the source logs:

If it’s a metric alert, determine which table or service generates the metric (e.g., Application Insights, Azure Functions).

Go to Application Insights (if your API or app is instrumented) or the relevant Log Analytics workspace.
Query for exceptions or failed requests. For example, in Application Insights Failed requests requests | where success == false | order by timestamp desc | take 50 Exceptions exceptions | order by timestamp desc | take 50
This will provide the actual error message, stack trace, or API response code that caused the alert to trigger.

But as per the provided screenshot we can conclude the error shown in your Azure alert is from a metric, The alert summary specifies that it was triggered because the total errors crossed the threshold

Check Diagnostic Settings

Go to your Azure OpenAI resource → Monitoring → Diagnostic settings.
Ensure logs are being sent to Log Analytics, Storage, or Event Hub and enable all log categories related to requests and responses.

Refer document: https://learn.microsoft.com/en-us/azure/azure-monitor/platform/diagnostic-settings?tabs=portal

Run Queries in Log Analytics:

Here's a sample query you can use to find the errors:

union isfuzzy=true AzureOpenAIServiceLogs, AzureDiagnostics
| where _ResourceId has "/providers/Microsoft.CognitiveServices/accounts/your_resource_name"
| where TimeGenerated > ago(24h)
| where tostring(ResponseCode) startswith "4" or tostring(ResponseCode) startswith "5"
| project TimeGenerated, OperationName, ApiName, DeploymentName, ResponseCode, ErrorCode, ErrorMessage, RequestId, ClientIp
| order by TimeGenerated desc

AzureDiagnostics

| where ResourceProvider == "MICROSOFT.COGNITIVESERVICES"

| where Category == "RequestLogs"

| where statusCode >= 400

| project TimeGenerated, operationName, statusCode, resultSignature, callerIpAddress, requestUri_s, message

| order by TimeGenerated desc

Check Application Insights (if enabled)

Go to Application Insights > Failures.
Filter by optibot and the same time window.
Look for exceptions or failed dependencies.

Review API Usage Metrics

Go to Azure OpenAI > Metrics.
Add a chart for Total Errors and filter by affected resource.
This can help correlate spikes with specific operations

AzureDiagnostics

| where TimeGenerated between (datetime(2025-08-21 12:00)..datetime(2025-08-21 12:10))

| where statusCode >= 400

Make sure your account has Monitoring Reader or Log Analytics Reader roles for the resource group or subscription.

Match logs to the alert

Use the alert timestamp to filter logs. Often the reason you see “nothing” is because the logs queried don’t include the exact time when the alert fired.

Example: filter logs for the alert timeframe

exceptions

| where timestamp between(datetime(2025-08-21 12:00) .. datetime(2025-08-21 12:10))

| order by timestamp desc

Enable alert details in email

Some alerts can include a link to the query results in the email. Enable this in the alert rule to quickly jump to the raw logs next time

To know more about Alerts you can refer https://learn.microsoft.com/en-us/azure/azure-monitor/alerts/alerts-manage-alert-instances

I Hope provided steps help resolve your issue, please let me if you have any further queries.

Share via

How to see the actual error that fired an Azure Monitor Alert

2 answers

Your answer