How to see the actual error that fired an Azure Monitor Alert

Ben Sherrod 0 Reputation points
2025-08-20T19:07:15.6833333+00:00

We keep getting an alert from Azure regarding our AI, and while it shows me that there were errors that fired the alert I cannot find out what the specific errors are. I think this is firing due to an API issue, but as I said I cannot actually view the error. How would I go about finding this info? I have queried the logs with queries that seem applicable, but they turn up nothing. 111111

Azure Monitor
Azure Monitor
An Azure service that is used to collect, analyze, and act on telemetry data from Azure and on-premises environments.
{count} votes

2 answers

Sort by: Most helpful
  1. Michele Ariis 4,590 Reputation points MVP
    2025-08-21T08:25:08.88+00:00

    Hi, What you’re looking at is a metric alert (TotalErrors) on your Azure OpenAI resource; metrics only tell you “how many”, not the actual failures. To see the real errors do this: 1) On the Azure OpenAI resource - Diagnostic settings - Send to Log Analytics (pick a workspace) and enable all log categories (request/response). 2) Wait a few minutes, then open Logs on the resource and run a query like:

    union isfuzzy=true AzureOpenAIServiceLogs, AzureDiagnostics
    | where _ResourceId has "/providers/Microsoft.CognitiveServices/accounts/optibot"
    | where TimeGenerated > ago(24h)
    | where tostring(ResponseCode) startswith "4" or tostring(ResponseCode) startswith "5"
    | project TimeGenerated, OperationName, ApiName, DeploymentName, ResponseCode, ErrorCode, ErrorMessage, RequestId, ClientIp
    | order by TimeGenerated desc
    

    (adjust table names using the Tables pane if your workspace uses the legacy AzureDiagnostics table). 3) In Metrics on the same resource, open the “Total Errors” chart and Split by ApiName and, if available, Response code class to see which API (e.g., ChatCompletions) and which code family (4xx vs 5xx) spiked at the alert time. 4) In your app, log the HTTP failures with the status code, response body error.code/error.message, and headers x-ms-request-id and (optionally set) x-ms-client-request-id so you can correlate app logs ↔ Azure logs. 5) If you front the API with APIM/Front Door, also enable their diagnostics—many “errors” are throttles (429) or auth (401/403). Once diagnostics are flowing, the query above will show the exact calls that fired the alert.

    1 person found this answer helpful.
    0 comments No comments

  2. Sandhya Kommineni 245 Reputation points Microsoft External Staff Moderator
    2025-08-21T10:31:24.55+00:00

    Hi Ben Sherrod, Thank you for your question on Microsoft Q&A forum.

    To help you identify the specific error that caused your Azure Monitor Alert, please follow the solution approach recommended below.

    Identify the type of alert:

    • Navigate to Azure Monitor → Alerts → [Your Alert] → Alert Details.
    • Check the Condition: Determine whether it’s a Metric alert or a Log alert.
    • Metric alert-Tracks counts or thresholds (e.g., request failures).
    • Log alert- Triggered based on a Log Analytics query.

    Review the source logs:

    If it’s a metric alert, determine which table or service generates the metric (e.g., Application Insights, Azure Functions).

    • Go to Application Insights (if your API or app is instrumented) or the relevant Log Analytics workspace.
    • Query for exceptions or failed requests. For example, in Application Insights Failed requests requests | where success == false | order by timestamp desc | take 50 Exceptions exceptions | order by timestamp desc | take 50
    • This will provide the actual error message, stack trace, or API response code that caused the alert to trigger.

    But as per the provided screenshot we can conclude the error shown in your Azure alert is from a metric, The alert summary specifies that it was triggered because the total errors crossed the threshold

    Check Diagnostic Settings

    • Go to your Azure OpenAI resource → Monitoring → Diagnostic settings.
    • Ensure logs are being sent to Log Analytics, Storage, or Event Hub and enable all log categories related to requests and responses.

    Refer document: https://learn.microsoft.com/en-us/azure/azure-monitor/platform/diagnostic-settings?tabs=portal

    Run Queries in Log Analytics:

    Here's a sample query you can use to find the errors:

    union isfuzzy=true AzureOpenAIServiceLogs, AzureDiagnostics
    | where _ResourceId has "/providers/Microsoft.CognitiveServices/accounts/your_resource_name"
    | where TimeGenerated > ago(24h)
    | where tostring(ResponseCode) startswith "4" or tostring(ResponseCode) startswith "5"
    | project TimeGenerated, OperationName, ApiName, DeploymentName, ResponseCode, ErrorCode, ErrorMessage, RequestId, ClientIp
    | order by TimeGenerated desc
    

    AzureDiagnostics

    | where ResourceProvider == "MICROSOFT.COGNITIVESERVICES"

    | where Category == "RequestLogs"

    | where statusCode >= 400

    | project TimeGenerated, operationName, statusCode, resultSignature, callerIpAddress, requestUri_s, message

    | order by TimeGenerated desc

    Check Application Insights (if enabled)

    • Go to Application Insights > Failures.
    • Filter by optibot and the same time window.
    • Look for exceptions or failed dependencies.

    Review API Usage Metrics

    • Go to Azure OpenAI > Metrics.
    • Add a chart for Total Errors and filter by affected resource.
    • This can help correlate spikes with specific operations

    AzureDiagnostics

    | where TimeGenerated between (datetime(2025-08-21 12:00)..datetime(2025-08-21 12:10))

    | where statusCode >= 400

    Make sure your account has Monitoring Reader or Log Analytics Reader roles for the resource group or subscription.

    Match logs to the alert

    Use the alert timestamp to filter logs. Often the reason you see “nothing” is because the logs queried don’t include the exact time when the alert fired.

    Example: filter logs for the alert timeframe

    exceptions

    | where timestamp between(datetime(2025-08-21 12:00) .. datetime(2025-08-21 12:10))

    | order by timestamp desc

    Enable alert details in email

    • Some alerts can include a link to the query results in the email. Enable this in the alert rule to quickly jump to the raw logs next time

    To know more about Alerts you can refer https://learn.microsoft.com/en-us/azure/azure-monitor/alerts/alerts-manage-alert-instances

    I Hope provided steps help resolve your issue, please let me if you have any further queries.

    1 person found this answer helpful.
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.