How to restart, redeploy, or otherwise fix Azure Cosmos DB for MongoDB (vCore) when it is non-responsive?

Jake Morgan 0 Reputation points
2025-08-15T14:33:33.5966667+00:00

Twice over the past months this service of mine has become nearly unresponsive. The metrics graph in Azure portal substantiates the instance being unresponsive.

Last time I went in to upscale the instance, even though that was completely unnecessary, and that repaired the connectivity.

Now the next step up would be quite expensive to do it, and seeing as it's completely unnecessary, I don't want to do it.

I just want the service to come back online. But there are zero options in the web UI to do anything to restart or repair the instance. And should I have to pay for tech support to just restart my stuff? Hence the question here.

Azure Cosmos DB
Azure Cosmos DB
An Azure NoSQL database service for app development.
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Sina Salam 23,931 Reputation points Volunteer Moderator
    2025-08-15T16:05:38.7633333+00:00

    Hello Jake Morgan,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    I understand that you would like to know how you can restart, redeploy, or otherwise fix Azure Cosmos DB for MongoDB (vCore) when it is non-responsive.

    Most of all, the service is a fully managed PaaS with no exposed “restart” or “redeploy” function. Instead, Microsoft handles all restarts, failovers, and rolling upgrades internally. When the service appears unresponsive, the priority is to confirm if the problem is due to an Azure platform event, network misconfiguration, client-side limitations, or actual capacity constraints, rather than trying to force a manual restart.

    Therefore, your expectation of fixing the issue by restarting from the portal will not work, because that option does not exist. The correct approach is to first check Azure Resource Health to see if the service is marked as Unavailable due to a platform issue, in which case you can open a support request even without a paid plan Azure Resource Health. If no outage is reported, you should then verify connectivity and DNS resolution, especially if using Private Endpoints, as incorrect private DNS zone configuration can cause intermittent or total loss of connectivity - Private DNS for Private Endpoints.

    If network tests and driver configurations are correct, review Azure Monitor metrics for CPU, storage, latency, and failed connections to identify potential performance bottlenecks - Azure Monitor for Cosmos DB. Storage saturation, in particular, can cause nodes to refuse connections until space is freed. In such cases, enabling High Availability (HA) ensures in-region failover to a standby node — High Availability in Cosmos DB vCore. For disaster recovery, consider creating a cross-region replica so you can promote it if the primary region is unhealthy.

    Finally, if you urgently need the service restored and none of the above resolves it, the last-resort approach is restoring from a backup to a new cluster and updating your applications to point to the new endpoint. This structured recovery process replaces the unavailable “restart” feature and provides a reliable, supportable path to resolution.

    I hope this is helpful! Do not hesitate to let me know if you have any other questions or clarifications.


    Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.