Hello Jake Morgan,
Welcome to the Microsoft Q&A and thank you for posting your questions here.
I understand that you would like to know how you can restart, redeploy, or otherwise fix Azure Cosmos DB for MongoDB (vCore) when it is non-responsive.
Most of all, the service is a fully managed PaaS with no exposed “restart” or “redeploy” function. Instead, Microsoft handles all restarts, failovers, and rolling upgrades internally. When the service appears unresponsive, the priority is to confirm if the problem is due to an Azure platform event, network misconfiguration, client-side limitations, or actual capacity constraints, rather than trying to force a manual restart.
Therefore, your expectation of fixing the issue by restarting from the portal will not work, because that option does not exist. The correct approach is to first check Azure Resource Health to see if the service is marked as Unavailable due to a platform issue, in which case you can open a support request even without a paid plan Azure Resource Health. If no outage is reported, you should then verify connectivity and DNS resolution, especially if using Private Endpoints, as incorrect private DNS zone configuration can cause intermittent or total loss of connectivity - Private DNS for Private Endpoints.
If network tests and driver configurations are correct, review Azure Monitor metrics for CPU, storage, latency, and failed connections to identify potential performance bottlenecks - Azure Monitor for Cosmos DB. Storage saturation, in particular, can cause nodes to refuse connections until space is freed. In such cases, enabling High Availability (HA) ensures in-region failover to a standby node — High Availability in Cosmos DB vCore. For disaster recovery, consider creating a cross-region replica so you can promote it if the primary region is unhealthy.
Finally, if you urgently need the service restored and none of the above resolves it, the last-resort approach is restoring from a backup to a new cluster and updating your applications to point to the new endpoint. This structured recovery process replaces the unavailable “restart” feature and provides a reliable, supportable path to resolution.
I hope this is helpful! Do not hesitate to let me know if you have any other questions or clarifications.
Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.