How to restart, redeploy, or otherwise fix Azure Cosmos DB for MongoDB (vCore) when it is non-responsive?

Question

How to restart, redeploy, or otherwise fix Azure Cosmos DB for MongoDB (vCore) when it is non-responsive?

Jake Morgan 0

Twice over the past months this service of mine has become nearly unresponsive. The metrics graph in Azure portal substantiates the instance being unresponsive.

Last time I went in to upscale the instance, even though that was completely unnecessary, and that repaired the connectivity.

Now the next step up would be quite expensive to do it, and seeing as it's completely unnecessary, I don't want to do it.

I just want the service to come back online. But there are zero options in the web UI to do anything to restart or repair the instance. And should I have to pay for tech support to just restart my stuff? Hence the question here.

1 answer

Your answer

Answer 1

Sina Salam 23,931 Volunteer Moderator

Hello Jake Morgan,

Welcome to the Microsoft Q&A and thank you for posting your questions here.

I understand that you would like to know how you can restart, redeploy, or otherwise fix Azure Cosmos DB for MongoDB (vCore) when it is non-responsive.

Most of all, the service is a fully managed PaaS with no exposed “restart” or “redeploy” function. Instead, Microsoft handles all restarts, failovers, and rolling upgrades internally. When the service appears unresponsive, the priority is to confirm if the problem is due to an Azure platform event, network misconfiguration, client-side limitations, or actual capacity constraints, rather than trying to force a manual restart.

Therefore, your expectation of fixing the issue by restarting from the portal will not work, because that option does not exist. The correct approach is to first check Azure Resource Health to see if the service is marked as Unavailable due to a platform issue, in which case you can open a support request even without a paid plan Azure Resource Health. If no outage is reported, you should then verify connectivity and DNS resolution, especially if using Private Endpoints, as incorrect private DNS zone configuration can cause intermittent or total loss of connectivity - Private DNS for Private Endpoints.

If network tests and driver configurations are correct, review Azure Monitor metrics for CPU, storage, latency, and failed connections to identify potential performance bottlenecks - Azure Monitor for Cosmos DB. Storage saturation, in particular, can cause nodes to refuse connections until space is freed. In such cases, enabling High Availability (HA) ensures in-region failover to a standby node — High Availability in Cosmos DB vCore. For disaster recovery, consider creating a cross-region replica so you can promote it if the primary region is unhealthy.

Finally, if you urgently need the service restored and none of the above resolves it, the last-resort approach is restoring from a backup to a new cluster and updating your applications to point to the new endpoint. This structured recovery process replaces the unavailable “restart” feature and provides a reliable, supportable path to resolution.

I hope this is helpful! Do not hesitate to let me know if you have any other questions or clarifications.

Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.

Jake Morgan 0 Reputation points

2025-08-15T16:27:26.99+00:00

Your link "Restore Azure Cosmos DB for MongoDB vCore" goes to a 404.

Following your advice for Azure Resource Health, I end up at "Service Health | Resource health", and can select my subscription, and can select "Cosmos DB", but it only shows a non-vCore old instance of Cosmos I have, and nothing in your "Resource type" selection even covers the "Azure Cosmos DB for MongoDB (vCore)". So I don't understand how to make progress there.

I don't have any private DNS configured. I'm using the provided endpoints.

Both Metrics and Insights when navigated to the resource show normal CPU, memory, storage. However, the query activity against the cluster abruptly stopped around 7AM when I lost connectivity from the 5+ places I was using to access the database.

For Azure Monitor, I end up at a page on the portal "Monitor | Azure Cosmos DB", but it's the same as with Resource Health-- only my old non-vCore instance is listed there. And the pages you referenced seem to be tailored toward non-vCore.

I'm not particularly interested in migrating my data to another instance in Azure just to fix an intermittent connectivity issue. If I migrate, it will be away from this service entirely.

I feel like I'm the only one on earth using this Cosmos/Mongo/vCore thing because I can't imagine this kind of situation being tolerable for anyone in production.
Sina Salam 23,931 Reputation points Volunteer Moderator

2025-08-15T16:50:08.4033333+00:00
Hi

Thank for your feedback.

The below could help with Key docs I used; read these for step-by-step commands and detail screenshots:

Availability & HA for Cosmos DB for MongoDB (vCore) — automatic in-region failover / HA behavior - https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/availability-disaster-recovery-under-hood

Monitor metrics & Metrics blade for vCore clusters — where to view Metrics in the cluster resource - https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/monitor-metrics

Monitor diagnostic logs & sample Kusto queries (VCoreMongoRequests table + sample queries) - https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/how-to-monitor-diagnostics-logs

Troubleshooting connectivity for vCore** — recommended network checks and recommended port check examples - https://learn.microsoft.com/en-us/azure/azure-portal/supportability/how-to-create-azure-support-request

Azure support & how to create a support request (portal flow) — how to open a ticket and what to expect; note special handling for service disruptions - https://learn.microsoft.com/en-us/azure/azure-portal/supportability/how-to-create-azure-support-request and Azure Documentation - https://docs.azure.cn/en-us/reliability/incident-response

Success
Jake Morgan 0 Reputation points

2025-08-15T19:11:07.96+00:00

The top 3 of your items I've looked at, but going through the steps tells me what I already known-- service health seems wonderful, but all activity to the service abruptly stopped at 7am. BTW this resource is unreachable from my desktop, my other azure services, from anywhere I've tried.

The 4th item seems relevant: "Troubleshooting connectivity for vCore" but you just posted a link to creating a support request.

And the 5th item pretty much covers that too.

I was able to submit a support request through a link in my subscription sponsorship, so I'm working with them now. ticket 2508150040004119
Sina Salam 23,931 Reputation points Volunteer Moderator

2025-08-16T11:56:03.3666667+00:00
Hello Jake Morgan,

Thank you for your feedback.

All the links provided are opening on a single click: You can read more documentation about your issues here - https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/

My advice for you:

Open a support request via the Azure portal.

You can do this without a paid support plan if it's a platform issue - Create Azure Support Request

You can also read more about Priority Customer Support (PCS) and contact them via: https://learn.microsoft.com/en-us/azure/azure-portal/supportability/priority-community-support

You will be able to get assistance via Microsoft Internal.

Okay, I have read about your comment:

I was able to submit a support request through a link in my subscription sponsorship, so I'm working with them now. ticket 2508150040004119

Also, after reviewing all the steps and incident, my advice to Microsoft is to:

Improve visibility of vCore clusters in Azure Monitor and Resource Health.

Provide restart/redeploy simulation or soft reset options for vCore clusters.

Offer free tier support escalation for critical service outages.

Cheers
Mahesh Kurva 7,570 Reputation points Microsoft External Staff Moderator

2025-08-18T05:30:32.3366667+00:00

Hi Jake Morgan,

Thanks for sharing the support ticket number, will follow up with them and update you.

Thank you.

Share via

How to restart, redeploy, or otherwise fix Azure Cosmos DB for MongoDB (vCore) when it is non-responsive?

1 answer

Your answer