This article provides a detailed list of default metrics monitored by the Sitecore Cloud Operations team for all Managed Cloud customers. More metrics might be included in future releases.
For more information about the Managed Cloud Standard monitoring description, refer to the service aspects article.
The Sitecore Managed Cloud team communicates to the designated Customer Technical Contact on all matters related to alerts, availability, and maintenance. By default, this is defined in the "Sitecore Customer Order" section of the Managed Cloud contract:
Customer Technical Contact Name: | {name of contact} |
Customer Technical Contact Email: | {email of contact} |
This contact list can be one or more recipients and is managed through the Sitecore account team. Customers are discouraged from fully delegating this Technical Contact responsibility to a partner or third party because it can limit visibility to important system-wide notifications.
Sitecore Managed Cloud comes with a Monitoring package starting from all new environments activated after May 2019. The full list of included monitored metrics is highlighted in the following tables.
Application Insight:
Rules | Threshold | Breaches | Period | Frequency (Minutes) |
Monitoring Plan |
Daily cap Reached | - | > 0 | - | - | Basic / Advanced |
Azure Web Apps:
Rules | Threshold | Breaches | Period | Frequency (Minutes) |
Monitoring Plan |
HTTP 5XX response | > 10 count | > 0 | Last 60 mins | 5 | Basic / Advanced |
Platform Availability KeepAlive.aspx | > 3 failed regions | > 0 | 5 mins | 5 | Basic / Advanced |
Average page response time | > 30 secs | > 0 | Last 30 mins | 5 | Advanced |
CD and CM backup issue | - | > 0 | - | - | Advanced |
Health Check health (for 9.3.0 and higher releases) | > 30 secs | > 0 | Last 30 mins | 5 | Advanced |
App Service Plan:
Rules | Threshold | Breaches | Period | Frequency (Minutes) |
Monitoring Plan |
CPU average | > 95% | > 30 | Last 60 mins | 10 | Basic / Advanced |
Memory average | > 95% | > 30 | Last 60 mins | 10 | Basic / Advanced |
File storage usage | > 80% | > 0 | Last 1 day | 1440 | Advanced |
Azure SQL Database:
Rules | Threshold | Breaches | Period | Frequency (Minutes) |
Monitoring Plan |
DTU average | > 95% | > 30 | Last 60 mins | 5 | Basic / Advanced |
CPU average | > 95% | > 30 | Last 60 mins | 10 | Basic / Advanced |
Storage utilization | > 75% | > 30 | Last 60 mins | 15 | Basic / Advanced |
DATA IO average | > 95% | > 30 | Last 60 mins | 15 | Basic / Advanced |
LOG IO average | > 95% | > 30 | Last 60 mins | 15 | Basic / Advanced |
Concurrent Workers (requests) | > 95% | > 30 | Last 60 mins | 15 | Basic / Advanced |
Concurrent sessions supported by the DB tier | > 95% | > 30 | Last 60 mins | 10 | Advanced |
Number of the failed database connections | > 5 count | > 14 | Last 60 mins | 10 | Advanced |
Average In-Memory OLTP storage | > 95% | > 14 | Last 60 mins | 15 | Advanced |
Azure Search Service:
Rules | Threshold | Breaches | Period | Frequency (Minutes) |
Monitoring Plan |
Throttled queries percent | > 30% | >30 | Last 60 mins | 5 | Basic / Advanced |
Average Search Query latency | > 10 secs | >0 | Last 30 mins | 10 | Advanced |
Search service response HTTP status code of 503 | >= 50 | >0 | Last 15 mins | 10 | Advanced |
Azure Redis Cache:
Rules | Threshold | Breaches | Period | Frequency (Minutes) |
Monitoring Plan |
Server load | > 95% | >30 | Last 60 mins | 15 | Basic / Advanced |
Average number of clientsconnected | > 80% | >0 | Last 30 mins | 10 | Advanced |
Average CPU Percent Processor Time | >= 95% | >0 | Last 30 mins | 15 | Advanced |
Average Used Memory | > 70% | >30 | Last 60 mins | 10 | Advanced |
SearchStax (SOLR) Server:
Alert | Threshold | ||
CPU Usage | > 80% | ||
JVM Heap Memory | > 80% | ||
Disk space | > 80% | ||
Search metrics: | |||
Average time/request | > 1 min | ||
Timeouts | >10 | ||
Errors | > 10 | ||
Indexing metrics: | |||
Average time/request | > 1 min | ||
Timeouts | > 10 | ||
Errors | > 10 |
Mongo Server:
Alert | Threshold |
Availability status | - |
CPU Usage | > 90% |
Storage space used | > 90% |
Page Faults | > 10 |
Replication set rollback on failover | - |
IMPORTANT NOTE: The monitoring package does not yet support the following deployment types:
Sitecore single topologies: xP0, xM0, xDB0.
Monitoring rules are set up for all Sitecore Managed Cloud Standard environments. For production environments, the Sitecore Managed Cloud team actively triages each alert and escalates incidents as appropriate. For non-production environments, customers can choose to respond as they prefer, but the Sitecore Managed Cloud team does not actively triage. If a customer would like to receive non-production alerts for Solr, hosted via SearchStax, this can be requested in a Support Case -- otherwise, there will be no alerting for non-production Solr components.