VRouterConnectionDown |
major |
VRouter <name>
<connection_type> connection to
<connection_id> is down. |
VRouterNonFunctional |
major |
VRouter <name> is non-functional. |
ControllerNonFunctional |
major |
Controller <name> is non-functional. |
ControllerConnectionDown |
major |
Controller <name>
<connection_type> connection to
<connection_id> is down. |
ControllerDBConnectionDown |
major |
Controller <name> connection to database is
down. |
AlertmanagerFailedReload |
critical |
Reloading an Alertmanager configuration has failed. |
AlertmanagerMembersInconsistent |
critical |
A member of an Alertmanager cluster has not found all other cluster
members. |
AlertmanagerFailedToSendAlerts |
warning |
An Alertmanager instance failed to send notifications. |
AlertmanagerClusterFailedToSendAlerts |
critical |
All Alertmanager instances in a cluster failed to send notifications
to a critical integration. |
AlertmanagerClusterFailedToSendAlerts |
warning |
All Alertmanager instances in a cluster failed to send notifications
to a non-critical integration. |
AlertmanagerConfigInconsistent |
critical |
Alertmanager instances within the same cluster have different
configurations. |
AlertmanagerClusterDown |
critical |
Half or more of the Alertmanager instances within the same cluster
are down. |
AlertmanagerClusterCrashlooping |
critical |
Half or more of the Alertmanager instances within the same cluster
are crashlooping. |
ConfigReloaderSidecarErrors |
warning |
config-reloader sidecar has not had a successful
reload for 10m. |
etcdInsufficientMembers |
critical |
etcd cluster "<name> ":
insufficient members (<value> ). |
etcdNoLeader |
critical |
etcd cluster "<name> ": member
<instance> has no leader. |
etcdHighNumberOfLeaderChanges |
warning |
etcd cluster "<name> ":
instance <instance> has seen
<value> leader changes within the last
hour. |
etcdHighNumberOfFailedGRPCRequests |
warning |
etcd cluster "<name> ":
<value> % of requests for
<grpc_method> failed on etcd instance
<instance> . |
etcdHighNumberOfFailedGRPCRequests |
critical |
etcd cluster "<name> ":
<value> % of requests for
<grpc_method> failed on etcd instance
<instance> . |
etcdGRPCRequestsSlow |
critical |
etcd cluster "<name> ": gRPC
requests to <grpc_method> are taking
<value> s on etcd instance
<instance> . |
etcdMemberCommunicationSlow |
warning |
etcd cluster "<name> ": member
communication with <name> is taking
<value> s on etcd instance
<instance> . |
etcdHighNumberOfFailedProposals |
warning |
etcd cluster "<name> ":
<value> proposal failures within the last
hour on etcd instance <instance> . |
etcdHighFsyncDurations |
warning |
etcd cluster "<name> ": 99th
percentile fsync durations are <value> s on etcd
instance <instance> . |
etcdHighCommitDurations |
warning |
etcd cluster "<name> ": 99th
percentile commit durations <value> s on etcd
instance <instance> . |
etcdHighNumberOfFailedHTTPRequests |
warning |
<value> % of requests for
<method> failed on etcd instance
<instance> . |
etcdHighNumberOfFailedHTTPRequests |
critical |
<value> % of requests for
<method> failed on etcd instance
<instance> . |
etcdHTTPRequestsSlow |
warning |
etcd instance <instance> HTTP
requests to <method> are slow. |
TargetDown |
warning |
One or more targets are unreachable. |
KubeAPIErrorBudgetBurn |
critical |
The API server is burning too much error budget. |
KubeAPIErrorBudgetBurn |
warning |
The API server is burning too much error budget. |
KubeStateMetricsListErrors |
critical |
kube-state-metrics is experiencing errors in list
operations. |
KubeStateMetricsWatchErrors |
critical |
kube-state-metrics is experiencing errors in watch
operations. |
KubeStateMetricsShardingMismatch |
critical |
kube-state-metrics sharding is
misconfigured. |
KubeStateMetricsShardsMissing |
critical |
kube-state-metrics shards are missing. |
KubePodCrashLooping |
warning |
Pod is crash looping. |
KubePodNotReady |
warning |
Pod has been in a non-ready state for more than 15 minutes. |
KubeDeploymentGenerationMismatch |
warning |
Deployment generation mismatch due to possible roll-back. |
KubeDeploymentReplicasMismatch |
warning |
Deployment has not matched the expected number of replicas. |
KubeStatefulSetReplicasMismatch |
warning |
Deployment has not matched the expected number of replicas. |
KubeStatefulSetGenerationMismatch |
warning |
StatefulSet generation mismatch due to possible
roll-back. |
KubeStatefulSetUpdateNotRolledOut |
warning |
StatefulSet update has not been rolled out. |
KubeDaemonSetRolloutStuck |
warning |
DaemonSet rollout is stuck. |
KubeContainerWaiting |
warning |
Pod container waiting longer than 1 hour. |
KubeDaemonSetNotScheduled |
warning |
DaemonSet pods are not scheduled. |
KubeDaemonSetMisScheduled |
warning |
DaemonSet pods are misscheduled. |
KubeJobCompletion |
warning |
Job was not complete in time. |
KubeJobFailed |
warning |
Job failed (was not completed). |
KubeHpaReplicasMismatch |
warning |
HPA has not matched desired number of replicas. |
KubeHpaMaxedOut |
warning |
HPA is running at max replicas. |
KubeCPUOvercommit |
warning |
Cluster has overcommitted CPU resource requests. |
KubeMemoryOvercommit |
warning |
Cluster has overcommitted CPU resource requests. |
KubeCPUQuotaOvercommit |
warning |
Cluster has overcommitted CPU resource requests. |
KubeMemoryQuotaOvercommit |
warning |
Cluster has overcommitted memory resource requests. |
KubeQuotaAlmostFull |
info |
Namespace quota is going to be full. |
KubeQuotaFullyUsed |
info |
Namespace quota is fully used. |
KubeQuotaExceeded |
warning |
Namespace quota has exceeded the limits. |
CPUThrottlingHigh |
info |
Processes experience elevated CPU throttling. |
KubePersistentVolumeFillingUp |
critical |
PersistentVolume is filling up. |
KubePersistentVolumeFillingUp |
warning |
PersistentVolume is filling up. |
KubePersistentVolumeErrors |
critical |
PersistentVolume is having issues with
provisioning. |
KubeVersionMismatch |
warning |
Different semantic versions of Kubernetes components are
running. |
KubeClientErrors |
warning |
Kubernetes API server client is experiencing errors. |
KubeClientCertificateExpiration |
warning |
Client certificate is about to expire. |
KubeClientCertificateExpiration |
critical |
Client certificate is about to expire. |
KubeAggregatedAPIErrors |
warning |
Kubernetes aggregated API has reported errors. |
KubeAggregatedAPIDown |
warning |
Kubernetes aggregated API is down. |
KubeAPIDown |
critical |
Target disappeared from Prometheus target discovery. |
KubeAPITerminatedRequests |
warning |
The Kubernetes apiserver has terminated
<value> of its incoming requests. |
KubeControllerManagerDown |
critical |
Target disappeared from Prometheus target discovery. |
KubeProxyDown |
critical |
Target disappeared from Prometheus target discovery. |
KubeNodeNotReady |
warning |
Node is not ready. |
KubeNodeUnreachable |
warning |
Node is unreachable. |
KubeletTooManyPods |
info |
Kubelet is running at capacity. |
KubeNodeReadinessFlapping |
warning |
Node readiness status is flapping. |
KubeletPlegDurationHigh |
warning |
Kubelet Pod Lifecycle Event Generator is taking too long to
relist. |
KubeletPodStartUpLatencyHigh |
warning |
Kubelet Pod startup latency is too high. |
KubeletClientCertificateExpiration |
warning |
Kubelet client certificate is about to expire. |
KubeletClientCertificateExpiration |
critical |
Kubelet client certificate is about to expire. |
KubeletServerCertificateExpiration |
warning |
Kubelet server certificate is about to expire. |
KubeletServerCertificateExpiration |
critical |
Kubelet server certificate is about to expire. |
KubeletClientCertificateRenewalErrors |
warning |
Kubelet has failed to renew its client certificate. |
KubeletServerCertificateRenewalErrors |
warning |
Kubelet has failed to renew its server certificate. |
KubeletDown |
critical |
Target disappeared from Prometheus target discovery. |
KubeSchedulerDown |
critical |
Target disappeared from Prometheus target discovery. |
NodeFilesystemSpaceFillingUp |
warning |
Filesystem is predicted to run out of space within the next 24
hours. |
NodeFilesystemSpaceFillingUp |
critical |
Filesystem is predicted to run out of space within the next 4
hours. |
NodeFilesystemAlmostOutOfSpace |
warning |
Filesystem has less than 5% space left. |
NodeFilesystemAlmostOutOfSpace |
critical |
Filesystem has less than 3% space left. |
NodeFilesystemFilesFillingUp |
warning |
Filesystem is predicted to run out of inodes within the next 24
hours. |
NodeFilesystemFilesFillingUp |
critical |
Filesystem is predicted to run out of inodes within the next 4
hours. |
NodeFilesystemAlmostOutOfFiles |
warning |
Filesystem has less than 5% inodes left. |
NodeFilesystemAlmostOutOfFiles |
critical |
Filesystem has less than 3% inodes left. |
NodeNetworkReceiveErrs |
warning |
Network interface is reporting many receive errors. |
NodeNetworkTransmitErrs |
warning |
Network interface is reporting many transmit errors. |
NodeHighNumberConntrackEntriesUsed |
warning |
Number of conntrack are getting close to the
limit. |
NodeTextFileCollectorScrapeError |
warning |
Node Exporter text file collector failed to scrape. |
NodeClockSkewDetected |
warning |
Clock skew detected. |
NodeClockNotSynchronising |
warning |
Clock not synchronizing. |
NodeRAIDDegraded |
critical |
RAID array is degraded. |
NodeRAIDDiskFailure |
warning |
Failed device in RAID array. |
NodeFileDescriptorLimit |
warning |
Kernel is predicted to exhaust file descriptors limit soon. |
NodeFileDescriptorLimit |
critical |
Kernel is predicted to exhaust file descriptors limit soon. |
NodeNetworkInterfaceFlapping |
warning |
Network interface is often changing its status. |
PrometheusBadConfig |
critical |
Failed Prometheus configuration reload. |
PrometheusNotificationQueueRunningFull |
warning |
Prometheus alert notification queue predicted to run full in less
than 30m. |
PrometheusErrorSendingAlertsToSomeAlertmanagers |
warning |
Prometheus has encountered more than 1% errors sending alerts to a
specific Alertmanager. |
PrometheusNotConnectedToAlertmanagers |
warning |
Prometheus is not connected to any Alertmanagers. |
PrometheusTSDBReloadsFailing |
warning |
Prometheus has issues reloading blocks from disk. |
PrometheusTSDBCompactionsFailing |
warning |
Prometheus has issues compacting blocks. |
PrometheusNotIngestingSamples |
warning |
Prometheus is not ingesting samples. |
PrometheusDuplicateTimestamps |
warning |
Prometheus is dropping samples with duplicate timestamps. |
PrometheusOutOfOrderTimestamps |
warning |
Prometheus drops samples with out-of-order timestamps. |
PrometheusRemoteStorageFailures |
critical |
Prometheus fails to send samples to remote storage. |
PrometheusRemoteWriteBehind |
critical |
Prometheus remote write is behind. |
PrometheusRemoteWriteDesiredShards |
warning |
Prometheus remote write desired shards calculation wants to run more
than configured max shards. |
PrometheusRuleFailures |
critical |
Prometheus is failing rule evaluations. |
PrometheusMissingRuleEvaluations |
warning |
Prometheus is missing rule evaluations due to slow rule group
evaluation. |
PrometheusTargetLimitHit |
warning |
Prometheus has dropped targets because some scrape configs have
exceeded the target limit. |
PrometheusLabelLimitHit |
warning |
Prometheus has dropped targets because some scrape configs have
exceeded the label limit. |
PrometheusTargetSyncFailure |
critical |
Prometheus has failed to sync targets. |
PrometheusErrorSendingAlertsToAnyAlertmanager |
critical |
Prometheus encounters more than 3% errors sending alerts to any
Alertmanager. |
PrometheusOperatorListErrors |
warning |
Errors while performing list operations in controller. |
PrometheusOperatorWatchErrors |
warning |
Errors while performing list operations in controller. |
PrometheusOperatorSyncFailed |
warning |
Last controller reconciliation failed. |
PrometheusOperatorReconcileErrors |
warning |
Errors while reconciling controller. |
PrometheusOperatorNodeLookupErrors |
warning |
Errors while reconciling Prometheus. |
PrometheusOperatorNotReady |
warning |
Prometheus operator not ready. |
PrometheusOperatorRejectedResources |
warning |
Resources rejected by Prometheus operator. |