Contrail Networking Alert List

date_range 14-Oct-22
Table 1: Contrail Networking Alert List
Alert Name Severity Description
VRouterConnectionDown major VRouter <name> <connection_type> connection to <connection_id> is down.
VRouterNonFunctional major VRouter <name> is non-functional.
ControllerNonFunctional major Controller <name> is non-functional.
ControllerConnectionDown major Controller <name> <connection_type> connection to <connection_id> is down.
ControllerDBConnectionDown major Controller <name> connection to database is down.
AlertmanagerFailedReload critical Reloading an Alertmanager configuration has failed.
AlertmanagerMembersInconsistent critical A member of an Alertmanager cluster has not found all other cluster members.
AlertmanagerFailedToSendAlerts warning An Alertmanager instance failed to send notifications.
AlertmanagerClusterFailedToSendAlerts critical All Alertmanager instances in a cluster failed to send notifications to a critical integration.
AlertmanagerClusterFailedToSendAlerts warning All Alertmanager instances in a cluster failed to send notifications to a non-critical integration.
AlertmanagerConfigInconsistent critical Alertmanager instances within the same cluster have different configurations.
AlertmanagerClusterDown critical Half or more of the Alertmanager instances within the same cluster are down.
AlertmanagerClusterCrashlooping critical Half or more of the Alertmanager instances within the same cluster are crashlooping.
ConfigReloaderSidecarErrors warning config-reloader sidecar has not had a successful reload for 10m.
etcdInsufficientMembers critical etcd cluster "<name>": insufficient members (<value>).
etcdNoLeader critical etcd cluster "<name>": member <instance> has no leader.
etcdHighNumberOfLeaderChanges warning etcd cluster "<name>": instance <instance> has seen <value> leader changes within the last hour.
etcdHighNumberOfFailedGRPCRequests warning etcd cluster "<name>": <value>% of requests for <grpc_method> failed on etcd instance <instance>.
etcdHighNumberOfFailedGRPCRequests critical etcd cluster "<name>": <value>% of requests for <grpc_method> failed on etcd instance <instance>.
etcdGRPCRequestsSlow critical etcd cluster "<name>": gRPC requests to <grpc_method> are taking <value>s on etcd instance <instance>.
etcdMemberCommunicationSlow warning etcd cluster "<name>": member communication with <name> is taking <value>s on etcd instance <instance>.
etcdHighNumberOfFailedProposals warning etcd cluster "<name>": <value> proposal failures within the last hour on etcd instance <instance>.
etcdHighFsyncDurations warning etcd cluster "<name>": 99th percentile fsync durations are <value>s on etcd instance <instance>.
etcdHighCommitDurations warning etcd cluster "<name>": 99th percentile commit durations <value>s on etcd instance <instance>.
etcdHighNumberOfFailedHTTPRequests warning <value>% of requests for <method> failed on etcd instance <instance>.
etcdHighNumberOfFailedHTTPRequests critical <value>% of requests for <method> failed on etcd instance <instance>.
etcdHTTPRequestsSlow warning etcd instance <instance> HTTP requests to <method> are slow.
TargetDown warning One or more targets are unreachable.
KubeAPIErrorBudgetBurn critical The API server is burning too much error budget.
KubeAPIErrorBudgetBurn warning The API server is burning too much error budget.
KubeStateMetricsListErrors critical kube-state-metrics is experiencing errors in list operations.
KubeStateMetricsWatchErrors critical kube-state-metrics is experiencing errors in watch operations.
KubeStateMetricsShardingMismatch critical kube-state-metrics sharding is misconfigured.
KubeStateMetricsShardsMissing critical kube-state-metrics shards are missing.
KubePodCrashLooping warning Pod is crash looping.
KubePodNotReady warning Pod has been in a non-ready state for more than 15 minutes.
KubeDeploymentGenerationMismatch warning Deployment generation mismatch due to possible roll-back.
KubeDeploymentReplicasMismatch warning Deployment has not matched the expected number of replicas.
KubeStatefulSetReplicasMismatch warning Deployment has not matched the expected number of replicas.
KubeStatefulSetGenerationMismatch warning StatefulSet generation mismatch due to possible roll-back.
KubeStatefulSetUpdateNotRolledOut warning StatefulSet update has not been rolled out.
KubeDaemonSetRolloutStuck warning DaemonSet rollout is stuck.
KubeContainerWaiting warning Pod container waiting longer than 1 hour.
KubeDaemonSetNotScheduled warning DaemonSet pods are not scheduled.
KubeDaemonSetMisScheduled warning DaemonSet pods are misscheduled.
KubeJobCompletion warning Job did not complete in time.
KubeJobFailed warning Job failed to complete.
KubeHpaReplicasMismatch warning HPA has not matched desired number of replicas.
KubeHpaMaxedOut warning HPA is running at max replicas.
KubeCPUOvercommit warning Cluster has overcommitted CPU resource requests.
KubeMemoryOvercommit warning Cluster has overcommitted CPU resource requests.
KubeCPUQuotaOvercommit warning Cluster has overcommitted CPU resource requests.
KubeMemoryQuotaOvercommit warning Cluster has overcommitted memory resource requests.
KubeQuotaAlmostFull info Namespace quota is going to be full.
KubeQuotaFullyUsed info Namespace quota is fully used.
KubeQuotaExceeded warning Namespace quota has exceeded the limits.
CPUThrottlingHigh info Processes experience elevated CPU throttling.
KubePersistentVolumeFillingUp critical PersistentVolume is filling up.
KubePersistentVolumeFillingUp warning PersistentVolume is filling up.
KubePersistentVolumeErrors critical PersistentVolume is having issues with provisioning.
KubeVersionMismatch warning Different semantic versions of Kubernetes components running.
KubeClientErrors warning Kubernetes API server client is experiencing errors.
KubeClientCertificateExpiration warning Client certificate is about to expire.
KubeClientCertificateExpiration critical Client certificate is about to expire.
KubeAggregatedAPIErrors warning Kubernetes aggregated API has reported errors.
KubeAggregatedAPIDown warning Kubernetes aggregated API is down.
KubeAPIDown critical Target disappeared from Prometheus target discovery.
KubeAPITerminatedRequests warning The Kubernetes apiserver has terminated <value> of its incoming requests.
KubeControllerManagerDown critical Target disappeared from Prometheus target discovery.
KubeProxyDown critical Target disappeared from Prometheus target discovery.
KubeNodeNotReady warning Node is not ready.
KubeNodeUnreachable warning Node is unreachable.
KubeletTooManyPods info Kubelet is running at capacity.
KubeNodeReadinessFlapping warning Node readiness status is flapping.
KubeletPlegDurationHigh warning Kubelet Pod Lifecycle Event Generator is taking too long to relist.
KubeletPodStartUpLatencyHigh warning Kubelet Pod startup latency is too high.
KubeletClientCertificateExpiration warning Kubelet client certificate is about to expire.
KubeletClientCertificateExpiration critical Kubelet client certificate is about to expire.
KubeletServerCertificateExpiration warning Kubelet server certificate is about to expire.
KubeletServerCertificateExpiration critical Kubelet server certificate is about to expire.
KubeletClientCertificateRenewalErrors warning Kubelet has failed to renew its client certificate.
KubeletServerCertificateRenewalErrors warning Kubelet has failed to renew its server certificate.
KubeletDown critical Target disappeared from Prometheus target discovery.
KubeSchedulerDown critical Target disappeared from Prometheus target discovery.
NodeFilesystemSpaceFillingUp warning Filesystem is predicted to run out of space within the next 24 hours.
NodeFilesystemSpaceFillingUp critical Filesystem is predicted to run out of space within the next 4 hours.
NodeFilesystemAlmostOutOfSpace warning Filesystem has less than 5% space left.
NodeFilesystemAlmostOutOfSpace critical Filesystem has less than 3% space left.
NodeFilesystemFilesFillingUp warning Filesystem is predicted to run out of inodes within the next 24 hours.
NodeFilesystemFilesFillingUp critical Filesystem is predicted to run out of inodes within the next 4 hours.
NodeFilesystemAlmostOutOfFiles warning Filesystem has less than 5% inodes left.
NodeFilesystemAlmostOutOfFiles critical Filesystem has less than 3% inodes left.
NodeNetworkReceiveErrs warning Network interface is reporting many receive errors.
NodeNetworkTransmitErrs warning Network interface is reporting many transmit errors.
NodeHighNumberConntrackEntriesUsed warning Number of conntrack are getting close to the limit.
NodeTextFileCollectorScrapeError warning Node Exporter text file collector failed to scrape.
NodeClockSkewDetected warning Clock skew detected.
NodeClockNotSynchronising warning Clock not synchronising.
NodeRAIDDegraded critical RAID Array is degraded.
NodeRAIDDiskFailure warning Failed device in RAID array.
NodeFileDescriptorLimit warning Kernel is predicted to exhaust file descriptors limit soon.
NodeFileDescriptorLimit critical Kernel is predicted to exhaust file descriptors limit soon.
NodeNetworkInterfaceFlapping warning Network interface is often changing its status.
PrometheusBadConfig critical Failed Prometheus configuration reload.
PrometheusNotificationQueueRunningFull warning Prometheus alert notification queue predicted to run full in less than 30m.
PrometheusErrorSendingAlertsToSomeAlertmanagers warning Prometheus has encountered more than 1% errors sending alerts to a specific Alertmanager.
PrometheusNotConnectedToAlertmanagers warning Prometheus is not connected to any Alertmanagers.
PrometheusTSDBReloadsFailing warning Prometheus has issues reloading blocks from disk.
PrometheusTSDBCompactionsFailing warning Prometheus has issues compacting blocks.
PrometheusNotIngestingSamples warning Prometheus is not ingesting samples.
PrometheusDuplicateTimestamps warning Prometheus is dropping samples with duplicate timestamps.
PrometheusOutOfOrderTimestamps warning Prometheus drops samples with out-of-order timestamps.
PrometheusRemoteStorageFailures critical Prometheus fails to send samples to remote storage.
PrometheusRemoteWriteBehind critical Prometheus remote write is behind.
PrometheusRemoteWriteDesiredShards warning Prometheus remote write desired shards calculation wants to run more than configured max shards.
PrometheusRuleFailures critical Prometheus is failing rule evaluations.
PrometheusMissingRuleEvaluations warning Prometheus is missing rule evaluations due to slow rule group evaluation.
PrometheusTargetLimitHit warning Prometheus has dropped targets because some scrape configs have exceeded the targets limit.
PrometheusLabelLimitHit warning Prometheus has dropped targets because some scrape configs have exceeded the labels limit.
PrometheusTargetSyncFailure critical Prometheus has failed to sync targets.
PrometheusErrorSendingAlertsToAnyAlertmanager critical Prometheus encounters more than 3% errors sending alerts to any Alertmanager.
PrometheusOperatorListErrors warning Errors while performing list operations in controller.
PrometheusOperatorWatchErrors warning Errors while performing list operations in controller.
PrometheusOperatorSyncFailed warning Last controller reconciliation failed.
PrometheusOperatorReconcileErrors warning Errors while reconciling controller.
PrometheusOperatorNodeLookupErrors warning Errors while reconciling Prometheus.
PrometheusOperatorNotReady warning Prometheus operator not ready.
PrometheusOperatorRejectedResources warning Resources rejected by Prometheus operator.