Alerts code reference

Learn about the alerts ThoughtSpot may generate.

This reference identifies the messages that can appear in the TS Stats: System Information and Usage  Critical Alerts panel and in the Alerts dashboard.

Informational alerts

APPLICATION_INVALID_STATE

Raised when Application raises invalid state alert.

Msg

{{.Service}}.{{.Task}} on {{.Machine}} at location {{.Location}}

Type

INFO

DISK_ERROR

Raised when a machine has disk errors.

Msg

Machine {{.Machine}} has disk errors

Type

INFO

HDFS_CORRUPTION

Raised when HDFS root directory is corrupted.

Msg

HDFS root directory is in a corrupted state.

Type

INFO

MASTER_ELECTION

Raised when a new Orion Master is elected.

Msg

{{.Machine}} elected as Orion Master

Type

INFO

PERIODIC_BACKUP

Raised when periodic backup fails.

Msg

{{.Process}} periodic backup for policy {{.Name}} failed.

Type

INFO

PERIODIC_SNAPSHOT

Raised when a periodic snapshot fails.

Msg

{{.Process}} periodic snapshot {{.Name}} failed.

Type

INFO

TASK_TERMINATED

Raised when a task terminates.

Msg

Task {{.Service}}.{{.Task}} terminated on machine {{.Machine}}

Type

INFO

UPDATE_END

Raised when update completes.

Msg

Finished update of ThoughtSpot cluster {{.Cluster}} to release {{.Release}}

Type

INFO

UPDATE_START

Raised when update starts.

Msg

Starting update of ThoughtSpot cluster {{.Cluster}}

Type

INFO

ZK_AVG_LATENCY

Raised when average Zookeeper latency is above a threshold.

Msg

Average Zookeeper latency is more than {{.Num}} msec

Type

INFO

ZK_MAX_LATENCY

Raised when max Zookeeper latency is above a threshold.

Msg

Max Zookeeper latency is more than {{.Num}} msec

Type

INFO

ZK_MIN_LATENCY

Raised when min Zookeeper latency is above a threshold.

Msg

Min Zookeeper latency is more than {{.Num}} msec

Type

INFO

ZK_NUM_WATCHERS

Raised when there are too many Zookeeper watchers.

Msg

Number of Zookeeper watchers exceeds {{.Num}}

Type

INFO

ZK_OUTSTANDING_REQUESTS

Raised when there are too many outstanding Zookeeper requests.

Msg

Number of outstanding Zookeeper requests exceeds {{.Num}}

Type

INFO

Errors

TIMELY_ERROR

Raised when a job manager runs into an inconsistent state.

Msg

Job manager {{.Message}}

Type

ERROR

TIMELY_JOB_RUN_ERROR

Raised when a job run fails.

Msg

Job run {{.Message}}

Type

ERROR

Warnings

BOOT_DISK_SPACE

Raised when a machine is low on available disk space on boot partition.

Msg

Machine {{.Machine}} has less than {{.Perc}}% disk space free on boot partition

Type

WARNING

DISK_ERROR_EXTERNAL

Raised when more than 2 disk errors happen in a day.

Msg

Machine {{.Machine}} has disk errors

Type

WARNING

DISK_SPACE

Raised when a disk is low on available disk space. Valid only in the 3.2 version of ThoughtSpot.

Msg

Machine {{.Machine}} has less than {{.Perc}}% disk space free

Type

WARNING

EXPORT_DISK_SPACE

Raised when a machine is low on available disk space on export partition.

Msg

Machine {{.Machine}} has less than {{.Perc}}% disk space free on export partition

Type

WARNING

HDFS_NAMENODE_DISK_SPACE

Raised when a machine is low on available disk space on HDFS namenode drive.

Msg

Machine {{.Machine}} has less than {{.Perc}}% disk space free on HDFS namenode drive

Type

WARNING

HOST_DOWN

Raised when a host is down.

Msg

{{.Machine}} is down

Type

WARNING

MEMORY

Raised when a machine is low on free memory.

Msg

Machine {{.Machine}} has less than {{.Perc}}% memory free

Type

WARNING

OS_PROCS

Raised when a machine has more too many processes.

Msg

Machine {{.Machine}} has more than {{.Num}} processes

Type

WARNING

OS_USERS

Raised when a machine has too many users logged in.

Msg

Machine {{.Machine}} has more than {{.Num}} logged in users

Type

WARNING

ROOT_DISK_SPACE

Raised when a machine is low on available disk space on root partition.

Msg

Machine {{.Machine}} has less than {{.Perc}}% disk space free on root partition

Type

WARNING

SSH

Raised when a machine has more than 600 processes.

Msg

Machine {{.Machine}} doesn’t have an active SSH server

Type

WARNING

TASK_NOT_RUNNING

Raised when a service task is not running on any machine in the cluster.

Msg

{{.ServiceDesc}} is not running

Type

WARNING

TASK_UNREACHABLE

Raised when a task is unreachable over HTTP.

Msg

{{.ServiceDesc}} on {{.Machine}} is unreachable over HTTP

Type

WARNING

UPDATE_DISK_SPACE

Raised when a machine is low on available disk space on update partition.

Msg

Machine {{.Machine}} has less than {{.Perc}}% disk space free on update partition

Type

WARNING

ZK_EPHEMERAL_COUNT

Raised when there are too many Zookeeper ephemeral files.

Msg

Zookeeper has more than {{.Num}} ephemeral files

Type

WARNING

ZK_FD_COUNT

Raised when there are too many open Zookeeper files.

Msg

Zookeeper has more than {{.Num}} open file descriptors

Type

WARNING

Critical alerts

APPLICATION_INVALID_STATE_EXTERNAL

Raised when Application raises invalid state alert.

Msg

{{.Service}}.{{.Task}} on {{.Machine}} at location {{.Location}}

Type

CRITICAL

HDFS_DISK_SPACE

Raised when a HDFS cluster is low on total available disk space.

Msg

HDFS has less than {{.Perc}}% space free

Type

CRITICAL

OREO_TERMINATED

Raised when the Oreo daemon on a machine terminates due to an error. This typically happens due to an error accessing Zookeeper, HDFS, or a hardware issue.

Msg

Oreo terminated on machine {{.Machine}}

Type

CRITICAL

PERIODIC_BACKUP_FLAPPING

This alert is raised when a periodic backup failed repeatedly.

Msg

Periodic backup failed {{._actual_num_occurrences}} times in last {{._earliest_duration_str}}

Type

CRITICAL

PERIODIC_SNAPSHOT_FLAPPING

This alert is raised when periodic snapshot failed repeatedly.

Msg

Periodic snapshot failed {{._actual_num_occurrences}} times in last {{._earliest_duration_str}}

Type

CRITICAL

TASK_FLAPPING

Raised when a task is crashing repeatedly. The service is evaluted across the whole cluster. So, if a service crashes 5 times in a day across all nodes in the cluster, this alert is generated.

Msg

Task {{.Service}}.{{.Task}} terminated {{._actual_num_occurrences}} times in last {{._earliest_duration_str}}

Type

CRITICAL

ZK_INACCESSIBLE

Raised when Zookeeper is inaccessible.

Msg

Zookeeper is not accessible

Type

CRITICAL