High Availability (HA) and resilience

Consider these guidelines to ensure HA of ThoughtSpot app, and node resilience.

Requirements for node resilience

The cluster must have at least 3 nodes.
The cluster must have spare capacity; if one node fails, the remaining nodes must be able to host and serve all loaded data.

What happens during node failure

When a node loses connection with the main service manager process, it becomes unhealthy.
ThoughtSpot migrates all migratable services that run on the failed node to other (healthy) nodes. For all practical purposes, ThoughtSpot ignores the failed node until it reports itself as healthy.
ThoughtSpot rebalances and redistributes the data served from the failed node onto healthy nodes. Healthy nodes read the data from the the HDFS storage layer into the in-memory database processes.

Disruption: impact on users

The process of redistributing and loading the data in the affected tables on HDFS layer from a failed node to the remaining healthy nodes is not instantaneous. The failover may impact the user experience.

Related information

Disk failure

Node failure

Cluster replacement

Mount a NAS file system

Configure disaster recovery

Was this page helpful?Give us feedback!

High Availability (HA) and resilience

Requirements for node resilience

What happens during node failure

Disruption: impact on users

Product

By Role

By Department

Solutions

About Us

Connect