High Availability (HA) and resilience
Consider these guidelines to ensure HA of ThoughtSpot app, and node resilience.
Requirements for node resilience
The cluster must have at least 3 nodes.
The cluster must have spare capacity; if one node fails, the remaining nodes must be able to host and serve all loaded data.
What happens during node failure
When a node loses connection with the main service manager process, it becomes unhealthy.
ThoughtSpot migrates all migratable services that run on the failed node to other (healthy) nodes. For all practical purposes, ThoughtSpot ignores the failed node until it reports itself as healthy.
ThoughtSpot rebalances and redistributes the data served from the failed node onto healthy nodes. Healthy nodes read the data from the HDFS storage layer into the in-memory database processes.