ThoughtSpot engineering has performed extensive testing of the ThoughtSpot appliance on various Amazon Elastic Compute Cloud (EC2) and Amazon Elastic Block Store (EBS) configurations for best performance, load balancing, scalability, and reliability.
You can find information here on which configuration of memory, CPU, storage, and networking capacity you should be running for your instances. There are also details on how to configure your placement groups.
ThoughtSpot AWS instance types
Data shape |
Total cluster |
Per VM |
Recommended |
vCPU/RAM |
Boot volume for each node |
Data volumes |
Standard (1KB/row) |
Up to 2 TB |
250 GB |
r4.16xlargea |
64/488 |
200 GB for each node |
2X 1 TB |
>2 TB |
384 GB (Large) |
r5.24xlarge |
96/768 |
200 GB for each node |
2X 1.5 TB |
|
Up to 100 GB |
100 GB |
r4.8xlargeb |
32/244 |
200 GB for each node |
2X 400 GB |
|
Up to 20 GB |
20 GB |
r4.4xlargeb |
16/122 |
200 GB for each node |
2X 400 GB |
|
Thin rows (<300 bytes/row) |
Any |
192 GB |
m5.24xlarge |
96/384 |
200 GB for each node |
2X 1 TB |
(a) Use the sizing calculators on each cloud provider to plug in expected customer discounts to arrive at the proper recommended cloud instance type. (b) Use the small and medium instance-type configuration. Refer to: Use small and medium instance types. |
For most instances, the per VM recommended user data capacity is set at 50% of the available RAM on the instance. However, in the case of our 16CPU/128GB RAM and 32CPU/256GB RAM instances, we support user data sizes below those numbers to budget for application overhead.
Regions, availability zones, and placement groups
AWS instances are configured to a location with regard to where the computing resources are physically located. You must specify a region, an availability zone, and below that, a placement group.
AWS nodes in a ThoughtSpot cluster must be in the same availability zone (and, therefore, also in the same region).
A placement group is a logical grouping of instances within a single availability zone. Placement groups are recommended for applications that benefit from low network latency, high network throughput, or both.
ThoughtSpot relies on high connectivity between nodes of a cluster, which is why creating a placement group is recommended. Being in same placement group will give you the best shot at the highest bandwidth across AWS EC2 instances and the lowest latencies. This will make the node-node network reach the closest AWS promised specs. Our default recommendation for a multi-instance setup requires a placement group since it works best for our application performance. Also, AWS will provide jumbo frames (9000 MTU) support in such situations, and they don’t charge extra for being in the same placement group.
Having said that, ThoughtSpot will still work with EC2s in the cluster across placement groups in an availability zone.