Overview of ThoughtSpot setup in AWS
The high-level process for setting up ThoughtSpot in AWS involves these steps:
- Gain access to ThoughtSpot AMIs.
- Choose a VM instance configuration recommended by ThoughtSpot.
- Set up your Amazon S3 bucket (optional).
- Set up your ThoughtSpot cluster in AWS.
- Contact ThoughtSpot to finish setting up your cluster.
- Open the required network ports for communication for the nodes in your cluster and end users.
About the ThoughtSpot AMI
An Amazon Machine image (AMI) is a preconfigured template that provides the information required to launch an instance. You must specify an AMI when you launch an instance in AWS.
To make deployment easy, the ThoughtSpot AMI includes a custom ThoughtSpot image, with the following components:
- A template for the root volume for the instance, such as an operating system, an appliance server, and applications.
- Launch permissions that control which AWS accounts can use the AMI to launch instances.
- A block device mapping that specifies the volumes to attach to the instance when it launches.
The ThoughtSpot AMI has specific applications on a CentOS base image. The AMI includes the EBS volumes necessary to install ThoughtSpot in AWS. When you launch an EC2 instance from this image, it automatically sizes and provisiones the EBS volumes. The base AMI includes 200 GB (xvda), 2X400 GB (xvdb), and SSD (gp2). It contains the maximum number of disks to handle a fully loaded VM.
To install and launch ThoughtSpot, you must have the following:
- Familiarity with Linux administration, and a general understanding of cloud deployment models.
- The necessary AWS Identity and Access Management (IAM) users and roles assigned to you to access and deploy the various AWS resources and services as defined in the Required AWS components section that follows.
For more information about IAM, see: What Is IAM? in Amazon’s AWS documentation.
Required AWS components
- An AWS VPC. For details, see VPC and Subnets in Amazon’s AWS documentation.
- A ThoughtSpot AMI. For details, see the next section.
- AWS security groups. For required open ports, see network policies.
- AWS VM instances. For instance type recommendations, see ThoughtSpot AWS instance types.
- EBS volumes.
- (Optional) If deploying with S3 persistent storage, one S3 bucket dedicated to each ThoughtSpot cluster.
Guidelines for setting up your EC2 instances
- Sign in to your AWS account.
- Copy the following ThoughtSpot public AMI which has been made available in N. California region to your AWS region:
AMI Name: thoughtspot-image-20190718-dda1cc60a58-prod
AMI ID: ami-0b23846e4761375f1
Region: N. CaliforniaNote: The AMI is backward-compatible with ThoughtSpot releases 5.1.x - 5.2.x.
- Choose the appropriate EC2 instance type: See ThoughtSpot AWS instance types for supported instance types.
- Networking requirements: 10 GbE network bandwidth is needed between the VMs. This is the default for the VM type recommended by ThoughtSpot.
- Security: The VMs that are part of a cluster need to be accessible by each other, which means they need to be on the same Amazon Virtual Private Cloud (VPC) and subnetwork. Additional external access may be required to bring data in/out of the VMs to your network.
- Number of EC2 instances needed: Based on the datasets, this number will vary. Please check ThoughtSpot AWS instance types for recommended nodes for a given data size.
- Staging larger datasets (> 50 GB per VM), may require provisioning additional attached EBS volumes that are SSD (gp2).
Setting up your Amazon S3 bucket
If you are going to deploy your cluster using the S3-storage option, you must set up that bucket before you set up your cluster. Contact ThoughtSpot Support to find out if your specific cluster size will benefit from the S3 storage option.
To set up an Amazon S3 bucket in AWS, do the following:
In AWS, navigate to the S3 service dashboard by clicking Services, then S3.
Make sure the selected region in the top-right corner of the dashboard is the same region in which you plan to set up your cluster.
Click Create bucket.
In the Name and region page, enter a name for your bucket, select the region where to set up the cluster, and click Next.
On the Properties page, click Next.
On the Configure options page, make sure Block all public access is selected and click Next.
On the Set permissions page, click Create bucket.
Encrypting your data at rest on Amazon EBS or S3 in AWS
ThoughtSpot makes use of EBS for the data volumes to store persistent data (in the EBS deployment model) and the boot volume (in the EBS and S3 deployment models). ThoughtSpot recommends that you encrypt your data volumes prior to setting up your ThoughtSpot cluster. If you are using the S3 persistent storage model, you can encrypt the S3 buckets using SSE-S3. ThoughtSpot does not currently support AWS KMS encryption for AWS S3.
For more information on encryption supported with AWS:
- For EBS, see Amazon EBS Encryption in Amazon’s AWS documentation.
- For S3, see Amazon S3 Default Encryption for S3 Buckets in Amazon’s AWS documentation.
Setting up your ThoughtSpot cluster in AWS
To set up a ThoughtSpot cluster in AWS, do the following:
In AWS, navigate to the EC2 service dashboard by clicking Services, then EC2.
Make sure your selected region is correct in the top-right corner of the dashboard. If not, select a different region you would like to launch your instance in. Let ThoughtSpot support know if you change your region.
Start the process of launching a VM by clicking Launch Instance.
Click the My AMIs tab, find the ThoughtSpot AMI from the list, and click Select.
On the Choose an Instance Type page, select a ThoughtSpot-supported instance type. (See ThoughtSpot AWS instance types.)
Click Next: Configure Instance Details.
Configure the instances by choosing the number of EC2 instances you need. The instances must be on the same VPC and subnetwork. ThoughtSpot will set up the instances to be in the same ThoughtSpot cluster.
S3 storage setting: If you are going to use the S3 storage option, you must go to the IAM role menu and select ec2rolewithfulls3access. This setting gives your instance access to all S3 buckets in your account’s region. If you want to restrict the access to a specific bucket, you must create a new IAM role that provides access to the specific bucket, and select it instead. For details on that, click Create new IAM role.
Click Next: Add Storage. Add the required storage based on the storage requirements of the instance type you have selected, and the amount of data you are deploying. For specific storage requirements, refer to ThoughtSpot AWS instance types.
When you are done modifying the storage size, Click Next: Add Tags.
Set a name for tagging your instances and click Next: Configure Security Group.
Select an existing security group to attach new security groups to so that it meets the security requirements for ThoughtSpot.Tip: Security setting for ThoughtSpot
- The VMs need intragroup security, i.e. every VM in a cluster must be accessible from one another. For easier configuration, ThoughtSpot recommends that you enable full access between VMs in a cluster.
- Additionally, more ports must be opened on the VM to provide data staging capabilities to your network. Check Network policies to determine the minimum required ports that must be opened for your ThoughtSpot appliance.
Click Review and Launch. After you have reviewed your instance launch details, click Launch.
Choose a key pair. A key pair consists of a public and private key used to encrypt and decrypt login information. If you don’t have a key pair, you must create one, otherwise you won’t be able to SSH into the AWS instance later on.
Click Launch Instances. Wait a few minutes for it to fully start up. After it starts, it will appear on the EC2 console.
Prepare the VMs (ThoughtSpot Systems Reliability Team)
Before we can install a ThoughtSpot cluster, an administrator must log into each VM through SSH as user “admin”, and complete the following preparation steps:
sudo /usr/local/scaligent/bin/prepare_disks.shon every machine.
Configure each VM based on the site-survey.
When complete, your storage is mounted and ready for use with your cluster.
Launch the cluster
Upload the TS tarball to one of the VMs and proceed with the normal cluster creation process, using tscli cluster create.
If you are going to use S3 as your persistent storage, you must enable it when running this command, using the enable_cloud_storage flag. Example:
tscli cluster create 6.0-167.tar.gz --enable_cloud_storage=s3a
When the setup is complete, you can load data into ThoughtSpot for search analytics.
Open the required network ports
To determine which network ports to open for a functional ThoughtSpot cluster, see Network policies.