Set up ThoughtSpot in GCP

Set up your GCP virtual machines.


ThoughtSpot Training

  • For best results in setting up ThoughtSpot on GCP, we recommend that you take the following ThoughtSpot U course: Node Configuration: GCP.

  • See other training resources at ThoughtSpot U.


After you determine your configuration options, set up your virtual machines (VMs). ThoughtSpot will share the ThoughtSpot base image for booting the VMs and some other aspects of system setup with you on the GCP platform.

ThoughtSpot uses a custom image to populate VMs in GCP. To find the ThoughtSpot custom image, refer to step 13 in the create an instance section.

Ask your ThoughtSpot contact for access to this image. We need the Google account/email ID of the individual who will be signed into your organization’s GCP console. We will share ThoughtSpot’s GCP project with them so they can use the contained boot disk image to create ThoughtSpot VMs.

This guide explains how to deploy ThoughtSpot on GCP, using ThoughtSpot’s CentOS-based image. You can also deploy ThoughtSpot on GCP using Red Hat Enterprise Linux (RHEL) or Oracle Enterprise Linux (OEL), allowing you to run ThoughtSpot on an RHEL 7.8 or 7.9 or OEL 7.9 image that your organization manages internally. To install ThoughtSpot using RHEL or OEL, refer to the RHEL and OEL deployment guide.

Before you can create a ThoughtSpot cluster, you must set up your VMs. Use the Google Compute Engine (GCP) platform to create and run VMs.

The following topics walk you through this process.

Prerequisites

Before you begin setting up ThoughtSpot, complete the following tasks:

Ensure that your Network Service Tier on the Google Cloud Console is set to Premium for the best performance of all your VMs.

A ThoughtSpot cluster requires 10 Gb/s bandwidth (or better) between any two nodes. You must ensure this before creating a new cluster.

Download and fill out the ThoughtSpot Site survey to have a quick reference for your networking information. Ask your network administrator if you need help filling out the site survey.

Setting up your Google Cloud Storage (GCS) bucket

If you are going to deploy your cluster using the GCS-storage option, you must set up that bucket before you set up your cluster. Contact ThoughtSpot Support to find out if your specific cluster size will benefit from the GCS storage option. If you are not using GCS, skip this step and create an instance.

  1. Sign in to the Google Cloud Console.

  2. Go to the Storage dashboard from the navigation bar on the side of your screen.

  3. Click CREATE BUCKET on the top menu bar.

  4. Enter a name for your bucket, and click CONTINUE.

  5. For location type, select Region.

  6. Use the Location drop-down menu to select the region where you are going to set up your instance.

  7. Click CONTINUE.

  8. For default storage class, select Standard.

  9. Click CONTINUE.

  10. Under Access Control, select Uniform to ensure uniform access to all objects in the storage bucket.

  11. Click CONTINUE.

  12. Do not edit the advanced settings.

    Leave Encryption set to Google-managed key and do not set a retention policy.

  13. Click CREATE.

When you create your instance, make sure you set Storage to Read Write access.

Create an instance

  1. Sign in to the Google Cloud Console.

  2. Click Select a Project from the top bar.

  3. Under Select From, pick your company’s project.

  4. Go to the Compute Engine dashboard.

    Go to the Compute Engine Dashboard
  5. Select VM instances on the left panel.

  6. Click the + icon from the top menu bar to create an instance.

  7. Provide a name for the instance.

  8. Select the region you are creating the instance in.

  9. Select the zone you are creating your region in.

  10. Under Machine type, select custom.

  11. Select the number of vCPU cores you need.

    Refer to ThoughtSpot GCP instance types to determine the number of vCPU cores your cluster needs.

  12. Specify your memory requirements and CPU platform.

    Refer to ThoughtSpot GCP instance types to determine the memory your cluster needs.

    Your configuration may look something like the following, but with your specific information.

    Cores

    64 vCPU

    Memory

    416 GB

    CPU platform

    Automatic, or select either one of the preferred CPU platforms, Intel Skylake or Intel Broadwell, if available.

    Specify machine configuration
    Preferred CPU platforms
  13. Configure the Boot disk.

    1. Scroll down to the Boot disk section and click Change.

      Change boot disk
    2. Click Custom Images from the options under Boot disk.

    3. Select ThoughtSpot-images under Show images from.

    4. Select one of the ThoughtSpot base images.

      Under the name of the image, you can see when it was created. ThoughtSpot should have directly sent you an image to use. If so, use that image.

      Select the latest ThoughtSpot image

      The image you should use depend on your release number.

      Release Number 7.1

      thoughtspot-image-20210402-71f6832a800-prod

      ThoughtSpot updates the base images with patches and enhancements. If more than one image is available, select the latest one by looking at the dates of creation. Each image works; however, we recommend using the latest image because it typically contains the most recent security and maintenance patches. Contact ThoughtSpot Support if you are unsure which image to use.

    5. Configure the boot disk as follows:

      Image

      ThoughtSpot

      Boot disk type

      SSD

      Size (GB)

      250

    6. Click Select to save the boot disk configuration.

  14. Back on the main configuration page, click to expand the advanced configuration options (Management, security, disks, networking, sole tenancy).

    Advanced configuration options
  15. Attach two 1 TB SSD drives for data storage. Refer to SSD-only persistent storage. If you are using GCS, attach only 1 SSD drive, with 500 GB instead of 1 TB. Refer to GCS and SSD persistent storage.

    1. Click the Disks tab, and click Add new disk.

      Add new disk

      Unselect the Deletion rule, to prevent potential loss of data if your instance is deleted accidentally.

    2. Configure the following settings for each disk. Refer to ThoughtSpot GCP instance types to determine the size in GB when you have GCS. Ensure the disks have read/write access.

      Type

      SSD persistent disk

      Source type

      Blank disk

      Size (GB)

      1024

      Deletion rule

      select keep disk, to prevent potential loss of data if your instance is deleted accidentally

      Configure your disk
  16. (For use with GCS only) In the Identity and API access section, make sure Service account is set to Compute Engine default service account. Under Access scopes, select Set access for each API.

  17. (For use with GCS only) After you click Set access for each API, scroll down to the Storage dropdown menu in the Identity and API access section. Set it to one of the following options:

    • To use Google Cloud Storage (GCS) as persistent storage for your instance, select Read Write.

    • To only use GCS to load data into ThoughtSpot, select Read Only.

  18. Under Networking, customize the network settings as needed. Use your default VPC settings, if you know them. Ask your network administrator if you do not know your default VPC settings.

    Update the network interface with your specific information or create a new one.

    Set your network interface

    1.

    Add an existing VPC network, or create a new one by clicking VPC network from the main menu. Ensure that this network has a firewall rule attached, with the minimum ports required for ThoughtSpot operation open. Refer to the minimum port requirements. See Google’s using firewalls and using VPCs documentation for assistance creating a firewall rule and a VPC network.

    2.

    Set the external IP as either ephemeral or static, depending on your preference.

    3.

    Ensure that network service tier is set to premium.

  19. Repeat these steps to create the necessary number of VMs for your cluster.

Minimum required ports

Open the following ports between the User/ETL server and ThoughtSpot nodes. This ensures that the ThoughtSpot processes do not get blocked. Refer to Network ports for more information on what ports to open for intracluster operation, so that your clusters can communicate.

The minimum ports needed are:

Port

22 [horizontal]

Protocol

SSH

Service

Secure Shell access

Port

443 [horizontal]

Protocol

HTTPS

Service

Secure Web access

Port

12345 [horizontal]

Protocol

TCP

Service

ODBC and JDBC drivers access

Prepare the VMs

Before you can install your ThoughtSpot cluster, an administrator must log in to each VM through SSH as user "admin", and complete the following preparation steps:

  1. Open a terminal application on your machine and ssh into one of your VMs.

    ssh admin@<VM-IP>
  2. Run sudo /usr/local/scaligent/bin/prepare_disks.sh.

    $ sudo /usr/local/scaligent/bin/prepare_disks.sh
  3. Configure the VM based on the site-survey.

  4. Repeat this process for each of your VMs.

Install cluster

To install your ThoughtSpot cluster, complete the installation process outlined in Installing ThoughtSpot in GCP.