Set up ThoughtSpot in Azure

After you determine your configuration options, you must set up your virtual machines using a ThoughtSpot image for Azure.


ThoughtSpot Training

  • For best results in setting up ThoughtSpot on Azure, we recommend that you take the following ThoughtSpot U course: Node Configuration: Azure.

  • See other training resources at ThoughtSpot U.


About the ThoughtSpot image

To provision ThoughtSpot in the Azure portal, access the ThoughtSpot Virtual Machine in the Azure Marketplace.

The ThoughtSpot Virtual Machine comes provisioned with the custom ThoughtSpot image to make hosting simple. A virtual machine is a preconfigured template that provides the information required to launch an instance of ThoughtSpot. It includes a root disk for the instance, which contains an operating system, application server, and other necessary software.

The ThoughtSpot Virtual Machine has the ThoughtSpot software installed and configured, on a CentOS-based image. Check with your ThoughtSpot contact to learn about the latest version of the ThoughtSpot Virtual Machine.

Due to security restrictions, the ThoughtSpot Virtual Machine does not have default passwords for the administrator users. When you are ready to obtain the password, contact ThoughtSpot Support.

This guide explains how to deploy ThoughtSpot on Microsoft Azure, using ThoughtSpot’s CentOS-based image. You can also deploy ThoughtSpot on Azure using Red Hat Enterprise Linux (RHEL), allowing you to run ThoughtSpot on an RHEL 7.8 or 7.9 image that your organization manages internally. To install ThoughtSpot using RHEL, refer to the RHEL and OEL deployment guide after you launch your virtual machines.

Set up ThoughtSpot in Azure

Follow these steps to provision and set up the VMs and launch ThoughtSpot.

Prerequisites

Complete these steps before launching your ThoughtSpot Virtual Machine:

Obtain an Azure login account.

Set up usage payment details with Microsoft Azure.

Find your company’s Resource Group.

You can also choose to create resource groups when creating your virtual machines.)

Download and fill out the ThoughtSpot site survey to have a quick reference for any networking information you may need to fill out. Ask your network administrator if you need help filling out the site survey.

A ThoughtSpot cluster requires 10 Gb/s bandwidth (or better) between any two nodes. You must ensure this before creating a new cluster.

Create an instance

Create your virtual machines based on the ThoughtSpot Virtual Machine.

  1. Log in to the Azure portal.

    In a browser, go to Azure home, and log in to your Azure account.

  2. On the Azure portal homepage, hover over Virtual machines, and click Create.

  3. Specify information under Basics.

    Subscription

    Choose a subscription type from the dropdown menu.

    Resource group

    If your company already has a resource group, select existing. If not, create new.

    Virtual machine name

    Specify a name for your virtual machine.

    Region

    Specify the region in which you are creating the VM.

    Image

    Click Browse all public and private images, and search for the ThoughtSpot image. Choose the ThoughtSpot Search & AI-driven Analytics (BYOL) image.

    Size

    Specify the size for your VM that works for your cluster needs.

    Authentication type

    Select SSH public key.

    Username

    Specify a username.

    This user is necessary for Azure VM creation, but does not exist in ThoughtSpot. You cannot log in to ThoughtSpot, or ssh into the command line, with this user.
    SSH public key

    Enter an SSH public key.

    You can choose use existing public key or generate new key pair, or ThoughtSpot Support^ to obtain a public key.

    This SSH public key is different from the SSH private key you use later, to ssh into your VM from the command line. This public key is necessary for Azure VM creation, but is not necessary at any later point.
    Public inbound ports

    Choose allow selected ports.

    Select inbound ports

    Open the necessary Inbound and Outbound ports to ensure that the ThoughtSpot processes do not get blocked.

  4. Specify information under Disks.

    OS disk type

    Choose a disk type from the dropdown menu. ThoughtSpot recommends the Premium SSD disks.

    Data disks

    Click Create and attach a new disk. Add two data disks. Refer to Azure configuration options to see what size they should be.

    Advanced

    Under Advanced, click yes to use managed disks.

    The new Standard SSD disk types are only available for particular regions. Make sure this disk type is supported in the region you chose for your VM before selecting it.

    See Standard SSD Disks for Virtual Machine workloads for more on SSD disks. ThoughtSpot recommends the Premium SSD disks.

  5. Specify information under Networking.

    Virtual network

    Find your company’s virtual network and select it, or create new.

    Public IP

    Find your company’s public IP, or create new.

    NIC network security group

    Select Advanced for NIC network security group.

    Configure network security group

    After you select Advanced, the Configure network security group option appears. Find your company’s security group, or create new. When creating your security group, ensure that the required ports are open. Refer to the Network ports article.

  6. Under Management, configure your monitoring and management preferences. If you have no preferences, you can leave them at their default settings.

  7. Under Advanced, configure your advanced settings preferences. If you have no preferences, you can leave them at their default settings.

  8. Under Tags, tag your virtual machine with a human-readable string to help you identify it.

  9. Click Review + create in the bottom left corner of your screen.

  10. Review your changes, and click create. Azure does the final validation check.

Minimum required ports

Open the following ports between the User/ETL server and ThoughtSpot nodes. This ensures that the ThoughtSpot processes do not get blocked.

The minimum ports requirements are:

Port 22

Protocol: SSH Service: Secure Shell access

Port 443

Protocol: HTTPS Service: Secure Web access

Port 12345

Protocol: TCP Service: ODBC and JDBC drivers access

Nodes purchased from Azure must be reachable to each other so that they can communicate and form a distributed environment. ThoughtSpot requires that these ports be accessible between nodes within a cluster. Use your discretion about whether to restrict public access or not for all nodes and all ports.

Refer to Network ports for more information.

Prepare for starting up ThoughtSpot

Prerequisite: To log in to the VM, you need the private key that is available in the image. You can obtain this from your ThoughtSpot contact.

  1. Obtain the VM’s public and private IP addresses.

    Public IP

    To see the public IP, click the VM name link. This will show the public IP of the VM.

    Private IP

    To see the private IP, select more services from the Microsoft Azure homepage. Select Networking from the list on the left side of the screen.

  2. In a terminal application, connect to the VM through SSH.

    Log in as the admin user, using the private key that your ThoughtSpot contact sent you.

    $ ssh -i <path_to_private_key> admin@<public_VM_IP>
  3. Update the password for both the admin and the thoughtspot users.

    The command prompts you to type in a new password, and then to confirm the password.

    $ sudo passwd admin
    Changing password for user admin
    $ sudo passwd thoughtspot
    Changing password for user thoughtspot
    +
    WARNING: If you do not change the password, you cannot log back into your Azure VMs.
    Your private key does not work after initial installation.
  4. Update the file /etc/hosts with all the node IP addresses for the other VMs that will be part of the ThoughtSpot cluster.

Verify storage disks

Verify the existence of your data disks, created in Step 4 of create an instance, by issuing lsblk in your terminal application:

$ lsblk

Your result may look something like the following listing:

   NAME    MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
   fd0       2:0    1    4K  0 disk
   sda       8:0    0  200G  0 disk
   ├─sda1    8:1    0    1G  0 part /mntboot
   ├─sda2    8:2    0   20G  0 part /
   ├─sda3    8:3    0   20G  0 part /update
   └─sda4    8:4    0  159G  0 part /export
   sdb       8:16   0    1T  0 disk
   └─sb1     8:17   0    1T  0 part /mnt/resource
   sdc       8:32   0    1T  0 disk
   sdd       8:48   0    1T  0 disk
   sr0      11:0    1  628K  0 rom
  1. Unmount the temporary disk by issuing the following command:

    $ sudo umount /mnt/resource
    The /mnt/resource disk, which is mounted on the /dev/sdb disk, is temporary. Any data on it will be wiped if the VM is shut down. You must unmount the /mnt/resource disk.
  2. Prepare the disks /dev/sdc and /dev/sdd for ThoughtSpot by issuing the following command:

    Do not use the disk /dev/sdb (the ephemeral disk). Any data on it will be wiped if the VM is shut down.
     $ sudo /usr/local/scaligent/bin/prepare_disks.sh /dev/sdc /dev/sdd
  3. Check the status of the disks by issuing the following command:

     $ df -h
  4. Repeat the steps in this section for each node in your cluster.