Install the ThoughtSpot application on offline clusters that use Amazon Linux 2
Install ThoughtSpot on Amazon Linux 2 offline clusters. Before starting the install, complete the pre-installation steps.
If you are using the AWS SSM agent as an alternative to SSH, you must run the Ansible playbook and all commands on the SSM console. Ensure that your nodes have the right IAM roles for executing SSM scripts, and that they have access to your S3 bucket, if you are using S3.
In an offline cluster, the hosts cannot connect to the public repositories to download the required packages. Instead, you must download the packages from your organization’s mirror repository to each host. Otherwise, the steps for installing on offline clusters are practically the same as the steps for installing on online cluster.
Before you build the ThoughtSpot cluster and install the ThoughtSpot application on the hosts, you must run the Ansible playbook. The TS Ansible playbook prepares your clusters in the following manner:
-
Ansible installs the required packages: YAML, Python, and R packages; see Packages installed with ThoughtSpot for Amazon Linux 2.
-
It creates and configures local user accounts for ThoughtSpot:
-
admin
user has full administrative functionality -
thoughtspot
user can load data in the application
-
-
It installs the ThoughtSpot CLI,
tscli
. -
It configures the ThoughtSpot host nodes:
-
checks that customization scripts can execute on the nodes
-
checks that the partitions meet minimum size requirements
-
Here are the general steps for installing offline:
❏ |
|
❏ |
|
❏ |
|
❏ |
Configure the Ansible Playbook
To set up the Ansible, follow these steps:
-
Download the Ansible tarball you obtained from ThoughtSpot Support to your local machine. Note that you need a specific tarball for an Amazon Linux 2 installation.
You can download it by running the cp command. For example, if the tarball is in your S3 bucket, run
aws s3 cp s3://bucket_name/path/to/the/tarball ./
.Note that you only need to copy the tarball to one node.
-
Unzip the Ansible tarball, to see the following files and directories on your local machine:
- ansible_pkgs
-
This contains the Ansible Yum packages.
- customize.sh
-
This runs as the last step in the preparation process. You can use it to inject deployment-specific customizations, such as enabling or disabling a corporate proxy, configuring extra SSH keys, installing extra services, and so on. By default, this script does nothing.
- hosts.sample
-
The Ansible inventory file.
- prod_image
-
This directory contains the ThoughtSpot tools and tscli, the ThoughtSpot CLI binary.
- python3_pkgs
-
This contains all the python 3 packages.
- python_pkgs
-
This contains all the python 2 packages.
- README.md
-
Basic information for the unzipped file.
- r_pkgs
-
This contains all the R packages.
- rpm_gpg
-
This directory contains the GPG keys that authenticate the public repository.
- run_offline.sh
-
This script installs the Yum, R, and Python packages.
- toolchain
-
The tools that are necessary to compile the instructions you define in the Ansible Playbook, the source code, into executables that can run on your device. The toolchain includes a compiler, a linker, and run-time libraries.
- ts-new.yaml
-
The Ansible Playbook for new installations.
- ts-update.yaml
-
The Ansible Playbook for updates.
- ts.yaml
- yum_pkgs
-
This contains all the yum packages.
- yum.repos.d
-
This directory contains information about the yum repo used by the cluster.
-
Run the following script to install packages:
sudo sed -i "s/localpkg_gpgcheck/# localpkg_gpgcheck/g" /etc/yum.conf sudo touch /tmp/run_offline_log sudo chmod 777 /tmp/run_offline_log sudo pip uninstall ansible -y ./run_offline.sh >> /tmp/run_offline_log 2>&1
-
Copy the Ansible inventory file
hosts.sample
tohosts.yaml
, and using a text editor of your choice, update the file to include your host configuration.Copy the file by running this command: cp hosts.sample hosts.yaml.
If you are using SSM, you must additionally run a command to replace the ts_partition_name, and run a command to create a single partition on the disk mounted under /export. Run the following command to replace the ts_partition_name:
TS_DISK=disk_name_for_export_partition TS_PARTITION_NAME=${TS_DISK}1 sed -i "s/xvda9/$TS_PARTITION_NAME/g" hosts.yaml
Then run this command to create a single partition on the disk mounted under
/export
:sudo parted -s /dev/$TS_DISK mklabel gpt sudo parted -s /dev/$TS_DISK mkpart primary xfs 0% 100%
- hosts
-
Add the IP addresses or hostnames of all hosts in the ThoughtSpot cluster.
- user_uid
-
Specify the user ID for the user who will set up the node. If you are using
ssh
instead of AWS SSM, use the default values. If you are using SSM, thessm_user
uses the default value,1001
. You must choose a new value. Note that thethoughtspot
user uses1002
, so you cannot use1001
or1002
.If you do not use the default, add values that are not currently in use. To determine what values your system uses already, run the following command:
cat /etc/passwd | cut -d ":" -f3-4| sort
- user_gid
-
Specify the user group ID for the user who will set up the node. If you are using
ssh
instead of AWS SSM, use the default values. If you are using SSM, thessm_user
uses the default value,1001
. You must choose a new value. Note that thethoughtspot
user uses1002
, so you cannot use1001
or1002
.If you do not use the default, add values that are not currently in use. To determine what values your system uses already, run the following command:
cat /etc/passwd | cut -d ":" -f3-4| sort
- ssh_user
-
The
ssh_user
must exist on the ThoughtSpot host, and it must havesudo
privileges. This user is the same as theec2_user
.If you are using AWS SSM instead of ssh, there is no need to fill out this parameter.
- ssh_private_key
-
Add the private key for
ssh
access to thehosts.yaml
file. You can use an existing key pair, or generate a new key pair in the Ansible Control server. Run the following command to verify that the Ansible Control Server can connect to the hosts overssh
:ansible -m ping -i hosts.yaml all
If you are using AWS SSM instead of ssh, there is no need to fill out this parameter or run the above command.
- is_user_wheel_group
-
Specifies if the administrator user should be added to the wheel group. The default is
true
. If you specifyfalse
, the administrator user is not added to the wheel group. - ssh_public_key
-
Add the public key to the
ssh authorized_keys
file for each host, and add the private key to thehosts.yaml
file. You can use an existing key pair, or generate a new key pair in the Ansible Control server. Run the following command to verify that the Ansible Control Server can connect to the hosts overssh
:ansible -m ping -i hosts.yaml all
If you are using AWS SSM instead of ssh, there is no need to fill out this parameter or run the above command.
- extra_admin_ssh_key
-
[Optional] An additional or extra key may be required by your security application, such as Qualys, to connect to the hosts. If you are using AWS SSM instead of ssh, there is no need to fill out this parameter.
- http(s)_proxy
-
If the hosts must access public repositories through an internal proxy service, provide the proxy information. This release of ThoughtSpot does not support proxy credentials to authenticate to the proxy service.
- minimal_sudo_install
-
When this is defined, TS disables certain functionality to avoid making additional sudo calls. This functionality includes the email notification management system, some cluster statistics reporting, and logging of connectivity status between nodes. The default is undefined.
- external_sudo_manager
-
When this is configured, ThoughtSpot does not make any changes to the sudoers file, such as adding the administrator user. The user is then responsible for ensuring that the administrator user has the ability to run certain elevated privilege commands. The default is undefined.
- skip_sshd_config
-
When this is configured, ThoughtSpot does not make any changes to the sshd configuration of the node. The user must ensure that the MaxSessions value for the administrator user is at least 10. The default is undefined.
- offline
-
When this is set, the ansible playbook continues an offline installation.
- skip_yum_update
-
When this is defined, the ansible playbook does not attempt to run a blanket yum update to pull the latest packages. The default is undefined.
- no_mail_packages
-
When this is defined, ThoughtSpot does not install the mail packages mutt and postfix. This only applies for online installations. The default is undefined.
- skip_time_sync_setup
-
When this is defined, ThoughtSpot does not configure time synchronization between nodes using
ntp
. The user must configure time synchronization using eitherntp
orchronyd
themselves. The default is undefined. - ts_partition_name
-
The extended name of the ThoughtSpot export partition, such as
/dev/sdb1
.
Redirect the mirror repository
For the cluster hosts to connect to your organization mirror repository, you must redirect the hosts requests to the mirror repository, through the DNS.
Alternatively, you can manually update the repository URLs in the yum.repos.d
file.
Run the Ansible Playbook
First, to allow installation of the Yum, Python, and R packages, you must run the run_offline
script on your local machine or from the SSM console. Run the following command on all nodes:
run_offline.sh
Now you can run the Ansible Playbook from your local machine or from the SSM console by entering the following command. You must run this command on all nodes.
ansible-playbook -i hosts.yaml ts.yaml
As the Ansible Playbook runs, it performs these tasks:
-
Trigger the installation of Yum, Python, and R packages.
-
Configure the local user accounts that the ThoughtSpot application uses
-
Install the ThoughtSpot CLI
-
Configure all the nodes in the ThoughtSpot cluster.
Format and create export partitions, if they do not exist.
Prepare disks
After the Ansible Playbook finishes, run the prepare_disks
script on every node. You must run this script with elevated privileges, either with sudo or as a root user. Specify the data drives by adding the full device path for all data drives, such as /dev/sdc
, after the script name. Separate data drives with a space.
Run the prepare_disks
script, either with sudo or as a root user:
Install the ThoughtSpot cluster and the application
Refer to Install ThoughtSpot clusters in AWS for more detailed information on installing the ThoughtSpot cluster.
Follow these general steps to install ThoughtSpot on the prepared hosts:
-
Connect to the host as an admin user.
-
Download the release artifact from the ThoughtSpot file sharing system.
-
Upload the release artifact to your organization’s mirror repository.
-
Run the
tscli cluster create
command. This script prompts for user input. -
[Optional, 7.2.1 and later] Upgrade Python version. ThoughtSpot’s default Python version for Amazon Linux 2 is 3.7; you can upgrade it to 3.9. Refer to Upgrade your Python version.
-
Check the cluster health by running health checks and logging into the application.