Configure disaster recovery

Use this procedure to set up a disaster recovery configuration with a primary and a mirror instance.

Disaster recovery setup configures periodic backups from the primary to a shared, mirrored storage volume. If the primary cluster fails, then a secondary cluster can take over its operations after a small manual intervention.

Should the production cluster be destroyed, monitoring and alerting notifies the administrator. The administrator can then make the secondary appliance into the new primary, by starting it and recovering from backups generated by the primary.

This system makes it possible for you to restore the last backed up state from the primary to the secondary sever. If you configure daily backups, any metadata or data loaded/created after the last backup is not included in restore.

Prerequisites

Both primary and secondary appliances must use a shared storage volume. Both the primary and secondary appliance must have an active running ThoughtSpot cluster. You can use an NAS or Samba volume for your share. If you choose NAS, keep in mind that too slow a volume potentially break backups or significantly slow restore performance. The following are good guidelines for choosing storage:

  • Provision a dedicated storage volume for periodic backups.

  • Do not use the backup volume for loading data or any other purposes. If backups fill up this storage, other components will suffer.

  • To ensure better supportability and continuity in case local hard disks go bad, the shared storage volume should be network based.

Thoughtspot supports shared storage by mounting NFS or CIFS/Samba based volumes. Before you begin, make sure you know if the shared volume is Samba or NAS volume. To find out, use the telnet command.

Telnet confirms NFS
$ telnet,2049
Trying <IP_address>\...
Connected to <IP_address>.
Escape character is '{caret}]'.
Telnet confirms Samba
$ telnet,445
Trying <IP_address>\...
Connected to <IP_address>.
Escape character is '{caret}]'

Configure and test your shared volume

Your shared volume should have a minimum of 15GB free and at least 20GB for a full backup. To configure and mount the shared volume on the primary and mirror appliances, complete the following steps:

  1. SSH into the primary appliance.

  2. Ensure that the primary appliance has a ThoughtSpot cluster up and running.

    The primary appliance contains the cluster you are protecting with the recovery plan.

  3. Create a directory to act as your mount_point.

    sudo mkdir <mount_point>
  4. Set the directory owner to admin.

    sudo chown -R admin:admin <mount_point>
  5. Use the tscli nas subcommand to create a NAS mount on all of the cluster nodes. Run tscli nas mount-nfs or tscli nas mount-cifs.

    Use the command-line help (tscli nas -h) or the documentation to view all the nas subcommand options. Below are some samples to help you:

    Samba share
    tscli nas mount-cifs
      --server <IP_address>
      --path_on_server /bigstore_share
      --mount_point /mnt
      --username <your_admin_username>
      --password <your_password>
      --uid 1001
      --gid 1001
    Samba share with Windows AD authentication
    tscli nas mount-cifs
      --server <IP_address>
      --path_on_server /elc
      --mount_point /home/admin/etl/external_datadir
      --username <your_admin_username>
      --password <your_password>
      --uid 1001
      --gid 1001
    NFS
    tscli nas mount-nfs
      --server <IP_address>
      --path_on_server /data/toolchain
      --mount_point /mnt
  6. Log in to the target machine/ secondary appliance.

  7. Ensure that the target machine is running a ThoughtSpot cluster. Note that the clusters on the primary and target machines do not need to be on the same ThoughtSpot version.

    If a cluster is not running on the target machine, contact ThoughtSpot Support to create a cluster.

  8. Add the NAS mount entry to the /etc/fstab file. For example:

    $ cat /etc/fstab
    ...
    ...
    ...
    <IP_address>:/elc /home/admin/etl/external_datadir nfs rw,noexec,nofail,username=<your_admin_username>,password=<your_password> 0 0
  9. Run the sudo mount -a command. This creates a NAS mount on all nodes.

    $ sudo mount -a

    You must create the NAS mount under /export. You cannot create it under the root partition.

    The mount point and the parameters should be identical for both the primary and secondary machines.

  10. Test the configuration by creating a file as the admin user.

    touch <mount_point>/testfile
  11. Return to the primary server and make sure you can edit the file.

    touch <mount_point>/testfile

Configure the backup and start the mirror

  1. If you haven’t already done so, SSH into the primary server.

  2. Use the tscli backup-policy create command.

    The command opens a vi editor for you to configure the backup policy. Make sure your policy points to the NAS mount in the primary appliance.

    When choosing times and frequencies for periodic backups, you should choose a reasonable frequency. Do not schedule backups too close together, since a backup cannot start when another backup is still running. Avoid backing up when the system is experiencing a heavy load, such as peak usage or a large data load.

    If you are unfamiliar with the policy format, see Configure periodic backups.

  3. Write and save the file to store your configuration.

    By default, newly created policies are automatically enabled.

  4. Verify the policy using the tscli backup-policy show <name> command.

    Use the <name> from the policy you created in the previous step.

  5. SSH into the secondary recovery appliance.

  6. Use the tscli dr-mirror subcommand to start the mirror cluster.

    tscli dr-mirror start <mount point> <comma separated ip addresses of secondary cluster> <cluster name> <cluster id>
  7. Verify that the cluster has started running in mirror mode

    tscli dr-mirror status

It may take some time for the cluster to begin acting as a mirror.

Recovery operations

If the primary cluster fails, the secondary cluster can take over its operations after a small manual intervention. The manual procedure makes the secondary instance into the primary.

We recommend that you engage with ThoughtSpot Support to help you with this task.
  1. Contact ThoughtSpot customer support.

  2. If the primary ThoughtSpot cluster is still running, stop it and disconnect it from the network.

  3. SSH into the secondary cluster.

  4. Stop the mirror cluster.

    tscli dr-mirror stop
  5. Verify the mirror has stopped.

    tscli dr-mirror status
  6. Start the new primary cluster.

    tscli cluster start
  7. Deploy a new mirror.

  8. Set up a backup policy on your new primary cluster.