Sync data from ThoughtSpot to Snowflake

Sync to Snowflake from an Answer

To create a sync to Snowflake from an Answer, follow these steps:

  1. Select the desired Answer from the Answers tab or the ThoughtSpot homepage. You must have Can manage sync permissions and view access to an Answer to create a sync.

  2. In the upper-right corner of the Answer, click the more options menu icon More options menu icon. From the dropdown menu, select Sync to other apps, then choose Snowflake.

  3. If this is the first sync you have created for Snowflake, a pop-up authorization window appears. To give ThoughtSpot permission to send data to your Snowflake account, select your account from the pop-up window.

  4. Within ThoughtSpot, fill in the following parameters:

    • Edit the Pipeline name if needed. By default, this field populates with PL-[Answer Name].

    • If you have more than one Snowflake destination set up, then the Destination field appears, and you will have to select a Snowflake destination from the dropdown menu available. However, if no destinations have been set up before or if you have only one Snowflake destination, the Destination field will not appear.

    • Select your Snowflake Object from the dropdown menu.

    • Select Operation from the dropdown menu. You can choose between Insert and Upsert.

    • Map the Source and Destination columns from the dropdown menus provided. Note that the Source column refers to the column in ThoughtSpot, while the Destination column refers to the column in Snowflake.

      If you select Upsert as the operation, the External ID option appears as well, to the left of the Source and Destination columns. This option is only clickable if the Destination column is unique (for example, ID). For the external ID column, the source column values will be looked up against the destination column values. For matches on that column, the existing records in Snowflake will be updated with the new source columns while records that don’t exist yet will be created and populated using the source column data.
  5. By default, “Save and sync” is selected. Select Save to send your data to Snowflake. Your data immediately appears in Snowflake.

  6. [Optional] To set up a repeated sync, click Schedule your sync and select your timezone. From the options provided, choose whether the sync will occur every:

    • n minutes. You can choose to schedule a sync every 5, 10, 15, 20, 30, or 45 minutes.

    • n hours.

    • n days at a selected time. Note that you can choose not to send an update on weekends.

    • week at a selected time and day.

    • n months at a selected time and date.

Any sync over 50,000 rows may result in an execution timeout. For optimal performance, keep your sync to below 50,000 rows. If you’re syncing a large number of rows and the sync fails, try applying filters like date filters to make your dataset smaller and then sync.

Sync to Snowflake from a Custom SQL View

To sync to Snowflake from a custom SQL view, follow these steps:

  1. Navigate to your SQL view by selecting the Data tab and searching from the Data Workspace home page. Select the SQL view name.

  2. In the upper-right corner, click the more options menu icon more options menu icon and select Sync to Snowflake.

  3. If this is the first sync you have created for the selected app, an authorization page appears. To give ThoughtSpot permission to send data to your Snowflake account, click Sign in with Snowflake, select your account, and click Allow.

  4. Fill in the following parameters:

    • Edit the Pipeline name if needed. By default, this field populates with PL-[Answer Name].

    • If you have more than one Snowflake destination set up, then the Destination field appears and you have to select a Snowflake destination from the dropdown menu available. However, if no destinations have been set up before or if you have only one Snowflake destination, the Destination field will not appear.

    • Select Object from the dropdown menu.

    • Select Operation from the dropdown menu. You can choose between Insert and Upsert.

    • Map the Source and Destination columns from the dropdown menus provided.

      If you select Upsert as the operation, the External ID option appears as well, to the left of the Source and Destination columns. This option is only clickable if the Destination column is unique (for example, ID). For the external ID column, the source column values will be looked up against the destination column values. For matches on that column, the existing records in Snowflake will be updated with the new source columns while records that don’t exist yet will be created and populated using the source column data.
  5. By default, “Sync and save” is selected. Select Save to send your data to Snowflake. Your data immediately appears in Snowflake.

  6. [Optional] To set up a repeated sync, click Schedule your sync and select your timezone. From the options provided, choose whether the sync will occur every:

    • n minutes. You can choose to schedule a sync every 5, 10, 15, 20, 30, or 45 minutes.

    • n hours.

    • n days at a selected time. Note that you can choose not to send an update on weekends.

    • week at a selected time and day.

    • n months at a selected time and date.

Any sync over 50,000 rows may result in an execution timeout. For optimal performance, keep your sync to below 50,000 rows. If you’re syncing a large number of rows and the sync fails, try applying filters like date filters to make your dataset smaller and then sync.

Failure to sync

A sync to Snowflake can fail due to multiple reasons. If you experience a sync failure, consider the following causes:

  • The underlying ThoughtSpot object was deleted.

  • The underlying Snowflake object was deleted.

  • The column name was changed in either ThoughtSpot or Snowflake, making it different to the column name setup in the mapping.

  • There are data validation rules in Snowflake which only allow data with only a certain data type to be populated in the Snowflake fields, but the columns being mapped onto Snowflake from ThoughtSpot do not have the same or allowable data type.

  • There is a mandatory field in Snowflake which has not been mapped onto as a destination column when setting up the mapping in ThoughtSpot.

Manage pipelines

While you can also manage a pipeline from the Pipelines tab in the Data Workspace, accessing the Manage pipelines option from an Answer or view displays all pipelines local to that specific data object. To manage a pipeline from an Answer or view, follow these steps:

  1. Click the more options menu icon more options menu icon and select Manage pipelines.

  2. Scroll to the name of your pipeline from the list that appears. Next to the pipeline name, select the more options icon more options menu icon. From the list that appears, select:

    • Edit to edit the pipeline’s properties. For example, for a pipeline to Google Sheets, you can edit the pipeline name, file name, sheet name, or cell number. Note that you cannot edit the source or destination of a pipeline.

    • Delete to permanently delete the pipeline.

    • Sync now to sync your Answer or view to the designated destination.

    • View run history to see the pipeline’s Activity log in the Data Workspace.

      More options menu for a pipeline

Related information


Was this page helpful?