tsload connector reference

The tsload connector APIs enable you to load data into ThoughtSpot.

The tsload connector supports the following APIs:

Login

Use this API to authenticate and log in a user. Login establishes a session with the ThoughtSpot ETL HTTP server. The authentication requires a username and password.

Request Parameters

username

ThoughtSpot username

Data type

string

password

ThoughtSpot password

Data type

string

Request

POST /ts_dataservice/v1/public/session HTTP/1.1
Host: client.mydomain.com Accept: application/json Content-type: application/json
{
"username":"<thoughtspot user name>",
"password":"<thoughtspot password>"
}

Response

Status: 200 OK
Set-cookie: token

Example failure responses

Status: 401 UNAUTHORIZED
Unable to verify user. Please login again.
Status: 500 INTERNAL SERVER ERROR
error code = INTERNAL, message = Couldn't resolve the authentication service.
If the cookie is not passed in subsequent calls, then requests fail.

StartLoad

After login, you use this API start the data load operation. The API call to be used here is “/ts_dataservice/v1/public/loads”. If the load is initiated successfully, the cycle ID, and the load balancer IP are returned. After this completes, use Load to start the actual data load.

Request Parameters

Target

Specification of the target. This D`B/Schema/Table` must exist on the destination ThoughtSpot system.

database

Database in ThoughtSpot

Data type

string

schema

(optional) Schema in ThoughtSpot

Data type

string

Default value

falcon_default_schema

table

Table in ThoughtSpot

Data type

string

Format

Format specifiers for parsing the input data.

type

(optional) Input format; Either csv, delimited, or parquet.

Data type

string

Default

csv

field_separator

(optional) Field separator character in source data.

Data type

string

Default

"," (comma)

trailing_field_separator

(optional) True if input data has trailing field separator, false otherwise.

Data type

boolean

Default

false

enclosing_character

(optional) The enclosing character in csv source format. This option applies only to csv format.

Data type

string

Default

"\" (backslash)

escape_character

(optional) Escape character in source data. This applies only to delimited data format. This option is ignored for other data sources.

Data type

string

Default

"" (null)

null_value

(optional) Escape character in source data. This applies only to delimited data format. This option is ignored for other data sources.

Data type

string

Default

"" (null)

has_header_row

(optional) True if input data file has header row, false otherwise.

Data type

boolean

Default

false

flexible

(optional) Whether input data file exactly matches target schema.

When true, attempts to load like this:

  1. If there are extra columns in the input file, the system discards them.

  2. If there are fewer columns in the input file, the system fills the missing columns using null values.

    When false, the load proceeds if input data file exactly matches the target schema.

    Data type

    boolean

    Default value

    false

date_time

converted_to_epoch

(optional) Whether date or datetime fields are already converted to epoch in source CSV. This option is ignored for other source types.

Data type

boolean

Default

true

date_time_format

(optional) Format string for datetime values. Default is System accepts date time format specifications supported in strptime datetime library.

Data type

string

Default

"%Y%m%d %H:%M:%S" (yearmonthday hour:minute:second)

Example

December 30, 2001 1:15:12 is 20011230 01:15:12

date_format

(optional) Format string for date values. System accepts date format specifications supported in strptime datetime library.

Data type

string

Default

"%Y%m%d" (yearmonthday)

Example

December 30, 2001 is 20011230

time_format

(optional) Format string for time values. Default is hour:minute:second. System accepts time format specifications supported in strptime datetime library.

Data type

string

Default

"%H:%M:%S" (hour:minute:second)

Example

1:15:12 is 01:15:12

skip_second_fraction

(optional) When true, skip fractional part of seconds: milliseconds, microseconds, or nanoseconds from either datetime or time values if that level of granularity is present in the source data.

This option is ignored for other source types.

Skipping fractional components from input data can have a negative impact when upserting data because non-unique fractional values for same time or datetime values can incorrectly replace valid rows.
Data type

boolean

Default

false

boolean

use_bit_values

(optional) If true, the source csv uses one bit for boolean values.

If false, boolean values are interpreted using the flag boolean_representation.

This option is valid for csv only, ad ignored for other types.

  • False is represented as 0x0

  • True is represented as 0x1.

    Data type

    boolean

    Default

    false

true_format

(optional) Represents True for boolean values in input.

Data type

string

Default

T

false_format

(optional) Represents False for boolean values in input.

Data type

string

Default

F

load_options

empty_target

(optional) If true, current rows in the target table or file are dropped before loading new data. If false, current rows are appended to target table or file.

Data type

boolean

Default

false

max_ignored_rows

(optional) Maximum number of rows that can be ignored for successful load. If number of ignored rows exceeds this limit, the load is aborted.

Data type

integer

Default

0

advanced_options

max_reported_parsing_errors

(optional) Maximum number of parsing errors to report back along with the status.

Data type

integer

Default

100

Example: using parameters

{
      target : {
          database : "<DB_NAME>",
          schema : "falcon_default_schema",
          table : "<TABLE_NAME>"
      },
      format : {
          type : "CSV",
          field_separator : ",",
          trailing_field_separator : false,
          enclosing_character : "\"",
          escape_character : "",
          null_value : "(null)",

          date_time : {
              converted_to_epoch : false,
              date_time_format : "%Y%m%d %H:%M:%S",
              date_format : "%Y%m%d",
              time_format : "%H:%M:%S",
              skip_second_fraction : false
          }
          boolean : {
              use_bit_values : false,
              true_format : "T",
              false_format : "F"
          }
          has_header_row : false,
          flexible : false
    },
    load_options : {
        empty_target : false,
        max_ignored_rows : 0,
    },
    advanced_options : {
        max_reported_parsing_errors : 100
    }
  }

Request

curl -i -X POST -b 'JSESSIONID=<GUID-XYZ>' -d '{"target_database": "<DB1>", "target_schema": "<SCHEMA1>", "target_table": "<TABLE1>", "field_separator": ",", "empty_target": false}' https://<TS_CLUSTER>:8442/ts_dataservice/v1/public/loads

Response

Status: 202 Accepted
Content-Type: text/plain
Content-Length: xx
{
  "node_address": {
    "host": "host",
    "port": port
  },
  "cycle_id": "cycle_id"
}

Example failure responses

Status: 401 UNAUTHORIZED
Unable to verify user. Please login again.
Status: 403 FORBIDDEN
User does not have required privileges. Please contact your administrator.
Status: 401 UNAUTHORIZED
Unable to verify user. Please login again.
Status: 500 INTERNAL SERVER ERROR
error code = INTERNAL, message = Couldn't resolve the authentication service.

Load

Use this API to load your data.

You can load data in multiple chunks for the same cycle ID. All data is uploaded directly to the ThoughtSpot cluster, unless you issue a commit load.

Request

POST /ts_dataservice/v1/public/loads/<cycle_id>
Cookie: <token>
Content-Type: multipart/form-data; boundary=bndry
--bndry
Content-Disposition: form-data; name="file"; filename="sample.csv"

<CSV Data>
--bndry--
We only support multipart form/data.

Response

Status: 202 Accepted
Content-Type: text/plain
Content-Length: xx
Connection: Close
Upload Complete.

Example failure responses

Status: 401 UNAUTHORIZED
Unable to verify user. Please login again.
Status: 403 FORBIDDEN
User does not have required privileges. Please contact your administrator.
Status: 400 BAD REQUEST
Unable to find table in Falcon. Cannot load data.
Status: 400 BAD REQUEST
Cycle_id=[cycle_id] does not exist.
Status: 400 BAD REQUEST
Cannot not connect to falcon_manager.
Status: 500 INTERNAL SERVER ERROR
error code = INTERNAL, message = Couldn't resolve the authentication service.

CommitLoad

After the data load completes, use the CommitLoad command to commit data load into the Falcon database.

Request

POST /ts_dataservice/v1/public/loads/<cycle_id>/commit
Cookie: <token>

Response

Status: 202 Accepted
Content-Type: text/plain
Content-Length: xx
Commit load cycle request made.

Example failure responses

Status: 401 UNAUTHORIZED
Unable to verify user. Please login again.
Status: 403 FORBIDDEN
User does not have required privileges. Please contact your administrator.
Status: 500 INTERNAL SERVER ERROR
Commit load cycle failed. Error ending load. Unknown cycle_id 'cycle_id'

AbortLoad

Use this API to stop loading data.

Request

POST /ts_dataservice/v1/public/loads/<cycle_id>/cancel
Cookie: token

Response

Status: 200 OK
Content-Type: text/plain
Content-Length: xx

Example failure responses

Status: 401 UNAUTHORIZED
Unable to verify user. Please login again.
Status: 403 FORBIDDEN
User does not have required privileges. Please contact your administrator.
Status: 500 INTERNAL SERVER ERROR
error code = INTERNAL, message = Couldn't resolve the authentication service.

Status of load

Use the api to get the current status of a load.

Request

GET /ts_dataservice/v1/public/loads/<cycle_id>
Cookie: token

Response

Status: 200 OK
Content-Type: text/plain
Content-Length: xx

Example failure responses

Status: 401 UNAUTHORIZED
Unable to verify user. Please login again.
Status: 403 FORBIDDEN
User does not have required privileges. Please contact your administrator.
Status: 500 INTERNAL SERVER ERROR
error code = INTERNAL, message = Couldn't resolve the authentication service.

Data load status check logic

You can run the following code to validate that the data load is complete:

while (true) {
if (status != OK) {
   // print status.message() as the error.
} else if (internal_stage == DONE) {
   // Data load is successful
} else {
   // poll again for data load status
}
}

Bad records

Use this api to view the bad records file data.

Request

GET /ts_dataservice/v1/public/loads/<cycle_id>/bad_records_file
Cookie: token
Content-range: xxx-xxxx

Response

Status: 200 OK
Content-Type: text/plain
Content-Length: xx
Bad Records file data

Example failure responses

Status: 401 UNAUTHORIZED
Unable to verify user. Please login again.
Status: 403 FORBIDDEN
User does not have required privileges. Please contact your administrator.
Status: 500 INTERNAL SERVER ERROR
Node does not exist: /tmp/cycle_id.bad_record
Status: 500 INTERNAL SERVER ERROR
error code = INTERNAL, message = Couldn't resolve the authentication service.