Amazon S3 connection reference
Learn about the fields used to create an Amazon S3 connection with ThoughtSpot DataFlow.
Here is a list of the fields for an Amazon S3 connection in ThoughtSpot DataFlow. You need specific information to establish a seamless and secure connection.
Connection properties
- Connection name
-
Name your connection. Mandatory field.
- Example:
-
AmazonS3Connection
- Connection type
-
Choose the Amazon S3 connection type. Mandatory field.
- Example:
-
Amnazon S3
- Amazon S3 URL
-
Specify the Amazon S3 hostname link. Mandatory field.
- Example:
-
'https://s3.eu-central-1.amazonaws.com'
- Other S3-compatible object store
-
Enable this option to support for non-AWS S3 storage. When selected, the Region field replaces Authentication Type. Optional field.
- Region
-
Specifies the location on the globe. Location options may vary depending on the type of Cloud Platform. Option appears when you select "Other S3-compatible object store." Mandatory field.
- Example:
-
US-West
- Bucket
-
Specify the bucket.
An Amazon S3 bucket name is globally unique, and the namespace is shared by all AWS accounts. Mandatory field. Example:;; bucket
- Folder
-
Give the storage folder details. Optional field.
- Authentication type
-
Specifies the type of security protocol to connect to the instance. Based on the type of security, select the authentication type and provide details. Mandatory field.
- Valid Values:
-
EC2 attached IAM role, Access key and secret key
- Access key
-
Specify the access key ID generated when creating AWS security credentials. Displayed only when "Authentication type" is Access key and Secret key. Mandatory field.
- Example:
-
access key
- Other notes:
-
Displayed only when "Authentication type" is "Access key and Secret key".
- Secret key
-
Specify the secret access key generated when creating AWS security credentials. Displayed only when "Authentication type" is Access key and Secret key. Mandatory field.
- Example:
-
ABCDEFGH245HIJK
- Other notes:
-
Displayed only when "Authentication type" is "Access key and Secret key".
Sync properties
For DataFlow S3 sync to work properly, it needs bucket-wide list permission. However, it is sufficient to have 'get' permissions for the object.
{ "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": [ "s3:ListBucket", "s3:GetBucketLocation" ], "Resource": "arn:aws:s3:::datalake" }, { "Sid": "VisualEditor1", "Effect": "Allow", "Action": [ "s3:PutObject", "s3:GetObject" ], "Resource": "arn:aws:s3:::datalake/david/*" } ] }
- File name
-
Specify the name of the file. Mandatory field.
- Example:
-
PRODUCT.csv
- Valid Values:
-
Any string literal
- Default:
-
File name for sync.
- Other notes:
-
To specify wildcard pattern, use
*
.
- Enable archive on success
-
Specify if DataFlow must archive file after operation succeeds. Optional field.
- Example:
-
No
- Valid Values:
-
Yes, No
- Default:
-
No
- Delete file on success
-
Specify if DataFlow must delete file after operation succeeds. Optional field.
- Example:
-
No
- Valid Values:
-
Yes, No
- Default:
-
No
- Column delimiter
-
Specify the column delimiter character. Mandatory field.
- Example:
-
,
- Valid Values:
-
Any printable ASCII character or decimal value for ASCII character
- Default:
-
1
- Skip header rows
-
Skip the specified number of header rows when loading data. Optional field.
- Example:
-
5
- Valid Values:
-
Any numeric value
- Default:
-
0
- Compression
-
Specify if the file has compression, and type of compression. Mandatory field.
- Example:
-
gzip
- Valid Values:
-
None, gzip
- Default:
-
None
- Row delimiter
-
Specify the end of the row character in the extracted data. Optional field.
- Example:
-
\\n
- Valid Values:
-
Any printable ASCII character
- Default:
-
\\n (new line character)
- Enclosing character
-
Specify if the text columns in the source data need to be enclosed in quotes. Optional field.
- Example:
-
Single
- Valid Values:
-
Single, Double, Empty
- Default:
-
None
- Escape character
-
Specify the escape character if using a text qualifier in the source data. Optional field.
- Example:
-
\\
- Valid Values:
-
Any ASCII character
- Default:
-
None
- Null value
-
Specifies the string literal indicates the null value for a column. During the data load, the column value matching this string will be loaded as null in the target. Optional field.
- Example:
-
NULL
- Valid Values:
-
Any string literal
- Default:
-
NULL
- Date style
-
Specifies how to interpret the date format. Optional field.
- Example:
-
YMD
- Valid Values:
-
YMD
,MDY
,DMY
,DMONY
,MONDY
,Y2MD
,MDY2
,DMY2
,DMONY2
, andMONDY2
- Default:
-
YMD
- Other notes:
-
MDY
: 2-digit month, 2-digit day, 4-digit year
DMY
: 2-digit month, 2-digit day, 4-digit year
DMONY
: 2-digit day, 3-character month name, 4-digit year
MONDY
: 3-character month name, 2-digit day, 4-digit year
Y2MD
: 2-digit year, 2-digit month, 2-digit day
MDY2
: 2-digit month, 2-digit day, 2-digit year
DMY2
: 2-digit day, 2-digit month, 2-digit year
DMONY2
: 2-digit day, 3-character month name, 2-digit year
MONDY2
: 3-character month name, 2-digit day, 2-digit year
- Date delimiter
-
Specifies the separator used in the date format. Optional field.
- Example:
-
-
- Valid Values:
-
Any printable ASCII character
- Default:
-
-
- Time style
-
Specifies the format of the time portion in the data. Optional field.
- Example:
-
24 hour
- Valid Values:
-
12 Hour, 24 Hour
- Default:
-
24 Hour
- Time delimiter
-
Specifies the character used as separate the time components. Optional field.
- Example:
-
:
- Valid Values:
-
Any printable ASCII character
- Default:
-
:
- Skip trailer rows
-
Skip the number of trailer rows specified while loading the data. Optional field.
- Example:
-
5
- Valid Values:
-
Any numeric value
- Default:
-
0
- tsload options
-
Specifies the parameters passed with the
tsload
command, in addition to the commands already included by the application. The format for these parameters is:<param_1_name> = <param_1_value>
- Example:
-
date_time_format = %Y-%m-%d date_format = %Y-%m-%d;time_format = %H:%M:%S
- Valid Values:
-
null_value = NULL max_ignored_rows = 0
- Default:
-
max_ignored_rows = 0
- Boolean representation
-
Specifies the representation of data in the boolean field.
Optional field.- Example:
-
true_false
- Valid Values:
-
true_false, T_F, 1_0, T_NULL
- Default:
-
true_false
Related information