Add a Hive database connection
You can add a connection to a Hive database using ThoughtSpot DataFlow.
Follow these steps:
-
Select Connections in the top navigation bar.
-
In the Connections interface, select Add connection in the upper-right corner.
-
In the Create Connection interface, select the Connection type.
-
After you select the Hive Connection type, the rest of the connection properties appear.
See Connection properties for details, defaults, and examples.
- Connection name
-
Name your connection.
- Connection type
-
Choose the Hive connection type.
- HiveServer2 HA configured
-
Specify this option if using HiveServer2 High Availability.
- HiveServer2 zookeeper namespace
-
Specify zookeeper namespace as hiveserver2. This is the default value. Only when using Hiveserver2 HA.
- Host
-
Specify the hostname or the IP address of the Hadoop system. Only when not using Hiveserver2 HA.
- Port
-
Specify the port. Only when not using Hiveserver2 HA.
- Hive security authentication
-
Specifies the type of security protocol to connect to the instance. Based on the type of security select the authentication type and provide details.
- User
-
Specify the user to connect to Hive. This user must have data access privileges.
- Password
-
Specify the password.
- Trust store
-
Specify the trust store name for authentication. For SSL and Kerberos authentication only.
- Trust store password
-
Specify the password for the trust store. For SSL and Kerberos authentication only.
- Hive transport mode
-
Applicable only for hive process engine. This specifies the network protocol used for communicating between hive nodes.
- HTTP path
-
This is specified as an option when http transport mode is selected. For HTTP transport mode only.
- Hadoop distribution
-
Provide the Hadoop distribution of the connection.
- Distribution version
-
Provide the version of the Hadoop distribution.
- Hadoop conf path
-
By default, the system picks the Hadoop configuration files from the HDFS. To override, specify an alternate location. Applies only when using configuration settings that are different from global Hadoop instance settings.
- DFS HA configured
-
Specify if using High Availability for DFS.
- DFS name service
-
Specify the logical name of the HDFS nameservice.
- DFS name node IDs
-
Specify a comma-separated list of NameNode IDs. System uses this property to determine all NameNodes in the cluster. XML property name is
dfs.ha.namenodes.dfs.nameservices
. - RPC address for namenode1
-
Specify the fully-qualified RPC address for each listed NameNode. Defined as
dfs.namenode.rpc-address.dfs.nameservices.name node ID 1
. For DFS HA and Hadoop Extract only. - RPC address for namenode2
-
Specify the fully-qualified RPC address for each listed NameNode. Define as
dfs.namenode.rpc-address.dfs.nameservices.name node ID 2
. - DFS host
-
Specify the DFS hostname or the IP address.
- DFS port
-
Specify the associated DFS port.
- Default DFS location
-
Specify the location for the default source/target location.
- Temp DFS location
-
Specify the location for creating temp directory.
- DFS security authentication
-
Select the type of security being enabled.
- Hadoop RPC protection
-
Hadoop cluster administrators control the quality of protection using the configuration parameter
hadoop.rpc.protection
. - Hive principal
-
Principal for authenticating hive services.
- User principal
-
To authenticate via a key-tab you must have supporting key-tab file which is generated by Kerberos Admin and also requires the user principal associated with Key-tab (Configured while enabling Kerberos).
- User keytab
-
To authenticate via a key-tab you must have supporting key-tab file which is generated by Kerberos Admin and also requires the user principal associated with Key-tab (Configured while enabling Kerberos).
- KDC host
-
Specify KDC Host Name where as KDC (Kerberos Key Distribution Center) is a service that runs on a domain controller server role (Configured from Kerberos configuration-/etc/krb5.conf).
- Default realm
-
A Kerberos realm is the domain over which a Kerberos authentication server has the authority to authenticate a user, host or service (Configured from Kerberos configuration-/etc/krb5.conf).
- Queue name
-
Specify the queue name followed by a coma separated form in yarn.scheduler.capacity.root.queues. For Hadoop Extract only.
- YARN web UI port
-
Yarn Providing web UI for yarn RM and by default 8088 in use. For Hadoop Extract only.
- Zookeeper quorum host
-
Specify the value of hadoop.registry.zk.quorum from yarn-site.xml. Only when using Hiveserver2 HA.
- Yarn timeline webapp host
-
Specify the ip address of yarn timeline service web application.
- Yarn timeline webapp port
-
Specify the port associated with the yarn timeline service web application.
- Yarn timeline webapp version
-
Specify the version associated with the yarn timeline service web application.
- JDBC options
-
Specify the options associated with the JDBC URL.
-
Select Create connection.