Constraints

Constraints allow you to build relationships and join tables.

Constraints include primary keys, foreign keys, and relationships. Relationships allow you to create a generic relationship for use when you want to join tables that don’t have a primary key/foreign key relationship.

Defining a generic relationship in the UI rather than using a primary key/ foreign key join through TQL has no impact on performance. However, when creating relationships in the UI, you must ensure that you create it in the right direction: many to one. To create many-to-many joins, or to create joins using >, <, >=, or <=, use TQL.

Primary keys

When a primary key is selected for a table, it impacts data loading behavior. When a new row is added:

  • If another row already exists with the same primary key, it is updated with the values in the new row.

  • If a row with the same primary key does not exist already, the new row is inserted into the table.

This behavior is referred to as “upsert” because it does an INSERT or an UPDATE, depending on whether a row with the same primary key already exists.

Note that ThoughtSpot does not check for primary key violations across different shards of the table. Therefore, you need to shard the table on the primary key columns if you require this “upsert” behavior.

Permitted joins and necessary permissions

See this matrix for information about which joins you can create, and what permissions these joins require.

These are the allowable joins and permissions within ThoughtSpots:

Worksheets

Must have Can edit permission on the source Worksheet. Can create joins with:

Imported tables (UI)

Must have Can manage data permission to load the table.

Views

Must have Can edit permission on the source View. Can create joins with:

Views
Materialized views
Imported tables (UI)

Must have Can manage data permission to load the table.

Tables uploaded from backend (tsload) or through DataFlow

Must have Admin privileges to access tsload.

Materialized views

Must have Can edit permission on the source Materialized view. Can create joins with:

Views
Materialized views
Imported tables (UI)

Must have Can manage data permission to load the table.

Tables uploaded from backend (tsload) or through DataFlow

Must have Admin privileges to access tsload. ThoughtSpot recommends creating this join through the UI, instead of using TQL.

Imported tables (UI)

Must have Can edit permission on the source table. Can create joins with:

Worksheets
Views
Materialized views
Imported tables (UI)

Must have Can manage data permission to load the table.

Tables uploaded from backend (tsload) or through DataFlow

Must have Admin privileges to access tsload.

Tables uploaded from backend (tsload) or through DataFlow

Must have Can edit permission on the source table. Can create joins with:

Views
Materialized views

ThoughtSpot recommends creating this join through the UI instead of using TQL.

Imported tables (UI)

Must have Can manage data permission to load the table.

Table uploaded from backend (tsload) or through DataFlow

Must have Admin privileges to access tsload. This join is often used to create Worksheets.

Tables uploaded through Embrace

Must have Can edit permission on the source table, and Can manage data permission. Can create joins with:

Table uploaded through Embrace

The two tables must be from the same connection.

View on top of table uploaded through Embrace

The View and the table must be from the same connection.

Views on top of table uploaded through Embrace

Must have Can edit permission on the source Views. Can create joins with:

Table uploaded through Embrace

The View and the table must be from the same connection.

View on top of table uploaded through Embrace

The two Views must be from the same connection.

Foreign key relationships

Foreign key relationships tell ThoughtSpot how two tables can be joined. These relationships are only used for joining the tables, and not for referential integrity constraint checking.

The directionality of primary key - foreign key relationships is important. The foreign key relationship is defined on the fact table and references the primary key(s) in the dimension table. So you can think of the fact table as the source and the dimension table as the target. In the schema viewer, notice that the arrow that represents a PK/FK join points to the dimension table.

If you use primary and foreign keys, when users search the data from the search bar, tables are automatically joined. For example, assume there are two tables:

  • revenue, which is a fact table

  • region, which is a dimension table

There is a foreign key on the fact table on regionid which points to the id in the region dimension table. When a user types in "revenue by region", the two tables will be joined automatically.

Foreign keys have to match the primary key of the target table they refer to. So if there are multiple columns that make up the primary key in the target table, the foreign key must include all of them, and in the same order.

Generic relationships (many-to-many)

You may have a schema where there is a fact table that you want to join with another fact table. If there isn’t a primary key/foreign key relationship between the tables, you can use many-to-many to enable this. You can do this by using the RELATIONSHIP syntax to add a link between them, that works similarly to the WHERE clause in a SQL join clause.

A many-to-many implementation may lead to overcounting. We recommend that you avoid using aggregation or count formulas in your search; They may count some rows multiple times, because the data satisfies the join condition for multiple rows.
We recommend that you avoid using many-to-many joins. When you have a use case that depends on joining two fact tables, we recommend that you create a bridge table between them, as a chasm trap. If the fact tables share common data, use it to create the dimension table that acts as a bridge. For example, a date or product dimension could join an inventory fact table to a sales fact table.

The generic join is a special kind of relationship that applies to specific data models and use cases. For example, suppose you have a table that shows wholesale purchases of fruits, and another table that shows retail fruit sales made, but no inventory information. In this case, it would be of some use to see the wholesale purchases that led to sales, but you don’t have the data to track a single apple from wholesale purchase through to sale to a customer.

In a many-to-many relationship, the value(s) in a table can be used to join to a second table, using an equality condition (required) and one or more range conditions (optional). These conditions act like the WHERE clause in a SQL JOIN clause. They are applied using AND logic, such that all conditions must be met for a row to be included.

To use a many-to-many relationship, you need to follow a few rules:

  • There must be one equality condition defined between the two tables.

  • Each table must be sharded on the same key as the equality condition.

  • There can optionally be one or more range conditions defined.

This example shows the TQL statements that create the two fact tables and the relationship between them.

TQL> CREATE TABLE "wholesale_buys" (
  "order_number" VARCHAR(255),
  "date_ordered" DATE,
  "expiration_date" DATE,
  "supplier" VARCHAR(255),
  "fruit" VARCHAR(255),
  "quantity" VARCHAR(255),
  "unit_price" DOUBLE
  )  PARTITION BY HASH (96) KEY ("fruit");

TQL> CREATE TABLE "retail_sales" (
  "date_sold" DATE,
  "location" VARCHAR(255),
  "vendor" VARCHAR(255),
  "fruit" VARCHAR(255),
  "quantity" VARCHAR(255),
  "sell_price" DOUBLE
  )  PARTITION BY HASH (96) KEY ("fruit");

TQL> ALTER TABLE "wholesale_buys" ADD RELATIONSHIP WITH "retail_sales"
     AS "wholesale_buys"."fruit" = "retail_sales"."fruit"
     AND ("wholesale_buys"."date_ordered" < "retail_sales"."date_sold"
          AND "retail_sales"."date_sold" < "wholesale_buys"."expiration_date");

Related information