Passthrough functions for Redshift

Passthrough functions allow you to send SQL expressions directly to Redshift without being interpreted by ThoughtSpot. If you have custom database functions that ThoughtSpot doesn’t support, you can use these new passthrough functions in the ThoughtSpot Formula Assistant to call your custom functions.

A passthrough function serves as a wrapper around the specific Redshift SQL function. To use a specific Redshift SQL function, match the data type that the function returns with the Connections passthrough function that returns that data type.

For details on all supported passthrough functions, see Connections passthrough functions.

For details on Redshift functions, see Redshift’s SQL Reference.

Passthrough functions and Row-Level Security (RLS)

ThoughtSpot does not validate, interpret or limit the statements that are included in the passthrough function. There is a risk malicious users may generate SQL injection attacks. This vulnerability is limited to:

  • the ability of these users to write select statements that bypass row- and column-level security that is defined in ThoughtSpot.

  • Note that if the data security is defined in the underlying data platform, this risk is mitigated.

  • Note that this does require an advanced understanding of SQL and also how ThoughtSpot generates SQL statements.

For example, a user may only have access to the Australian sales data. However, using a passthrough function, they could request data from other regions.

Limitations

Passthrough functions are subject to the following limitations:

ThoughtSpot does not support passthrough functions for queries with chasm or fan traps.

If the resulting query includes an aggregation from a table different to the one defined in the passthrough function, it generates an error. The query should be valid if a) the underlying query is defined upon a single table or b) the underlying query involves a single fact table and multiple dimension tables.

ThoughtSpot does not support passthrough functions for queries with aggregate formulas.

Aggregate formulas are not directly available in the formula editor. It is possible to wrap an aggregate formula in a group_aggregate function. This will allow for the passthrough function to be saved. However, this will likely generate an error.

ThoughtSpot does not validate the correctness of selected passthrough functions.

The passthrough function name is based upon the expected return data type. For example, if the expected result is a STRING value, then the correct passthrough function is sql_string_*. However, a user could write the ThoughtSpot formula referencing a function with a different return data type, for example, sql_number_*. This would likely result in an error.
For each data type, the user can select an aggregate or non-aggregate passthrough function. The selection is dependent upon the underlying function the user is referencing, for example, rank() is an aggregate function while concat() is a non-aggregate function.

ThoughtSpot does not validate the correctness of the SQL statement.

The user must ensure that the strings are correctly encapsulated. In the scenario where the statement includes double quotes, then the passthrough function should be wrapped in single quotes.
ThoughtSpot generates GROUP BY statements for all queries. The user must ensure that aggregate columns include the correct aggregation.
For example, the first definition below will return a result because the parameter is wrapped in a sum(). However, the second definition below will result in an error.

correct_formula = sql_double_aggregate_op ("rank() over (order by sum({0}) desc)",Amount )

incorrect_formula = sql_double_aggregate_op ("rank() over (order by {0} desc)",Amount )

ThoughtSpot does not support passthrough functions defined with a window aggregate.

If a filter is applied to a passthrough function defined with a window aggregate, then the query will result in a HAVING clause, which generates an error. Filtering window aggregates require that the query is written with either a QUALIFY clause (if supported by the underlying data platform), or the entire query is rewritten in the form of a common table expression (CTE). Note that this is the format generated by ThoughtSpot’s native window functions.

ThoughtSpot does not validate that columns included in window functions partition clauses are included in the search.

If a column is defined as a PARTITION, then this column must be included in search data. However, this requirement is not explicitly stated.
If a column is defined as a PARTITION and the user drills down on the partition column, an error results. This is because during drill down, the column is removed from the search and replaced with a filter.

Window function partitioning over date parts often results in a query generation error.

Partitioning requires that the partition date matches that in the search. In the formula example below, the definition assumes that WEEKLY is defined in the search. If the search defines the date as MONTHLY, an error results.
sql_int_aggregate_op ( "sum(count(distinct {0})) over (order by date_trunc('week',{1} ) rows between unbounded preceding and current row)", Stations ID, Dim My Date )

The definition window function’s aggregates are dependent upon how the aggregate column is referenced in search.

In ThoughtSpot, it is possible to change the behavior of a column that is defined as a measure in search. [Region][Sales] returns an aggregate of sales for each region. The passthrough function for a rank statement for this search would be defined as: sql_double_aggregate_op ("rank() over (order by sum({0}) desc)",Amount ).
However, [Region][by Sales] modifies the behavior of the Sales column to return a non-aggregated row. Therefore in this scenario, you could define the passthrough function for a rank statement as: sql_double_aggregate_op ("rank() over (order by {0} desc)",Amount ).

Cumulative sum window functions often generate an error for columns defined as measures.

If the column defined in the window aggregation is a measure, it must be modified to act as an attribute. This is because ThoughtSpot includes GROUP BY statements in the underlying queries. If the search is defined as [Region][State][Sales], the formula results in an error. If the search is defined as [Region][State][by Sales], the formula returns a result.
sql_double_aggregate_op ("sum({0}) over (partition by {1} order by {2} rows between unbounded preceding and current row)",sales,region,state)

Error handling and relevant messaging are limited.

If the resulting query is invalid, ThoughtSpot returns a generic message, "Error Loading Data, Query execution failed on Redshift database." If the same SQL is executed in the underlying database, then a more meaningful error is often displayed.

Passthrough functions may generate columns that are not included in group by clauses.

Assume that a column stores car sales information as a variant. The following formula will return the string value for the dealership.
fxDealership = sql_string_op ("src:dealership::text")
Assuming that a user wants to count the number of records for each dealership, they could ask a question such as: [count src] by [fxDealership]. The resulting query will fail as fxDealership is not included in the GROUP BY clause. This is because no ThoughtSpot attribute column is being referenced directly in the search. To resolve this and force the attribute column’s alias into the GROUP BY clause, you must wrap the formula in a concat statement. Note that the LEFT function does not return any characters.
fxDealership = concat (sql_string_op ("src:dealership::text"),left(src ,0)).

By default, ThoughtSpot modifies all text in formulas to lowercase.

This may cause issues if the underlying data platform expects the function to respect the string case.