Flexible aggregation functions
How aggregation formulas work
Typically, the groupings and filters used in a formula use the same fields as columns returned in the search results. The concept of a grouping equates to an attribute column.
For example, in the search revenue ship mode
, revenue
is the measure, and ship mode
is the attribute, or grouping.
The result of this search shows total revenue for each ship mode:
revenue | ship mode |
---|---|
$ a |
air |
$ r |
rail |
$ t |
truck |
$ s |
sea transport |
The aggregation formulas are described in Overview of aggregate formulas.
About flexible aggregation
ThoughtSpot provides flexible aggregation with the group_*
functions.
You can use group_*
formulas when you want to specify columns and filters to include or ignore in your query.
The group_*
formulas use a sub-query to perform these custom aggregations.
If the sub-query is at a different level of detail than the original query, ThoughtSpot adds the result column to the result of the original query.
To use the groups and filters, specify them using the query_groups
and query_filters
keywords, respectively.
You can also add or exclude groups or filters.
Best practices for flexible aggregations
The group_aggregate
function enables you to calculate a result at a specific aggregation level, and then returns it at a different aggregation level.
For this reaggregation result to return correctly, follow these syntax guidelines:
-
Wrap
group_aggregate
in an aggregate function, such assum
oraverage
-
The wrapping function must be the immediate preceding function, such as
sum(group_aggregate(...))
Examples
For a search on revenue monthly ship mode
, you can add a formula to calculate yearly revenue by ship mode:
group_aggregate(sum(revenue), {ship mode, year(commit date)}, {})
The same formula can also be written using query_groups()
and query_filters()
as following:
group_aggregate(sum(revenue), query_groups() - {commit date} + {year(commit date)}, {})
This is helpful to include the main query groups that are not known at formula creation time.
You can use +/-
to modify the set of groups included from the query.
Query filters
With query_filters()+{filter_condition}
or query_filters()-{filter_condition}
, users will be able to aggregate the results while including/excluding a filter condition.
Filter condition: Ship Mode='car'
For a search on Category Customer ID sales by customer id and category Ship Mode='car’
, you can add a formula to calculate sales by category for each customer as:
sales by Customer ID and Category= group_aggregate (sum(Sales), {Category, Customer ID }, query_filters()+{Ship Mode='air'})
In this case, the results will be aggregated based on the dimensions: ‘Category’
and ‘Customer ID’
and filters: ‘air’
and ‘car’
.
With query_filters()-{column}
, users will be able to aggregate the results while removing any expression related to a column.
Filter condition: Ship Mode='car'
For a search on Customer ID sales by customer id and category Ship Mode='car'
, you can add a formula to calculate sales for each customer while ignoring the filter on a column as:
sales by Customer ID and Category= group_aggregate (sum(Sales), {Customer ID, Category }, query_filters()-{Ship Mode})
In this case, the results will be aggregated based on the dimensions in the search; Customer ID and any filter related to Ship Mode will not be considered while aggregating the results.
Group aggregation filters enhancement
Beginning in the 9.0.0.cl release, you can specify a column in the third argument of a group_aggregate function in order to include all filters defined in the search that pertain to that column, and ignore unrelated filters. This enhancement allows the analyst to specify which fields to accept filters from, improving readability of formulas, and reducing operational change and related errors.
For example, say that you would like to determine how many stores a certain product was sold in during a given time period, expressed as a ratio of total available stores. In order to calculate the total available stores you require a group_aggregate function that will accept the filters associated with transaction date (for example, this year, last six months, august), and ignore filters from other fields.
Before 9.0.0.cl, this formula would look like: group_aggregate(count(storeid),query_groups(),query_filters()-{field1,field2,fieldN})
, where users would have to use query_filters() - {}
to manually remove every field in the search bar that did not relate to transaction date. When adding a new unrelated filter to the search bar, users would then have to edit all existing formulas to ignore the filter.
Now, users can simply specify the formula as group_aggregate(count(storeid),query_groups, {date})
, and the formula will automatically ignore all unrelated filters in the search bar.
Related information
For more examples of flexible aggregation, see the group_aggregate function in the Formula function reference.
To learn about aggregation formulas in general, see Overview of aggregate formulas and Group aggregation functions.
To understand group aggregate query filters, see Aggregate filters.
To learn about how the
group-aggregate
function can be used within your business practice, we encourage you to see Reaggregation scenarios in practice.