Filters in group_aggregate()
In this article, we walk through examples of group aggregation using filters. To test for yourself, use the data set we provide in prerequisites.
Prerequisites
Before attempting the following practice examples, follow these steps to upload the sample dataset:
Click the dropdown to see the steps to upload the dataset.
-
Click here to download the sample data set.
-
Sign in to your ThoughtSpot account, and click the Data tab.
-
Click the more menu icon in the upper-right of the screen. From the menu, select Upload data.
-
Upload the data file, either by dragging and dropping it into the field provided, or by clicking Browse your files and selecting Meetings_tiny.csv.
-
To answer Are the column names already defined in the file header?, select Yes.
-
To answer Are the fields separated by?, select Comma (,).
-
-
Select Next at the lower right of the screen to proceed.
-
Select Next to proceed without changing the column names. For the sake of the formulas used in the examples, column names should not be changed.
-
Review the column types in Step 3 and select Upload to finish.
-
Click the search icon on the final screen of the upload process to begin a search based on the sample data set.
Now let’s see what happens when you use filters in your search. The last argument of group_aggregate() allows you to specify whether ThoughtSpot should apply the filter from the search in the formula.
Adding filters through the search bar
In this example, we apply a filter in the search bar that filters for activity classification = "call"
.
-
First, let’s look at what our table looks like with no filter. We use the following search:
-
Search for:
account
activity classification
hours
sort by account
. -
The following table results:
Account Activity Classification Total Hours Amazon
meeting
11
Oracle
meeting
18
Redshift
meeting
5
Redshift
call
1
Snowflake
meeting
2
Snowflake
call
5
-
-
Now let’s add a filter for
activity classification = "call"
.-
Search for:
account
activity classification
hours
activity classification = call
sort by account
. -
The following table results:
Account Activity Classification Total Hours Redshift
call
1
Snowflake
call
5
-
-
Finally, let’s add a formula that aggregates hours to the account level of detail.
-
Click the more options icon and select Add formula.
-
In the formula pop-up window, name your formula. Here, we name the formula hours per account.
-
Enter the following formula and select Save:
average( group_aggregate( sum(hours), {account}, {} ) )
-
-
Finally, add the
hours per account
formula to our search bar.-
Search for:
account
activity classification
hours
hours per account
activity classification = call
sort by account
. -
The following table results:
Table 1. Table 4.1.1 Account Activity classification Total Hours Hours per account Redshift
call
1
6.00
Snowflake
call
5
7.00
This gives us a result which filters our rows to only rows that include "call", but note that
Hours per account
does not matchTotal Hours
.Hours per account
is aggregated without the filter, because the last argument of group_aggregate is {}.If you want your formula to take into account the filters specified in the search, you can use the query_filters() argument.
-
-
Select Formulas from the left sidebar and click the edit icon next to "hours per account".
-
In the formula window, enter the following formula and select Save:
average( group_aggregate( sum(hours), {account}, query_filters() ) )
Now your resulting table looks like this:
Table 2. Table 4.1.2 Account Activity Classification Total Hours Hours per account Redshift
call
1
1.00
Snowflake
call
5
5.00
Adding filters to your formula
You can also include filters in your group_aggregate formula which are not in the search bar.
-
Create the following formula if you want your formula to explicitly filter for
activity classification = "call"
, regardless of what you add in the search bar:- Hours per account
-
average( group_aggregate( sum(hours), {account}, {activity classification = "call"} ) )
-
Search for:
account
activity classification
hours
hours per account
sort by account
sort by activity classification
The resulting table’s
Hours per account
column results in values that are filtered by "call", even though the table itself contains rows for both calls and meetings.Table 3. Example 4.2 Account Activity Classification Total Hours Hours per account Amazon
meeting
11
{null}
Oracle
meeting
18
{null}
Redshift
call
1
1
Redshift
meeting
5
1
Snowflake
call
5
5
Snowflake
meeting
2
5
Using multiple filters
You can put multiple filters in the filters argument of the group_aggregate function. For example, to filter on both activity classification and activity date:
-
Create the following formula:
- Hours per account
-
average( group_aggregate( sum(hours), {account}, {activity classification = "call", activity date >= 04/01/2020 and activity date <= 04/14/2020 } ) )
In the preceding example, ThoughtSpot adds the activity classification and activity date filters in an
and
condition together.
-
Search for:
account
activity classification
hours
hours per account
sort by account
sort by activity classification
.Table 4. Example 4.3 Account Activity Classification Total Hours Hours per account Amazon
meeting
11
{null}
Oracle
meeting
18
{null}
Redshift
call
1
1
Redshift
meeting
5
1
Snowflake
call
5
4
Snowflake
meeting
2
4
You can also use AND, OR, and parens, to flexibly combine filters as shown in following formula:
- Hours per account
-
average( group_aggregate( sum(hours), {account}, {activity classification = "call" OR ( activity date >= 04/01/2020 and activity date <= 04/14/2020 ) } ) )
Related information: