groupByTable
Groups table rows with matching values in one or more columns. For each group, the function aggregates the data for one or more columns. Sum aggregation applies by default, but you can pick a different type.
The function returns a list of lists, with each inner list displaying the grouping values and aggregated values for a unique group. The lists are sorted by the first aggregated value in descending numerical order.
Note
Available in "Describe the Table" and "Describe Row in Context" projects only.
For JSON projects, use groupByJson or groupByJsonObject instead.
Parameters
INPUT TABLE (table region)
The whole table or a selected table region.
GROUPING COLUMNS (column or list)
One or more columns to group by. For multiple columns, give a list of column variables. The examples show how to do this using ATL. If using the Function Builder, see here.
AGGREGATION COLUMNS (column or list)
One or more columns with numeric values. For multiple columns, give a list of column variables. The examples show how to do this using ATL. If using the Function Builder, see here.
AGGREGATION FUNCTIONS (function or list)
Optional. The functions for aggregating the values in the aggregation columns.
The number of aggregation functions should match the number of aggregation columns. The functions and columns should be in matching order, working left to right through the ATL.
Default: sum()
Aggregation functions
Function | Purpose | Note |
---|---|---|
| Counts the number of rows. | Counts null or empty values. |
| Calculates the maximum value. | Ignores null or empty values. |
| Calculates the mean value. | Ignores null or empty values. |
| Calculates the minimum value. | Ignores null or empty values. |
| Calculates the sum total. | Ignores null or empty values. |
Examples
Assume a "Describe the Table" project with this data:
ID | BranchID | Region | Type | Sales | Profit | |
---|---|---|---|---|---|---|
Row 1 | 1 | 1111 | California | Groceries | 50,234.45 | 3,014.07 |
Row 2 | 2 | 1112 | California | Groceries | 47,221.90 | 3,069.48 |
Row 3 | 3 | 2121 | Texas | Clothing | 44,476.12 | 2,668.57 |
Row 4 | 4 | 2111 | Texas | Groceries | 55,713.66 | 3,285.32 |
Row 5 | 5 | 1113 | California | Groceries | 67,315.85 | 4,712.11 |
Row 6 | 6 | 3121 | New York | Clothing | 66,391.40 | 4,647.27 |
Row 7 | 7 | 1121 | California | Clothing | 24,059.78 | 1,684.18 |
Row 8 | 8 | 2122 | Texas | Clothing | 40,727.24 | 2,565.63 |
Row 9 | 9 | 3111 | New York | Groceries | 42,457.60 | 2,547.45 |
Row 10 | 10 | 2112 | Texas | Groceries | 61,616.90 | 3,697.01 |
Row 11 | 11 | 1122 | California | Clothing | 19,393.42 | 1,376.93 |
Row 12 | 12 | 3122 | New York | Clothing | 70,781.55 | 4,961.67 |
Important
You cannot group or aggregate using the first table column.
Example 1 — Single grouping column with single aggregation
Suppose you want to know how each region performed for total profit. You can group the rows by Region and then sum-aggregate the Profit values for each group.
The ATL is: [[groupByTable(WholeTable, Region, Profit, sum())]]
.
The output is this list of lists:
|
Each inner list contains values for a distinct group. For each inner list, the first value is the region name and the second is the sum-total profit. The list is sorted by the first aggregated value in descending order.
The printed output is:
California and 13,856.77, Texas and 12,216.53 and New York and 12,156.39
Here's how you might extract values from the result to produce narrative text:
[[global.regions = groupByTable(WholeTable,Region,Profit,sum()); global.topOne = top(regions,1);'']] The best performing region was [[topOne[0][0]]], with profits of [[currencyFormat(topOne[0][1])]]. [[sorted = sort(regions, (x,y) -> sign(x[1] - y[1])); global.bottomOne = top(sorted,1);'']] The worst performing region was [[bottomOne[0][0]]], with profits of [[currencyFormat(bottomOne[0][1])]].
The printed result is:
The best performing region was California, with profits of $13,856.77.
The worst performing region was New York, with profits of $12,156.39.
Note
In a "Describe Row in Context" project, the ATL would be slightly different:
[[groupByTable(WholeTable, RegionColumn, ProfitColumn, sum())]]
.
Example 2 — Single grouping column with multiple aggregations
Sometimes you need to specify multiple aggregation columns and multiple aggregation types. Suppose you want to produce a narrative that identifies (1) the regions that achieved average sales over $50K, and (2) the number of branches in each successful region.
To calculate average sales, you must group by Region and mean-aggregate the Sales values for each group. To calculate the number of branches in each region, you must group by Region and then apply count() aggregation to the BranchID column.
The ATL is [[groupByTable(WholeTable, Region, (Sales,BranchID), (mean(),count()))]]
.
The output is this list or lists:
|
The inner list contains values for a distinct group. For each inner list, the first value is the region name, the second is the average sales figure, and the third is the number of unique branches in the group.
The printed output is:
New York, 59,876.85 and 3, Texas, 50,633.48 and 4 and California, 41,645.08 and 5
Here's how you might extract values from the result to produce narrative text:
[[global.over = filter(groupByTable(WholeTable,Region,(Sales,BranchID),(mean(),count())), x -> x[1] >= 50000);'']] [[global.regionsOver = map(over, x -> x[0]);'']] [[regionsOver]] were the only regions with average sales over $50K. [[over[0][0]]] averaged [[currencyFormat(over[0][1])]] from [[numToWords(over[0][2],'',10)]] branches. [[over[1][0]]] averaged [[currencyFormat(over[1][1])]] from [[numToWords(over[1][2],'',10)]] branches.
The printed result is:
New York and Texas were the only regions with average sales over $50K.
New York averaged $59,876.85 from three branches.
Texas averaged $50,633.48 from four branches.
Note
In a "Describe Row in Context" project, the ATL would be slightly different:
[[groupByTable(WholeTable, RegionColumn, (SalesColumn,BranchIDColumn), (mean(),count()))]]
.
Example 3 — Multiple grouping columns with multiple aggregations
The previous examples use a single grouping column. This example demonstrates the benefits of using two or more columns (second parameter) to make your grouping criteria more precise.
Suppose you want to group by Region and Type. Also, suppose that you wish to discover (1) the Region + Type group with the highest sales, and (2) the Region + Type group with the highest average profit.
The ATL is [[groupByTable(WholeTable, (Region,Type), (Sales,Profit), (sum(),mean()))]]
.
The output is this list of lists:
|
Each inner list contains values for a distinct group. For each inner list, the first value is the region name, the second is the store type, the third is the sum-total for sales, and the fourth is the mean profit.
The printed output is:
California, Groceries, 164,772.2 and 3,598.55, New York, Clothing, 137,172.95 and 4,804.47, Texas, Groceries, 117,330.56 and 3,491.17, Texas, Clothing, 85,203.36 and 2,617.1, California, Clothing, 43,453.2 and 1,530.56 and New York, Groceries, 42,457.6 and 2,547.45
Here's how you might extract values from the result to produce narrative text:
[[global.results = groupByTable(WholeTable,(Region,Type),(Sales,Profit),(sum(),mean())); global.topSales = top(results,1);'']] [[topSales[0][1]]] stores in [[topSales[0][0]]] achieved the best total sales ([[abbreviateNumber(currencyFormat(topSales[0][2]))]]). [[sortedByProfit = sort(results, (x,y) -> sign(y[3] - x[3])); global.topProfit = top(sortedByProfit,1);'']] [[topProfit[0][1]]] stores in [[topProfit[0][0]]] achieved the best average profit ([[abbreviateNumber(currencyFormat(topProfit[0][3]))]]).
The printed result is:
Groceries stores in California achieved the best total sales ($164.8K).
Clothing stores in New York achieved the best average profit ($4.8K).
Note
In a "Describe Row in Context" project, the ATL would be slightly different:
[[groupByTable(WholeTable, (RegionColumn,TypeColumn), (SalesColumn,ProfitColumn), (sum(),mean()))]]
.