Skip to main content

groupByTable

Groups table rows with matching values in one or more columns. For each group, the function aggregates the data for one or more columns. Sum aggregation applies by default, but you can pick a different type.

The function returns a list of lists, with each inner list displaying the grouping values and aggregated values for a unique group. The lists are sorted by the first aggregated value in descending numerical order.

Note

Available in "Describe the Table" and "Describe Row in Context" projects only.

For JSON projects, use groupByJson or groupByJsonObject instead.

Parameters

  • INPUT TABLE (table region)

    The whole table or a selected table region.

  • GROUPING COLUMNS (column or list)

    One or more columns to group by. For multiple columns, give a list of column variables. The examples show how to do this using ATL. If using the Function Builder, see here.

  • AGGREGATION COLUMNS (column or list)

    One or more columns with numeric values. For multiple columns, give a list of column variables. The examples show how to do this using ATL. If using the Function Builder, see here.

  • AGGREGATION FUNCTIONS (function or list)

    Optional. The functions for aggregating the values in the aggregation columns.

    The number of aggregation functions should match the number of aggregation columns. The functions and columns should be in matching order, working left to right through the ATL.

    Default: sum()

Aggregation functions

Function

Purpose

Note

count()

Counts the number of rows.

Counts null or empty values.

maxVal()

Calculates the maximum value.

Ignores null or empty values.

mean()

Calculates the mean value.

Ignores null or empty values.

minVal()

Calculates the minimum value.

Ignores null or empty values.

sum()

Calculates the sum total.

Ignores null or empty values.

Examples

Assume a "Describe the Table" project with this data:

ID

BranchID

Region

Type

Sales

Profit

Row 1

1

1111

California

Groceries

50,234.45

3,014.07

Row 2

2

1112

California

Groceries

47,221.90

3,069.48

Row 3

3

2121

Texas

Clothing

44,476.12

2,668.57

Row 4

4

2111

Texas

Groceries

55,713.66

3,285.32

Row 5

5

1113

California

Groceries

67,315.85

4,712.11

Row 6

6

3121

New York

Clothing

66,391.40

4,647.27

Row 7

7

1121

California

Clothing

24,059.78

1,684.18

Row 8

8

2122

Texas

Clothing

40,727.24

2,565.63

Row 9

9

3111

New York

Groceries

42,457.60

2,547.45

Row 10

10

2112

Texas

Groceries

61,616.90

3,697.01

Row 11

11

1122

California

Clothing

19,393.42

1,376.93

Row 12

12

3122

New York

Clothing

70,781.55

4,961.67

Important

You cannot group or aggregate using the first table column.

Example 1 — Single grouping column with single aggregation

Suppose you want to know how each region performed for total profit. You can group the rows by Region and then sum-aggregate the Profit values for each group.

The ATL is: [[groupByTable(WholeTable, Region, Profit, sum())]].

The output is this list of lists:

( ("California", 13856.77), ("Texas", 12216.53), ("New York", 12156.39) )

Each inner list contains values for a distinct group. For each inner list, the first value is the region name and the second is the sum-total profit. The list is sorted by the first aggregated value in descending order.

The printed output is:

California and 13,856.77, Texas and 12,216.53 and New York and 12,156.39

Here's how you might extract values from the result to produce narrative text:

[[global.regions = groupByTable(WholeTable,Region,Profit,sum()); global.topOne = top(regions,1);'']]

The best performing region was [[topOne[0][0]]], with profits of [[currencyFormat(topOne[0][1])]].

[[sorted = sort(regions, (x,y) -> sign(x[1] - y[1])); global.bottomOne = top(sorted,1);'']]

The worst performing region was [[bottomOne[0][0]]], with profits of [[currencyFormat(bottomOne[0][1])]].

The printed result is:

The best performing region was California, with profits of $13,856.77.

The worst performing region was New York, with profits of $12,156.39.

Note

In a "Describe Row in Context" project, the ATL would be slightly different:

[[groupByTable(WholeTable, RegionColumn, ProfitColumn, sum())]].

Example 2 — Single grouping column with multiple aggregations

Sometimes you need to specify multiple aggregation columns and multiple aggregation types. Suppose you want to produce a narrative that identifies (1) the regions that achieved average sales over $50K, and (2) the number of branches in each successful region.

To calculate average sales, you must group by Region and mean-aggregate the Sales values for each group. To calculate the number of branches in each region, you must group by Region and then apply count() aggregation to the BranchID column.

The ATL is [[groupByTable(WholeTable, Region, (Sales,BranchID), (mean(),count()))]].

The output is this list or lists:

( ("New York", 59876.85, 3), ("Texas", 50633.48, 4), ("California", 41645.08, 5) )

The inner list contains values for a distinct group. For each inner list, the first value is the region name, the second is the average sales figure, and the third is the number of unique branches in the group.

The printed output is:

New York, 59,876.85 and 3, Texas, 50,633.48 and 4 and California, 41,645.08 and 5

Here's how you might extract values from the result to produce narrative text:

[[global.over = filter(groupByTable(WholeTable,Region,(Sales,BranchID),(mean(),count())), x -> x[1] >= 50000);'']]

[[global.regionsOver = map(over, x -> x[0]);'']]

[[regionsOver]] were the only regions with average sales over $50K.

[[over[0][0]]] averaged [[currencyFormat(over[0][1])]] from [[numToWords(over[0][2],'',10)]] branches.

[[over[1][0]]] averaged [[currencyFormat(over[1][1])]] from [[numToWords(over[1][2],'',10)]] branches.

The printed result is:

New York and Texas were the only regions with average sales over $50K.

New York averaged $59,876.85 from three branches.

Texas averaged $50,633.48 from four branches.

Note

In a "Describe Row in Context" project, the ATL would be slightly different:

[[groupByTable(WholeTable, RegionColumn, (SalesColumn,BranchIDColumn), (mean(),count()))]].

Example 3 — Multiple grouping columns with multiple aggregations

The previous examples use a single grouping column. This example demonstrates the benefits of using two or more columns (second parameter) to make your grouping criteria more precise.

Suppose you want to group by Region and Type. Also, suppose that you wish to discover (1) the Region + Type group with the highest sales, and (2) the Region + Type group with the highest average profit.

The ATL is [[groupByTable(WholeTable, (Region,Type), (Sales,Profit), (sum(),mean()))]].

The output is this list of lists:

( ("California", "Groceries", 164772.2, 3598.55), ("New York", "Clothing", 137172.95, 4804.47), ("Texas", "Groceries", 117330.56, 3491.17), ("Texas", "Clothing", 85203.36, 2617.1), ("California", "Clothing", 43453.2, 1530.56), ("New York, Groceries", 42457.6, 2547.45) )

Each inner list contains values for a distinct group. For each inner list, the first value is the region name, the second is the store type, the third is the sum-total for sales, and the fourth is the mean profit.

The printed output is:

California, Groceries, 164,772.2 and 3,598.55, New York, Clothing, 137,172.95 and 4,804.47, Texas, Groceries, 117,330.56 and 3,491.17, Texas, Clothing, 85,203.36 and 2,617.1, California, Clothing, 43,453.2 and 1,530.56 and New York, Groceries, 42,457.6 and 2,547.45

Here's how you might extract values from the result to produce narrative text:

[[global.results = groupByTable(WholeTable,(Region,Type),(Sales,Profit),(sum(),mean())); global.topSales = top(results,1);'']]

[[topSales[0][1]]] stores in [[topSales[0][0]]] achieved the best total sales ([[abbreviateNumber(currencyFormat(topSales[0][2]))]]).

[[sortedByProfit = sort(results, (x,y) -> sign(y[3] - x[3])); global.topProfit = top(sortedByProfit,1);'']]

[[topProfit[0][1]]] stores in [[topProfit[0][0]]] achieved the best average profit ([[abbreviateNumber(currencyFormat(topProfit[0][3]))]]).

The printed result is:

Groceries stores in California achieved the best total sales ($164.8K).

Clothing stores in New York achieved the best average profit ($4.8K).

Note

In a "Describe Row in Context" project, the ATL would be slightly different:

[[groupByTable(WholeTable, (RegionColumn,TypeColumn), (SalesColumn,ProfitColumn), (sum(),mean()))]].