Skip to main content

groupByJsonObject

Groups JSON objects with matching values in one or more fields (hereafter referred to as keys). For each group, the function also aggregates the data for one or more keys. Sum aggregation applies by default, but you can pick a different type if required.

The function returns a list of lists, with each inner list displaying the grouping values and aggregated values for a unique group. The lists are sorted by the first aggregated value in descending numerical order.

Note

Use this function if your JSON data is structured as an array of objects. Use groupByJson if your data is structured as a two-dimensional array — i.e. an array of arrays.

Parameters

  • INPUT DATA (JSON array)

    An array of JSON objects. Fields within each object are represented by key–value pairs.

  • GROUPING KEYS (string or list)

    One or more keys to group by. For multiple keys, give a list of key names.

  • AGGREGATION KEYS (string or list)

    One or more keys with numeric values. For multiple keys, give a list of key names.

  • AGGREGATION FUNCTIONS (function or list)

    Optional. The functions for aggregating the values in the aggregation fields. The number of aggregation functions should match the number of aggregation fields. The functions and fields should be in matching order, working left to right through the ATL.

Aggregation functions

Function

Purpose

Note

count()

Counts the number of objects

Counts null or empty values.

maxVal()

Calculates the maximum value.

Ignores null or empty values.

mean()

Calculates the mean value.

Ignores null or empty values.

minVal()

Calculates the minimum value.

Ignores null or empty values.

sum()

Calculates the sum total.

Ignores null or empty values.

Examples

Assume a "Describe a JSON Object" project with this data:

{
    "finances": [
	{"ID": 1111, "Region": "California", "Type": "Groceries", "Sales": 50235.45, "Profit": 3014.07},
	{"ID": 1112, "Region": "California", "Type": "Groceries", "Sales": 47221.90, "Profit": 3069.48},
	{"ID": 2121, "Region": "Texas", "Type": "Clothing", "Sales": 44476.12, "Profit": 2668.57},
	{"ID": 2111, "Region": "Texas", "Type": "Groceries", "Sales": 55713.66, "Profit": 3285.32},
	{"ID": 1113, "Region": "California", "Type": "Groceries", "Sales": 67315.85, "Profit": 4712.11},
	{"ID": 3121, "Region": "New York", "Type": "Clothing", "Sales": 66391.40, "Profit": 4647.27},
	{"ID": 1121, "Region": "California", "Type": "Clothing", "Sales": 24059.78, "Profit": 1684.18},
	{"ID": 2122, "Region": "Texas", "Type": "Clothing", "Sales": 40727.24, "Profit": 2565.63},
	{"ID": 3111, "Region": "New York", "Type": "Groceries", "Sales": 42457.60, "Profit": 2547.45},
	{"ID": 2112, "Region": "Texas", "Type": "Groceries", "Sales": 61616.90, "Profit": 3697.01},
	{"ID": 1122, "Region": "California", "Type": "Clothing", "Sales": 19393.42, "Profit": 1376.93},
	{"ID": 3122, "Region": "New York", "Type": "Clothing", "Sales": 70781.55, "Profit": 4961.67}
    ]
}

The main array is "finances". This contains 12 objects, each containing five fields. Each field has a key name (e.g. "ID"). You must use these names to specify the grouping and aggregation keys.

Example 1 — Single grouping key with single aggregation

Suppose you want to know how each region performed for total profit. You can group the objects by Region and then sum-aggregate the Profit values for each group.

The ATL is: [[groupByJsonObject(WholeJSON.finances, 'Region', 'Profit', sum())]].

The output is this list of lists:

( ("California", 13856.77), ("Texas", 12216.53), ("New York", 12156.39) )

Each inner list contains values for a distinct group. For each inner list, the first item is the region name, and the second item is the sum-total profit. The list is sorted by the first aggregated value in descending order.

The printed output is:

California and 13,856.77, Texas and 12,216.53 and New York and 12,156.39

Here's how you might extract values from the result to produce narrative text:

[[global.regions = groupByJsonObject(WholeJSON.finances, 'Region', 'Profit', sum()); global.topOne = top(regions,1);'']]

The best performing region was [[topOne[0][0]]], with profits of [[currencyFormat(topOne[0][1])]].

[[sorted = sort(regions, (x,y) -> sign(x[1] - y[1])); global.bottomOne = top(sorted,1);'']]

The worst performing region was [[bottomOne[0][0]]], with profits of [[currencyFormat(bottomOne[0][1])]].

The printed result is:

The best performing region was California, with profits of $13,856.77.

The worst performing region was New York, with profits of $12,156.39.

Example 2 — Single grouping key with multiple aggregations

Sometimes you need to specify multiple aggregation keys and multiple aggregation types. Suppose you want to produce a narrative that identifies (1) the regions that achieved average sales over $50K, and (2) the number of branches in each successful region.

To calculate average sales, you must group by Region and mean-aggregate the Sales totals for each group. To calculate the number of branches in each region, you must group by Region and then apply count() aggregation to the ID field.

The ATL is [[groupByJsonObject(WholeJSON.finances, 'Region', ('Sales','ID'), (mean(),count()))]].

The output is this list or lists:

( ("New York", 59876.85, 3), ("Texas", 50633.48, 4), ("California", 41645.08, 5) )

The inner list contains values for a distinct group. For each inner list, the first item is the region name, the second item is the average sales figure, and the third item is the number of unique branches in the group. The printed output for the ATL is:

New York, 59,876.85 and 3, Texas, 50,633.48 and 4 and California, 41,645.08 and 5

Here's how you might extract values from the result to produce narrative text:

[[global.over = filter(groupByJsonObject(WholeJSON.finances, 'Region', ('Sales','ID'), (mean(),count())), x -> x[1] >= 50000);'']]

[[global.regionsOver = map(over, x -> x[0]);'']]

[[regionsOver]] were the only regions with average sales over $50K.

[[over[0][0]]] averaged [[currencyFormat(over[0][1])]] from [[numToWords(over[0][2],'',10)]] branches.

[[over[1][0]]] averaged [[currencyFormat(over[1][1])]] from [[numToWords(over[1][2],'',10)]] branches.

The printed result is:

New York and Texas were the only regions with average sales over $50K.

New York averaged $59,876.85 from three branches.

Texas averaged $50,633.48 from four branches.

Example 3 — Multiple grouping keys with multiple aggregations

The previous examples use a single grouping key. This example demonstrates the benefits of using two or more keys (second parameter) to make your grouping criteria more precise.

Suppose you want to group by Region and Type. Also, suppose that you wish to discover (1) the Region + Type group with the highest sales, and (2) the Region + Type group with the highest average profit.

The ATL is [[groupByJsonObject(WholeJSON.finances, ('Region','Type'), ('Sales','Profit'), (sum(),mean()))]].

The output is this list of lists:

( ("California", "Groceries", 164772.2, 3598.55), ("New York", "Clothing", 137172.95, 4804.47), ("Texas", "Groceries", 117330.56, 3491.17), ("Texas", "Clothing", 85203.36, 2617.1), ("California", "Clothing", 43453.2, 1530.56), ("New York", "Groceries", 42457.6, 2547.45) )

Each inner list contains values for a distinct group. For each inner list, the first item is the region name, the second item is the store type, the third item is the sum-total for sales, and the fourth item is the mean profit.

The printed output is:

California, Groceries, 164,772.2 and 3,598.55, New York, Clothing, 137,172.95 and 4,804.47, Texas, Groceries, 117,330.56 and 3,491.17, Texas, Clothing, 85,203.36 and 2,617.1, California, Clothing, 43,453.2 and 1,530.56 and New York, Groceries, 42,457.6 and 2,547.45

Here's how you might extract values from the result to produce narrative text:

[[global.results = groupByJsonObject(WholeJSON.finances, ('Region','Type'), ('Sales','Profit'), (sum(),mean()));'']]

[[global.topSales = top(results,1);'']]

[[topSales[0][1]]] stores in [[topSales[0][0]]] achieved the best total sales ([[abbreviateNumber(currencyFormat(topSales[0][2]))]]).

[[sortedByProfit = sort(results, (x,y) -> sign(y[3] - x[3])); global.topProfit = top(sortedByProfit,1);'']]

[[topProfit[0][1]]] stores in [[topProfit[0][0]]] achieved the best average profit ([[abbreviateNumber(currencyFormat(topProfit[0][3]))]]).

The printed result is:

Groceries stores in California achieved the best total sales ($164.8K).

Clothing stores in New York achieved the best average profit ($4.8K).