groupByVarianceJson
This function calculates the variance between the sum-aggregated values for two measures (base and comparison), then breaks the variance down by one or more grouping dimensions to identify drivers and offsets of that variance.
The function works with JSON data arranged in a two-dimensional array (i.e. an array of arrays). As there are no field names to reference, such as Sales or Sales Target, you must use zero-based indexing to identify the measures and dimensions in the data (i.e. their positions in the inner arrays). This is explained further in the Examples below.
There are two main scenarios for variance, and this function covers both. First, you might want to do a target-based variance to compare two measures, such as comparing Sales against Sales Target. Secondly, you might want to do a time-based variance to compare the same measure from two sets of data, such as Sales for Q2 against Sales for Q1. In the second scenario, you should use the filter function to create two datasets — one for Q1 data and another for Q2 data. Both approaches are demonstrated below.
You can also use this function to break down the variance to identify drivers (a variance in the same direction as its parent) or an offset (a variation in the opposite direction); for example, a comparison of Sales against Sales Targets, broken down by Country. If total sales for all countries fall short of the sales target, France will be a driver if its sales are also less than target. If sales in France are above target, France will be an offset. If more than one dimension is given, each child variance (driver or offset) is broken down by the next dimension; for example, each Country variance is broken down by City.
The function returns an ATL object with fields such as the base measure value (e.g. Sales), the comparison measure value (e.g. Sales Target), the variation amount, the variation as a percentage, and the drivers and offsets. The drivers and offsets fields each hold a list of variance objects for the next dimension breakdown (including the same fields). For further detail about the fields, see Result object fields.
Occasionally, you might not have figures for one of the measures you are comparing. For example, you might not have the Q1 sales data for Lyon to compare with the Q2 sales data for Lyon. In this case, the measure value is zero and the variance object does not appear in the list of drivers or offsets; instead, it goes into either the zeroBaseEntries or zeroComparisonEntries list, depending on whether it was the base or comparison measure value that was zero. This is because you may want to describe this variance object differently in your narrative — e.g. "The Lyon branch was not open in Q1, so no comparison is possible".
You can download an example project and results ATL object file for the Examples below.
Note
Available in "Describe a JSON Object" projects only. For table projects, use groupByVarianceTable.
Parameters
BASE DATA (JSON array)
This is the first set of input data to be aggregated for base measure. This is a two-dimensional array and can be the whole data array or a filtered set of data, such as for a specific time period.
COMPARISON DATA (JSON array)
This is the second set of input data. This is a two-dimensional array and can be the whole data array or a filtered set of data, such as for a specific time period. If comparing two different measures (e.g. Sales and Sales Target), use the same array that is given in the BASE DATA parameter. If comparing the same measure for two time periods, this should be the data for the comparison time period.
BASE MEASURE INDEX (number)
The base measure is the data to aggregate (sum) and compare to the comparison measure. The parameter value is the zero-based index number for the position of this field in the inner array; for example, if it is the third field in the array, the index value given is 2. The field must contain numbers. Any null values are counted as a zero.
COMPARISON MEASURE INDEX (number)
The comparison measure is the field to aggregate (sum) and compare against the base measure. The parameter value is the zero-based index number for the position of this field in the inner array. The variance value will be the base measure minus the comparison measure. The field must contain numbers. Any null values are counted as a zero.
DIMENSION INDEXES (number or list)
One or more dimensions by which to break down the total. For multiple dimensions, input a list of indexes up to a maximum of 10. When you give multiple dimensions, the function works through the list from left to right.
DIRECTION PERCENTAGE THRESHOLD (number)
When the absolute variance percentage is below this value, the variance direction is returned as NOCHANGE. This does not affect whether a child variance is regarded as a driver of offset of its parent.
Result object fields
The function returns a multi-level ATL object containing the following fields:
Field | Description |
---|---|
dimensionIndex | The name of the dimension (e.g. Country or City). For the parent variance object, this is an empty string. |
dimensionInstance | The value of the dimension (e.g. France or Germany if the dimension is Country). For the parent variance object, this is an empty string. |
level | The top object is level 1. Its drivers and offsets, the variances broken down by the first dimension (e.g. Country), are level 2. Their drivers and offsets, which are the variances broken down by the second dimension (e.g. City) are level 3, and so on. |
baseValue | The aggregated total for the base measure. |
comparisonValue | The aggregated total for the comparison measure. |
varianceValue | The variance value = (baseValue – comparisonValue). |
variancePercentage | The variance value as a percentage of the comparison value. When the comparison value is zero, the variance percentage value is set to 100%. |
baseMeasureIndex | The index value for the base measure (e.g. 4 for Sales in the examples below). |
comparisonMeasureIndex | The index value for the comparison measure (e.g. 5 for Sales Target in the examples below). |
drivers | A list of variance objects broken down by the next dimension, where the movement (variance) is in the same direction (positive or negative) to its parent variance. The list is given in descending order of absolute variance amount. This is an empty list for the last dimension (bottom level). |
offsets | A list of variance objects broken down by the next dimension, where the movement (variance) is in the opposite direction (positive or negative) to its parent variance. The list is given in descending order of absolute variance amount. This is an empty list for the last dimension (bottom level). |
zeroBaseEntries | A list of variance objects for the next dimension breakdown, where the base measure value is zero and the comparison measure value is non-zero. |
zeroComparisonEntries | A list of variance objects for the next dimension breakdown, where the comparison measure value is zero. |
direction | The variance direction. Either INCREASE, DECREASE, or NOCHANGE. |
Tip
Printing the ATL object while you're developing the narrative can produce a large output that is difficult to read. We recommend you just print the part(s) you're checking at that moment. See Printing parts of the result in groupByVarianceTable
for tips.
Examples
Note
The two examples below are derived from this DOWNLOADABLE EXAMPLE PROJECT. The READ ME script contains a full explanation of the project and narrative.
For reasons of space, we've not included the full resulting ATL objects for the examples, but to make it easier to understand them, we recommend you download the file using this link, DOWNLOADABLE RESULTS FILE, and look at the content using a text editor.
The downloadable project uses the same data and global variables as used in the examples below. However, it also uses multiple script and conditional statements to cover possible data scenarios not included in the examples, such as a variance having no offsets. It therefore provides a more detailed example of how this function might be used. Further guidance on Using conditionals is given at the end of the groupByVarianceTable
topic.
Assume a "Describe a JSON object" project with the following two-dimensional data:
{ "columnNames": ["Code", "Quarter", "Country", "City", "Sales", "Sales Target", "Profit"], "data": [ [1001, "Q1", "France", "Paris", 423248.15, 400000, 38092.33], [2001, "Q1", "France", "Paris", 342263.44, 350000, 25669.76], [3001, "Q1", "France", "Lyon", 215863.83, 250000, 19427.74], [4001, "Q1", "France", "Strasbourg", 328174.47, 350000, 34999.19], [1002, "Q2", "France", "Paris", 422246.33, 400000, 37057.89], [2002, "Q2", "France", "Paris", 331782.65, 350000, 20296.45], [3002, "Q2", "France", "Lyon", 235183.78, 250000, 22545.78], [4002, "Q2", "France", "Strasbourg", 346166.18, 350000, 38112.34], [5001, "Q1", "Germany", "Berlin", 377242.65, 350000, 33951.83], [6001, "Q1", "Germany", "Munich", 291370.72, 200000, 23309.65], [7001, "Q1", "Germany", "Hamburg", 290455.48, 300000, 32531.01], [5002, "Q2", "Germany", "Berlin", 369248.72, 350000, 30017.89], [6002, "Q2", "Germany", "Munich", 297370.72, 200000, 27318.25], [7002, "Q2", "Germany", "Hamburg", 275853.45, 300000, 26928.36] ] }
With table data, you don't use the first column as this is usually the row ID. However, in two-dimensional JSON data, you can use zero-based indexing for all fields, including the first one. In the example above, the first array gives the name for each of the seven fields in the inner arrays in the main "data" array. Indexing is zero-based, so the first field - Code - is index 0, the second - Quarter - is index 1 and so on, with the last field - Profit - index 6.
To avoid confusion as to which index number represents which column, we highly recommend assigning meaningful global variables for each index number, such as those below:
[[global.codeIdx = 0; global.quarterIdx = 1; global.countryIdx = 2; global.cityIdx = 3; global.salesIdx = 4; global.salesTargetIdx = 5; global.profitIdx = 6; '']]
Example 1 - Two measures from the same data
This example shows how to calculate variance using two measures (columns) from the same table. This is often called target-based variance. It uses Sales (global.salesIdx
) as the base measure and Sales Target (global.salesTarget.Idx
) as the comparison measure. The grouping dimensions are Country (global.countryIdx) and City (global.cityIdx). The direction percentage threshold is 1%. Since the two measures are from the same dataset, you pass the same dataset (in this case WholeJSON.data
) in to both the base data and the comparison data parameters.
[[global.codeIdx = 0; global.quarterIdx = 1; global.countryIdx = 2; global.cityIdx = 3; global.salesIdx = 4; global.salesTargetIdx = 5; global.profitIdx = 6; '']] [[groupByVarianceJson(WholeJSON.data,WholeJSON.data,global.salesIdx,global.salesTargetIdx,(global.countryIdx,global.cityIdx),1)]]
The resulting ATL object can be seen in the first section of this DOWNLOADABLE RESULTS FILE.
You can extract key data insights from the ATL object using dot notation to access specific key values and bracket notation with index values to access specific values in lists. This is much easier if you first assign the result of the groupByVarianceJson
call to a global script variable, as shown below:
[[global.results = groupByVarianceJson(WholeJSON.data,WholeJSON.data,global.salesIdx,global.salesTargetIdx,(global.countryIdx,global.cityIdx),1); '']]
Once you have your results variable, you can use dot/bracket notation (and other ATL functions) to extract the key data insights for your narrative:
[[global.codeIdx = 0; global.quarterIdx = 1; global.countryIdx = 2; global.cityIdx = 3; global.salesIdx = 4; global.salesTargetIdx = 5; global.profitIdx = 6; '']] [[global.results = groupByVarianceJson(WholeJSON.data,WholeJSON.data,global.salesIdx,global.salesTargetIdx,(global.countryIdx,global.cityIdx),1); '']] The combined sales target for this period was [[currencyFormat(results.comparisonValue,'EUR')]]. Total sales were [[currencyFormat(results.baseValue,'EUR')]], which exceeded the target by [[abs(results.variancePercentage)]]%. This was driven by sales in [[map(results.drivers, x -> joinStrings(x.dimensionInstance," (",abs(x.variancePercentage),"% over target)"))]], and offset by sales in [[map(results.offsets, x -> joinStrings(x.dimensionInstance," (",abs(x.variancePercentage),"% below target)"))]]. Total sales in [[results.drivers[0].dimensionInstance]] were [[currencyFormat(results.drivers[0].baseValue,'EUR')]], which exceeded the country target ([[currencyFormat(results.drivers[0].comparisonValue,'EUR')]]) by [[abs(results.drivers[0].variancePercentage)]]%. This was driven by sales in [[map(results.drivers[0].drivers, x -> joinStrings(x.dimensionInstance," (",abs(x.variancePercentage),"% above target)"))]], and offset by sales in [[map(results.drivers[0].offsets, x -> joinStrings(x.dimensionInstance," (",abs(x.variancePercentage),"% below target)"))]]. Total sales in [[results.offsets[0].dimensionInstance]] were [[currencyFormat(results.offsets[0].baseValue,'EUR')]], which fell short of the country target ([[currencyFormat(results.offsets[0].comparisonValue,'EUR')]]) by [[abs(results.offsets[0].variancePercentage)]]%. This was driven by sales in [[map(results.offsets[0].drivers, x -> joinStrings(x.dimensionInstance," (",abs(x.variancePercentage),"% below target)"))]], and offset by sales in [[map(results.offsets[0].offsets, x -> joinStrings(x.dimensionInstance," (",abs(x.variancePercentage),"% above target)"))]].
The text output is:
The combined sales target for this period was €4,400,000. Total sales were €4,546,471, which exceeded the target by 3.33%. This was driven by sales in Germany (11.86% over target), and offset by sales in France (2.04% below target).
Total sales in Germany were €1,901,542, which exceeded the country target (€1,700,000) by 11.86%. This was driven by sales in Munich (47.19% above target) and Berlin (6.64% above target), and offset by sales in Hamburg (5.62% below target).
Total sales in France were €2,644,929, which fell short of the country target (€2,700,000) by 2.04%. This was driven by sales in Lyon (9.79% below target) and Strasbourg (3.67% below target), and offset by sales in Paris (1.3% above target).
Example 2 - Same measure from different datasets
Instead of comparing two different measures (target-based variance), you can compare the same measure (field) aggregated from two different datasets — that is, two datasets created from the two-dimensional JSON array using the filter function — to create a time-based variance.
For example, you could use this method to compare sales figures for Q2 against Q1. You would start by filtering the array for Q2 data and Q1 data and then input these to the base data
and comparison data
parameters. You would then input the Sales field to both the base measure
and comparison measure
parameters.
To simplify the ATL, use the filter function to create two sets of data, assigning them to global script variables that you can then input into the groupByVarianceJson
function:
[[global.codeIdx = 0; global.quarterIdx = 1; global.countryIdx = 2; global.cityIdx = 3; global.salesIdx = 4; global.salesTargetIdx = 5; global.profitIdx = 6; '']] [[ global.Q1data = filter(WholeJSON.data, x -> x[quarterIdx] == 'Q1'); global.Q2data = filter(WholeJSON.data, x -> x[quarterIdx] == 'Q2'); global.results = groupByVarianceJson(Q2data,Q1data,salesIdx,salesIdx,(countryIdx, cityIdx),1) ]]
The ATL above uses Sales for Q2 arrays (Q2data
) as the base measure
and Sales for Q1 arrays (Q1data
) as the comparison measure
. The grouping dimensions are Country (countryIdx
) and City (cityIdx
). The direction percentage threshold is 1%.
The result is a multi-level ATL object similar to that produced in Example 1; you can see the result in the second section of the downloadable results ATL object file. Again, you can extract the key data insights from the object by using a mixture of dot and bracket notation and ATL functions:
[[global.codeIdx = 0; global.quarterIdx = 1; global.countryIdx = 2; global.cityIdx = 3; global.salesIdx = 4; global.salesTargetIdx = 5; global.profitIdx = 6; '']] [[ global.Q1data = filter(WholeJSON.data, x -> x[quarterIdx] == 'Q1'); global.Q2data = filter(WholeJSON.data, x -> x[quarterIdx] == 'Q2'); global.results = groupByVarianceJson(Q2data,Q1data,salesIdx,salesIdx,(countryIdx, cityIdx),1), '']] Total sales for Q2 were [[currencyFormat(results.baseValue,'EUR')]]. Compared to the Q1 total of [[currencyFormat(results.comparisonValue,'EUR')]], this represents an increase of [[abs(results.variancePercentage)]]%. This was driven by sales in [[map(results.drivers, x -> joinStrings(x.dimensionInstance," (up ",abs(x.variancePercentage),"%)"))]] and offset by sales in [[map(results.offsets, x -> joinStrings(x.dimensionInstance," (down ",abs(x.variancePercentage),"%)"))]]. Q2 sales in [[results.drivers[0].dimensionInstance]] were [[currencyFormat(results.drivers[0].baseValue,'EUR')]], which exceeded Q1's sales ([[currencyFormat(results.drivers[0].comparisonValue)]]) by [[abs(results.drivers[0].variancePercentage)]]%. This was driven by sales in [[map(results.drivers[0].drivers, x -> joinStrings(x.dimensionInstance," (up ",abs(x.variancePercentage),"%)"))]], and offset by sales in [[map(results.drivers[0].offsets, x -> joinStrings(x.dimensionInstance," (down ",abs(x.variancePercentage),"%)"))]]. Q2 sales in [[results.offsets[0].dimensionInstance]] were [[currencyFormat(results.offsets[0].baseValue,'EUR')]], which were down on Q1's sales ([[currencyFormat(results.offsets[0].comparisonValue,'EUR')]]) by [[abs(results.offsets[0].variancePercentage)]]%. This was driven by sales in [[map(results.offsets[0].drivers, x -> joinStrings(x.dimensionInstance," (down ",abs(x.variancePercentage),"%)"))]], and offset by sales in [[map(results.offsets[0].offsets, x -> joinStrings(x.dimensionInstance," (up ",abs(x.variancePercentage),"%)"))]].
The output text is:
Total sales for Q2 were €2,277,852. Compared to the Q1 total of €2,268,619, this represents an increase of 0.41%. This was driven by sales in France (up 1.97%) and offset by sales in Germany (down 1.73%).
Q2 sales in France were €1,335,379, which exceeded Q1's sales ($1,309,550) by 1.97%. This was driven by sales in Lyon (up 8.95%) and Strasbourg (up 5.48%), and offset by sales in Paris (down 1.5%).
Q2 sales in Germany were €942,473, which were down on Q1's sales (€959,069) by 1.73%. This was driven by sales in Hamburg (down 5.03%) and Berlin (down 2.12%), and offset by sales in Munich (up 2.06%).
Printing parts of the result
Printing the whole result object while developing your narrative produces a large output that is difficult to read. Printing select parts of the object is useful because it allows you to see only the part you are interested in at a particular point in development.
For guidance and tips on printing parts of the result, see groupByVarianceTable.