rollingCorrelation
Performs a rolling correlation analysis for two series of numbers. A window is rolled over both series, position by position, and each segment under the window is analyzed to calculate its Pearson correlation coefficient.
The result is a list of ATL objects, one for each labeled position. An example object is:
|
The returned coefficient (corrValue) in each object is a number between 1 and -1. A coefficient close to 1 indicates very strong positive correlation between the two series with the second series increasing with the first. A coefficient close to -1 indicates very strong negative correlation with the second series decreasing when the first increases. A coefficient close to zero indicates very little correlation.
Parameters
SERIES 1 (list)
The first series of numbers. This must be a list or list-like object with three or more numbers.
SERIES 2 (list)
The second series of numbers. This must be a list or list-like object with three or more numbers.
LABELS (list)
A series of labels. This must be a list or list-like object containing at least three labels.
WINDOW SIZE (number)
The size of the rolling window. This must be an integer value >= 3.
SHIFT INDEX (number)
Optional. The number of places to shift the second series before running the analysis. For example, if SHIFT INDEX = 2, the SERIES 1 window ending at position X will be correlated with the SERIES 2 window ending at position X+2.
Default: 0
Notes
If SERIES 1 is longer than LABELS, the function automatically truncates SERIES 1 to match the length of LABELS.
If SERIES 2 is longer than LABELS, the function automatically truncates SERIES 2 to match the length of LABELS. The function factors in any requested shift (SHIFT INDEX) before truncating.
If LABELS contains a null value, the function uses 'NoLabel' for that label in the output object.
If the rolling window captures a null value in SERIES 1 or SERIES 2, the function cannot calculate a coefficient value for that segment. This is indicated by 'NaN' in the output object.
Examples
Assume a "Describe the Table" project with this data:
Code | Month | Sales | COGS | grossProfit | otherRev | otherExp | netProfit | |
---|---|---|---|---|---|---|---|---|
Row 1 | 1001 | Jan | 12,563.35 | 5,000.00 | 7,563.35 | 1,000.00 | 2,000.00 | 6,563.35 |
Row 2 | 1002 | Feb | 12,867.41 | 5,500.00 | 7,367.41 | 1,200.00 | 2,000.00 | 6,567.41 |
Row 3 | 1003 | Mar | 13,582.93 | 5,750.00 | 7,832.93 | 1,200.00 | 2,000.00 | 7,032.93 |
Row 4 | 1004 | Apr | 13,558.15 | 6,000.00 | 7,558.15 | 2,250.00 | 4,000.00 | 5,808.15 |
Row 5 | 1005 | May | 13,892.68 | 7,000.00 | 6,792.68 | 5,450.00 | 4,000.00 | 8,342.68 |
Row 6 | 1006 | Jun | 14,847.74 | 7,000.00 | 7,847.74 | 3,500.00 | 4,500.00 | 6,847.74 |
You can find the correlation between Sales and Net Profit (from Jan through Jun) using pearsonCorrelation:
ATL in Script | Result | Notes |
---|---|---|
| 0.2582927 | The result indicates very weak positive correlation between Sales and Net Profit. |
This is a useful insight, but you can use rollingCorrelation
to analyze the same trend in greater detail. For example, you can use a three-month rolling window to take a snapshot of the data at each labeled position — that is, at every month.
The ATL is [[rollingCorrelation(Sales,netProfit,Month,3)]]
and the output is this list of ATL objects:
The function's rolling window takes a snapshot at each labeled position. The window ends at each position, so it captures the requested values for each month AND the two months that precede it. If the window fails to capture three values in both input series, the function cannot calculate a coefficient value. This is why NaN (not a number) is the returned corrValue in the first two objects (Jan and Feb).
The result includes four coefficient values, two of which indicate very strong positive correlation between Sales and Net Profit. These two results are contrary to the broader trend (very weak positive correlation between Sales and Net Profit) that we identified previously using the pearsonCorrelation function to analyze correlation for Jan through Jun.
Here's how you might extract values from the result and produce narrative text:
[[global.completeResults = rollingCorrelation(Sales,netProfit,Month,3); global.filteredResults = filter(completeResults, x -> isNumber(x.corrValue)); global.strongCorr = filter(filteredResults, x -> x.corrValue >= 0.9); global.weakerCorr = filter(filteredResults, x -> x.corrValue < 0.9);'']] There was very strong positive correlation between Sales and Net Profit during [[map(strongCorr, x -> "[[x.labels[0]]]–[[x.labels[2]]]")]]. The correlation was weaker during [[map(weakerCorr, x -> "[[x.labels[0]]]–[[x.labels[2]]]")]]
The output text is:
There was very strong positive correlation between Sales and Net Profit during Jan–Mar and Mar–May.
The correlation was weaker during Feb–Apr and Apr–Jun.
Note
Remember, you can input any list of numbers to SERIES 1 and SERIES 2. You don't need to use column variables. In a JSON project, you could create variables that return a list of numbers, then input those variables to the function.
Using the SHIFT INDEX parameter
Assume a "Describe the Table" project with this data:
Code | Month | Orders | salesRev | otherRev | totalRev | |
---|---|---|---|---|---|---|
Row 1 | 1001 | Jan | 24 | 14,000.00 | 1,000.00 | 15,000.00 |
Row 2 | 1002 | Feb | 48 | 10,000.00 | 1,000.00 | 11,000.00 |
Row 3 | 1003 | Mar | 44 | 18,000.00 | 1,000.00 | 19,000.00 |
Row 4 | 1004 | Apr | 54 | 16,500.00 | 1,000.00 | 17,500.00 |
Row 5 | 1005 | May | 66 | 19,500.00 | 1,200.00 | 20,700.00 |
Row 6 | 1006 | Jun | 48 | 21,000.00 | 1,250.00 | 22,250.00 |
Suppose this data is for a business that delivers its product 30 days after order and only receives payment upon delivery. Given this, any correlation analysis for orders and sales revenue should factor in a month's delay to account for the gap between order and payment. You can run this analysis using function's optional SHIFT INDEX parameter.
First, let's run the analysis without using the SHIFT INDEX parameter.
The ATL is [[rollingCorrelation(Orders,salesRev,Month,3)]]
and the result is this list of ATL objects:
The correlation values are significantly lower than you would expect for orders and sales revenue, which typically correlate very strongly. This is because the analysis does not account for the month's delay between order and payment. For example, it compares Jan–Mar orders to Jan–Mar sales revenue. See the highlighted values in the third object above.
To fix this, use the SHIFT INDEX parameter to shift the second series one place.
The ATL is [[rollingCorrelation(Orders,salesRev,Month,3,1)]]
and the result is this list of ATL objects:
The effect of the shift can be seen by comparing the values2 results in this image to those in the previous image. This time, the analysis does factor in the required month delay. For example, it compares Jan–Mar orders to Feb–Apr sales revenue. See the highlighted values in the third object above.
Now, each returned corrValue exceeds 0.9, indicating very strong positive correlation.
Important
All examples in this topic use 3 for the WINDOW SIZE parameter. This is to keep the examples simple. Remember, you can use a much larger window size if required.