Skip to main content

rollingCorrelation

Performs a rolling correlation analysis for two series of numbers. A window is rolled over both series, position by position, and each segment under the window is analyzed to calculate its Pearson correlation coefficient.

The result is a list of ATL objects, one for each labeled position. An example object is:

(labels = Jan, Feb and March, values1 = 14, 22 and 36, values2 = 204, 268 and 283, corrValue = 0.8775575)

The returned coefficient (corrValue) in each object is a number between 1 and -1. A coefficient close to 1 indicates very strong positive correlation between the two series with the second series increasing with the first. A coefficient close to -1 indicates very strong negative correlation with the second series decreasing when the first increases. A coefficient close to zero indicates very little correlation.

Parameters

  • SERIES 1 (list)

    The first series of numbers. This must be a list or list-like object with three or more numbers.

  • SERIES 2 (list)

    The second series of numbers. This must be a list or list-like object with three or more numbers.

  • LABELS (list)

    A series of labels. This must be a list or list-like object containing at least three labels.

  • WINDOW SIZE (number)

    The size of the rolling window. This must be an integer value >= 3.

  • SHIFT INDEX (number)

    Optional. The number of places to shift the second series before running the analysis. For example, if SHIFT INDEX = 2, the SERIES 1 window ending at position X will be correlated with the SERIES 2 window ending at position X+2.

    Default: 0

Notes

  • If SERIES 1 is longer than LABELS, the function automatically truncates SERIES 1 to match the length of LABELS.

  • If SERIES 2 is longer than LABELS, the function automatically truncates SERIES 2 to match the length of LABELS. The function factors in any requested shift (SHIFT INDEX) before truncating.

  • If LABELS contains a null value, the function uses 'NoLabel' for that label in the output object.

  • If the rolling window captures a null value in SERIES 1 or SERIES 2, the function cannot calculate a coefficient value for that segment. This is indicated by 'NaN' in the output object.

Examples

Assume a "Describe the Table" project with this data:

Code

Month

Sales

COGS

grossProfit

otherRev

otherExp

netProfit

Row 1

1001

Jan

12,563.35

5,000.00

7,563.35

1,000.00

2,000.00

6,563.35

Row 2

1002

Feb

12,867.41

5,500.00

7,367.41

1,200.00

2,000.00

6,567.41

Row 3

1003

Mar

13,582.93

5,750.00

7,832.93

1,200.00

2,000.00

7,032.93

Row 4

1004

Apr

13,558.15

6,000.00

7,558.15

2,250.00

4,000.00

5,808.15

Row 5

1005

May

13,892.68

7,000.00

6,792.68

5,450.00

4,000.00

8,342.68

Row 6

1006

Jun

14,847.74

7,000.00

7,847.74

3,500.00

4,500.00

6,847.74

You can find the correlation between Sales and Net Profit (from Jan through Jun) using pearsonCorrelation:

ATL in Script

Result

Notes

[[pearsonCorrelation(Sales,netProfit)]]

0.2582927

The result indicates very weak positive correlation between Sales and Net Profit.

This is a useful insight, but you can use rollingCorrelation to analyze the same trend in greater detail. For example, you can use a three-month rolling window to take a snapshot of the data at each labeled position — that is, at every month.

The ATL is [[rollingCorrelation(Sales,netProfit,Month,3)]] and the output is this list of ATL objects:

rollCorr1.png

The function's rolling window takes a snapshot at each labeled position. The window ends at each position, so it captures the requested values for each month AND the two months that precede it. If the window fails to capture three values in both input series, the function cannot calculate a coefficient value. This is why NaN (not a number) is the returned corrValue in the first two objects (Jan and Feb).

The result includes four coefficient values, two of which indicate very strong positive correlation between Sales and Net Profit. These two results are contrary to the broader trend (very weak positive correlation between Sales and Net Profit) that we identified previously using the pearsonCorrelation function to analyze correlation for Jan through Jun.

Here's how you might extract values from the result and produce narrative text:

[[global.completeResults = rollingCorrelation(Sales,netProfit,Month,3);
global.filteredResults = filter(completeResults, x -> isNumber(x.corrValue));
global.strongCorr = filter(filteredResults, x -> x.corrValue >= 0.9);
global.weakerCorr = filter(filteredResults, x -> x.corrValue < 0.9);'']]

There was very strong positive correlation between Sales and Net Profit during [[map(strongCorr, x -> "[[x.labels[0]]]–[[x.labels[2]]]")]].

The correlation was weaker during [[map(weakerCorr, x -> "[[x.labels[0]]]–[[x.labels[2]]]")]]

The output text is:

There was very strong positive correlation between Sales and Net Profit during Jan–Mar and Mar–May.

The correlation was weaker during Feb–Apr and Apr–Jun.

Note

Remember, you can input any list of numbers to SERIES 1 and SERIES 2. You don't need to use column variables. In a JSON project, you could create variables that return a list of numbers, then input those variables to the function.

Using the SHIFT INDEX parameter

Assume a "Describe the Table" project with this data:

Code

Month

Orders

salesRev

otherRev

totalRev

Row 1

1001

Jan

24

14,000.00

1,000.00

15,000.00

Row 2

1002

Feb

48

10,000.00

1,000.00

11,000.00

Row 3

1003

Mar

44

18,000.00

1,000.00

19,000.00

Row 4

1004

Apr

54

16,500.00

1,000.00

17,500.00

Row 5

1005

May

66

19,500.00

1,200.00

20,700.00

Row 6

1006

Jun

48

21,000.00

1,250.00

22,250.00

Suppose this data is for a business that delivers its product 30 days after order and only receives payment upon delivery. Given this, any correlation analysis for orders and sales revenue should factor in a month's delay to account for the gap between order and payment. You can run this analysis using function's optional SHIFT INDEX parameter.

First, let's run the analysis without using the SHIFT INDEX parameter.

The ATL is [[rollingCorrelation(Orders,salesRev,Month,3)]] and the result is this list of ATL objects:

rollCorr2.png

The correlation values are significantly lower than you would expect for orders and sales revenue, which typically correlate very strongly. This is because the analysis does not account for the month's delay between order and payment. For example, it compares Jan–Mar orders to Jan–Mar sales revenue. See the highlighted values in the third object above.

To fix this, use the SHIFT INDEX parameter to shift the second series one place.

The ATL is [[rollingCorrelation(Orders,salesRev,Month,3,1)]] and the result is this list of ATL objects:

rollCorr3.png

The effect of the shift can be seen by comparing the values2 results in this image to those in the previous image. This time, the analysis does factor in the required month delay. For example, it compares Jan–Mar orders to Feb–Apr sales revenue. See the highlighted values in the third object above.

Now, each returned corrValue exceeds 0.9, indicating very strong positive correlation.

Important

All examples in this topic use 3 for the WINDOW SIZE parameter. This is to keep the examples simple. Remember, you can use a much larger window size if required.