Skip to main content

Filtering a JSON array

To filter a JSON array, use the filter function.

The best primer on filter is the function topic, which describes the function's purpose, explains the parameter requirements, and provides examples. Read that topic first to gain a basic understanding.

This topic is a supplement to the function topic. It covers common use cases, including some not covered by the function topic. It also has more complex examples.

Index

Example dataset

The examples below use a dataset too large to display fully here. The topic is easier to follow if you download and open THIS EXAMPLE PROJECT, paste the example ATL into a script, and click Preview to see results.

The full dataset is an array — "yr2021" — of 96 JSON objects. For reference, the first three are:

{
    "yr2021": [
        {
            "ID": 1,
            "city": "Atlanta",
            "region": "Southeast",
            "month": "Jan",
            "quarter": "Q1",
            "sales": 56000,
            "target": 50000
        },
        {
            "ID": 2,
            "city": "Boston",
            "region": "Northeast",
            "month": "Jan",
            "quarter": "Q1",
            "sales": 32000,
            "target": 40000
        },
        {
            "ID": 3,
            "city": "Chicago",
            "region": "Midwest",
            "month": "Jan",
            "quarter": "Q1",
            "sales": 44000,
            "target": 45000
        }
    ]
}

Important

Each object has seven key–value pairs. Note that each object has a unique ID value.

Applying a single-field filter

Suppose you want to calculate the total sales value for Q2. The first step is to get all objects for which the quarter value is Q2. You can achieve this with a single-field filter.

ATL in Script

Filter Condition

[[filter(WholeJSON.yr2021, x -> x.quarter == 'Q2')]]

Retain the object only if the quarter value is equal to Q2.

Paste this ATL into the example project and click Preview. The result is a filtered array containing Objects 25–48 (inclusive). Once you have a filtered array, you can use other ATL functions — in this case, map and totalVal — to calculate the Q2 sales total.

ATL in Script

Result

[[Q2data = filter(WholeJSON.yr2021, x -> x.quarter == 'Q2');
totalVal(map(Q2data, x -> x.sales))]]

1,265,000

Tip

All examples in this topic use dot notation. If preferred, use bracket notation instead.

[[Q2data = filter(WholeJSON['yr2021'], x -> x['quarter'] == 'Q2');
totalVal(map(Q2data, x -> x['sales']))]]

Now suppose you want the combined sales total for Q2 and Q3. This requires a single-field filter that applies two conditions. Specifically, you want all objects for which the quarter value is Q2 or Q3. Here is the ATL:

ATL in Script

Filter Condition

[[filter(WholeJSON.yr2021, x -> x.quarter == 'Q2' || x.quarter == 'Q3')]]

Retain object if the quarter value is equal to Q2 or Q3.

The result is a filtered array containing Objects 24–72. Again, you could interrogate the filtered array further with other ATL functions to calculate, for example, the total sales value for Q2–Q3.

Note that the ATL uses the OR operator (||) to add an extra condition. The other option is to add a condition with the AND operator (&&). This is less common in single-field filters, but here's one example:

ATL in Script

Filter Condition

[[filter(WholeJSON.yr2021, x -> x.sales > 30000 && x.sales < 40000)]]

Retain object if the sales value is greater than 30,000 AND less than 40,000.

The filter applies two conditions. Since the conditions are separated by the AND operator (&&), the object must meet both conditions to pass the filter. When two conditions are separated by the OR operator (||), the object passes by meeting either condition.

Applying a multiple-field filter

Suppose that you want the average Q2 sales value for Boston. This requires a multiple-field filter.

ATL in Script

Filter Condition

[[filter(WholeJSON.yr2021, x -> x.city == 'Boston' && x.quarter == 'Q2')]]

Retain if the city value is Boston AND the quarter value is Q2.

You can then interrogate the filtered array to calculate the average sales value.

ATL in Script

Result

[[Q2Boston = filter(WholeJSON.yr2021, x -> x.city == 'Boston' && x.quarter == 'Q2'); mean(map(Q2Boston, x -> x.sales))]]

43,666.67

The following filter checks data in four fields:

ATL in Script

Result

[[filter(WholeJSON.yr2021, x -> x.region == 'South' && x.quarter == 'Q4' && (x.sales > 65000 || percentageChange(x.sales,x.target) > 5))]]

Objects 92 and 93.

The above ATL retains all objects for which:

  • The region value is 'South' AND

  • The quarter value is 'Q4' AND

  • The sales value exceeds 65,000 OR exceeds the target value by 5% or more.

All examples above use a lambda expression to define the filter conditions; and in each case, x represents the input object. Use a different symbol for the object if it makes your ATL easier to follow. For example:

[[filter(WholeJSON.yr2021,  inputObject -> inputObject.region == 'South' && inputObject.quarter == 'Q4' && (inputObject.sales > 65000 || percentageChange(inputObject.sales,inputObject.target) > 5))]]

Applying a case-insensitive filter

For a case-insensitive filter, you must define a condition that is case-insensitive.

Suppose you want all objects for December (Objects 89–96). Most have have a month value of 'Dec', but several have a different value such as 'DEC' or 'dec'. A filter that is case-sensitive won't return each desired object.

ATL in Script

Result

[[filter(WholeJSON.yr2021, x -> x.month == 'Dec')]]

Objects 89, 91, 93, 95 and 96.

Here is one way to account for the variations in casing:

ATL in Script

Result

[[filter(WholeJSON.yr2021, x -> lower(x.month) == 'dec')]] 

Objects 89–96.

Now, it doesn't matter if the object's month value is 'dec', 'Dec', 'DEC', or 'DEc'. All pass the filter because the condition converts the value to lowercase before testing if that value is equal to 'dec' (the constant).

Note

The lowercase conversion is done with the lower function.

Filtering with the negation operator

Suppose that you want the total sales value for Q2–Q4 (inclusive). First, you need all objects for which the quarter value is not Q1. The simplest way to achieve this is to use the negation operator.

ATL in Script

Result

[[filter(WholeJSON.yr2021, object -> object.quarter != 'Q1')]]

Objects 25–96.

Note that this ATL uses the NOT EQUAL TO operator (!=). This is formed by adding the negation operator (!) before an equals sign. You can then interrogate the filtered array to get the total sales value:

ATL in Script

Result

[[notQ1data = filter(WholeJSON.yr2021, object -> object.quarter != 'Q1'); totalVal(map(notQ1data, object -> object.sales))]]

3,663,000

You can also use the negation operator before any ATL function that returns a Boolean value of true or false. The negation operator flips true to false, and vice versa. For example:

ATL in Script

Result

[[filter(WholeJSON.yr2021, x -> startsWith(x.month,'J'))]]

Objects 1–8 and 41–56.

[[filter(WholeJSON.yr2021, x -> !startsWith(x.month,'J'))]]

Objects 9–40 and 57–96.

The first example retains objects with a month value that starts with the character 'J'. The second does the opposite — that is, it retains objects with a month value that does not start with 'J'.

Note

The ATL functions that return a Boolean true or false:

contains, endsWith, hasKey, isCountry, isDateTime, isEmpty, isNotEmpty, isNumber, startsWith

Filtering out objects with null values

A common requirement is to filter OUT objects with a null value in a specific field.

You can do this by using the NOT EQUAL TO operator (!=).

For example, to remove all objects with a null target value:

ATL in Script

Result

[[filter(WholeJSON.yr2021, x -> x.target != null)]]

Objects 1–93.

Note that Objects 94–96 have a null target value; therefore, they don't pass the filter.

Sometimes, you must test for nulls before you apply another filter condition. For example, suppose you want all objects with a target value greater than 55,000. You might try a single-condition filter:

ATL in Script

Result

[[filter(WholeJSON.yr2021, x -> x.target > 55000)]]

MAJOR WARNING: Comparison of null

Studio returns an error here because when the function evaluates Objects 94–96, it finds a null value in the target field, and it cannot assess a null value against the given constant (55000).

You can avoid this by adding a condition that tests for nulls.

[[filter(WholeJSON.yr2021, x -> x.target != null && x.target > 55000)]]

Tip

The null-testing condition must be first; otherwise, you still get the error.

Filtering with the contains function

You can combine filter with contains to apply less stringent filters. Rather than filtering by looking for an exact match, you can filter by checking whether the relevant field value merely contains the given string value.

To understand this better, compare and contrast these examples:

ATL in Script

Result

[[filter(WholeJSON.yr2021, x -> x.region == 'South')]]

Objects 4, 5, 20, 21, 28, 29, 36, 37, 45, 52, 53, 60, 61, 68, 69, 76, 77, 84, 85, 92 and 93.

This first example tests for exact matches. It retains all objects for which the region value is 'South'.

ATL in Script

Result

[[filter(WholeJSON.yr2021, x -> contains(x.region,'South'))]]

Objects 1, 4, 5, 9, 17, 20, 21, 25, 28, 29, 33, 36, 37, 41, 45, 49, 52, 53, 57, 60, 61, 65, 68, 69, 73, 76, 77, 81, 84, 85, 89, 92 and 93.

This second example tests for objects with a region value that contains the string 'South'. The result is a larger filtered array because it includes objects with a region value of 'South' or 'Southeast'.

Note

Objects with a region value of 'Southern' or 'Southwest' would also pass.

A more powerful way to use contains is to test if the field value is one of a range of values.

For example, to retain all objects for April–July:

ATL in Script

Result

[[

wantedMonths = makeList('Apr', 'May', 'Jun', 'Jul');

filter(WholeJSON.yr2021, x -> contains(wantedMonths, x.month))

]]

Objects 25–56.

The first line defines a list of wanted values. The second line filters the array by searching the wanted list for each object's month value. If the object's value is found in the list, the object passes the filter.

Use the negation operator (!) before contains to apply a does-not-contain filter.

ATL in Script

Result

[[

unwantedMonths = makeList('Apr', 'May', 'Jun', 'Jul');

filter(WholeJSON.yr2021, x -> !contains(unwantedMonths, x.month))

]]

Objects 1–24 and 57–96..

Filtering a two-dimensional array

All previous examples show how to filter an array of JSON objects; however, you can also have your data in a two-dimensional array — that is, an array or arrays. Here's a small example dataset:

{
    "Q1data": [
	[1, "Austin", "Jan", 56000, 50000],
	[2, "Boston", "Jan", 66000, 70000],
	[3, "Dallas", "Jan", 62000, 60000],
	[4, "Austin", "Feb", 51000, 45000],
	[5, "Boston", "Feb", 63000, 62000],
	[6, "Dallas", "Feb", 54000, 57000],
	[7, "Austin", "Mar", 52000, 50000],
	[8, "Boston", "Mar", 74000, 70000],
	[9, "Dallas", "Mar", 59000, 60000]
    ]
}

When your JSON data is structured this way, there are no fields — e.g. month or sales — to reference for the purpose of data retrieval. Instead, you must use bracket notation with zero-based indexing.

Note that the second value in each inner array is the name of a city. Zero-based indexing applies, so you can apply a by-city filter by referencing the values at Index 1. For example, to get the arrays for Boston:

ATL in Script

Result

[[filter(WholeJSON.Q1data, x -> x[1] == 'Boston')]]

Inner arrays 2, 5 and 8

To filter by month and sales, reference the values at Indexes 2 and 3.

ATL in Script

Result

[[filter(WholeJSON.Q1data, x -> x[2] == 'Jan' && x[3] > 60000)]]

Inner arrays 2 and 3

It might help to clarify your ATL if you create variables for the required index numbers. For example:

ATL in Script

Result

[[

month = 2; sales = 3;

filter(WholeJSON.Q1data, x -> x[month] == 'Jan' && x[sales] > 60000)

]] 

Inner arrays 2 and 3

Once you have filtered the top-level array, you can interrogate it further with other ATL functions.

To get the Jan–Feb data for Dallas and calculate the sales total, the ATL is:

[[

city = 1; month = 2; sales = 3;

filteredData = filter(WholeJSON.Q1data, x -> x[city] == 'Dallas' && x[month] != 'Mar');

allSales = map(filteredData, x -> x[sales]);

totalSales = currencyFormat(sum(allSales));

joinStrings('Excluding March, the Q1 sales total for Dallas is ', totalSales, '.')

]]

The output text:

Excluding March, the Q1 sales total for Dallas is $116,000.