Skip to main content

Filtering a table region

The key ATL function for filtering tables is filterRows. This function is available in "Describe the Table" and "Describe Row in Context" projects because these allow access to multiple rows.

The best primer on filterRows is the function topic, which describes the function's purpose, explains the parameter requirements, and provides examples. Read that topic first to gain a basic understanding.

This topic is a supplement to the function topic. It covers common use cases, including some not covered by the function topic. It also has more complex examples.

Index

Example dataset

The examples below use a dataset too large to display fully here. The topic is easier to follow if you download and open THIS EXAMPLE PROJECT, paste the example ATL into a script, and click Preview to see results.

The full dataset has 96 rows. For reference, the first four are:

ID

City

Region

Month

Quarter

Year

Sales

Target

1

Atlanta

Southeast

Jan

Q1

2021

56,000.00

50,000.00

2

Boston

Northeast

Jan

Q1

2021

32,000.00

40,000.00

3

Chicago

Midwest

Jan

Q1

2021

44,000.00

45,000.00

4

Dallas

South

Jan

Q1

2021

57,000.00

50,000.00

Important

You cannot filter a table by the values in the first column. Each value in the first column must be unique. The best approach is to add an ID column — like the one in the example project — before uploading the data in Studio.

filterRows and different project types

The filterRows function works in two project types:

  • Describe the Table

  • Describe Row in Context

Each ATL solution given below works in a project of either type. This is because the ATL references specify columns (second parameter) by giving each column name in string form.

If you give column variables instead, the ATL differs slightly by project type. Remember, in "Describe Row in Context" projects, column variables end in 'Column' — for example, SalesColumn instead of Sales.

Applying a single-column filter

Suppose you want to calculate the total Sales value for Q2. The first step is to get all rows for which the Quarter value is Q2. You can achieve this with a single-column filter.

ATL in Script

Filter Condition

[[filterRows(WholeTable, 'Quarter', x -> x == 'Q2')]]

Retain row if the Quarter value is Q2.

Paste this ATL into the example project and click Preview. The result is a filtered table region containing Rows 25–48 (inclusive). Once you have a filtered region, you can use other ATL functions — in this case, columnsInRegion and totalVal — to calculate the Q2 sales total.

ATL in Script

Result

[[Q2data = filterRows(WholeTable, 'Quarter', x -> x == 'Q2');
totalVal(columnsInRegion(Q2data, Sales))]]

1,265,000

Tip

In a "Describe Row in Context" project, the column variable is SalesColumn, not Sales.

Now suppose you want the combined sales total for Q2 and Q3. This requires a single-column filter that applies two conditions. Specifically, you want all rows for which the Quarter value is Q2 or Q3. Here is the ATL:

ATL in Script

Filter Condition

[[filterRows(WholeTable, 'Quarter', x -> x == 'Q2' || x == 'Q3')]]

Retain row if the Quarter value is equal to Q2 or Q3.

The result is a filtered table region containing Rows 24–72. Again, you could interrogate this filtered region further with other ATL functions to calculate, for example, the combined sales total for Q2 and Q3.

Note that the ATL uses the OR operator (||) to add an extra condition. The other option is to add a condition with the AND operator (&&). This is less common in single-column filters, but here's one example:

ATL in Script

Filter Condition

[[filterRows(WholeTable, 'Sales', x -> x > 30000 && x < 40000)]]

Retain row if the Sales value is greater than 30,000 AND less than 40,000.

The filter applies two conditions. Since these conditions are separated by the AND operator (&&), the row must meet both conditions to pass the filter. When two conditions are separated by the OR operator (||), the row passes by meeting either condition.

Applying a multiple-column filter

Suppose that you want the average Q2 sales value for Boston. This requires a multiple-column filter.

[[filterRows(WholeTable, ('City','Quarter'), (x,y) -> x == 'Boston' && y == 'Q2')]]

You can then interrogate the filtered region to calculate the average sales value.

ATL in Script

Result

[[Q2Boston = filterRows(WholeTable, ('City','Quarter'), (x,y) -> x == 'Boston' && y == 'Q2'); mean(columnsInRegion(Q2Boston, Sales))]]

43,666.67

Tip

In a "Describe Row in Context" project, the column variable is SalesColumn, not Sales.

The following filter uses the data in four columns:

[[filterRows(WholeTable, ('Region','Quarter','Sales','Target'), (a,b,c,d) -> a == 'South' && b == 'Q4' && (c > 65000 || percentageChange(c,d) > 5))]]

The above ATL retains all rows for which:

  • The Region value is 'South' AND

  • The Quarter value is 'Q4' AND

  • The Sales value exceeds 65,000 OR exceeds the Target value by 5% or more.

The last example filters the table by using four columns: Region, Quarter, Sales, and Target . These columns are listed in the function's second parameter and are represented by a, b, c, and d in the lambda expression.

Note that you can use full words to represent each column if preferred. For example, this ATL performs the same operation and is probably easier for your colleagues to follow:

[[filterRows(WholeTable, ('Region','Quarter','Sales','Target'), (region,quarter,sales,target) -> region == 'South' && quarter == 'Q4' && (sales > 65000 || percentageChange(sales,target) > 5))]]

Applying a case-insensitive filter

For a case-insensitive filter, you must define a condition that is case-insensitive.

Suppose you want all rows for December (Rows 89–96). Most have a Month value of 'Dec', but several have a different value such as 'DEC' or 'dec' . A filter that is case-sensitive is insufficient.

ATL in Script

Result

[[filterRows(WholeTable, 'Month', x -> x == 'Dec')]]

Rows 89, 91, 93, 95 and 96.

Here is one way to account for the variations in casing:

ATL in Script

Result

[[filterRows(WholeTable, 'Month', x -> lower(x) == 'dec')]]

Rows 89–96.

Now, it doesn't matter if the row's Month value is 'dec', 'Dec', 'DEC', or 'DEc'. All pass the filter because the condition converts the value to lowercase before testing if that value is equal to 'dec' (the constant).

Note

The lowercase conversion is done with the lower function.

Filtering with the negation operator

Suppose that you want the total sales value for Q2–Q4 (inclusive). First, you need all rows for which the Quarter value is not Q1. The simplest way to achieve this is to use the negation operator.

ATL in Script

Result

[[filterRows(WholeTable,'Quarter', x -> x != 'Q1')]]

Rows 25–96.

Note that this ATL uses the NOT EQUAL TO operator (!=). This is formed by adding the negation operator (!) before an equals sign. You can then interrogate the filtered table to get the total sales value:

ATL in Script

Result

[[notQ1data = filterRows(WholeTable,'Quarter', x -> x != 'Q1');
totalVal(columnsInRegion(notQ1data,Sales))]]

3,663,000

You can also use the negation operator before any ATL function that returns a Boolean value of true or false. The negation operator flips true to false, and vice versa. For example:

ATL in Script

Result

[[filterRows(WholeTable, 'Month', x -> startsWith(x,'J'))]]

Rows 1–8 and 41–56.

[[filterRows(WholeTable, 'Month', x -> !startsWith(x,'J'))]]

Rows 9–40 and 57–96.

The first example retains rows with a Month value that starts with the character 'J'. The second does the opposite — that is, it retains rows with a Month value that does not start with 'J'.

Note

The ATL functions that return a Boolean true or false:

contains, endsWith, hasKey, isCountry, isDateTime, isEmpty, isNotEmpty, isNumber, startsWith

Filtering out rows with missing values

A common requirement is to filter OUT rows that are missing a value in a specific column.

You can do this by using the isNotEmpty function in your filter.

For example, suppose you want to remove all rows with no Target value:

ATL in Script

Result

[[filterRows(WholeTable,'Target', x -> isNotEmpty(x))]]

Rows 1–94.

Note that Rows 95 and 96 have no Target value; therefore, they don't pass the filter.

Sometimes, you must use isNotEmpty before you can apply another filter condition. For example, suppose you want all rows with a Target value greater than 55,000. You might try a single-condition filter:

ATL in Script

Result

[[filterRows(WholeTable,'Target', x -> x > 55000)]]

MAJOR WARNING: Comparison of null

Studio returns an error here because when the function evaluates Rows 95 and 96, it finds no value in the Target column, and it cannot assess a non-existent value against the given constant (55000).

You can avoid this by adding an isNotEmpty condition:

[[filterRows(WholeTable,'Target', x -> isNotEmpty(x) && x > 55000)]]

Tip

The isNotEmpty condition must be first; otherwise, you'll still get the error.

Filtering with the contains function

You can combine filterRows with contains to apply less stringent filters. Rather than filtering by looking for an exact match, you can filter by checking whether a column value merely contains the given string value.

To understand this better, compare and contrast these examples:

ATL in Script

Result

[[filterRows(WholeTable, 'Region', x -> x == 'South')]]

Rows 4, 5, 12, 13, 20, 21, 28, 29, 36, 37, 44, 45, 52, 53, 60, 61, 68, 69, 76, 77, 84, 85, 92 and 93.

This first example tests for exact matches. It retains all rows for which the Region value is 'South'.

ATL in Script

Result

[[filterRows(WholeTable, 'Region' , x -> contains(x,'South'))]]

Rows 1, 4, 5, 9, 12, 13, 17, 20, 21, 25, 28, 29, 33, 36, 37, 41, 45, 49, 52, 53, 57, 60, 61, 65, 68, 69, 73, 76, 77, 81, 84, 85, 89, 92 and 93.

This second example tests for rows with a Region value that contains the string 'South'. The result is a larger table region because it includes rows with a Region value of 'South' or 'Southeast'.

Note

Rows with a Region value of 'Southern' or 'Southwest' would also pass the filter.

A more powerful way to use contains is to test if the column value is one of a range of values.

For example, to retain all rows for April–July:

ATL in Script

Result

[[

wantedMonths = makeList('Apr', 'May', 'Jun', 'Jul');

filterRows(WholeTable, 'Month', x -> contains(wantedMonths, x))

]]

Rows 25–56.

The first line defines a list of wanted values. The second line filters the array by searching the wanted list for each row's Month value. If the row's value is found in the list, the row passes the filter.

Use the negation operator (!) before contains to apply a does-not-contain filter.

ATL in Script

Result

[[

unwantedMonths = makeList('Apr', 'May', 'Jun', 'Jul');

filterRows(WholeTable, 'Month', x -> !contains(unwantedMonths, x))

]]

Rows 1–24 and 57–96.