Filtering a table region
The key ATL function for filtering tables is filterRows. This function is available in "Describe the Table" and "Describe Row in Context" projects because these allow access to multiple rows.
The best primer on filterRows
is the function topic, which describes the function's purpose, explains the parameter requirements, and provides examples. Read that topic first to gain a basic understanding.
This topic is a supplement to the function topic. It covers common use cases, including some not covered by the function topic. It also has more complex examples.
Index
Example dataset
The examples below use a dataset too large to display fully here. The topic is easier to follow if you download and open THIS EXAMPLE PROJECT, paste the example ATL into a script, and click Preview to see results.
The full dataset has 96 rows. For reference, the first four are:
ID | City | Region | Month | Quarter | Year | Sales | Target |
---|---|---|---|---|---|---|---|
1 | Atlanta | Southeast | Jan | Q1 | 2021 | 56,000.00 | 50,000.00 |
2 | Boston | Northeast | Jan | Q1 | 2021 | 32,000.00 | 40,000.00 |
3 | Chicago | Midwest | Jan | Q1 | 2021 | 44,000.00 | 45,000.00 |
4 | Dallas | South | Jan | Q1 | 2021 | 57,000.00 | 50,000.00 |
Important
You cannot filter a table by the values in the first column. Each value in the first column must be unique. The best approach is to add an ID column — like the one in the example project — before uploading the data in Studio.
filterRows and different project types
The filterRows
function works in two project types:
Describe the Table
Describe Row in Context
Each ATL solution given below works in a project of either type. This is because the ATL references specify columns (second parameter) by giving each column name in string form.
If you give column variables instead, the ATL differs slightly by project type. Remember, in "Describe Row in Context" projects, column variables end in 'Column' — for example, SalesColumn
instead of Sales
.
Applying a single-column filter
Suppose you want to calculate the total Sales value for Q2. The first step is to get all rows for which the Quarter value is Q2. You can achieve this with a single-column filter.
ATL in Script | Filter Condition |
---|---|
| Retain row if the Quarter value is Q2. |
Paste this ATL into the example project and click Preview. The result is a filtered table region containing Rows 25–48 (inclusive). Once you have a filtered region, you can use other ATL functions — in this case, columnsInRegion and totalVal — to calculate the Q2 sales total.
ATL in Script | Result |
---|---|
[[Q2data = filterRows(WholeTable, 'Quarter', x -> x == 'Q2'); totalVal(columnsInRegion(Q2data, Sales))]] | 1,265,000 |
Tip
In a "Describe Row in Context" project, the column variable is SalesColumn
, not Sales
.
Now suppose you want the combined sales total for Q2 and Q3. This requires a single-column filter that applies two conditions. Specifically, you want all rows for which the Quarter value is Q2 or Q3. Here is the ATL:
ATL in Script | Filter Condition |
---|---|
| Retain row if the Quarter value is equal to Q2 or Q3. |
The result is a filtered table region containing Rows 24–72. Again, you could interrogate this filtered region further with other ATL functions to calculate, for example, the combined sales total for Q2 and Q3.
Note that the ATL uses the OR operator (||
) to add an extra condition. The other option is to add a condition with the AND operator (&&
). This is less common in single-column filters, but here's one example:
ATL in Script | Filter Condition |
---|---|
| Retain row if the Sales value is greater than 30,000 AND less than 40,000. |
The filter applies two conditions. Since these conditions are separated by the AND operator (&&
), the row must meet both conditions to pass the filter. When two conditions are separated by the OR operator (||
), the row passes by meeting either condition.
Applying a multiple-column filter
Suppose that you want the average Q2 sales value for Boston. This requires a multiple-column filter.
[[filterRows(WholeTable, ('City','Quarter'), (x,y) -> x == 'Boston' && y == 'Q2')]]
You can then interrogate the filtered region to calculate the average sales value.
ATL in Script | Result |
---|---|
[[Q2Boston = filterRows(WholeTable, ('City','Quarter'), (x,y) -> x == 'Boston' && y == 'Q2'); mean(columnsInRegion(Q2Boston, Sales))]] | 43,666.67 |
Tip
In a "Describe Row in Context" project, the column variable is SalesColumn
, not Sales
.
The following filter uses the data in four columns:
[[filterRows(WholeTable, ('Region','Quarter','Sales','Target'), (a,b,c,d) -> a == 'South' && b == 'Q4' && (c > 65000 || percentageChange(c,d) > 5))]]
The above ATL retains all rows for which:
The Region value is 'South' AND
The Quarter value is 'Q4' AND
The Sales value exceeds 65,000 OR exceeds the Target value by 5% or more.
The last example filters the table by using four columns: Region, Quarter, Sales, and Target . These columns are listed in the function's second parameter and are represented by a, b, c, and d in the lambda expression.
Note that you can use full words to represent each column if preferred. For example, this ATL performs the same operation and is probably easier for your colleagues to follow:
[[filterRows(WholeTable, ('Region','Quarter','Sales','Target'), (region,quarter,sales,target) -> region == 'South' && quarter == 'Q4' && (sales > 65000 || percentageChange(sales,target) > 5))]]
Applying a case-insensitive filter
For a case-insensitive filter, you must define a condition that is case-insensitive.
Suppose you want all rows for December (Rows 89–96). Most have a Month value of 'Dec', but several have a different value such as 'DEC' or 'dec' . A filter that is case-sensitive is insufficient.
ATL in Script | Result |
---|---|
| Rows 89, 91, 93, 95 and 96. |
Here is one way to account for the variations in casing:
ATL in Script | Result |
---|---|
| Rows 89–96. |
Now, it doesn't matter if the row's Month value is 'dec', 'Dec', 'DEC', or 'DEc'. All pass the filter because the condition converts the value to lowercase before testing if that value is equal to 'dec' (the constant).
Note
The lowercase conversion is done with the lower function.
Filtering with the negation operator
Suppose that you want the total sales value for Q2–Q4 (inclusive). First, you need all rows for which the Quarter value is not Q1. The simplest way to achieve this is to use the negation operator.
ATL in Script | Result |
---|---|
| Rows 25–96. |
Note that this ATL uses the NOT EQUAL TO operator (!=
). This is formed by adding the negation operator (!
) before an equals sign. You can then interrogate the filtered table to get the total sales value:
ATL in Script | Result |
---|---|
[[notQ1data = filterRows(WholeTable,'Quarter', x -> x != 'Q1'); totalVal(columnsInRegion(notQ1data,Sales))]] | 3,663,000 |
You can also use the negation operator before any ATL function that returns a Boolean value of true or false. The negation operator flips true to false, and vice versa. For example:
ATL in Script | Result |
---|---|
| Rows 1–8 and 41–56. |
| Rows 9–40 and 57–96. |
The first example retains rows with a Month value that starts with the character 'J'. The second does the opposite — that is, it retains rows with a Month value that does not start with 'J'.
Note
The ATL functions that return a Boolean true or false:
contains, endsWith, hasKey, isCountry, isDateTime, isEmpty, isNotEmpty, isNumber, startsWith
Filtering out rows with missing values
A common requirement is to filter OUT rows that are missing a value in a specific column.
You can do this by using the isNotEmpty function in your filter.
For example, suppose you want to remove all rows with no Target value:
ATL in Script | Result |
---|---|
| Rows 1–94. |
Note that Rows 95 and 96 have no Target value; therefore, they don't pass the filter.
Sometimes, you must use isNotEmpty
before you can apply another filter condition. For example, suppose you want all rows with a Target value greater than 55,000. You might try a single-condition filter:
ATL in Script | Result |
---|---|
| MAJOR WARNING: Comparison of null |
Studio returns an error here because when the function evaluates Rows 95 and 96, it finds no value in the Target column, and it cannot assess a non-existent value against the given constant (55000).
You can avoid this by adding an isNotEmpty
condition:
[[filterRows(WholeTable,'Target', x -> isNotEmpty(x) && x > 55000)]]
Tip
The isNotEmpty
condition must be first; otherwise, you'll still get the error.
Filtering with the contains function
You can combine filterRows
with contains to apply less stringent filters. Rather than filtering by looking for an exact match, you can filter by checking whether a column value merely contains the given string value.
To understand this better, compare and contrast these examples:
ATL in Script | Result |
---|---|
| Rows 4, 5, 12, 13, 20, 21, 28, 29, 36, 37, 44, 45, 52, 53, 60, 61, 68, 69, 76, 77, 84, 85, 92 and 93. |
This first example tests for exact matches. It retains all rows for which the Region value is 'South'.
ATL in Script | Result |
---|---|
| Rows 1, 4, 5, 9, 12, 13, 17, 20, 21, 25, 28, 29, 33, 36, 37, 41, 45, 49, 52, 53, 57, 60, 61, 65, 68, 69, 73, 76, 77, 81, 84, 85, 89, 92 and 93. |
This second example tests for rows with a Region value that contains the string 'South'. The result is a larger table region because it includes rows with a Region value of 'South' or 'Southeast'.
Note
Rows with a Region value of 'Southern' or 'Southwest' would also pass the filter.
A more powerful way to use contains is to test if the column value is one of a range of values.
For example, to retain all rows for April–July:
ATL in Script | Result |
---|---|
[[ wantedMonths = makeList('Apr', 'May', 'Jun', 'Jul'); filterRows(WholeTable, 'Month', x -> contains(wantedMonths, x)) ]] | Rows 25–56. |
The first line defines a list of wanted values. The second line filters the array by searching the wanted list for each row's Month value. If the row's value is found in the list, the row passes the filter.
Use the negation operator (!
) before contains
to apply a does-not-contain filter.
ATL in Script | Result |
---|---|
[[ unwantedMonths = makeList('Apr', 'May', 'Jun', 'Jul'); filterRows(WholeTable, 'Month', x -> !contains(unwantedMonths, x)) ]] | Rows 1–24 and 57–96. |