Data access in table projects

Studio automatically creates — at the point of data upload — variables that give you access to that data. In table projects, the variable names correspond to the column headers in your table data; however, they differ slightly depending on project type.

Tip

For a primer on the different project types, see Key Concepts > Projects.

Index

Data access variables — Describe Each Row

This project type uses what we call one-dimensional table access, meaning you can only access a single row of data at any one time.

The variables that Studio automatically creates for this project type refer directly to the cell values in the focus row, and they take their names from the column headers in your table data.

Suppose you create a Describe Each Row project and upload this data.

	City	Population 2010	Population 2014	Population 2015
Row 1	Sydney	4,183,471	4,448,914	4,526,479
Row 2	Melbourne	3,953,939	4,266,718	4,353,514
Row 3	Brisbane	2,019,074	2,175,751	2,209,453
Row 4	Perth	1,723,218	1,932,749	1,958,912
Row 5	Adelaide	1,225,668	1,276,711	1,288,681

Studio creates four variables for accessing the data:

City
Population_2010
Population_2014
Population_2015

You can use these variables in a script:

[[City]] has a population of [[Population_2015]]. That's [[percentageChange(Population_2015,Population_2014)]]% up on last year's count of [[Population_2014]], and [[percentageChange(Population_2015,Population_2010)]]% up on 2010's figure of [[Population_2010]].

The output for Row 1 is:

Sydney has a population of 4,526,479. That’s 1.7% up on last year’s count of 4,448,914, and 8.2% up on 2010’s figure of 4,183,471.

The output for Row 2 is:

Melbourne has a population of 4,353,514. That’s 2.0% up on last year’s count of 4,266,718, and 10.1% up on 2010’s figure of 3,953,939.

KEY POINT: The variables created by "Describe Each Row" projects return column values for the focus row.

Data access variables — Describe the Table

	City	Population 2010	Population 2014	Population 2015
Row 1	Sydney	4,183,471	4,448,914	4,526,479
Row 2	Melbourne	3,953,939	4,266,718	4,353,514
Row 3	Brisbane	2,019,074	2,175,751	2,209,453
Row 4	Perth	1,723,218	1,932,749	1,958,912
Row 5	Adelaide	1,225,668	1,276,711	1,288,681

This project types uses what we call two-dimensional data access. When you generate just one narrative describing the whole table, there is no focus row, which means Studio only provides variables that refer to whole columns, not to individual cells.

To reference the value of a particular cell, you must use a data access function. To enable this, the table rows need names too. Studio assumes that the first column in your data table contains the row names. The special variable used to access this column is called RowNames. In the table above, this is the City column, making each city name the key for accessing a row. The items in the key column must be unique or row access is not possible.

For the remaining three table columns, the same three variables are created as in the one-dimensional case, but each refers to the entire column. So a reference to the Population_2010 variable returns all data in the Population 2010 column. If column or row name variables are called in a script, Studio prints all values in the column or row as a punctuated list.

Tip

In addition, Studio provides a variable called WholeTable, which gives access to the entire table. If this variable is referenced in a script, the output is printed as an HTML table.

Using the data given above, you could write the following ATL script:

The five most populous cities in Australia are [[RowNames]]. In 2015, their respective population figures were [[Population_2015]], averaging at [[precision(mean(Population_2015),0)]]. This average was [[pecentageChange(mean(Population_2015), mean(Population_2014))]]% higher than in 2014. The most populous city was [[rowNames(max(Population_2015))]].

The output narrative would be:

The five most populous cities in Australia are Sydney, Melbourne, Brisbane, Perth and Adelaide. In 2015, their respective population figures were 4,526,479, 4,353,514, 2,209,453, 1,958,912 and 1,288,681, averaging at 2,867,408. This average was 1.68% higher than in 2014. The most populous city was Sydney.

If you know which row names will appear in the script, you can refer to them by their exact string when using one of our Data Access Functions. For example:

In 2015, Sydney had [[cell("Sydney",Population_2015)]] inhabitants.

produces the following narrative output.

In 2015, Sydney had 4,526,479 inhabitants.

Data access variables — Describe Row in Context

"Describe Row in Context" projects combine aspects of the two previous project types. Like the one-dimensional project type, it has the concept of a focus row, but it also allows two-dimensional data access. It generates one narrative per row, but the values for the row can be compared to the values of other rows, table-wide averages, maximums, and so on. Suppose you want to say how the current city being described compares to the average city. This project type allows you to do this.

To make this possible, Studio creates both cell variables and column variables, in addition to the WholeTable variable. Similar to the "Describe Each Row" project type, variables referring to cells in the focus row take their names from the column headers. The names of variables referring to entire columns are formed by appending the word "Column" to the header names.

Assume a "Describe Row in Context" project with this data:

	City	Population 2010	Population 2014	Population 2015
Row 1	Sydney	4,183,471	4,448,914	4,526,479
Row 2	Melbourne	3,953,939	4,266,718	4,353,514
Row 3	Brisbane	2,019,074	2,175,751	2,209,453
Row 4	Perth	1,723,218	1,932,749	1,958,912
Row 5	Adelaide	1,225,668	1,276,711	1,288,681

For this dataset, Studio creates these cell and column variables:

FocusRowName
RowNames
Population_2010
Population_2010Column
Population_2014
Population_2014Column
Population_2015
Population_2015Column

Important

FocusRowName and RowNames are the cell and column variables for the City column. Studio doesn't create City and CityColumn variables because City is the first column — i.e. the RowNames column.

You might use these variables in the following ATL script:

In 2015, [[FocusRowName]] had [[Population_2015]] inhabitants, meaning its population was [[if(Population_2015 > mean(Population_2015Column)){larger}else{smaller}]] than the average across the 5 most populous cities in Australia ([[RowNames]]).

The narrative output for Row 1 would be:

In 2015, Sydney had 4,526,479 inhabitants, meaning its population was larger than the average across the 5 most populous cities in Australia (Sydney, Melbourne, Brisbane, Perth and Adelaide).

Note how the script references both the cell variable Population_2015 and the corresponding column variable Population_2015Column. It uses both variable types to make a comparison between each city's 2015 population and the average across the whole 2015 column.

There are a lot more things you can do with two-dimensional data. See Data access functions.