Blog

Blog

Ready for an Update (columns)?

November 28, 2017 | Kevin Crooks

Imagine a table of data; full of potential insights; but with poorly formatted columns (or at least, not as we would like). How might we tidy these, and prepare the table for analysis?

This is an issue when dealing with date type columns, with the ambiguously-defined "1/2/12" - is this the Day/Month/Year (DMY) style; the MDY style, or even beginning with the year first (each the standard in a specific context). Wishing to apply formatting for our own preference however, we would need to manually fix each column - and ensure we're being consistent!

Schematiq v4.0 introduces a new tbl.UpdateColumns function. We will show here how this can be used to create a custom formatting function, and to apply this function simultaneously to multiple columns of your data table.

As an example, we will use some London temperature data: Our aim will be to convert all of these values from Celsius to Fahrenheit, using only a few easily-editable Schematiq functions. We will show how these functions may be easily modified and automatically applied to all of our selected columns.

Throughout this example, we will be using the built-in Schematiq functions, and the Schematiq table object, which will be displayed in the Viewer. The first time that these functions are used, we will state the function arguments:

C18: = tbl.Create([data], [columnNames])

= tbl.Create(B4:E15, B3:E3)

TemperatureData

This function loads the data in to a Schematiq table, which is displayed in the Viewer to the right-hand-side.

If we only want to calculate a single column, we can use the tbl.CalculateColumn function, where the same operation is applied to each row. As an example of this, let's create a function that will remove the text formatting "°C" from a specific entry.

We do this by specifying a snippet function to apply to any value "x", and use fn.Call to call this function using the cell C4 as the input:

D20: A snippet function has the same effect when called as the Excel function =LEFT([cell], LEN([cell]) - 2)

x => LEFT(x, LEN(x) - 2)

C20: = fn.Call(function, [inputValue1])

= fn.Call(D20, C4)

Snippet

We can see from the result that this has worked for cell C4, so let's use tbl.CalculateColumn to apply this function to the entire column at once:

C22: = tbl.CalculateColumn(table, [inputColumnNames], function)

= tbl.CalculateColumn(C18, "Normal", D20)

CalcColumn

We see in the Viewer that this function has now updated the column "Normal", using the formula that we constructed in cell D20.

We could apply this formula twice more to the remaining columns ("Warmest" and "Coldest"), to complete our table. This works for cases where we only wish to update a small number of columns. But for date or text formatting functions, our table could easily have tens of columns to fix, and this quickly turns into a large manual task.

So we could do this - but fortunately tbl.UpdateColumns gives us a better way, by updating all selected columns using the same function.

C24: = tbl.UpdateColumns(table, [inputColumnNames], function)

= tbl.UpdateColumns(C18, C3:E3, D20)

UpdateColumns

We see in the Viewer that the function in cell D20 has been applied to all the columns C3:E3 with a single function call. (We could manually input a list of columns here, or even use a function to select which columns we want - see the PRO-tip section below).

To complete our formatting, let's create a final snippet function to convert these values from Celsius to Fahrenheit. This function will perform the conversion and then truncate to one decimal place using Excel's TRUNC function:

D26:

x => TRUNC(32 + x * 9 / 5,1)

C26:

= tbl.UpdateColumns(C24, C3:E3, D26)

We've completed the conversion, but looking in the Viewer, we'd quite like to include the new units "°F". This can be achieved with a single edit to our snippet function:

D26:

x => TRUNC(32 + x * 9 / 5,1) & "°F"

UpdateSnippet

The Viewer shows that our result is correct - with a single edit to our update function, our columns have been updated.

This completes our example data-set, and there is a downloadable copy of this worksheet here. This sheet requires an active Schematiq subscription - please contact us for a trial licence.

The tbl.UpdateColumns can be applied in far more advanced settings, such as custom date-formatting or text-formatting functions, or even when the specific columns to update may be unknown at the point of creating the Schematiq table. See the PRO-tip section, for a more advanced Schematiq formatting tip.

 

*PRO-tip: Selecting columns to update based on criteria:

In our above formula we have chosen to update every column name specified by the range C3:E3:

C24:

= tbl.UpdateColumns(C18, C3:E3, D20)

We could equally have provided the column names directly:

C24:

= tbl.UpdateColumns(C18, { "Normal", "Warmest", "Coldest" }, D20)

Alternatively, we can use the Schematiq tbl.ColumnNames function to select which columns we'd like to update. To select every column of the table we can use the function

= tbl.ColumnNames(table)

and use this as the [inputColumnNames] argument to tbl.UpdateColumns:

C24:

= tbl.UpdateColumns(C18, tbl.ColumnNames(C18), D20)

In most cases though, we want to select which column names we want to use, based on some criteria function that returns either TRUE or FALSE.

For example, to apply our function to only the columns "Warmest" and "Coldest", we could use the snippet function

E24:

x => RIGHT(x, 3) = "est"

which will return TRUE for "Warmest" and "Coldest" but FALSE for "Months" and "Normal". This is particularly useful with a DATE-formatting function, where we could select columns based on column headers containing the text "Date" ("Start Date", "End Date", etc.).

We can use our snippet function in tbl.ColumnNames to return a range containing the two matching column names "Warmest" and "Coldest".

D24: = tbl.ColumnNames(table, [predicate])

= tbl.ColumnNames(C18, E24)

Finally, we can edit the following formulas apply the update function D20 to these two columns only, meaning that our previous column updates will only be applied to the final two columns.

C24:

= tbl.UpdateColumns(C18, D24, D20)

C26:

= tbl.UpdateColumns(C24, D24, D26)

ColumnNames

Tags: Schematiq UpdateColumns Simplify