Alteryx has convinced me that data is infinitely malleable. You can flip it, spin it, twist it, shape it, break it into a million pieces and then rebuild it into the precise configuration you need. So long as you know what you want the output to look like, Alteryx will let you get there. The exact steps may not be immediately obvious (especially to people starting out), but there are enough ways to combine tools that you will ultimately get the output you want.
I recently got a request to take a bunch of disparate data files and produce a single output file. This is bread and butter stuff for Alteryx. A few joins, unions, filters and formulas allowed me to take half a dozen individual files and produce (within a few seconds) a single, consolidated Excel file. Awesome. Job done…or so I thought.
After I produced the output file, I was given the spec required to upload the consolidated data into a customer’s system. They needed the data pivoted so that the header dates became columns, with the dates sorted in ascending order.
Okay…that’s what the Cross Tab tool is for. So I pivoted on Forecast Date (I.e. made Forecast Date the new column header) which did create a column for each unique date, but the field format was wrong. Whereas the customer spec indicated “MM/DD/YY” for the date headers, the Cross Tab tool was outputting “YYYY_MM_DD”.
However, if I reformatted the date to be in the correct format, then the Cross Tab tool would sort the fields in ascending order based on Month and Day, meaning that January would always come first even if it was in the following year. This is because the Cross Tab tool sets the column header as a string and sorts ascending with no obvious way to change that sort. It also adds those underscores, which I don’t particularly like.
In pre-Alteryx days this would have necessitated some recurring tweaking in Excel to change the format of the column headers. But I knew there had to be a way to make this happen within the existing workflow so that the file generated was immediately ready for upload. And there was, via 4 pretty easy steps.
Step 1. Create a separate data stream that uses the Summarize tool to pull in the original Forecast Date and a new field called “Header Date” that is simply a string version of Forecast Date, but in the right format. I used String vs Date since the Cross Tab tool converts all column headers to string anyway.
I then used a Sort tool to sort ascending on Forecast Date and a Record ID tool to add a record ID to every row.
What this step produces is a simple 3-column data set that looks like this:
Step 2. Use the Join tool to combine this new stream with the main data stream, joined on Forecast Date.
Step 3. Use the Cross Tab tool to pivot the data as before, but this time using the new Record ID field as the header.
This produced the output in the required structure and format, adding at most another second or so to the overall workflow time. And my love affair with this software continues…
Thanks for reading.