When you’re fighting ghosts, crossing streams might be something you want to avoid. But when dealing with data inside of Alteryx, you should embrace creating and crossing as many streams as necessary to get the results you want.
A co-worker was recently confronted by an Excel spreadsheet that didn’t want to conform to the workflow he was building in Alteryx. It was one of those spreadsheets that are all too common in corporations – blank rows, hidden columns, stacked headers, sub-total rows sprinkled throughout, etc. The only corporate “norm” it didn’t possess was a rainbow color palette.
In other words:
By adding a Record ID tool to the Input tool, I could then use a Filter tool to select Row 2 as one stream and Rows >= 5 as a second stream, getting rid of rows 1, 3 and 4 which were blank.
At this point in the workflow all of the column headers were the default “F#” format you get when you import an Excel file without explicit headers. I would need to add actual header names eventually but those numbers were what I would use to generate the column ID field. All I needed to do was strip out the “F”. Easy enough with the RegEx tool.
Once I had the column ID in both streams, I could use that field in a Join tool to add the calendar date to the main data stream as an additional field. Now I had a complete data set with every calendar date associated to every product. Only problem was that the date was showing as a number vs an actual date. This happened because a column can only have a single data type and each column in the spreadsheet contained strings, numbers and dates, which could only be handled by a String data type, a conversion that reverted the formatted date value into the Excel date number value.
Step 4 – Name the columns
As I mentioned above, many of the column names still had useless names like F1, F2, etc. The columns I had added to the workflow, via the Join or a Formula tool, already had good names but the original fields from my primary data stream did not. Those field names existed as data values in the first row.
A Dynamic Rename tool allowed me to use those first row values to rename the headers.
Once I had the totals, I used a RegEx tool to strip out the word “Total” from the product name since then I could cleanly join this data with other product data (the data in this particular Excel file was merely one of several inputs in a larger workflow).