Exploring Australian Open Tennis data with Tableau — Part 3

Well know you know.

You just have to Google the Tableau Software | Academic ProgramsThere are a few house keeping items to register with the Tableau team, you will have to identify as an academic or a student and verify your identity with your student card or an academic transcript.

Once the vetting process is completed you will be issued with a product key.

Getting started with TableauMy objective today is to visualise the data in Tableau to understand the data before I conduct any predictive modelling.

Take the pre-processed data ‘aus_open.

csv’ file and upload it into Tableau Desktop 2018 v3 by selecting ‘text file’ .

I already prepared something earlier by creating a workbook which I saved as ‘Tennis_viz’.

twbx’ file and not a *.

twb file.

I used Tableau Desktop at Allianz insurance and you can only share insights by saving your workbook as a *.

twbx file and your stakeholder will need to install Tableau Reader which is free to view your visualisations.

Data SourceOnce your data is uploaded into Tableau, you will have to inspect your data to ensure that Tableau has correctly estimated your data attributes as either a date, number (measure), categorical variable (dimension) or string (text).

Once you are satisfied with the data types, hover over to connection and save your connection as an extract and not a live connection because the memory of reading a live data might slow your computer data processing time.

I would recommend the extract version especially if you are working on a laptop.

Create your first worksheetWhen you access your first worksheet in Tableau, rename the sheet as you would for an Excel spreadsheet and don’t forget to save the file.

I inspected the data upload and ensured that Tableau correctly allocated the variables as a Measure or a Dimension.

I was interested in plotting the loser of the first set across the years.

In columns I dragged ‘Year’ and aggregated ‘L1’ by Sum in Rows.

Progress in the 1st setProgress in the 1st setI wanted to further explore the data by comparing Winner and Losers in the first set.

From the data and notes provided by tennis-data.

co.

uk we can see missing data for the years 2000 to 2003 and this was shown in R previously.

How are they tracking in the 4th set?How are they tracking by the 5th set?According to Tableau, it looks like the Australian Open Tennis Men’s final will either be Rafael Nadal or Roger Federer.

However we will look at building a predictive model in our next post to check the accuracy of our modelling and feature engineering.

The Tableau workbook is shared here on Github.

Happy coding and enjoy the tennis!.????Originally published at wendywong.

org on January 16, 2019.

.

. More details

Leave a Reply