FINA has since before the Shanghai 2011 World Championships provided us with an interesting tool, demanding all World and Olympic qualification times to be submitted to their Swimming World Rankings. The London 2012 data set is of course not complete yet, with the qualification period spanning from March 1st 2011 to June 18th 2012 – but we still get an exciting overview over what is to come, when we put these data through pivot tables etc.
(Minor error here, the Date radio button should be selected to be Date Range in the figure above, not Named Date Range. Doesn’t affect the results right now, though)
Extracting the olympic qualification times is easy, but it has been a bit cumbersome to make these lists ready for Google Spreadsheet. I used filter options as shown in the figure above, and then only found the option to extract these data as PDF or HTML, which is not optimal for spreadsheet import. But with a bit of geeky importing into first a text editor, a twist of grep magic, and then a few hours of data cleanup in old trusty Microsoft Excel, we get a nice dataset to import into Google Spreadsheet.
After importing the dataset, I added a few additional informative columns, like merging ‘swimmer’ and ‘team’ into ‘swimmer (team)’, adding a column that extracted the date as only the month of qualification, and adding a ‘team rank’ (‘team’ is here the data label for ‘country’). And then I could finally start playing with the data, summarizing this and that by using pivot tables, and visualizing it with graphs.
So, we can for instance make a ‘medal table prediction’, that says that USA is going to clearly win the (individual) medal race, with 12 gold medals out of 26 possible, and 20 out of 78 in total. China is next with 5 gold medals out of 26 (and 8 out of the total 78), and Sweden third with 2 individual gold medals. The host nation Great Britain is down in 6th place alongside Brazil, with 1 gold medal, 2 silver and 1 bronze.
Bear in mind that this prediction is only based on those registered olympic qualification times that I could extract from the FINA world rankings now in January 2012, ranging from the Indianapolis Grand Prix meet in March 2011 to the Italian Open Winter Nationals in December 2011. There might be meets missing, and there might of course be errors in my procedure or handling of data. And it will change in the coming months.
Google Spreadsheet is a fantastic tool, with its ability to visualize data on for instance world maps . Here is for instance a ‘heat map’, where we get an overview over which countries have the most swimmers managing the Olympic qualification time (OQT), never mind that each country can only send two swimmers for each event. A sort of indicator of the depth of strength of each swimming nation.
We could also visualize which meets produced these Olympic qualification times
Or when in the year (2011) they were produced
The possibilities are almost endless, I might add other visualizations here or in another post these next few days, and if you have special requests (or critique) then please comment here or send me a message. Tomorrow I’ll show some work that Bartal has prepared with his mastery of the even more fantastic Google Fusion Tables, so please revisit :-)
Now back to the data ! …