As the world goes into football overdrive, football fanatics and the upcoming 16 teams go on crunch time. FIFA World Cup 2018 Russia, just cleared 20% of the tournament and some finicky predictions have gone out the window.
There has been no shortage of predictive analytics when it comes to the most popular sports in the world.
Data scientists, analysts and psychics have blustered in cyberspace over the World Cup to have their two-minute of fame in mainstream and social media platforms.
Even financial institutions jump into the scrimmage.
UBS for example, taps into their predictive applications commonly used for investment analysis, and claim that their analytics model gave Germany high probability of taking the trophy.
Folks who parked money with them might get a little sweaty in their palms considering they just dug themselves out of a hole not too long.
Analytics is probably one of the toughest applications to apply in sports. As a favourite modern-day saying in mandarin goes, (球是圆的) ‘the ball is round’, which refers to ‘anything can happen’.
Now that the tournament is one-fifth complete, shocking results, inspirational underdogs and strange display of ball-play in some cases by once dominant football-strong countries, are at the opposite end of what some big data and predictive analytics experts have put together.
Let’s take a quick look at some case studies…
Below is snapshot by a practitioner who placed Germany in 2nd ranking of the winning probability. Its report actually sells for $19.
Now it is known that it’s analytics about Germany is dead wrong. It predicts Japan having a 0.5% chance to win.
What do you think?
Another practitioner, Analytics Vidhya, led by researcher Andrea Grolls from the University of Dortmund, with a slightly fancier name to its analytical tool, Random Forest Model with machine learning capability, predicts not just probability but an all-the-way victory for Germany.
This again, is dead wrong.
[source] Cornell University Library, https://arxiv.org/pdf/1806.03208.pdf
Andrea and researchers dived in further by analyzing its data to give break-down figures for each nation and its corresponding probability in percentage term to each stage of the tournament.
As you can see from the predictions, Germany is at the top of the probability table.
Japan, however, is ranked at 28th place.
The results of Japan reaching final and eventually won are at 0.2% to 0%, lower than the previous prediction, which is 0.5%.
One interesting prediction is that Japan is ranked ahead of South Korea.
As many know by now that the country that invented kimchi is the one that sent Germany packing.
This in some sense, demonstrates technical skills superiority and theoretically ranks higher.
But analytics do not take footballers technical skills or emotions into its analysis. It is fair to say that the prediction for Japan above South Korea is accurate now that the kimchi nation has been eliminated.
It would be interesting to see how the remaining probabilities pan out.
As we can see from the above examples, all got it wrong about Germany. And going by the looks of how Argentina plays so far, the probability from these analytics experts left much to be desired.
Maybe WACA can consider developing a curriculum and approach the analytics practitioners in FIFA and offer them skills upgrading.
Jokes aside, can Japan defy the odds from these so-called experts?! Can Japan beat the table of predictive analytical results?
‘Slice and dice’ is the cliche, ironically, in the still considered nascent field of data science analytics.
Many a times, such analytics can better be applied to buying and selling of goods and services. Learning where prospective customers are? Uncovering what are the buying habits and patterns in advance? Or even taking a stab at lottery.
Sports analytics, even with machine learning capabilities, are probably more difficult to apply for win, lose and draw, as the nature of the event has a certain amount of non-cognitive element, which is difficult to build into the science of algorithms, at least, for now.
Things like individual’s health, mood, team effort and spirit.
Another point which warrants emphasising is that analytics are not to predict the results but rather, to provide a probability of the results.
Feel free to leave your comments.
Thanks for reading.
Posted by Gary Tan Kar Quan