Friday, 24 April 2015

Analytics for the People

Richard Whittall has recently written an excellent column for 21st Club in which he explores the gap between analytics and its functional implementation with football clubs.  He finishes up with this: 
The current field in football analytics is very good at many things, but not so good sometimes in identifying specific problems for which analysts may provide a partial or whole solution. Work on the latter will help further bridge the gap between analyst and club. Sometimes, it’s important for analysts to step out of R and Tableau and start to breakdown if and how clubs can actually move the needle on some of these predictive metrics. Otherwise, they are like doctors who are only able to offer a diagnosis, but not a cure. 

Wise words and certainly advice that should be heeded if you are one of the many people in the market offering analytic solutions for football clubs.   With such work being necessarily proprietary and the retention of a competitive edge encouraging secrecy from within clubs, it is not always obvious how analytically switched on the industry is.   Leading data providers such as Opta and Prozone occasionally offer a window into the products they offer to clubs, in particular recent Prozone videos from Hector Ruiz and Paul Power showed great skill and clarity alongside the benefits and economies of scale afforded by full data access and a dedicated and skilful workforce.  One presumes that by this point most clubs will have at least a small analytics department and probably a lot more.  Whether such a department is fully integrated with coaching or the first team will likely vary on a club to club basis, but the point remains: analytics does not exist in a bubble, it is in place, there are professional companies that can offer a full package and access to the market from the outside is difficult.  Plenty of people want to work in football and whilst smart in principle, co-opting a few models from other industries and creating a brand is unlikely to improve on what is already available.  But, and I feel this is important, a desire to work in football is not the only reason people are interested in or learning about analytics.

Indeed, none of this has stopped a vibrant online amateur community from sprouting up in recent years.  The advent of data sites such as Whoscored and Squawka has offered easy access to data at a level that far exceeds what was available prior.  Now, anyone with curiosity can collate data from numerous competitions at a player and team level and play with it.  It can be analysed and truths, both whole or partial, can be uncovered.  These truths have a variable application.  For some, with good technical skills, predictive modelling can inform betting, for others fantasy football.  I choose to tell stories about what I've deduced and i've sunk many hundreds of hours into it because I find it interesting and intellectually rewarding.  As with any subject, there is a learning curve that never ends, not everything I do hits the mark and there are few short cuts to knowledge, but as others who've done this before me have noted, you do it because it's fun.

The current situation in analytics has created different viewpoints.  Firstly there is a great drive for predictability, repeatability and application.  These are entirely logical and commendable aims, but the arms race to maximise these effects has lead to a shroud being laid over the details involved.  In particular, and with one notable exception from Michael Caley, the multiple black box Expected Goals models and advanced derivatives regularly cited have obfuscated analysis due to their non-standardisation.  This is not criticism of any specific model, many hours of hard work and theoretical analysis will have gone into each by people with far more advanced skills than myself and those with such access have doubtless found multiple utilities for insight gained.  My concerns lie around accessibility and interpretation and this is where I feel some parts of the analytics community have missed the point.  

Barriers to entry may have reduced over time, but barriers to understanding have not.  There is no such thing as an "Expected Goal", it is entirely theoretical.  A layman interested in football statistics may not yet understand the value of a shot or a shot on target yet he is quickly encountering hypothetical versions of the things he does understand: goals.  That is a tough sell.  Shot counts are real and easily understandable, they aren't "outdated" metrics, they are the building blocks of all that comes after and if the analytics community has any interest in popularising it's method of thinking and transcending a niche corner of football, the stories told by our fundamental metrics are intrinsic.

And there are many stories to be told.  Variance in league seasons of 34 to 46 games is huge.  Half and whole seasons can go by where the measurable statistical reality of a team is skewed vastly in either direction.  Liverpool's huge overachievement of 2013-14 followed by an almost inevitable regression this season, just one obvious case.  It isn't just the board that need to understand the wider implication of such matters, the fans can benefit too.  Interpretations may differ but we can pull apart possible reasons in the numbers and disseminate the knowledge.  Oh for a day where the average pre-game conversation involves an understanding that a team's save percentage has been unsustainably high or a striker gets cheers rather than murmurs or abuse because fans understand he's been unfortunate rather than inept.

It's probably a long way off but as each year passes, we collect more data and we can test more outcomes.  Our knowledge can grow and with it our expertise and ability to inform.  It is important to encourage people new to the movement and support their effort.  We may strive for professionalism but we all start as amateurs.  Guide rather than chastise and realise that the more people that are interested in football statistics and analytics, the greater the likelihood of resulting success for everyone, whatever your desired end-game. 

This may seem somewhat utopian, but elitism will get us nowhere, accessibility and inclusiveness just might.