Outcome Probability Calculator (Updated)
Ford Bohrmann
I've updated the site's Outcome Probability Calculator. I updated and added more games of data, changed the methodology somewhat, and created a new online app. The first iteration of the app was featured on the Wall Street Journal's website in a blog post called Arsenal Beats Reading and Math.
Data
To create a model that gives the odds of win, draw, and loss for a club depending on the venue of the game (home/away), the minute, and the goal difference in the game, I used over 4000 EPL games stretching back to 2001. This is an improvement over the older Outcome Probability Calculator, which only used 1000 games.
Methodology
Even though there are over 4000 games to base the model off of, there are inevitably still some game scenarios that do not occur enough to get reliable results. There are not many games in the past data where the away team went up by 3 in the first 5 minutes. Because of this, when I plotted the outcome probability against the minute of the game, the line is not exactly smooth. To make the results a little more reliable and hopefully more consistent with actual games, I applied the Loess regression method to smooth the lines. This method has some positives and negatives-- it does a good job of smoothing the data, but it is a non-parametric method so it cannot be expressed in an equation as a regular regression is able to do. Below are two plots, one using the LOESS and one that is just the raw data.
Shiny
Best of all, I updated the application interface. RStudio Shiny lets you make simple web applications through R. The Shiny interface makes the application a little more functional. The Outcome Probabiity Calculator now displays a graph similar to the one above with a marker for the specific minute chosen. It's interesting to mess around and explore different game scenarios.
Next Steps
Next I am planning on updating the Expected Points Added application with the new data. I'll also hopefully calculate this for every player in the past 10 years of English Premier League football. It should be interesting to see how this has fluctuated from year to year for specific players.