I’ve been quite busy this past week with real world commitments so decided to post some Elo rankings for this past week. Enjoy any of you tennis fans! If enough of you like this kind of stuff I wouldn’t mind putting it in the side bar (or Glicko-2 ratings, whatever people seem to prefer). Ratings are sorted by Serve (ratingS) and then by Return (ratingR).
Last time we examined the Glicko-2 model and understood how it worked mathematically. This time we shall focus on implementing the Glicko-2 system for tennis.
Let’s establish a few of the ground rules, 1) we will be using the constants mentioned last week (the base rating as 1500 and the initial deviation as 400 / ln (10. Other constants remain. 2) Despite Glicko-2’s ability to do batch updates (i.e. for a tournament), after trying both methods I found that there is no significant benefit in batch updates and, in fact, updating each match individually proves to have better accuracy. 3) We need to have two ratings per player, one on serve and one on return. (more…)
The Glicko-2 rating system is the second generation of rating systems developed by Mark Glickman to estimate a player’s skill in chess. Glicko itself is, in my opinion, a more sophisticated version of Elo. I truly love the rating system both for its simplicity and the information it provides. Unlike Elo, Glicko provides both the rating deviation and the volatility a player has. That being said, let’s get right into the math and then explain how to implement it for tennis.
The Elo rating system originated in the mid-1900s and has since been predominantly used in chess rankings. On occasion Elo has been used for other sports, or for video games (e.g. League of Legends used the system until just recently). The system itself is very basic; it’s entirely based on wins and losses against other players. Simply put, your new skill will be measured based on the skill of your opponent, your expected performance given your opponent’s skill and your skill, and your actual skill (win, loss, draw, or anywhere in between) on the actual game.
Before we begin it’s important to remember that I’m only going to talk about a few different ways to approach this problem; there are alternative routes you can take to do predictive modeling. Be warned, long post ahead! That being said let’s break this down into two parts. First, determining an individual player’s skill at point in time. Second, given a player one’s skill and a player two’s skill at point in time, determining the probability that a player will win the match. Basically, we’re trying to determine the skill of a player, and, given different players’ skills, calculate their probability of winning a match. (more…)
My name is Nikola Peric, thank you for joining me. I am fascinated by big data, statistics, and machine learning. Over the course of 2013 I created a fairly accurate predictive modeling system for tennis (well several to be honest, a neural net, an Elo system, and a Glicko2 system). That being said, I am still very much learning all the time, trying to improve my intuition in data analysis.
The purpose of Delta Data is to provide a record of my learning experience that may be useful to other individuals. There will be some heavy math and statistics here and there, but, ideally, even if there are equations splurged everywhere I would like to explain the summary in easy to comprehend form, for those who may be interested. Sections with advanced math or coding will be labeled as such so you can feel free to skip them if you so wish. (more…)