Snooker Ranking Analysis
In this post, I team up with my friend Michael to explore the relationship between the world snooker rankings and the percentage of frames each player won. I practice using Plotly to trace out player’s careers over time. My third sports post after my Parkrun analysis write up and when I wrote about Duckworth-Lewis last year.
Snooker
I started playing snooker during my first year of uni. I enjoyed playing a few frames at the weekend or in the evening after a heavy day of maths. When the snooker room was moved to central campus, my housemate Michael and I would play several times a week. Michael is a much stronger player than me, and he has the BUCS medals to prove it. For the record, I did beat him once in the freshers tournament, but haven’t come close since!
In this project, we teamed up to work on a piece of snooker analysis: doubling the number of Snooker datasets on Kaggle in the process. Michael provided the crucial domain knowledge to understand the analysis, and I wrote the code. Our thanks to Ron Florax for maintaining cuetracker.net, an unrivalled source of snooker data.
The Brief
We were interested in the Snooker World Rankings; how players rise and fall over time. Prior to starting, I thought the world ranking would be calculated by the % of professional frames won in a given year. This is not the case (which led a good exploratory comparision), and I have invited Michael to explain the true logic behind the rankings.
How are world rankings calculated?
Snooker introduced a world ranking system (also known as the ‘Order of Merit’ at certain points for no particularly good reason) in the 1975/76 season. This system allocated a certain number of points for each tournament with the rankings updated at the end of the season after the World Championship. This meant the governing body, World Snooker, would set the ranking points available in each tournament according to the perceived prestige.
For example the World and UK Championships would carry a lot more points than the Dubai Classic or the Malta Cup. This system was seen as unfair to many players on the cusp of the top 16 though so Barry Hearn decided to introduce a new rolling system that updates after each tournament for the 2010/11 season.
The way the rankings were calculated changed too, as the number of ranking points you had were equal to the amount of prize money you had won. This has meant sponsors have been able to make tournaments have a much greater weight – ranking wise – than their prestige actually should allow.
Take the China Open which has a current prize pot of £1,000,000 and the new Saudia Arabia Snooker Masters which has a staggering £2,500,000 prize pot. Compare this to the two most prestigious ranking events of all time; World Championship (£2,395,000) and the UK Championship (£1,009,000) and you can see that the ranking system has been hijacked by the wealthy sponsors who want an artificially big tournament.
It’s also worth noting that these are some of the biggest tournaments – money wise – on the calendar and that the regular tour events have much less money in them. Take for example the four legged Home Nation Series (£405,000 per tournament) or the German Masters (£400,000) – which is widely considered one of the best venues in snooker. This means a player’s performance in a season can be easily skewed by a good run in one of the big money events, even if they have struggled at regular tour events.
Snooker Players Rise and Fall
We initially set out to plot the ranking trajectory of players over their career. Having scraped the World Snooker Rankings from cuetracker.net, we uploaded the dataset onto Kaggle and began the final bit of cleaning before being ready to plot. We used Plotly to produce a graph for each player ranked in the top 64 sometime between 1982 and 2019. Objective Complete; we now had further questions to answer.
Winning Frames
As Michael explained above, the player who wins the most frames isn’t necessarily ranked number 1 in the world. It’s a question of winning the right frames (those at tournaments where prize money is greatest.
We used another Kaggle dataset containing snooker match data to obtain, for each player, the percentage of professional ranking frames won each year out of all which they played. We then produced another ranking list upon this metric.
We plotted this rank against the world rank for each player. With a bit of formatting, we arrived at a set of 2D plots, which track snooker careers over time. At first these plots looked hopelessly erratic. If the World Rank was closely based upon the frame win %, then all points would be near the diagonal line out of the origin, y = x. We have shaded this region golden. Most trajectories are not even close to conforming to this region, but on further interrogation, we noticed some explainable trends. Let’s pull these out in a couple of examples.
Neal Foulds – A Stable Trajectory
Players tend to enter the world rankings with a strong Frame Win % Rank. Neal Foulds breaks into the top 32 in 1984, but is ranked 5th in Frame Win %. We can understand this that players improve their world rank after a period of excellent cueing.
Mid career, the two ranks are better aligned; the world rank has caught up with a players ability and the player sits closer to the golden stripe.
As a player reaches the end of their career, and start to win fewer frames, we see their Frame Win % rank drop off before their World Rank. There is a lag in the rank, and experienced players seem to be better at rising to matches when it matters in order to protect their World Rank. Eventually however, their world rank once again catches up with
their current ability, and they either choose to retire gracefully, or drop outside of the top 64.
Piecing this together, we can view a players career as an arc, instead of a straight line. Excuse the sketchy drawing, Neal. We’re calling this the Jenkins-Wilson Snooker Arc (not yet a protected term!).
Robert Milkins – A Remarkable Return
A slightly more complicated case study of this effect is Robert Milkins, who, in essence, has followed the career trajectory twice!
Milkins first enters the top 64 in 2001 and improves his World Rank down into the 30’s within a few years. However, by 2007 it looks like his time is up. He has traced out the Jenkins-Wilson Snooker Arc and is set to fall out of the rankings entirely.
But Milkins, who played few professional matches between 2007-2009, went away and trained hard, and played well in minor tournaments, increasing his Frame Win % in the process. In 2009, he ranked in the top 10 for Frame Win % with a World Ranking outside of the top 50 – a position where many players start their professional career. Milkins was ready for part 2.
Over the last 10 years, Milkins has traced out another Jenkins-Wilson Snooker Arc. The question is, what’s next for Milkins – will he retire in the next few years, or is he set to make another remarkable emergence?!
Once again, please excuse the sketchy drawings!
Ding Junhui – Round in Circles
Let’s look at one more example with the career so far of Ding Junhui.
Ding entered the top 64 in 2005 and was tipped to be the first Chinese player to win the World Championships, a feat which has still eluded him. Looking closely at the graph, do you see half of an arc, or do you see 4 smaller arcs?
With a bit of contextual knowledge, I’m inclined to say the second one. Ding seems to be doing just enough to stay within the top 16, but isn’t as hungry for major tournament wins as other top players. Taken from an article published by the Independent Ding says
‘I know a lot of people especially home in China look at me and wonder why I haven’t won a world title, or I don’t win even more tournaments. But I have my own life plan, and only a certain amount of time in a year. I won’t put myself in all the events, and for example I wanted some time away from snooker this season.’
Ding does enough to stay competitive but isn’t focused on being number 1.
Further Questions:
It looks like the eventual best ranked players over index on win % rank early in their career – a potential indicator for future success.
- Can we predict a player’s eventual best World rank based upon their first year or two of Win % ranks and World Ranks? What about the length of their career? What about their entire trajectory?
We’ve seen that Ding Junhui is able to maintain a strong World Ranking whilst letting his Frame Win % slip. He plays well when it counts at large championships.
- Does win % change through the season, with the best players really upping their game as the World Championships nears?
- Does splitting win % into frames played against top 16 players, and frames played against others have a stronger relationship with world rank?
Aside from World Rankings, I’m interested in the percentage of winning a match from any given current scoreline. Given two players of near equal ability, there is, theoretically, a 50-50 chance of either winning from 0-0 in a best of n frames. If each frame was independent, we could apply a binomial distribution to establish the likelihood of any result – my hypothesis is that taking sports psychology into account, this assumption fails.
Plenty of questions for sure, but for now, we’ll go back to baulk and have a rest.
Technical Learnings
To close, I’d like to reflect on what I learnt from this analysis aside from a better understanding of the World Snooker Rankings.
Scraping multi page xframes. The rankings on cuetracker.net show 50 players per page. I used selenium to navigate the pagination, clicking to the next page in a loop with iterable equal to the floor of the number of players divided by 50. In the future, I would find the HTML element by link text (‘Next’) instead.
Loading the dataset into Kaggle. Giving back to the community, and appreciating the importance of good meta data when sharing a source.
More advanced use of the Plotly library. Exporting Plotly Graphs to PNG for later viewing, and adding shapes and annotations to charts. A case of getting stuck into the documentation to find what I was looking for.
Until next time,
Scott and Michael