A look at ratings #2 – the ELO system
March 5, 2008 9 Comments
In a previous post, I took a look at some of the statistical correlations between a player’s scoring averages and their Scrabble® rating. Over this past year, I’ve seen my own personal rating jump by over 400 points (in tournament play) and about 200 points at club play. Seems like everyone likes to talk about ratings – either theirs going up or down.
In this post, I’m taking a closer look at the rating system. For those of you who don’t know, here in North America, the National Scrabble® Association (NSA) uses the ELO system originally developed and still used for chess. I won’t go into too much detail, but this system is based on estimating a player’s "true skill" by using statistical measures of interpreting wins & losses.
The ELO rating system
Basically, each player has a rating. This rating is then compared with your opponent’s rating and an "expected" win % is determined based on how far apart you and your opponent’s ratings are. After each game is played, you calculate the difference between the expected % of wins and our performance. If two players are equal in skill, then in theory, they should each win 50% of the time against one another. A higher rating would thus suggest higher skill and higher likelihood of winning any particular game against lower rated players. The same works when playing against higher rated players. The theory is that over time, your true skill should be reflected in your rating because your "good and bad" playing should even out over time and your true rating should emerge.
For a more detailed explanation of the ELO system, please view the Wikipedia entry titled "Elo rating system".
Complaints about the rating system(s) in Scrabble
In the recent issue of Scrabble News (#218 I believe), there was a note about the formation of a ratings committee. This group is going to suggest an alternative ratings formula and come-up with something for use by 2009. I’m not sure why people feel a need to change the current system, but I’ve heard the following complaints:
- Ratings have been decreasing/"deflating" (read below for some more info about this)
- Point "spreads" should be acknowledged – for example a player winning by 500 points should be recognized
- Similar to the point above, "all wins aren’t equal"
- It’s too easier to "lose" rating points but harder to gain them
Personally, I’m not sure what all the complaining and clamoring is all about. No system is perfect. In science, it’s what we call "measurement error" in that every measurement we take has some sort of error involved. For things like measuring the height of something, we can be pretty good, but with things like measuring someone’s intelligence, it can be hit or miss.
Ratings deflation in Scrabble®
In the case of people complaining about ratings "going down", the Wikipedia entry has a very interesting example that describes this phenomenon of ratings decreasing even though a player’s skill level remains the same (it’s about halfway through the section titled "ratings inflation and deflation"). Because of the way the system is set-up, if only a few players improve their skill while everyone else stays the same, what happens is that the player who got better gets a big jump in ratings while the others who stayed the same decrease in ratings – thus the "deflation".
I believe this is what is happening in Scrabble today. Players who used to be in the 1500-1600 range have had their ratings drop a good 200-300 points in the past few years. What I’ve seen as a new player is that more players are getting better and at a quicker rate. We can see this in some of the younger players getting involved, but mostly because of the better study tools available. Computer simulation software allows players to analyze games and become better. We also see this in the decreasing number of (active) players in the 1900s and 2000s. What we don’t see, however, is a change in the relative rankings of players. The top players are still at the top in roughly the same order as they were before. It’s those players who haven’t improved much over the past few years who have seen their ratings plummet. They aren’t any worse than they were before. It’s just that everyone else is getting better and so gaining and/or maintaining a rating is much more difficult.
In my example, based on my play at my local club, I’ve felt that I was about a 1150-1250 player and yet my tournament/NSA rating was about 800 for over a year. Once I figured out a few things about playing in tournaments, particularly about playing against lower rated players, I’ve seen my rating jump 400+ points to about 1205 (estimated). I’ve jumped up so much because my true level of play is higher than 800. Now that I’m at the 1200 level, I don’t expect to see much gains in my rating until I start improving again. Those people who I beat would see their rating go down, not because they played poorly, but because they played someone who was rated lower than them. Similarly, I would expect it very difficult to keep my rating if I start to play against opponents who are getting better. We’re all on this treadmill of needing to improve or else see our ratings go down.
What people also complain about is the fact that unlike in chess, Scrabble® involves quite a bit of randomness and luck. Sometimes you get some crappy tiles and it’s almost impossible to win. Unfortunately, there’s really not much to do. In the card game Bridge, the solution was to create "duplicate bridge", where every team plays the exact same hands/cards. The theory is that if everyone plays the same cards/hand then those with better skill should do better. It’s like having everyone take a standardized test and getting a score. Unfortunately, I don’t think this is likely to happen in Scrabble.
So after a brief look at the ELO system, I’m not sure if any new system that may (or may not be introduced) will fair any better. Personally, I think the secret is not to worry about the rating and focus on just becoming a better player. The ratings will take care of themselves.
Powered by Qumana