A closer look at Scrabble ratings
April 8, 2007 10 Comments
I’ve often wondered how I can improve my Scrabble rating (both at the club and my “official” NSA tournament rating). Apart from “study words” or “do well at tournaments”, there really isn’t much else to say. I’ve asked a few expert players at my club and they all seem to advocate an “open style” (i.e., keep the board open). Our club director often reminds us lower rated players to “not play scared” and to keep the board open (aside from specific situations). When I first started playing Scrabble at the club, my first instinct was to try and play as many overlapping plays as possible, making use of the two and three letter words. What inevitably happens is a “step-ladder” pattern where neither play can score as the board gets very closed. Now that I’m improving, I’m finding that I really dislike closed boards. I wondered if this change in attitude was a result of my improved play. I’ll be honest – when I first started playing a more open style, I was very nervous of getting beaten by huge margins. In any case, I decided to do some analysis and explore why expert players suggest a more open style.
Each week our club director circulates a listing of each player’s statistics: club rating, rating change, games won/lost (total for season), winning %, average points scored, average points against, high-3 game total, high game, and # of times 3–0. I took this information and did some analysis (see bottom for details on the methods used for this analysis).
After plotting the data points, a straight line can be added to suggest a trend – as the club rating increases, so does the average points scored (Figure 1). Many of the data points are clustered close to the trend line with a few points far away. You can see that the R2 statistic for this trend line (points scored vs. club rating) is 0.7529, indicating a very good “fit” of the data. R2 is a statistical measure of how well a particular line “fits” the data. Therefore a value of 1.0 means perfect fit (i.e., all the data points are on the line), a value of 0 random fit (i.e., any horizontal line will fit the data). For more information regarding R2 , you can read the wikipedia.org entry titled “Coefficient of Determination”.
Figure 1: Average points scored vs. club rating
(click on image to view full-size)
Let’s take a look at points scored against (i.e., opponent scoring).
In this case, you can also see a straight line fits the data (Figure 2), unlike with points scored (Figure 1). Unfortunately, more of the data points fall far from the trend line. The R2 statistic for average points against vs. club rating is 0.3474 (a weak value), suggesting there is more variability in this data.
What does all of this information mean?
Well, from the two graphs, we can see that there seems to be a strong positive correlation between club rating and the average points scored. The higher your rating, the higher you are likely to score in any particular game. When we examine the points scored against, there is a weak positive correlation with considerable variability in the data. Points scored seems to be a pretty good predictor of one’s rating (based on this data).
I was a bit surprised that the points against wasn’t as good a fit as points scored. If we assume club/tournament directors try to match players of equal skill together, then I would expect that games be closely contested. Based on my personal experience, I’m finding that actual scores of games can vary greatly based on tile distributions, so maybe my original expectation wasn’t correct.
I think that the reason higher-rated players want to play a more open style of Scrabble is because they can take advantage of their opportunities better and score more. A closed board leaves fewer opportunities to make high scoring words/plays and leaves more of the score to the chance (i.e., whoever gets better tiles at specific moments). Given the same tiles, I suspect a higher rated player can generate more points per turn than a lower rated player. This makes sense because higher rated players, due to their better word knowledge and also improved strategic play, make fewer sub-optimal plays (i.e., exchanges, passes, or poor plays). Also, when higher rated players/experts do get good tiles, they can maximize their score, whereas a lower rated player is less likely to do so.
To use a sports analogy, we often hear in hockey and in soccer that “better teams” (i.e., more talented and skilled) tend to play a more attacking and open style of game. Less talented/skilled teams tend to play more defensively and try to limit scoring opportunities. The theory is that the more skilled/talented teams don’t mind allowing an opponent to score because they feel confident that they can out-score an opponent or capitalize more given the same number of opportunities. Because the less talented teams can’t score as easily, they rely more on limiting their opponents from scoring. In Scrabble, the expert players know they can make plays with tiles like U, V, J, X, Q, and so forth, wherease the lower rated players struggle to play words with those tiles (or can’t generate as many points). I would bet that this is a result of word knowledge rather than skill (the expert players know more words).
So, getting back to my original point, when experts suggest playing a more open style, they do so because an open game plays to their advantage because they can score more. That’s why they can get scores in the high 400s, 500s, and even 600s while lower rated players struggle to reach 400. To get better, the best way would be to improve your ability to score rather than trying to limit your opponent from scoring.
Assumptions & Limitations
This analysis is all fine and dandy, but we need to be aware of some assumptions and limitations:
- I assumed the club rating is a reasonable measure of a player’s Scrabble ability. Of course for newer players, this rating may not be a good indicator because there may not be enough “data” (i.e., games played) to get a reliable measure.
- I’m not sure if this analysis can be generalized to NSA tournament ratings as I don’t have the specific data to test if there are similar or different correlations.
- I used only a limited data set (41 points) from a convenience sample based on one club. If we expanded this analysis to include other clubs and players over a longer period of time, we may find different results.
- This analysis only suggests that a relationship between rating and scoring exists and thus I report only correlations between points scored/against versus ratings. While tempting, you cannot infer a causal relationship until we test this out. A good test would to examine results of duplicate Scrabble games, where the only variable is a player’s ability since everyone plays the exact same tiles/game. Could be interesting to test out. Future analysis may want to investigate scoring differentials (i.e., points scored minus points against) compared with ratings. Personally, I’m not sure that point differentials would provide better results.
For those interested, I took the weekly player statistics (club rating, rating change, games won/lost (total for season), winning %, average points scored, average points against, high-3 game total, high game, and # of times 3–0) and visually scanned the data to identify a few variables of interest. Just to clarify, “weekly player statistics” is probably a bit of a misnomer as the statistics reflect performance since the beginning of the Mississauga Scrabble Club season (September 2006) until the end of March 2007. I could have used a full season’s data, but this data was handy.
In total, there are 41 players who have played between 15 and 78 games each. After examining the statistics, I identified points scored and points against as interesting. Players are grouped by division (A, B, and C) according to ability. Divisions are grouped as <1000 (Division C), 1000–1500 (Division B), and >1500 (Division A). For reporting purposes, the weekly player statistics file is presented in this order: Division A, Division C, and Division B.
I manually entered club rating, points scored, and points against into Microsoft Excel 2002 and created graphs (see Figure 1 & Figure 2). Next, I used the “add trend line” feature to add a straight line to the data (i.e., linear regression) and report the R2 statistic.