I analyzed the full leaderboard at Scavenger (http://dominion.lauxnet.com/leaderboard/) and I noticed that the initial phi=2 seems to be chosen too high. Compared to chess, luck just makes it more difficult to beat opponents consistently – and there are fewer pros. So currently 95% of players (with at least 20 games) have a mu between [-1.95,1.68]. Whereas for a new player it is implicitly assumed that the 95% range is [-4,4].
This gives the odd results for some players with few games who get to very high/low mu. Currently, if someone new just beats a mu>2 player once, they'll end up with a mu around 2.5, which just doesn't sound right, given the luck involved in Dominion. (the change in mu is approximately phi^2*(wins-expected wins))
Therefore, I'd suggest to lower the parameter to phi=1 for new players and also to cap phi there. (Capping was suggested in glicko1, and it seems reasonable to me that someone who has played shouldn't have a higher rating deviation than someone new).
After a couple of months, it would actually be possible to estimate the parameters for initial phi, sigma, and tau, that give the best results in predicting game outcomes.
I also have some thoughts on the definition of a good match. I think comparing levels is bad especially at the lower end of the leaderboard. There are some people with a very low mu and high phi, which results in low levels and makes them a supposedly bad match for many opponents. Whereas actually we are very uncertain that their mu is that low.
My preferred way would be to use expected win probabilities and let players set a range. The advantage would be that a certain winning probability is more understandable for the layman than some level difference.
If you want to keep a criterion closer to the current system, I would define the range of suitable opponents as [mu-phi-x, mu+phi+x] with some cutoff x (x=0.5 seems reasonable to me). That would mean that there are more possible opponents for a player with a high phi (the system doesn't know the skill well) than for a player with a low phi (good estimate of the skill).
A more sophisticated matching algorithm could also check the distribution of players that started a match say in the last 30 minutes to determine a good cutoff. (When there are more players and/or a player is more in the middle of the distribution, you can find a more equal opponent within a certain time than for players in the tail of the distribution.)
This gives the odd results for some players with few games who get to very high/low mu. Currently, if someone new just beats a mu>2 player once, they'll end up with a mu around 2.5, which just doesn't sound right, given the luck involved in Dominion. (the change in mu is approximately phi^2*(wins-expected wins))
Therefore, I'd suggest to lower the parameter to phi=1 for new players and also to cap phi there. (Capping was suggested in glicko1, and it seems reasonable to me that someone who has played shouldn't have a higher rating deviation than someone new).
After a couple of months, it would actually be possible to estimate the parameters for initial phi, sigma, and tau, that give the best results in predicting game outcomes.
I also have some thoughts on the definition of a good match. I think comparing levels is bad especially at the lower end of the leaderboard. There are some people with a very low mu and high phi, which results in low levels and makes them a supposedly bad match for many opponents. Whereas actually we are very uncertain that their mu is that low.
My preferred way would be to use expected win probabilities and let players set a range. The advantage would be that a certain winning probability is more understandable for the layman than some level difference.
If you want to keep a criterion closer to the current system, I would define the range of suitable opponents as [mu-phi-x, mu+phi+x] with some cutoff x (x=0.5 seems reasonable to me). That would mean that there are more possible opponents for a player with a high phi (the system doesn't know the skill well) than for a player with a low phi (good estimate of the skill).
A more sophisticated matching algorithm could also check the distribution of players that started a match say in the last 30 minutes to determine a good cutoff. (When there are more players and/or a player is more in the middle of the distribution, you can find a more equal opponent within a certain time than for players in the tail of the distribution.)