First of all I think you guys are doing a great job and I cheer any implementation and integration of any serious scoring system into the site. If it be True skill, another elo variant or elo itself. Diplomacy is a great game and this site makes it possible for me to enjoy the game and play against some top competition from around the world - and even get ranked!
I haven't read the thread, just ATC's post, so I'm sorry if some of my points of discussion have already passed.. A couple of things I'd like to add:
1."In Elo, you're supposed to get bigger ratings boost for beating players who are better than you - not for beating players who are equally rated. It's always impressive to solo over 7 players who are as good as you are - and I don't think it gets more impressive as those players get better."
I do not agree with this statement. I think solo-ing in a top game is much, much more difficult for a top player, than soloing is for a low/medium skilled player in a low/medium skilled game. This has to do with the fact that the game is designed such that actually you should not win. Most games are designed such that the most likely outcome is a win, diplomacy is designed (with all its stalemate lines and to model reality I guess) to not be won if played well by all players. Take the diplomacy 2012 world cup for example. It has to do with the reversibility of the game. Diplomacy positions can always be pushed back to equilibrium if played well and the limit (t-> large) of a well played game should be an immovable draw. Chess for example knows draws as well, but the game is designed to get skewed and end (as much as possible for a logical game) as you can't rebuild pieces. Summarizing, the better the players are, the less likely a win is. Hence I don't think it's strange you get more points for a solo in a high quality game - or getting in a draw even - if you are a top player then a lesser player in a easier game overall. I actually like this feat.
2. "If the top player on the site gets beaten by a bunch of low ranked players, they should take a big rating hit. GR treats that loss as equivalent to being beaten by other top players."
I also don't think this is a problem. Because diplomacy is a seven for all king of the hill that is actually designed to end in a draw, I think top players should not take a huge hit for loosing from a bunch of beginners. Beginners don't recognize all the tactical implications of their actions, hence a game with six beginners and one top player is very likely to end in a solo. If the top player wins he won't win much, as the 6% of the beginners isn't worth much. If a beginner wins, he relatively gets a big pot, as there is 6% of the top player in it. The top player in that game has probably lost because of some beginners making a series of mistakes, should the top player then take a big hit? I don't think so.
I agree with you ATC that the mechanisms that you recognize as faulty are strange if we were dealing with a regular game, say magic or poker or something. But (albeit by accident or genius ghost design) these faults are actually very suitable for this game and this website I feel.
From my own experience: My two stints of diplomacy on this site have been a steady climb to a relatively higher GR. In my earlier games I have solo-ed, I still only solo in games that are of medium skill. The higher my GR got the more games I played against good players, in games with really top ranked players (top 50 or something) it is so incredibly hard to solo, that should be rewarded more. If there is a solo in such a game it usually because there is one very low GR player involved. Most losses in my games against less skilled competition (now and in the past) are usually due to some pretty foolish decision make-ing by the lowest ranked player or lack of dedication (NMR). These wins should not be much more harmful (they are already relatively scaled!) for my GR as I climb the ladder as they are increasingly less my own fault as I get better.
I think if you take out these two 'faulty' mechanisms we will see that top players really won't play against beginners or medium players any more. This is already a very small problem, but I think you'll make it huge if you organize a big GR hit for top players in lesser games. Climbing the ladder I have experienced first hand how much better players get, I find GR remarkably accurate (if GB and variants are taken away or separated). I'm quite sure I won't solo against the current top 6.
I don't know if my contribution is useful, I tried to summarize why I think that GR works so remarkably well (even though it shouldn't theoretically).