All Diplomacy Scoring Stinks

Use this forum to discuss Diplomacy strategy.
Forum rules
This forum is limited to topics relating to the game Diplomacy only. Other posts or topics will be relocated to the correct forum category or deleted. Please be respectful and follow our normal site rules at http://www.webdiplomacy.net/rules.php.
Message
Author
RoganJosh
Silver Donator
Silver Donator
Posts: 556
Joined: Sun Dec 31, 2017 1:02 am
Location: Stockholm
Karma: 464
Contact:

Re: All Diplomacy Scoring Stinks

#21 Post by RoganJosh » Sat Oct 24, 2020 12:04 am

Yes, any probabilistic model has a variance somewhere. The expected value is still not a prediction.

BunnyGo
Posts: 13635
Joined: Thu Jul 18, 2019 12:21 am
Karma: 4457
Contact:

Re: All Diplomacy Scoring Stinks

#22 Post by BunnyGo » Sat Oct 24, 2020 12:48 am

RoganJosh wrote:
Fri Oct 23, 2020 6:37 pm
Yonni wrote:
Fri Oct 23, 2020 4:33 pm
A player that was eliminated early probably did worse than a player that lasts until the end...
Yeah, I misread your previous statement. Correlation is all that's needed to make a prediction.
BunnyGo wrote:
Fri Oct 23, 2020 5:16 pm
You don’t view ELO in chess or Dan level in Go as predictive?
No? But I'm not sure what you mean by 'predictive'. ELO approximates skills of players conditioned on outcomes of games. Shouldn't a predictive variable approximate outcomes of games conditioned of the skills of the players?
Maybe. In the Dan ranking in Go, you can make a pretty good prediction on how much of a handicap is needed in order to have a close match.

A_Tin_Can
Lifetime Site Contributor
Posts: 283
Joined: Fri Sep 29, 2017 9:18 pm
Karma: 451
Contact:

Re: All Diplomacy Scoring Stinks

#23 Post by A_Tin_Can » Sat Oct 24, 2020 4:35 pm

GR isn't a very good predictor of future performance. You can beat it if you make the system a bit more like Elo. I've complained about this before - in GR, high ranked players still get a big boost from beating high GR players, even if they're peers. Elo doesn't work that way - in Elo beating peers gives only a small boost (for the scoring system nerds, the problem is that the k-factor scales).

As discussed above, messages per player is a fairly good predictor of performance overall. So is not missing turns. Largely, the more active you are in games, the better you do. Timezones are another possible signal - in general, north american timezoned players do better, because those are the timezones where most of the player base are.

If you want to *really* look at player skill, you could treat all of those signals as the background noise and try to remove them. Who are the players who do well despite sending only a few messages? Etc.

--

There's an interesting social factor in that the high GR players have a rep for being good - because they have high GR they get known for being good. I don't think there are any bad players near the top - I'd say that all high GR players are better than average. However, I think the ranking isn't very solid.

The social factor comes in when you propose a different system - it opens up to "but player X is ranked below player Y and that is clearly silly".

I ran a few alternative ratings a while back, based on more Elo-like systems.
I don't remember which of the systems performs best, though - I got disheartened when I realised "this player will score zero in this game" was the best predictor of them all.
1

e.m.c^42
Posts: 6320
Joined: Thu Jan 11, 2018 7:00 pm
Location: Rated 0/5 Stars; ☆☆☆☆☆
Karma: 1726
Contact:

Re: All Diplomacy Scoring Stinks

#24 Post by e.m.c^42 » Sat Oct 24, 2020 11:12 pm

Are messages counted per-send? If so, then it'll also skew depending on whether people like to send their press sentence by sentence, or all at once with paragraphs.

A_Tin_Can
Lifetime Site Contributor
Posts: 283
Joined: Fri Sep 29, 2017 9:18 pm
Karma: 451
Contact:

Re: All Diplomacy Scoring Stinks

#25 Post by A_Tin_Can » Sun Oct 25, 2020 12:53 am

They are. When I had database access, I also did the analysis per message character - as far as I remember, it didn't give different results in terms of predictive power.

I really think both are just models of game engagement, though. I don't think that there's an important difference in messaging style that will give you a general edge. But being engaged in the game will.
1

Wusti
Posts: 399
Joined: Thu Oct 19, 2017 10:12 pm
Karma: 232
Contact:

Re: All Diplomacy Scoring Stinks

#26 Post by Wusti » Sun Oct 25, 2020 3:43 am

Timezone is a definite factor against players outside the majority US/EU zones. When most people are active in a game, there is a bias toward swift responses to facilitate more "conversational" engagement which is simply impossible if you are asleep when most of the action happens.

A_Tin_Can
Lifetime Site Contributor
Posts: 283
Joined: Fri Sep 29, 2017 9:18 pm
Karma: 451
Contact:

Re: All Diplomacy Scoring Stinks

#27 Post by A_Tin_Can » Sun Oct 25, 2020 4:51 am

Being asleep when the action happens is a problem, but judging my my recent performance, I have also been asleep while writing orders.
2

Macchiavelli
Posts: 9
Joined: Wed Apr 11, 2018 6:18 pm
Karma: 9
Contact:

Re: All Diplomacy Scoring Stinks

#28 Post by Macchiavelli » Mon Oct 26, 2020 2:07 am

It sounds like you might favor a "skill points" awarded by committee of expert judges at the end of a game. Judges consideration for "tactical play" and "general strategy" and "correctly going for/stopping a solo" etc. etc. Something like gymnastics or whatnot where the skill itself is an art form.
[/quote]

Yes, but only if the judges had a set of hard and fast rules that were made public before any scoring happens.

But the best measure of skill, and the best predictor of future behaviour, is still your record.

Look at what percentage of total games people win/draw. That's the biggest stat by far for value.

My win/draw is just below 50% I think. So I either win or draw half my games, in a 7 or 10 or 17 player game.

I just got beat by a guy with around a 65% win/draw. He's better than me, obvi

RoganJosh
Silver Donator
Silver Donator
Posts: 556
Joined: Sun Dec 31, 2017 1:02 am
Location: Stockholm
Karma: 464
Contact:

Re: All Diplomacy Scoring Stinks

#29 Post by RoganJosh » Mon Oct 26, 2020 3:47 pm

To return to BG's article, here is my criticism.

When you enter a game, then you are trying to win some competition. The competition might be the single game, or it might be a tournament, or it might be that you are trying to get on top of some ranking. Or something else, but those are the most common goals in diplomacy. Points is the mean to achieve your goal. In that sense, the points belong to the competition.

Let's ask ourselves, is DSS utility, skill, or tournament points? In fact, this question has no answer. That's because DSS is a scoring system. A scoring system is a method to (re)distribute points; it is agnostic to the purpose of the points. No, the choice of scoring system does not determine the purpose of the points.

The effects of scoring system can align with, or be contra-productive to, the purpose of the points. However, never forget that the house rules of the competition interferes with the scoring system! It is not possible to discuss the effects of a scoring system without putting it in context.
4

tr1285
Posts: 59
Joined: Tue Oct 23, 2018 8:25 pm
Location: NJ, USA
Karma: 79
Contact:

Re: All Diplomacy Scoring Stinks

#30 Post by tr1285 » Tue Oct 27, 2020 1:39 pm

I also had some criticism of the original article. It might be related to RJ's but probably coming from a different angle.

Maybe I don't really understand how you could separate utility points that motivate in-game goals and skill points that guage relative ability of players in Diplomacy. Assuming that players play logically to maximize their points within whatever scoring system they are using, there should be no difference between the two, and I think it is a very good thing that GR calculates the skill rating according to the scoring system of the games played. Otherwise there is an inconsistency about goals, such as in this game I get more points from eliminating players, but if I care more about my skill ranking I want to increase my center count instead. Tournament play is a completely different beast.
Draw Sized Scoring rewards me managing to happy Care Bear draw in 5-way draws, and score as well over time as compared to nearly the same number of 3-way draws.
I don't think so. In a 5-way draw you get 20% of the winnings. In a 3-way draw you get 33 1/3%. That means someone who averages a 3-way draw score does 65% (33/20-1) better than someone who averages 5-way draws. I would not call that "nearly the same." It seems a pretty significant improvement.

I would like to see a scoring system that takes into consideration the draw size as well as the center count, though it seems mathematically complicated to decide how to do that. I'll just put a simple version out there. If there is a board leader with at least 3 more centers than anyone else, their score is #centers divided by total centers. The rest of the pot gets split equally. Or if there is no dominant board leader, just split the pot equally. The advantage over SoS and Tribute is the board leader in a Draw never gets more than 50% of the pot. The advantage over DSS is while you are getting closer to a solo, you shouldn't have to worry about cutting out small players in case your solo fails, but if you are trying to stop a solo, there is incentive both to limit the board leader's growth as well as cut out smaller players.

FleetYeet
Posts: 11
Joined: Wed Oct 16, 2019 12:54 pm
Karma: 9
Contact:

Re: All Diplomacy Scoring Stinks

#31 Post by FleetYeet » Tue Oct 27, 2020 3:12 pm

I don't think rewarding large center counts in draws is really fitting for Diplomacy, nor does it really reward skill. Diplomacy often puts you into situations where surviving into the draw with just a few centers is the best you can hope for, often through basically no fault of the player. This is especially obvious in gunboat - sometimes literally every single neighbor will move to kill you in S01, and you just have to scramble and play your attackers off against each other to even stand a chance of surviving.

Surviving in a situation like that requires plenty of skill. Throwing strategically to other players to force a draw is similarly a difficult maneuver to pull off, and rewarding when it occurs.

Not only does surviving into draws on small center counts take skill, but players that are attempting to survive on small center counts make the game more interesting, so that skill should be rewarded. Players notice incentives; if surviving on 1-2 centers isn't rewarded, they won't try as hard once it's clear that that's their best result, and the game will get more boring for everyone else.
2

Wusti
Posts: 399
Joined: Thu Oct 19, 2017 10:12 pm
Karma: 232
Contact:

Re: All Diplomacy Scoring Stinks

#32 Post by Wusti » Wed Oct 28, 2020 1:11 am

FleetYeet wrote:
Tue Oct 27, 2020 3:12 pm
Diplomacy often puts you into situations where surviving into the draw with just a few centers is the best you can hope for, often through basically no fault of the player.
That statement right there is the source of your problem - and its Bullshit.

My 2c worth.
1

BunnyGo
Posts: 13635
Joined: Thu Jul 18, 2019 12:21 am
Karma: 4457
Contact:

Re: All Diplomacy Scoring Stinks

#33 Post by BunnyGo » Wed Oct 28, 2020 3:34 am

RoganJosh wrote:
Mon Oct 26, 2020 3:47 pm
To return to BG's article, here is my criticism.

When you enter a game, then you are trying to win some competition. The competition might be the single game, or it might be a tournament, or it might be that you are trying to get on top of some ranking. Or something else, but those are the most common goals in diplomacy. Points is the mean to achieve your goal. In that sense, the points belong to the competition.

Let's ask ourselves, is DSS utility, skill, or tournament points? In fact, this question has no answer. That's because DSS is a scoring system. A scoring system is a method to (re)distribute points; it is agnostic to the purpose of the points. No, the choice of scoring system does not determine the purpose of the points.

The effects of scoring system can align with, or be contra-productive to, the purpose of the points. However, never forget that the house rules of the competition interferes with the scoring system! It is not possible to discuss the effects of a scoring system without putting it in context.
That was exactly the point of my article. Yes, I'm glad we agree.

Choosing a scoring system without a goal in mind is pointless (pun intended).
1

FleetYeet
Posts: 11
Joined: Wed Oct 16, 2019 12:54 pm
Karma: 9
Contact:

Re: All Diplomacy Scoring Stinks

#34 Post by FleetYeet » Wed Oct 28, 2020 4:32 pm

Wusti wrote:
Wed Oct 28, 2020 1:11 am
FleetYeet wrote:
Tue Oct 27, 2020 3:12 pm
Diplomacy often puts you into situations where surviving into the draw with just a few centers is the best you can hope for, often through basically no fault of the player.
That statement right there is the source of your problem - and its Bullshit.

My 2c worth.
Nuanced statements and the internet. Is there anything that doesn't mix better?

Would you agree that a weaker version of my original statement is true? Say, that the effect that a player's skill has on a game's outcome will vary game-to-game? Because if so, you end up with the same conclusion I think.
1

President Eden
Posts: 6907
Joined: Fri Oct 20, 2017 2:11 pm
Location: possibly Britain
Karma: 9609
Contact:

Re: All Diplomacy Scoring Stinks

#35 Post by President Eden » Sat Oct 31, 2020 7:18 pm

What about a negotiating system for draws? All parties not eliminated from the game have to agree to a split of the pot. If they do, the game is drawn and the survivors receive their apportionment. Otherwise, the game continues.

I could see there being issues with this holding up draws because the parties don’t agree to a split, which would probably lead to some very dumb and annoying endgames with people squabbling over pennyante differences in the outcome. But there’s a couple of neat advantages:
• Flexibility — in theory, there is an “optimal” split of the pot for any group of players based on their preferences; this system would allow that split to be achieved every time, and not ruled out by a fixed system.
• Value-agnosticism — this system imposes no value judgments about the reason to play the game; the players in the game get to decide.

You could call this the “Treaty” scoring system.

Swede03
Posts: 52
Joined: Fri Apr 26, 2019 4:27 pm
Karma: 21
Contact:

Re: All Diplomacy Scoring Stinks

#36 Post by Swede03 » Sat Oct 31, 2020 7:50 pm

President Eden wrote:
Sat Oct 31, 2020 7:18 pm
What about a negotiating system for draws? All parties not eliminated from the game have to agree to a split of the pot. If they do, the game is drawn and the survivors receive their apportionment. Otherwise, the game continues.

I could see there being issues with this holding up draws because the parties don’t agree to a split, which would probably lead to some very dumb and annoying endgames with people squabbling over pennyante differences in the outcome. But there’s a couple of neat advantages:
• Flexibility — in theory, there is an “optimal” split of the pot for any group of players based on their preferences; this system would allow that split to be achieved every time, and not ruled out by a fixed system.
• Value-agnosticism — this system imposes no value judgments about the reason to play the game; the players in the game get to decide.

You could call this the “Treaty” scoring system.
How is this different from WTA?

President Eden
Posts: 6907
Joined: Fri Oct 20, 2017 2:11 pm
Location: possibly Britain
Karma: 9609
Contact:

Re: All Diplomacy Scoring Stinks

#37 Post by President Eden » Sat Oct 31, 2020 8:01 pm

Swede03 wrote:
Sat Oct 31, 2020 7:50 pm
President Eden wrote:
Sat Oct 31, 2020 7:18 pm
What about a negotiating system for draws? All parties not eliminated from the game have to agree to a split of the pot. If they do, the game is drawn and the survivors receive their apportionment. Otherwise, the game continues.

I could see there being issues with this holding up draws because the parties don’t agree to a split, which would probably lead to some very dumb and annoying endgames with people squabbling over pennyante differences in the outcome. But there’s a couple of neat advantages:
• Flexibility — in theory, there is an “optimal” split of the pot for any group of players based on their preferences; this system would allow that split to be achieved every time, and not ruled out by a fixed system.
• Value-agnosticism — this system imposes no value judgments about the reason to play the game; the players in the game get to decide.

You could call this the “Treaty” scoring system.
How is this different from WTA?
The WTA model on here (Draw-Size Scoring) forces an even split of the pot. I suspect that a lot of games played under this type of system would result in even splits, because the unanimous consent gives the smaller powers equal footing to bargain (presuming they aren't at risk of elimination). And it would probably get there much more acrimoniously than DSS, so maybe that's a cost that makes it not worthwhile if it reaches the same result.
But, Diplomacy players are typically good sports (suppress your laughter you know it's true!), and respect when a player has played a good game but not quite managed to solo. I could see them being willing to reward the solo threat with a larger share, which the Treaty system would allow.

jmo1121109
Lifetime Site Contributor
Posts: 1099
Joined: Fri Sep 29, 2017 4:20 pm
Karma: 2944
Contact:

Re: All Diplomacy Scoring Stinks

#38 Post by jmo1121109 » Sat Oct 31, 2020 9:50 pm

What I think is interesting is that most of the models I've ever seen proposed have not accounted for one of the biggest skill based factors in a game.

Country.

There are some top players who are amazing at 5 or 6 countries, but tend to lose every game as 1 specific country. If for example, I enter a game as France or Germany, my odds of solo'ing go up regardless of who else is in that game. I just have gotten very good at the meta around those countries (and France already has the highest % meta last time I checked)

When you're talking about chess this obviously isn't a factor because you're playing with the same board so elo based doesn't quite account for it. But it would be hard to prove out because the number of times most people have played each country aren't all that high.

That said, one of the things I am aiming to do sometime in the mid to near future is build out a player stats page that'll show your success in each country for classic. Combined with GR and points you should be able to get a pretty good understanding of someone's skill level at a glance.
5

MadMarx
Gold Donator
Gold Donator
Posts: 293
Joined: Sun Dec 31, 2017 12:01 am
Karma: 148
Contact:

Re: All Diplomacy Scoring Stinks

#39 Post by MadMarx » Sat Oct 31, 2020 11:05 pm

jmo1121109 wrote:
Sat Oct 31, 2020 9:50 pm
What I think is interesting is that most of the models I've ever seen proposed have not accounted for one of the biggest skill based factors in a game.

Country.
That is an excellent point jmo, I had never thought about that...

I have never won a classic game as Italy, for example. Also, I was once in a game where my opponent later confessed to seeing a relatively high win/draw rate for me as the great power I had drawn, compared to my typical win/draw rate, so I agree it would be interesting to see an analysis based on country.
1

Peregrine Falcon
Site Contributor
Site Contributor
Posts: 245
Joined: Tue Mar 14, 2017 8:44 pm
Karma: 310
Contact:

Re: All Diplomacy Scoring Stinks

#40 Post by Peregrine Falcon » Sat Oct 31, 2020 11:51 pm

jmo1121109 wrote:
Sat Oct 31, 2020 9:50 pm
That said, one of the things I am aiming to do sometime in the mid to near future is build out a player stats page that'll show your success in each country for classic. Combined with GR and points you should be able to get a pretty good understanding of someone's skill level at a glance.
This would be amazing. I actually track all my stats on my own, (yes I'm a data geek, how could you tell?) but it would be amazing if webDip did it automatically.

Would you also be willing to do that in the aggregate as well? (Ie. stats by country / press type / live vs non-live?) I've been considering asking you for game data tables to do replicate some of the statistical analyses out there with an updated, larger sample.

jmo1121109 wrote:
Sat Oct 31, 2020 9:50 pm
What I think is interesting is that most of the models I've ever seen proposed have not accounted for one of the biggest skill based factors in a game.

Country.

There are some top players who are amazing at 5 or 6 countries, but tend to lose every game as 1 specific country. If for example, I enter a game as France or Germany, my odds of solo'ing go up regardless of who else is in that game. I just have gotten very good at the meta around those countries (and France already has the highest % meta last time I checked)
This is a really good point. I have an 80% win/draw-rate as France and a 71% draw rate as Turkey (0% solo rate lol), but countries like Germany (46% elimination rate) and Austria (53% elimination rate) bring my general scores way down. (Also yes, you can now laugh at how much I suck at the Anschluss partners.)

Having the by-country stats would help people see these, but are you also considering making some sort of generalised rating for each country beyond win/draw for players?
2

Post Reply

Who is online

Users browsing this forum: No registered users and 59 guests