Why Players Prefer Draw-Size Scoring

Use this forum to discuss Diplomacy strategy.
Forum rules
This forum is limited to topics relating to the game Diplomacy only. Other posts or topics will be relocated to the correct forum category or deleted. Please be respectful and follow our normal site rules at http://www.webdiplomacy.net/rules.php.
Message
Author
nopunin10did
Posts: 8
Joined: Sat Jan 12, 2019 6:17 pm
Karma: 16

Re: Why Players Prefer Draw-Size Scoring

#41 Post by nopunin10did » Thu Feb 07, 2019 4:20 am

Amateur Statistician Moment

For the sake of argument, I took a sample of 179 games of classic that I'd previously collected from PlayDip. Not a huge sample, but enough to run a primary experiment. These games were full-press, no variants, et cetera.

Keep in mind that all rated games on PlayDip currently use Draw-Sized-Scoring.

I used the SC counts at end-of-game to come up with these variables:
  1. Did the game end in a solo?
  2. What was the board-topper's Sum-of-Squares score for the game?
  3. How many powers were eliminated by end of game?
  4. What was a surviving player's DSS score assuming the game ended in DIAS?
For number 2, I ignored the rule in SOS whereby a soloist gets 100% of the points, as I'm using SOS more as a metric for the time being.

I fed these values into R and produced a few different logistic regressions.

In a logistic regression, the dependent variable is assumed to be a binary or boolean value. A game that actually ended in a solo has a Solo value of 1, and a game that actually ended in a draw has a Solo value of 0.

I found the following:
  1. There was no evidence of a correlation between hypothetical DSS score and the game ending in a solo. The coefficient for this was actually negative, but the p-value was absurdly high.
  2. Likewise, there was no evidence of a correlation between powers eliminated and the game ending in a solo. This is effectively the same test as what was done for (A), just with the independent variable scaling linearly.
  3. There was significant evidence of positive correlation between the hypothetical SOS score and the game ending in a solo. I tried two different regressions for this value, and the p-values were 7.23e-11 and 0.00692 for regressions with nonzero or zero intercepts, respectively.
In layman's terms, this means that a Sum-of-Squares score was a very good measure for closeness-to-the-solo, while draw size was not, even in an environment where draw size is the only system used.

This doesn't necessarily mean that Sum-of-Squares is a perfect scoring system, but it does provide evidence for some of what has already been discussed in this thread.

Code: Select all

> head(games,10)
   Solo Elims    GPI SOS.p.GPI Delta.SC  SOSR Delta.SOSR DSS.Score
1     0     4 131.33      1.29        1 0.429      0.063 0.3333333
2     1     4 150.67      2.15       10 0.717      0.575 0.3333333
3     0     2  61.60      1.96        0 0.393      0.000 0.2000000
4     1     3 115.00      3.14       12 0.785      0.678 0.2500000
5     0     2  62.00      1.61        0 0.323      0.000 0.2000000
6     1     2  84.00      4.30       14 0.860      0.800 0.2000000
7     0     1  38.33      2.61        3 0.435      0.222 0.1666667
8     0     4 136.67      1.65        4 0.549      0.254 0.3333333
9     1     2  91.60      4.37       14 0.873      0.795 0.2000000
10    0     4 132.67      1.48        3 0.492      0.188 0.3333333
> games.dss.glm <- glm(Solo ~ DSS.Score, data = games, family = binomial)
> summary(games.dss.glm)

Call:
glm(formula = Solo ~ DSS.Score, family = binomial, data = games)

Deviance Residuals: 
   Min      1Q  Median      3Q     Max  
-1.202  -1.190   1.156   1.165   1.192  

Coefficients:
            Estimate Std. Error z value Pr(>|z|)
(Intercept)  0.09439    0.54201   0.174    0.862
DSS.Score   -0.25661    1.85836  -0.138    0.890

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 246.74  on 177  degrees of freedom
Residual deviance: 246.72  on 176  degrees of freedom
AIC: 250.72

Number of Fisher Scoring iterations: 3

> games.sos.glm <- glm(Solo ~ SOSR, data = games, family = binomial)
> summary(games.sos.glm)

Call:
glm(formula = Solo ~ SOSR, family = binomial, data = games)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-3.3274  -0.2509   0.0368   0.3240   1.9783  

Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  -16.399      2.517  -6.516 7.23e-11 ***
SOSR          26.108      3.931   6.641 3.11e-11 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 246.74  on 177  degrees of freedom
Residual deviance:  82.99  on 176  degrees of freedom
AIC: 86.99

Number of Fisher Scoring iterations: 6

> games.elim.glm <- glm(Solo ~ Elims, data = games, family = binomial)
> summary(games.elim.glm)

Call:
glm(formula = Solo ~ Elims, family = binomial, data = games)

Deviance Residuals: 
   Min      1Q  Median      3Q     Max  
-1.300  -1.179   1.060   1.176   1.297  

Coefficients:
            Estimate Std. Error z value Pr(>|z|)
(Intercept)  -0.4173     0.4613  -0.905    0.366
Elims         0.1400     0.1388   1.009    0.313

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 246.74  on 177  degrees of freedom
Residual deviance: 245.71  on 176  degrees of freedom
AIC: 249.71

Number of Fisher Scoring iterations: 3

> games.sos.zero.glm <- glm(Solo ~ SOSR + 0, data = games, family = binomial)
> summary(games.sos.zero.glm)

Call:
glm(formula = Solo ~ SOSR + 0, family = binomial, data = games)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-1.4166  -1.3173   0.9383   0.9809   1.0273  

Coefficients:
     Estimate Std. Error z value Pr(>|z|)   
SOSR   0.6507     0.2409   2.701  0.00692 **
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 246.76  on 178  degrees of freedom
Residual deviance: 239.28  on 177  degrees of freedom
AIC: 241.28

Number of Fisher Scoring iterations: 4
4

Claesar
Site Moderator
Site Moderator
Posts: 691
Joined: Tue Oct 03, 2017 10:34 am
Karma: 239

Re: Why Players Prefer Draw-Size Scoring

#42 Post by Claesar » Thu Feb 07, 2019 12:31 pm

nopunin10did wrote:
Thu Feb 07, 2019 4:20 am
...
I found the following:
  1. There was no evidence of a correlation between hypothetical DSS score and the game ending in a solo. The coefficient for this was actually negative, but the p-value was absurdly high.
  2. Likewise, there was no evidence of a correlation between powers eliminated and the game ending in a solo. This is effectively the same test as what was done for (A), just with the independent variable scaling linearly.
Interesting, thanks!
  • There was significant evidence of positive correlation between the hypothetical SOS score and the game ending in a solo. I tried two different regressions for this value, and the p-values were 7.23e-11 and 0.00692 for regressions with nonzero or zero intercepts, respectively.
In layman's terms, this means that a Sum-of-Squares score was a very good measure for closeness-to-the-solo, while draw size was not, even in an environment where draw size is the only system used.
I don't fully understand what you're saying here. Are you suggesting that a large difference between the biggest power and the second-biggest increases the chance of a solo?
1

nopunin10did
Posts: 8
Joined: Sat Jan 12, 2019 6:17 pm
Karma: 16

Re: Why Players Prefer Draw-Size Scoring

#43 Post by nopunin10did » Thu Feb 07, 2019 4:20 pm

Claesar wrote:
Thu Feb 07, 2019 12:31 pm
  • There was significant evidence of positive correlation between the hypothetical SOS score and the game ending in a solo. I tried two different regressions for this value, and the p-values were 7.23e-11 and 0.00692 for regressions with nonzero or zero intercepts, respectively.
In layman's terms, this means that a Sum-of-Squares score was a very good measure for closeness-to-the-solo, while draw size was not, even in an environment where draw size is the only system used.
I don't fully understand what you're saying here. Are you suggesting that a large difference between the biggest power and the second-biggest increases the chance of a solo?
Not exactly. I don't want to overstate my claims here, especially since by definition SOS for a board topper in a solo has a higher possible range than for a board-topper in a draw.

A followup experiment would look at snapshots of game states midgame to get the SOS of the board-topping player, then see whether that's a good predictor of an eventual later solo.

But I don't think it will be all that controversial to say that you're much more likely to clinch the solo the higher the lead in SC count you have over other players.

More important is my finding that there was no evidence of correlation between eliminations and achieving the solo.
3

ghug
Posts: 2035
Joined: Mon Mar 20, 2017 3:51 pm
Location: Seattle
Karma: 525

Re: Why Players Prefer Draw-Size Scoring

#44 Post by ghug » Thu Feb 07, 2019 5:20 pm

nopunin10did wrote:
Wed Feb 06, 2019 7:20 pm
ghug wrote:
Wed Jan 30, 2019 12:40 am
Carnage also encourages allowing a draw to happen without oneself over throwing a solo.
In Carnage, draws must include all survivors. That's the case for SOS too, though I'm less familiar with specifically how your site implements it. Do you allow people to vote themselves out of a draw in an SOS game?

This is one small point where Carnage & SOS tend to adhere to the printed rules better than typical DSS systems, since the concept of approving a draw that you're not a party to isn't supported in the rules.

(Not that I particularly think that the rulebook should be treated as gospel when it comes to draws, but that's a bigger issue.)
All webDip draws include all survivors. What I was getting at is that it's better to be eliminated and have the game end in a draw without you than it is to allow a solo in Carnage, which is weird.
1

nopunin10did
Posts: 8
Joined: Sat Jan 12, 2019 6:17 pm
Karma: 16

Re: Why Players Prefer Draw-Size Scoring

#45 Post by nopunin10did » Thu Feb 07, 2019 5:59 pm

ghug wrote:
Thu Feb 07, 2019 5:20 pm
All webDip draws include all survivors. What I was getting at is that it's better to be eliminated and have the game end in a draw without you than it is to allow a solo in Carnage, which is weird.
Gotcha. It is a little strange, but it's in line with one goal of Carnage's designer: that the solo should be the most important part of the system. Regardless of what you do, you only win the most points if you solo, and you only win the least points if someone else gets the solo.
1

Restitution
Posts: 1
Joined: Thu Jan 31, 2019 7:00 am

Re: Why Players Prefer Draw-Size Scoring

#46 Post by Restitution » Thu Feb 07, 2019 6:24 pm

Is there any reason Carnage scoring wouldn't work in webdip?

nopunin10did
Posts: 8
Joined: Sat Jan 12, 2019 6:17 pm
Karma: 16

Re: Why Players Prefer Draw-Size Scoring

#47 Post by nopunin10did » Thu Feb 07, 2019 10:52 pm

Restitution wrote:
Thu Feb 07, 2019 6:24 pm
Is there any reason Carnage scoring wouldn't work in webdip?
Funny you should ask:
https://www.playdiplomacy.com/forum/vie ... =6&t=57975

Short answer is Yes, it can work, but with modifications. Standard Carnage doesn’t scale well with variants, and it’s not great when used in this sort of zero-sum environment where players can pick a scoring system game-by-game.

I developed a formula that alters Carnage to scale with any game of 5 players or more and keeps its payouts in a draw comparable with those of DSS. We’re currently using a version of this Fibonacci scoring system for an online forum/Discord tournament.
2

swordsman3003
Posts: 66
Joined: Tue Jan 02, 2018 2:51 pm
Karma: 76
Contact:

Re: Why Players Prefer Draw-Size Scoring

#48 Post by swordsman3003 » Thu Feb 21, 2019 12:02 am

As promised, I posted captainmeme's response essay to my blog. It's absolutely awesome to get a guest post, and I love that the blog now has an argument against my perspective.

Thank you captainmeme!!

https://brotherbored.com/guest-post-dra ... s-scoring/
4

Post Reply

Who is online

Users browsing this forum: No registered users and 0 guests