Since the release on eXtreme Gammon, there have been several independent studies on the strength of the program. We'd like to thank all the people who did all that hard and very long work, particularly Michael Depreli.
Michael Depreli Study 2012
Michael Depreli Study 2010
Mike Corbet Study
Michael Depreli Study 2012: published on BGonline.org (on January 28th 2012)
The study is comparing the different top programs (at multiple strength level). Using 500 money games any difference of opinion is analyzed very deeply using a rollout. The mistakes each program/level makes are accumulated. This is a long process (more than 5000 moves or cube decisions needed to the rolled). Rollouts were made using eXtreme Gammon 2 (Rollout parameters: 3-ply Checker, XGRoller For cube, Roll until the 95% confidence of the equity is less than 0.005, minimum 1296 trials). All numbers are normalized equity.
The process does not take into account search interval. Each candidate was analyzed in the level requested regardless of the search interval used.
From the table one can see that
eXtreme Gammon 2, using 3-ply makes about twice less error than the previous version at the same level (and is faster).
eXtreme Gammon 2 XGRoller+, which is faster than
Snowie 3-ply makes 6 times less error than Snowie.
Program |
Level
|
Checker
play
|
Missed
Double
|
Wrong
Double
|
Wrong
Take
|
Wrong
Pass
|
Total
|
PR
|
|
eXtreme Gammon 2
|
XGRoller++
|
3.538
|
0.113 |
0.223 |
0.196 |
0.028
|
4.097 |
0.11 |
eXtreme Gammon 2
|
XGRoller+
|
5.341 |
0.477 |
0.260 |
0.389 |
0.020
|
6.487 |
0.18
|
eXtreme Gammon 2
|
5-Ply
|
6.891
|
0.967 |
0.392 |
0.615 |
0.104
|
8.969 |
0.25
|
eXtreme Gammon 2
|
4-Ply
|
9.701 |
0.355 |
1.158 |
0.562 |
0.160
|
11.936 |
0.33
|
GnuBg 1.00 |
4-ply
|
9.107 |
1.132 |
0.555 |
1.435 |
0.308 |
12.536 |
0.35 |
eXtreme Gammon 2
|
XGRoller
|
12.843
|
0.495
|
0.989
|
0.401
|
0.160
|
14.887
|
0.41
|
eXtreme Gammon 2
|
3-Ply
|
13.128 |
1.208 |
1.000 |
0.700 |
0.196 |
16.231 |
0.45
|
GnuBg 1.00 |
3-ply
|
12.865 |
0.778 |
1.814 |
0.898 |
0.421 |
16.775 |
0.46 |
XG mobile |
Champion |
14.331 |
0.907 |
0.741 |
0.822 |
0.196 |
16.996 |
0.47 |
GnuBg 1.00 |
2-ply
|
16.619 |
1.063 |
1.270 |
1.579 |
0.420 |
20.951 |
0.58 |
eXtreme Gammon 2
|
3-Ply Red
|
19.689
|
1.424
|
1.160
|
0.704
|
0.196
|
23.173
|
0.64
|
BgBlitz 2.8.0 |
4-ply |
23.586 |
2.698 |
1.640 |
3.860 |
0.701 |
32.485 |
0.90 |
BgBlitz 2.8.0 |
3-ply |
29.747 |
2.050 |
1.760 |
3.464 |
0.466 |
37.487 |
1.04 |
Snowie 4* |
3-ply
|
37.424
|
1.922
|
1.139
|
3.651
|
0.867
|
45.003
|
1.24
|
|
Program |
Level
|
Total
|
PR
|
eXtreme Gammon 2
|
XGRoller++
|
4.097
|
0.11 |
eXtreme Gammon 2
|
XGRoller+
|
6.487
|
0.18
|
eXtreme Gammon 2
|
5-Ply
|
8.969
|
0.25
|
eXtreme Gammon 2
|
4-Ply
|
11.936
|
0.33
|
GnuBg 1.00 |
4-ply
|
12.536 |
0.35 |
eXtreme Gammon 2
|
XGRoller
|
14.887
|
0.41
|
eXtreme Gammon 2
|
3-Ply
|
16.231 |
0.45
|
GnuBg 1.00 |
3-ply
|
16.775 |
0.46 |
XG mobile |
Champion |
16.996 |
0.47 |
GnuBg 1.00 |
2-ply
|
20.951 |
0.58 |
eXtreme Gammon 2
|
3-Ply Red
|
23.173
|
0.64
|
BgBlitz 2.8.0 |
4-ply
|
32.485 |
0.90 |
BgBlitz 2.8.0 |
3-ply
|
37.487 |
1.04 |
Snowie 4* |
3-ply
|
45.003
|
1.24
|
Legend:
- (*) The data for these programs are not yet available the data presented here are from the 2010 study
-Ply: Search depth as defined for the program (GnuBG 2-ply is equivalent to other bot 3-ply)
-Total: total equity lost
Here is a chart that shows the relative strength based on this study (in Elo compared to XG 3-ply). The speed test were performed by GameSite 2000 ltd and are not from an independent source.
Speed was evaluated using a core i7 computer analyzing of a money session and a match. Speed test were made using the using a search interval where the last ply looks up to 4 moves within 0.080 equity (eXtreme Gammon: Huge for 3-ply, GnuBG 3-ply and 4-ply: Large)
Note about BGBlitz: as it cannot analyze a full match. Its speed was determined using Rollout speed.
Click on the images to enlarge
Michael Depreli Study 2010: published on BGonline.org (finished on April 25th 2010)
Important: the 2010 Study was made using eXtreme Gammon 1. The newest version is noticeably stronger. Michael Depreli is in the process of rolling the position again using eXtreme Gammon 2 with much stronger settings than the one use in the 2010 one. The new results will also include eXtreme Gammon 2 results and will be soon available.
The study is comparing the different top programs (at multiple strength level). Using 500 money games any difference of opinion is analyzed very deeply using a rollout. The mistake each program/level makes are accumulated. This is a long process (more than 4500 moves or cube decisions needed to the rolled). The project got completed after 6 month of intense analysis. Rollouts were made using GnuBG (Rollout parameters GnuBG 2-ply world class 1296 trials or 2.33 JSD (98% conf) if sooner). All number are normalized equity.
We'd like to commend Michael for his extraordinary dedication and all his hard work to get that project completed.
The process does not take into account search interval. Each candidate was analyzed in the level requested regardless of the search interval used.
Program
|
Level
|
Checker play
|
Missed Double
|
Wrong Double
|
Wrong Take
|
Wrong Pass
|
Total
|
|
eXtreme Gammon 1
|
XGR+
|
13.397
|
1.088
|
0.658
|
0.970
|
0.241
|
16.354
|
eXtreme Gammon 1
|
XGR
|
22.269
|
1.661
|
0.783
|
2.173
|
0.264
|
27.150
|
|
eXtreme Gammon 1
|
5-ply
|
17.169
|
1.507
|
0.789
|
2.859
|
0.450
|
22.774
|
GnuBG
|
4-ply
|
21.599
|
2.663
|
0.644
|
4.061
|
0.127
|
29.094
|
|
eXtreme Gammon 1
|
4-ply
|
22.967
|
0.426
|
1.647
|
0.818
|
0.555
|
26.413
|
GnuBG
|
3-ply
|
29.313
|
0.903
|
10.276
|
0.775
|
5.880
|
47.147
|
|
eXtreme Gammon 1
|
3-ply
|
27.814
|
1.831
|
1.528
|
3.996
|
0.520
|
35.689
|
GnuBg
|
2-ply
|
33.247
|
2.763
|
1.670
|
4.261
|
0.476
|
42.417
|
Snowie 4
|
3-ply
|
37.424
|
1.922
|
1.139
|
3.651
|
0.867
|
45.003
|
BgBlitz
|
3-ply
|
41.286
|
1.692
|
10.864
|
4.168
|
2.159
|
60.169
|
|
Program
|
Level
|
Total
|
eXtreme Gammon 1
|
XGR+
|
16.354
|
eXtreme Gammon 1
|
XGR
|
27.150
|
eXtreme Gammon 1
|
5-ply
|
22.774
|
GnuBG
|
4-ply
|
29.094
|
eXtreme Gammon 1
|
4-ply
|
26.413
|
GnuBG
|
3-ply
|
47.147
|
eXtreme Gammon 1
|
3-ply
|
35.689
|
GnuBg
|
2-ply
|
42.417
|
Snowie 4
|
3-ply
|
45.003
|
BgBlitz
|
3-ply
|
60.169
|
Legend:
- Ply: Search depth as defined for the program (GnuBG 2-ply is equivalent to other bot 3-ply)
-Total: total equity lost
Here is a chart that shows the relative strength based on this study (in Elo compared to XG 3-ply). The speed test were performed by GameSite 2000 ltd and are not from an independent source.
Speed was evaluated using a core i7 computer analyzing of a money session and a match. Speed test were made using the using a search interval where the last ply looks up to 4 moves within 0.080 equity (eXtreme Gammon: Huge for 3-ply, GnuBG 3-ply and 4-ply: Large)
Note about BGBlitz: as it cannot analyze a full match. Its speed was determined using Rollout speed. Click to see a bigger picture.
Mike Corbett Study:
Phil Simborg ran a test at Mike Corbett's request on 10 positions of his book.
On 9 positions eXtreme Gammon version 1 did better than Snowie. GnuBG did better on 6 (all move in 3-ply (2-ply for GnuBG).
As the positions picked were the ones Snowie gets wrong this test does not reflect the difference between eXtreme Gammon and Snowie. It does, however show the difference between eXtreme Gammon and GnuBG.
Page Number
|
eXtreme did better than Snowie
|
Gnu did better than Snowie
|
Error avoided by eXtreme Gammon
|
1
|
|
|
0.059
|
2
|
|
|
0.076
|
15
|
|
|
0.024
|
23
|
|
|
0.001
|
61
|
|
|
0.012
|
68
|
|
|
None
|
70
|
|
|
0.054
|
83
|
|
|
0.051
|
87
|
|
|
0.166
|
133
|
|
|
0.023
|