return to index



GammOnLine Project Part II. Snowie Parameter Settings

by Chuck Bower
INTRODUCTION.

Snowie is not really one robot, but many. The reason is that thereare several optional parameters which can be set by the user. Jellyfishalso has paramaters, but few compared to Snowie. Before showing thetime impact of these various parameters, let's take a look at themand explain their meaning and implementation. (Much of this information can be found in Snowie's extensive help-file.)

DEFINITIONS.

1) Ply. As with Jellyfish there are three ways to set Snowie'slookahead ability. 1-ply means Snowie uses just the positionavailable and its neural-net (brain). It does NOT compare thepositions after various diceroll sequences. At a 2-ply setting,Snowie looks at the (opponent's) 36 reply rolls, records theway it would play each of these 36 numbers, and then combines theresults (1/36 * subsequent equity for a given roll, summed overthe 36 possible rolls) and makes its decision. Naively this shouldtake 21 times as long to accomplish as 1-ply. Full 3-ply meansSnowie looks at roller's 36 responses to each of opponent's 36 rollsand weighs them accordingly. Again, this should take considerablylonger than 2-ply, and it does.

If Snowie takes longer at deeper plys, why bother wasting time? Thebottom line is that Snowie is 'smarter' the deeper it looks ahead.How much smarter are the higher plies? That is what part III of thisproject will hopefully tell us.

2) Truncation. To shorten the Time of a rollout, you can tell Snowiehow many dicerolls per trial, and to then evaluate the position andgo on to the next trial. Basically you get random dice for thenumber of rolls indicated (truncation depth) and then an evaluation.Although this kind of rollout in general will give more reliableresults (that is, indicates closer the 'true' equity of the originalposition being rolled out) than mere evaluations, it's not as robustas FULL (untruncated) rollouts. It's still vulnerable to Snowie'simperfect evaluation function. For checker play rollouts (where youare trying to find the best checker play among several candidates),truncated rollouts probably lead to reasonable results, based on theconcept that the evaluation function error (which occurs at the endof the number of specified rolls) has a similar magnitude and directionfor each of the candidates. For cube decision rollouts, especiallycomplicated positions (for example, backgames), the absolute valueof the result determines the proper cube action, so untruncated rolloutsare likely to be the most reliable.

3) Bearoff database usage. Snowie has a built in bearoff database.This file contains bearoff positions and when both sides are puttogether, an accurate estimate of each player's game winning chancesresult. By stopping and evaluating a rollout once positions in the database are reached, considerable time savings result. I know ofno reason NOT to use this option for any rollout to be performed.

4) Live cube. Snowie has the capability to use a doubling cube during therollouts. This is true for both checker-play and cube-play rollouts.When performing this kind of rollout for a money game, the user cantell Snowie a maximum cube level. The purpose is to prevent one oreven a few games with very large cubes dominating the results. Atthe maximum level the game is stopped when a certain equity threshold(also user specified) is reached. For matchplay this maximum cubelevel option isn't available since the finite matchlength effectively limits the cube's hightest value.

When performing a live-cube rollout, Snowie goes through all the evaluationof cubeless rollouts, but in addition it also keeps track of doubles,takes, and passes. As a result, live-cube rollouts take longer, althoughas will be seen later, specifying a smaller ply for the doubling decisionsthan for the checker play decisions, very little time is added.

5) Search space. Typically Snowie must evaluate many different candidate plays. At 1-ply this doesn't take long, but at higher plies the numberof positions to be considered grows precipitously. By specifying a 'searchspace' value (there are six choices), Snowie limits the number of candidatesin two ways. First it looks to see how much difference in equity there isbetween plays at the lower ply. If that difference is small it will continueto higher plies for the competing (close) candidates. In addition, even ifmany plays are close, there is a maximum number it will continue lookingat to higher plies. At 2-ply, the maximum number of candidates to beconsidered can be set between 3 and 13. For a 3-ply rollout, in additionthe number of candidates considered is somewhere between 2 and 7, dependingon the user's choice of search space. Snowie coins these spaces 'supertiny'through 'huge'. The six titles are sufficiently descriptive.

6) Speed. To me this parameter could have been better named. I wouldprefer "chainsaw size" or something equally descriptive! This settingonly applies at the 3-ply level. The structure of candidate plays at3-ply grows like a tree. For Snowie to look at all of the leaves of thistree would take a considerable amount of time. The 'speed' parameter tellsSnowie where to trim the branches of this tree so as to quicken the rollout. There are five speed settings (20%, 25%, 33%, 50%, 100%) but the speed of the rollout is INVERSELY related to these numbers. That is, 20% trims the tree closer to the trunk and runs fastest. A 100% speed setting takes the most time.

7) "Checker play according to score". Although the name of this parameterimplies matchplay rollouts, this CAN be checked for money games. The reasonis that, when playing with a cube, the best play may be different thanwhen cube turns aren't possible. In order for this option to be used, alive-cube must be active, even if (based upon the matchscore), the cubeis dead. In principle, when making its evaluation, Snowie adjusts forthe different cube locations and gammon-prices that the matchscore inflicts. For example, the best play at double-match-point (when neither player can gain from a gammon) is sometimes different from the top choice at money play. The Snowie Helpfile warns that selecting this parameter increasesrollout time by a factor of 3! We'll see below if our experiments back upthat claim.

STRATEGY OF DATA TAKING.

If our goal was to measure times for EVERY POSSIBLE combination of rolloutsettings, the task would be formidable indeed. For cubeless money play alone and a fixed number of trials (for example, 36, as was used in most of this study), there are 200 1-ply, 1200 2-ply, and 6000 3-ply possiblesetups! The data presented here in graphic form represent only 190 rollouts.Because most computers used today are fast enough to run the more reliable2- and 3-ply rollouts, I concentrated on these levels. In addition, becausethe 'speed' parameter is applicable only for 3-ply, most of my tests were done at that level.

I used a Pentium II-233 exclusively for this study. Part I (last month) showed how runtimes relate to CPU speed. At the end I will give parameterized equations for various conditions (including CPU speed), but the results below should be proportional to the performance on other CPU's. In particular, if asetting change in the data for this study predicts a given time factorincrease, that should also occur on a different CPU. For example, for acubeless 3-ply rollout with speed setting of 100%, going from a supertiny search-space to a small search-space caused the rollout time to double in this study. That same factor of two increase in time should occur forsimilar rollout settings on a faster computer.

In all 190 rollouts, the same position and initial diceroll were used: thestandard opening game setup, a 62 opening roll, and three candidate plays (24/18, 13/11; 24/16; and 13/5). The bearoff database was activated.A random seed (chosen by Snowie) was also used foreach rollout. You will notice in the graphs presented below that there issome 'jitter' in the data which is mostly a result of statistical fluctuationin runtimes due to the small sample size (36 trials for each candidate) usedto keep the experiment manageable.

RESULTS OF PARAMETER VARIATION.

In all plots, the vertical axis shows the runtime (in seconds) for the rolloutspresented. The horizontal axis (and multiple curves, when shown) representthe different parameter settings.

Figure 1 shows the effect of truncation depth. The data represented here weretaken at the 3-ply level without cube. The searchspace and speed parameters were set at their minimum values (supertiny, 20%). The truncation value was varied from 5 to 15 (rolls) in increments of 1. Two more truncation settings -- 20 and UNtruncated -- were also tested. On the plot, the rightmost (high) point is the UNtruncated result. A nice linear relationship can be seen. This makes sense. If you require Snowie to toss twice as many dicerolls before truncating (evaluating), it takes twice as long. It can also be seen that the UNtruncated datapoint represents about 35 dicerolls (BEFORE reaching the bearoff database).

All of the 2-ply (truncated at 10 rolls) results of this study can be seen in figure 2. Two parameters were varied here: cube setting and search-space. (For all figures from here on, I have converted searchspaceto numeric values as follows: supertiny = 1, tiny = 2, small = 3, medium = 4,large = 5, and huge = 6). Note that the difference in runtime between twodifferent cube levels is independent of search-space. There is overhead foradding a doubling cube to the rollout, and the amount of overhead depends uponthe ply level used in determining the cube decisions, but not on the choice of search-space. You will also see that there is a small overhead going from nocube to 1-ply cube, but very little difference between a 1-ply cube and a 2-ply cube. Finally there is a large increase in runtime when cube decisions are based upon 3-ply analysis.

Figure 3 shows the runtime dependence on search-space and speed for 3-plycubeless rollouts truncated at 5. This is the 'bread-and-butter' plot of thisstudy, since 3-ply rollouts are the power of Snowie, and searchspace and speedsettings make a significant difference in 3-ply rollout runtimes. Here it can be seen that the first three speed levels give almost the same runtime. This probably means that, at least for early game decisions, the tree pruning algorithm isn't much different for 20%, 25% and 33% speed. Therefore, it makes sense that one might as well run 33% since it doesn't result in an increase in CPU time.

Figure 4 illustrates the effect of adding the cube to 3-ply (20% speed) rollouts truncated after 10 dicerolls. As was seen with 2-ply play, the cube only adds overhead when it is used at 3-ply. Even cubeless runs no faster than a 2-ply cube when 3-ply play is implemented. As was also noted for 2-ply, the introduction of a 3-ply cube adds a constant time increase, independent of searchspace.

Although most of the emphasis of this study has been on the 3-ply setting, itis interesting to compare the slowest 2-ply play settings (with a 3-ply cube) to cubeless 3-ply play (speed setting 20%). Figure 5 shows such a comparison. Interesting that for a supertiny setting, 3-ply cubeless is faster. For a tiny searchspace, the two different rollouts take about the same amount of time. But at higher searchspaces, 3-ply cubeless burns up more clock. Thisis another illustration of the fact mentioned earlier that the cube adds aconstant time overhead, independent of searchspace.

The longest (but likely the most reliable) settings are 3-ply play with a 3-plycube. Figure 6 shows the affect of varying both the searchspace and the speedfor these deepest levels. Note that once again, there is little differencebetween the time required for rollouts with the slowest three speed settings.After that, though, both the speed setting and the searchspace setting have amajor impact on runtimes.

The ultimate in rollouts involves the "checker play according to score" (and cube position) setting. Figure 7 shows the runtimes when this parameter is used. By comparing figure 7 with figure 6, an increase of runtime by approximately a factor of 2 is evident. Note that this is not quite as large of a penalty of runtime as predicted in the Snowie helpfile -- a factor of 3 there. (Actually, for some positions, a factor of 3 may be the accurate number.) Still, doubling the runtime can be a serious time consuming decision and one should make sure this extra complication is warranted.

EQUATIONS FOR PREDICTING RUNTIMES

The data presented above has been combined to produce equations which expressruntimes for the various parameter settings. For 2-ply lookahead, the formulais

T2 = (ca/3)*(N/36)*(233/ps)*(tr/10)*(216*ss + ct) seconds

with the meaning of the variables: 'ca' is the number of candidate plays beingrolled out, 'N' is the number of trials, 'ps' is processor speed (in MHz), 'tr' is truncation depth, 'ss' is searchspace (1 thru 6) and 'ct' is an overhead term which includes standard overhead and cube computation time. ct takes onone of the values [246, 441, 503, 1587] for [none, 1-ply, 2-ply, 3-ply] cube respectively.

For 3-ply rollouts, the expression is:

T3 = (ca/3)*(N/36)(233/ps)*(tr/5)*sf*(78.4 + 171*ss + 13.1*s + 10.1*s*ss + 500*cf) seconds.

Here 'ca', 'N', 'ps', 'tr', and 'ss' are defined above. 'sf' is the score factor and has a value of 2 when "checker play according to score" is used, 1 otherwise. 's' is Snowie's 'speed' setting (33, 50, or 100); 'cf' is the 'cube factor' and is 1 when a 3-ply cube is in use, and 0 otherwise. Note that although Snowie's 'speed' can be set at 20% and 25%, the data shown earlierindicate little difference between 20%, 25%, and 33% so in the parameterization, use 33% speed if running at either 20% or 25%.

These arithmetic expressions are meant as a guideline to help select parameterswhich will allow a rollout to be completed in an alotted time. Even for earlygame positions (for which they were derived), I wouldn't expect the predictionsto agree better than 10-20%. For complicated positions, longer runtimes areprobably to be expected while for simpler positions, shorter runtimes than thepredictions are possible.

PART III -- A PREVIEW AND CALL FOR ASSISTANCE.

When I proposed this group project in January, I outlined three parts. Thefirst was a study of the effect of different CPU's, and this work, thanks tothe participation of several GoL members, was quite successful. (See theFebruary issue of GoL.) Due to the many parameters which required systematicvariation, I chose to perform the rollouts for part II myself. We now knowthe time impact of the various Snowie parameter settings. That, alone, isinsufficient for the serious Snowie user, however. We don't yet know theimpact, equitywise, in the various settings. For example, running the hugesearchspace can be seen to increase the runtimes by a large factor. Is thisextra effort reflected in more reliable results? If not, then we shouldn'twaste the CPU-time, especially since we may not want to wait so long to getanswers.

I definitely can use (or, more strongly stated, NEED) help in this last part.I only ran 36 trials for each of 3 candidate plays for each rollout. Wenow want to run longer rollouts, especially for higher truncation levels.We need at least 10,000 (equivalent) games per position, and that is a bareminimum. More would be better.

In addition, this 3rd part is being opened to ALL robots, not just Snowie.Although I'd like to limit the position to the one used previously (standardopening, 62 roll, three most common candidate plays), I think it will beinteresting to see the results arrived at by different robots. I've alreadyreceived some gnu-backgammon data from Gary Wong, but more is needed, notonly for that bot but for Jellyfish, Expert Backgammon, and ANY homegrownprograms, for that matter.

Initially I'm going to ask contributors to choose their own (favorite)settings from the wide variety of possibilities and send the results to me.I'd like to see as much output (in terms of rollout results) as is madeavailable. Please e-mail all results to me at bower@astro.indiana.edu.I have already received a few sets of Snowie results, and I'll thank allcontributors by name in the final writeup, but I need more, More, MORE!Please contribute. Hopefully all of us can reap the collective rewards.

return to index