Testing the civ3 random number generator

vulture · August 29, 2002, 11:11

Since Hurricane mentioned in another thread what the basic pseudo-random number generator used by civ3 was, I decided to test it for 'streakiness'.

First off, the exact algorithm wasn't revealed, but linear congruent generators all work in the same way AFAIK. My implementation was as follows (in C):

unsigned int seed; /* global variable */

unsigned int prng(void)
{
unsigned long long dummy;
unsigned int tmp;

dummy = ((unsigned long long) seed * 1103515245) + 12345;
seed = dummy & 0xffffffff;
tmp = seed & 0xffc00000;
return(seed >> 22);
}

(on my machine, int is 32 bits, long long is 64 bits).

The bottom 32 bits of 'dummy' are kept as the next seed, and the higest 10 bits of seed are returned are to generate a random number between 0 and 1023 inclusive. As I understand it, this is how the civ3 PRNG works. If any of the programmers wish to correct me...

This PRNG generates a nice flat distribution, with all values being equally likely. The question for combat in civ3 is whether the generator is streaky - whether it has a tendency to produce say 3 or 4 low numbers in a row. Of course there are various tests that can be done on this, but I set it up to simulate 100,000,000 combats in civ3 between a 4 hp, 6.0 attack unit attacking a 4 hp 2.75 defense unit (i.e. an infantry attacking a fortified spearman across a river).

The test is to compare the final hp's left for the winner vs. the distribution expected if the numbers were perfectly random. If there is any streakiness in the random numbers, then we expect to see an excess of results where one of the units is left on 4 hp, and a defecit of results where the surviving unit is left on 1hp (a fact which I confirmed by running 10,000,000 combats with a PRNG with streakiness deliberately inserted).

The results for 100,000,000 battles fought (numbers given are the percentage chance of a given outcome).

spearman
hp left expected prob observed prob
1 6.249 6.249 +/- 0.003
2 4.551 4.550 +/- 0.002
3 2.652 2.650 +/- 0.002
4 0.966 0.967 +/- 0.001

infantry
hp left
1 13.686 13.691 +/- 0.004
2 21.829 21.824 +/- 0.005
3 27.854 27.855 +/- 0.005
4 22.214 22.215 +/- 0.005

The uncertainties quoted in the observed probabilities are 1 sigma errors (meaning that you'd expect the 'expected' and 'observed' values to match within this error in roughly 2/3 of the cases).

Pretty obviously, there is no significant deviation away from the prediction of purely random numbers (i.e. no streakiness). I also did a simulation with the spearman fighting 2 attackers in a row, which also showed no significant deviation from the expected distribution.

So, assuming my version of the civ3 PRNG is accurate, it is pretty safe to say that there is no undue streakiness causing strange combat results in civ3; whatever streakiness there is in the generator doesn't show up in 100,000,000 combats (probably 20,000-200,000 games worth, depending on how often you go to war).

So, if you are convinced that you are getting unusual results too often, then you'll have to chalk it up to either a bug or a deliberate cheat on the part of the programmers (or more plausbily, the fact that our 'natural' understanding of randomness *very* badly underestimates the number of long streaks of odd results you can get).

SpencerH · August 29, 2002, 12:11

Very interesting. As I expected, the random number generator provides random numbers over a huge number of combats. The question is whether the generator is "streaky" over a much smaller number? Is there a tendency in a 50-50 proposition for the generator to assign clusters of hits to one side or another? With a very large n the numbers may appear random, but that doesnt mean there isnt clustering of the numbers.

Gen.Dragolen · August 29, 2002, 12:22

Vulture,

In the example you used, the results look correct.

Have you tried with more equally matched attack and defensive strengths? And I see that you did not include the 10% terrain bonus the defender gets for just standing there.

In the 8 months or so of game play, I have noticed there is a definite pattern to the combat results. There will be strings where the defending unit is down to it's last hp, while the attacking unit has 3 or 4 hp, and the attacking unit will die, so your next attacking unit will certainly kill the defender.

In naval combat the results are totally predictable: if you attack a unit of equal defensive strength, 7 out 8 times, your unit dies, often without doing even 1 hp of damage. The defender always has that 10% defensive bonus, which apprears to make all the difference in the world.

I am going to start recording the combat results as evidence of this, since it occurs at all difficulty levels. I think 10 games played to conclusion should generate about 200+ combat results per game and that should make for a significant sample.

And you could very well be right that there is no bias in the combat when all factors are taken into account, but only when you take into account all of the combat between AI Civ's as well. Only seeing your own results may be the source of our bias on the randomness of the results.

D.

Solver · August 29, 2002, 12:28

Recording combat results seems a good thing. I'm sometimes awed by the results... how could have my Longbowman survived an attack of a Swodsman, for instance?

vulture · August 29, 2002, 12:50

Quote:

Originally posted by SpencerH
Very interesting. As I expected, the random number generator provides random numbers over a huge number of combats. The question is whether the generator is "streaky" over a much smaller number? Is there a tendency in a 50-50 proposition for the generator to assign clusters of hits to one side or another? With a very large n the numbers may appear random, but that doesnt mean there isnt clustering of the numbers.

En contraire, the point of this test was exactly to look at clustering on small scales. The fact that I used a lot of results may obscure this, but it's still true (unless I'm hopelessly wrong). The point is that I am simulating combat between 4 hp units. Streakiness in the range of 2-4 numbers will show up most strongly in this region.

The first test I did, checking that each of the integers 0-1023 was generated with equal probability, is the test that your complaint would be valid for. More subtle defects in PRNGs produce e.g. a tendency to produce pairs of similar numbers, or a set of 3 numbers might be preferentially followed by a particular set of another 3 numbers (the sequence 576 431 972 might be followed by 543 1010 966 more often than expected by chance - and actually with a linear cogruent generator this is likely to be the case (although obviously not necessarily with the numbers I picked here)). The combat test here is sensitive to those kind of defects to some degree, at least to the kind where one unit will lose several hp in a row more often than you would expect.

As I mentioned, I did a second test where the spearman fought infantry in two consecutive battles (to test the proposition that a spearman that wins one implausible battle (taking minimal damage) is quite likely to win the next one as well - something which in my games certainly feels as though it is true sometimes). The test showed that a spearman won against two consecutive attacks no more often than would be expected, and that the hp-distribution of the spearman in the battles it survived matched the predictions. If there was any streakiness that showed up at the level of a spearman winning two fights in a row, this would have been noticable (it would increase the number of times the spearman survived with 3 or 4 hp f'rinstance).

The point is that any kind of clustering (except in high dimensions, where linear congruential generators are known to perform very badly) would show up in these tests. The higher dimensions don't show up exactly because I am testing clustering on shorter scales, by essentially studying the properties of small groups of random numbers (i.e. what I'm testing is what you suggest I should be testing).

vulture · August 29, 2002, 13:21

Quote:

Originally posted by Gen.Dragolen
Vulture,

In the example you used, the results look correct.

Have you tried with more equally matched attack and defensive strengths? And I see that you did not include the 10% terrain bonus the defender gets for just standing there.

I used unbalanced stats because I wanted to test the proposition that streaks on unlikely results are over-represented; it seemed easiest to do that by having a scenario where one of the combat results was obviously more unlikely. And yes, I forgot one of the defensive bonuses, and gave 10% for terrain and 25% for fortifying, and that was it. Mea culpa.

Quote:

In the 8 months or so of game play, I have noticed there is a definite pattern to the combat results. There will be strings where the defending unit is down to it's last hp, while the attacking unit has 3 or 4 hp, and the attacking unit will die, so your next attacking unit will certainly kill the defender.

There are bound to be some kinds of oddities in the combat system. The seed is 32 bit. This means that there are only 2^32 possible sequences of numbers (or less if the PRNG is not well chosen). I am told that the civ3 combat system generated numbers in the 0-1023 range. Now consider a sequence of 4 numbers in the range 0-1023 - the should be 2^40 possibilities. But our PRNG only has 2^32 possible states. What does this mean in practice? It means that if you look at the first three numbers in the sequence, there are only 4 possible values for the last number rather than 1024. For a typical fight, say involving 6 rounds of combat, there are only 2^32 different outcomes from the PRNG rather than the 2^60 that their ought to be if it was truly random. In practice for civ3, this doesn't make a huge difference it seems, since there are only 8 observable different outcomes to the infantry-spearman fight. And only 2^6 sequences of hp loss. So as long as the 2^32 PRNG states map more or less equally between these observable states, there is not a problem. Problems can arise when the limited number of strings that can actually be generated are not 'evenly' distributed amongst the theoritcally possible strings (in the example above, with only 1/256 of the possible 4 number sequence actually being produced by the PRNG, it is possible that the strings generated are clustered in certain parts of the phase space; hopefully the constants used in the PRNG are chosen to avoid this kind of clustering as much as possible).

Determining just where the limitations of the PRNG start to show up isn't easy, which is why I did the simulation. I expect that if you look at the distribution of results of 4 or 5 successive combats you will start to see breakdowns occur, with some sequences of results being impossible. But I doubt that this is going to be anywhere near the kind of effect to produce the subjective impression of streakiness that many people seem to have (and as I mentioned elsewhere, I get it too, but I'm more inclined to consider it more a problem of perception than of the PRNG).

Keeping notes on the combat results in games would be a good test, as long as you are very careful to decide before hand what criteria you are going to use for what to record. When I tried it I ended up only recording my attacks on the AI, 'cos I knew I could take the time to record what happened, while when you are the defender you don't always have time, and this probably introduces a bias towards certain types of result.

BTW only 200 combats in a game? Are you some kind of raving pacifist or what!?

SpencerH · August 29, 2002, 14:09

Quote:

Originally posted by vulture

by essentially studying the properties of small groups of random numbers (i.e. what I'm testing is what you suggest I should be testing).

100,000,000 combats is a small number ? Since my small experience with stats is in small numbers say 5-10 subjects/group with 3-4 groups maybe I have an advantage in looking at anomolies

I'm thinking of clustering in this way, we're familiar with the concept that you can compare two groups and come up with a probability whether or not they are the same. In some cases the they can appear different but that difference may disappear as n becomes larger (and obviously the reverse occurs). So what I'm suggesting (and I may be out to lunch here) is that you run this test with 1000, 100K, 10M, 100M combats and look at the results in this way.

Gen.Dragolen · August 29, 2002, 15:26

Vulture,

I should have defined my terms better: to me "one combat" is one battle involving alot of units where I am on the offensive. And usually they involve sacking an AI Civ's city...

You described exactly what I suspect happens with the random number generation function:

"Problems can arise when the limited number of strings that can actually be generated are not 'evenly' distributed amongst the theoritcally possible strings (in the example above, with only 1/256 of the possible 4 number sequence actually being produced by the PRNG, it is possible that the strings generated are clustered in certain parts of the phase space"

What you have described corresponds exactly to the sequences I see repeated though out the game. Not being a programmer by trade or being able to remember much from the statistical analysis course at university, I don't know much about how they coded this RNGFn but there are two ways that I know of: first start with the same seed each time or two, use the previous outcome as the seed for the next.

Your code above uses the second method and since there is some severe truncation of the outcome, and the results match what we have experienced, what can we do to fix it ?

In game play, I tend to use obselete units to soap off the first bad result or two and then commit the regular troops to the assault. Saves me the frustration of loosing a cavalry regiment attacking a division of longbowmen in open ground...

Do you have any suggestions about which test results would be useful so you could compare them to yours? I was looking at recording the terrain, the units attack vs defense, number of hp of each unit and the number of turns the combat takes to end. From that I should be able to calculate the percentage chance for the observed outcome using a combat results calculator I have from another thread, and should be able to write a small batch file to do the work for me.

BTW, Vulture, what do you do for a living ? You write like a career engineer or mathematician.

D.

Gen.Dragolen · August 29, 2002, 15:43

Vulture,

To explain the whining over this thing: the perception is driven by the timing of the obscene results.

These results usually happen at the very start of a game, where despite a promising starting location, you loose all of your archers attacking warriors in open ground, and your enemey's warrior takes your capital despite being defended by a spearman on the walls and behind a river, or when the AI Civ benfits when you are trying to take out that one last infantryman in a target city, and you are down to your last unit while the rest of your forces are destroyed or have minimal hit points.

Too many times you see the defender down to his last hit point and your unit still has full hit points, only to have your unit destroyed before combat ends, and the next unit to attack will destroy the defender, even if the defender has healed to full strength again. Hence my impression of the streak.

Our passions and comprehension of reality are too easily offended when the improbable occurs in a game like this one. I just have to keep reminding myself all the time that it's just a game.

D.

vulture · August 29, 2002, 16:48

Quote:

Originally posted by Gen.Dragolen
Vulture,

I should have defined my terms better: to me "one combat" is one battle involving alot of units where I am on the offensive. And usually they involve sacking an AI Civ's city...

Ah, the best kind of combat

Quote:

Do you have any suggestions about which test results would be useful so you could compare them to yours? I was looking at recording the terrain, the units attack vs defense, number of hp of each unit and the number of turns the combat takes to end. From that I should be able to calculate the percentage chance for the observed outcome using a combat results calculator I have from another thread, and should be able to write a small batch file to do the work for me.

BTW, Vulture, what do you do for a living ? You write like a career engineer or mathematician.

Actually I'm an astrophysicist... (but it's not as bad as it sounds).

After pondering this on the way back from the restaurant on the bus, a possibility has occured to me. As I said, I do see what seem to be certain patterns in the combat results. One pattern seems to be that if a defender wins an unlikely combat taking minimal damage, it does seem very likely to win the next one as well. Also, sequence of units losing hps alternatively seem to happen a lot (although you'd expect them to under a lot of circumstances).

So one interesting test would be to look at sequences of the order in which hps are lost (e.g. veteran tank attacks veteran rifleman, results are ADDAAA - A means the attacker won a round, D means the defender, and at the end of the combat the tank is alive and on 2 hp). If there is anything screwy going on, some of these strings (either for one combat or N consecutive combats) should be over-represented compared to the random probabilities. (Aside, SpencerH's point seems to be that some of the strings can be over-represented but that the overall distribution of strings is such that the probability distribution I present earlier is unaffected - I'm not convinced that this can be the case, but since it's not obviously false then we can't dismiss the possibility yet; I was working earlier in the belief that any bias in the string distribution would show up in the prob. tables, 'cos that seems intuitive to me).

I'll go away and implement this test with the PRNG that I gave earlier, and see what turns up.

Myrddin · August 29, 2002, 17:12

One pattern that I've convinced myself that I see (whether or not it exists) is that the the first defender in a city has some extra bonus ie the first defender seems more likely to cause damage to attackers than subsequent defenders, even if the defending units appear identical - has anyone else seen this?

WarpStorm · August 29, 2002, 20:52

deleted

Zachriel · August 29, 2002, 20:59

Quote:

Originally posted by vulture
One pattern seems to be that if a defender wins an unlikely combat taking minimal damage, it does seem very likely to win the next one as well.

That would require the PRNG to "know" that the call was for such a specific game event. The PRNG "knows" nothing about the purpose to which the result will be used. Only if the programmer made the result of the first round affect succeeding rounds would that be relevant. In other words, even if the PRNG were streaky, the streaks would be randomly applied to each round. So sometimes the streak would start at the beginning of the unit's combat, other times not.

People see patterns where none exist. They see faces in the clouds and animal shapes in the stars. That's just the way the mind works.

Zachriel · August 29, 2002, 21:04

Quote:

Originally posted by vulture
So one interesting test would be to look at sequences of the order in which hps are lost (e.g. veteran tank attacks veteran rifleman, results are ADDAAA - A means the attacker won a round, D means the defender, and at the end of the combat the tank is alive and on 2 hp). If there is anything screwy going on, some of these strings (either for one combat or N consecutive combats) should be over-represented compared to the random probabilities.

There is a small rounding error in the attack modifers, factored in after the PRNG. This could slightly change the ratio of A/D. However, it would not increase streakiness, just the ratio.

(By the way, good job on the analysis.

)

Switch · August 29, 2002, 21:43

Quote:

Originally posted by vulture
Actually I'm an astrophysicist... (but it's not as bad as it sounds).

What do you mean? Astrophysics sounds bad?
Are you a practical or theoretical physisist? I just love physics, especially astro-.

BTW, nice work with the RNG. It's nice to know that it's just the limits of programming and not a bug or AI cheat causing these streaks.

Hurricane · August 30, 2002, 01:37

Quote:

Originally posted by Myrddin
One pattern that I've convinced myself that I see (whether or not it exists) is that the the first defender in a city has some extra bonus ie the first defender seems more likely to cause damage to attackers than subsequent defenders, even if the defending units appear identical - has anyone else seen this?

Hey, this is a good thread where someone for a break tries to bring in some facts into this never-ending debate. I would suggest we keep the superstitious posts out of here.

Blake · August 30, 2002, 03:32

Quote:

Originally posted by SwitchMoO
BTW, nice work with the RNG. It's nice to know that it's just the limits of programming and not a bug or AI cheat causing these streaks.

"It's nice to know it's just sheer programmer incompetence in choosing a RNG and not a bug or AI cheat causing these streaks

"

Okay okay... that's not exactly a constructive post, btw where is the original thread mentioned in the first post?

vulture · August 30, 2002, 03:47

Quote:

Originally posted by Zachriel

That would require the PRNG to "know" that the call was for such a specific game event. The PRNG "knows" nothing about the purpose to which the result will be used. Only if the programmer made the result of the first round affect succeeding rounds would that be relevant. In other words, even if the PRNG were streaky, the streaks would be randomly applied to each round. So sometimes the streak would start at the beginning of the unit's combat, other times not.

People see patterns where none exist. They see faces in the clouds and animal shapes in the stars. That's just the way the mind works.

I know. Even though I don't think that there is anything obviously wrong with the PRNG, I just mentioned that to point out that its still very easy to think you see a pattern (and maybe that pattern is real, but just due to chance rather than systematic effects in the PRNG). Even if I reverse engineered the whole civ3 code, tested it to exhaustion and convinced myself that the numbers were entirely indistinguishable from random to any test, I'm sure I'd still be finding patterns in the combat results (and I say this as someone who plays a lot of backgammon with some very high quality PRNGs).

Quote:

There is a small rounding error in the attack modifers, factored in after the PRNG. This could slightly change the ratio of A/D. However, it would not increase streakiness, just the ratio.

I used that info in my test. Anyone paying very close attention might have noticed that the results I quote are consistent with an attack of 6.0 vs. a defence of 2.739687... rather than 2.75. The probability of the defender winning (def_str / (def_str + att_str)) was multiplied by 1024 and converted to an integer, creating this small rounding error. The numbers for a strength of exactly 2.75 would be noticably different at the precision I quoted.

vulture · August 30, 2002, 04:08

Quote:

Originally posted by Blake

"It's nice to know it's just sheer programmer incompetence in choosing a RNG and not a bug or AI cheat causing these streaks

"

I read a good story about an on-line poker site that used a 32 bit PRNG (which is bad enough) and then did something silly with seeding it with the time of the day in milliseconds in some manner. So they had 86,400,000 possible seeds (and therefore that mand possible permutations of the deck of cards - thats about 2^26 rather than the 2^32 the PRNG is capable of, and noticably less than the 52! (2^226) possible states of a deck of cards. Since they published the PRNG code to show everyone how good it was, it meant that once you'd seen three cards, you could search the possible permutations in real time and know exactly what cards were where within a few seconds. Now *that* is a serious ****-up.

Quote:

Okay okay... that's not exactly a constructive post, btw where is the original thread mentioned in the first post?

The post mentioning the PRNG code is somewhere near post 40-50 of the The dangerous sea... thread, which references a thread on the other civ site which can be found here, somewhere near the bottom of the page. The info comes from somebody called hwinkels (is that Zachriel's name on the other site by any chance?)

Kampus majore · August 30, 2002, 06:48

Quote:

Originally posted by Gen.Dragolen
These results usually happen at the very start of a game, where despite a promising starting location, you loose all of your archers attacking warriors in open ground, and your enemey's warrior takes your capital despite being defended by a spearman on the walls and behind a river, or when the AI Civ benfits when you are trying to take out that one last infantryman in a target city, and you are down to your last unit while the rest of your forces are destroyed or have minimal hit points.

Too many times you see the defender down to his last hit point and your unit still has full hit points, only to have your unit destroyed before combat ends, and the next unit to attack will destroy the defender, even if the defender has healed to full strength again. Hence my impression of the streak.

Our passions and comprehension of reality are too easily offended when the improbable occurs in a game like this one. I just have to keep reminding myself all the time that it's just a game.

D.

I fully agree with this statement. It is just frustrating to see you're units be defeated, but when the same funny effect happens to your opponents, you're happy with it!

vulture · August 30, 2002, 07:08

Quote:

Originally posted by SwitchMoO

What do you mean? Astrophysics sounds bad?
Are you a practical or theoretical physisist? I just love physics, especially astro-.

Most theoretical high energy astrophysics (active galaxies), but also some experimental stuff - I'm involved in gamma-ray astronomy, and also some optical astronomy. Endless minutes of fun

WarpStorm · August 30, 2002, 07:27

From experience at work on PRNGs, we decided that the Mersenne Twister algorithm was the best for our purposes (Monte Carlo simulations involving millions of events, a minor flaw in the stability of a PRNG shows up in that many samples). It is faster than the standard C rand() function and has a period of 2**19937-1. I'd recommend it for just about any PRNG use (except cryptography).

Mike B, y'all ought to consider it, that would put an end to most any complaints if you did.

Zachriel · August 30, 2002, 07:36

Quote:

Originally posted by vulture
I know.

I know you know.

Anyway, as one who sees faces in the clouds and animal shapes in the stars, I try to keep an open mind. However, the scientific method has brought us very far since Galileo asserted that the Earth does indeed move.

Thanks for the data.

vulture · August 30, 2002, 07:43

Quote:

Originally posted by SpencerH

100,000,000 combats is a small number ? Since my small experience with stats is in small numbers say 5-10 subjects/group with 3-4 groups maybe I have an advantage in looking at anomolies

Shoot me for a cynic, but I'd have thought your chances of finding anything statistically significant in groups that small has to be pretty slim (say the guy who just gave a talk on the significance of the detection of 4 photons...) 100M combats isn't a small number, but any real effect that would show up in 1000 combats would show up much more strongly in 100M combats; all you get at 1000 combats is more statistical noise, and any anomolies you see are very probably random (especially the ones that disappear as you do more trials).

Some kinds of bias would show up by doing many groups of 1000 combats, that wouldn't show up in the kind of analysis I did; namely the possibility that e.g. the 2^1000 possible sequences of battle results may not be randomly distributed (the probability of the defender winning 530 combats (rather than the expected 390 (or whatever) might be too high), but this has to be balanced quite precisely against other biases to keep the overall probability of winning (as tested by the first set of tests I did) the same). I hope I've got my brackets sorted out there.

To test for that kind of bias I've run a few tests, looking at much shorter sequences than 1000 fights. First test was to look at the 70 possible sequences that make up one of the combats in the original post (e.g. there is only one sequence of hits that leaves the defender dead and the attacker on 4 hp - WWWW (W is a win for the attacker in that round, L is a win for the defender), but there are 4 that leave the attacker on 3 hp - WWWLW, WWLWW, WLWWW, LWWWW). As it happens, the distribution of such sequences for 1,2 or 3 successive combats is indistinguishable from what would be expected.

To test longer sequences of combats, it is no longer possible to track individual blows in the combat, but just look at the combat results themselves, and count the sequences of attacker wins fight (W) vs. defender wins fight (L). The only anomoly that has showed up was the chance for the attacking side to win 4 straight battles in a row (in 4 separate battles of infantry attacking fortified spearman on plane). I think that this was enough to rate as statistically significant, but no-one is ever going to notice it IMHO. If you fought 10,000 sets of those 4 attacks, you would expect the attackers to win all 4 5,366 times, and instead they would only manage it 5,365 times out of 10,000. Given random fluctuations (and the fact that combats in civ3 cover a wide range of attack/defense strength combinations) there is no way anyone is ever going to notice that just by playing the game. I had to fight 1 billion battles just to get it to show up with marginal significance in a test that was looking for that kind of effect.

So, the most significant failure of the PRNG (assuming it wasn't a random oddity, which is still possible) is something that would make absolutely no discernable difference to your games if you played civ3 from now until the end of your life without pause. I'm pretty confident that whatever is causing dodgy results (or the perception of dodgy results IMHO) is not caused by and problems with the PRNG. It is not producing any noticably streaky results.

To quote HHGTTG "We have achieved normality. Anything you can still not cope with is therefore your own problem" (paraphrase, from memory).

Zachriel · August 30, 2002, 07:43

Quote:

Originally posted by WarpStorm
From experience at work on PRNGs, we decided that the Mersenne Twister algorithm . . .

Mike B, y'all ought to consider it, that would put an end to most any complaints if you did.

I am certain that you are incorrect. People will still see patterns. Why? Because that is the way people are designed (God bless them!). The "patterns" people are detecting in the current PRNG are not real patterns caused by some sort of deficiency in the algorithm.

Use a quantum-RNG and people will still say that the fix is in.

vulture · August 30, 2002, 07:50

Quote:

Originally posted by WarpStorm
From experience at work on PRNGs, we decided that the Mersenne Twister algorithm was the best for our purposes (Monte Carlo simulations involving millions of events, a minor flaw in the stability of a PRNG shows up in that many samples). It is faster than the standard C rand() function and has a period of 2**19937-1. I'd recommend it for just about any PRNG use (except cryptography).

That's the one I use as well for my Monte Carlo runs, mostly 'cos its period is nice and long (and it passes just about every test of randomness AFAIK). In one of my PRNG tests I cycled around the period of the generator 10 times, which doubtless has screwed up the results somewhat (since that was the test that showed up the discrepancy I mentioned above), although it isn't as bad as it could have been since each individual result was a sequence of 22 or so PRNG calls, so the chances of all the sets of 22 starting at the same point in the PRNG cycle are pretty small. But still, it will have introduced some effect.

SpencerH · August 30, 2002, 09:06

Quote:

Originally posted by vulture

Shoot me for a cynic, but I'd have thought your chances of finding anything statistically significant in groups that small has to be pretty slim

What I'm describing is the kind of groups you can see published in virtually any biological journal (and they'll often use a t-test to calculate p values

. Of course better labs will repeat the experiment, but not always. HIV experiments have been published in Nature with results from 2 chimps.

Two thoughts:

Could the random seed number for that turn (that's used to generate the random numbers for combat) cause clustering? Could human starting-stopping-starting of games alter the "randomness" of the random seed and therefore cause clustering?

vulture · August 30, 2002, 09:26

Quote:

Originally posted by SpencerH

What I'm describing is the kind of groups you can see published in virtually any biological journal (and they'll often use a t-test to calculate p values

. Of course better labs will repeat the experiment, but not always. HIV experiments have been published in Nature with results from 2 chimps.

I generally remain rather cynical of some of the studies I've seen in biology and sociology/psychology reports. I guess we usually have the luxury of being able to gather better stats under better conditions on the whole in physics.

Quote:

Two thoughts:

Could the random seed number for that turn (that's used to generate the random numbers for combat) cause clustering? Could human starting-stopping-starting of games alter the "randomness" of the random seed and therefore cause clustering?

Good point, which hadn't occured to me. For those of us who use the 'save random seed' option, the seed is saved in the save file, and the pauses in the program are entirely transparent to the PRNG, so it makes no difference (proviso - there may be issues coming from the choice of seed at the beginning of the game; I would hope that the PRNG at the end of one game is saved and stored for the start of the next game).

BUT

For those who don't save the random seed, an awful lot depends on how the PRNG is re-seeded. One way that ought to work is to have a second PRNG that generates seeds for the first one, with the second one saving its sequence between sessions. Other (hopefully) reasonable ways would be to use a selection of bits of system information to generate the seed in some way (one thing I like about Linux is that there is always a source of 'genuine' randomness available which is produced by a complex procedure involving user activities etc.). A very bad method would be to use something like the system time, which, depending on the implementation, could limit the seed to a very small fraction of the possible values, and possibly lead to some strings of results being over-represented at a noticable level (I have no idea how likely that is).

Anyone from Firaxis want to enlighten us on how the seed is determined for the start of games / savs without the seed saved?

SpencerH · August 30, 2002, 10:03

Quote:

Originally posted by vulture

I generally remain rather cynical of some of the studies I've seen in biology and sociology/psychology reports. I guess we usually have the luxury of being able to gather better stats under better conditions on the whole in physics.

Well I'm a cellular micobiologist and I'm pretty cynical about them too

Theseus · August 30, 2002, 10:35

I had a discussion with Carver re the 'nature' of the PRNG in a game like Civ3.

Is it not the case that within your 100M set, there would, in fact, be streaks, and that they would be balanced by counter-streaks?

Think blackjack and card-counting.