Analysing eqemu RNG

Discussion in 'General Discussion' started by surron, Jun 30, 2016.

  1. surron

    surron People Like Me

    Messages:
    552
    Per this article on c++'s random_device and mt19937 random number engine

    http://www.pcg-random.org/posts/cpps-random_device.html

    the author states the impact of seeding mt19937 with random_device

    EQEMU code does exactly what he says not to do.

    If you look at https://github.com/EQMacEmu/Server/blob/54f2877ff4f42d57057f1b210f0318062399f311/common/random.h

    which is random.h in its constructor Reseed() is called.

    Reseed() looks like...

    Code:
    [ Only registered users can see the bbcode. Click Here To Register... ]
    and Random is initialized as a global variable inside Zone. That means for each zone the RNG is only seeded once, and it's with a static UINT. This static UINT changes per each zone object, but that's irrelevant since each zone is its own entity.

    You can tell by the comments that the person who wrote this code knows better seeding can be done with seed_seq which, is what the article above states should be done.

    http://www.pcg-random.org/posts/cpp-seeding-surprises.html


    I wrote a program that uses EQEMU's Random.h class and ran tests.

    Here's a slightly modified Random.h

    Code:
    [ Only registered users can see the bbcode. Click Here To Register... ]
    and here's my main class that implements this and does some logging

    Code:
    [ Only registered users can see the bbcode. Click Here To Register... ]
    What this program does is run each test (4 modes) and rolls 1 million times each, gathering info on the distribution of numbers, how many times the number rolled the same as the previous number (streaks) and how many times the number rolled the same as the previous number +/- 1 (close streaks).
     
    Last edited: Jun 30, 2016
  2. surron

    surron People Like Me

    Messages:
    552
     
  3. surron

    surron People Like Me

    Messages:
    552
     
  4. surron

    surron People Like Me

    Messages:
    552

    I wish I had a way to visualize this data... Anyways you can see that both random engines tried (mt19937 and srand) both distribute their numbers fairly evenly. You can see that reseeding before each roll doesn't really affect the distribution, however reseeding before each roll on each engine added 8x the overhead. You can see that seeding with system time before each roll is impossible with this test because the loop happens so quickly, the seed is being set as the same thing repeatedly, thus the rand pattern is the same.

    Now for analyzing the streaks...
    Code:
    [ Only registered users can see the bbcode. Click Here To Register... ]
    In conclusion, I was wrong for stating in a previous thread that seeding with system time would be better than random_device (all in all they are just UINTs.) I was also wrong about how seeding more than once could be beneficial.

    **Going forward**

    I would like to edit my program when I get more time and do the following...

    - Setting the close streak threshold to something greater than 1. For example, some checks on tradeskills are 5%... so maybe set the threshold to match that.

    - Maybe since our Int roll function according to EQEMU is (0, 99) which makes sense because we are just converting it to %, but would it be a difference if we did (0, 999) or (0, 9999)?

    - Implementing the seed_seq function. I think this is the next best approach


    There's got to be some reason why our RNG is so streaky, it is beyond noticeable in game, and hopefully this data / test can spark some ideas.
     
    Last edited: Jun 30, 2016
    Mambo and kicnlag like this.
  5. Haynar

    Haynar Administrator

    Messages:
    3,637
    Spark what ideas? How to make it streakier like EQ really was?

    I think its streaky enough. I am against trying to make it have bigger streaks, regardless of how EQ was.
     
  6. surron

    surron People Like Me

    Messages:
    552
    the opposite, less streaky... EQ's code wouldn't have mt19937 or random_device... they were probably using c++98
     
    Last edited: Jun 30, 2016
  7. Bum

    Bum I Feel Loved

    Messages:
    2,647
    My kids underwear have lots of streaks
     
  8. Lenas

    Lenas I Feel Loved

    Messages:
    2,968
    Yeah but Surron's whole thing is about EQ being too streaky as is
     
  9. Haynar

    Haynar Administrator

    Messages:
    3,637
    EQ was streaky.
     
  10. Torven

    Torven I Feel Loved

    Messages:
    2,742
    Changing how the pseudo RNG algorithm is seeded won't change the results in-game what so ever. You see streaks in-game because true randomness is streaky. Play Diablo and you'll get the same legendaries over and over.
     
    kicnlag likes this.
  11. surron

    surron People Like Me

    Messages:
    552
    I'd have to argue that EQ live, EQ mac, and even p99 wasn't as streaky as takp

    Haynar is p99's random class similar / the same as here?


    I'm unfamiliar with diablo 1, but if its like diablo 2 and the stats are random on legends then I'd have to argue that getting the same legendary isn't that uncommon in a small world like diablo (just like the small world of 1 eq zone), it would be getting similar stats on the same legendary that would conclude streakyness.
     
    Last edited: Jun 30, 2016
  12. kicnlag

    kicnlag Active Member

    Messages:
    65
    Elroz and Pithy like this.
  13. Pithy

    Pithy I Feel Loved

    Messages:
    2,620
    Or the clustering illusion, yeah. Torven's right: Real random processes are streaky! Go flip a coin 100 times; you'll see streaks of 5+ heads pretty regularly.
     
  14. surron

    surron People Like Me

    Messages:
    552
    i'd agree with that statement but 2 outcomes is a lot different than 100
     
  15. Lenas

    Lenas I Feel Loved

    Messages:
    2,968
    Aren't there only two outcomes? Skill up / no skill up. That's the streakiness you are talking about.
     
  16. surron

    surron People Like Me

    Messages:
    552
    im talking about everything that uses the RNG. for example kick is 33% chance to land in our code, for it to land it goes to the RNG rolls 0-99, if the number rolled is 33 or under it succeeds.

    everything is based off that concept, skill ups too. i am specifically interested in the part where the RNG is rolling 0-99
     
  17. Haynar

    Haynar Administrator

    Messages:
    3,637
    We have no plans to change the RNG.

    Still.
     
  18. surron

    surron People Like Me

    Messages:
    552
    I know math may be difficult sometimes, and I'm not asking you to change it... I'm just analyzing it.

    Does p99's random.h class look the same as TAKP? I've asked that question a couple times don't know if you've seen it.

    There are other RNGs out there, here's a custom one that is built using a 64 bit int distribution.

    Code:
    [ Only registered users can see the bbcode. Click Here To Register... ]
    You'll notice right away that there are 0 streaks in 1 million rolls, but double the "Close Streaks" as our current RNG.

    Now in 1 million rolls I would expect some streaks in a range of 100 numbers, but in this case it doesn't happen. To roll the same number twice with 1/100 odds is a 0.0001% chance (if we are not counting in a row.) So out of a million rolls that 0.0001% chance never happened. Our current RNG is saying that 0.0001% chance is happening ~1000 times per 1 million rolls, which = 0.0001%.

    While our RNG may be statistically accurate and matches 1/100 * 1/100 chance to roll the same number, it does not account for rolling that same number in a row in a given finite set. Also my program does not account for rolling a number 3 or more times in a row. If our TAKP RNG is rolling the same number 3 times in a row that's 0.000001% chance. Out of a million rolls that could happen 1 time statistically speaking.


    edit* so I modded my program to remember up to 3 consecutive rolls.

    We realize that rolling 1-100 1million times and getting the same number 3 times in a row is a 0.000001% chance. Aka it should happen once.

    It happened 82 times in 1 million rolls (a chance of 1.2x10^-8) or 0.000000012%! That screams streaky.
     
    Last edited: Jul 1, 2016
  19. Pithy

    Pithy I Feel Loved

    Messages:
    2,620
    RL is streaky! Whether they're drawn from continuous or discrete distributions, uniform random numbers will tend to clump and cluster. That's the whole reason people use variance reduction techniques like stratification and Latin hypercube sampling in Monte Carlo simulation. The animation on the wiki clustering illusion page illustrates this nicely.

    Is Knuth's LCG outputting ints or floats? If floats, that'd explain why you see no 'streaks' but lots of 'close streaks'.
     
  20. Haynar

    Haynar Administrator

    Messages:
    3,637
    Math is tough for me.

    I cant understand all these silly statistic things. They are beyond my uneducated, ability to understand.

    So i am going to just call you a nerd, point and laugh, instead.
     
    Slayzz likes this.
  21. Pithy

    Pithy I Feel Loved

    Messages:
    2,620
    BTW, surron, I for one think your analysis is super cool! RNGs are fascinating, and the blog post in the OP makes a good point about the importance of the initial seed. The Mersenne twister wiki page also mentions that the output stream takes a long time to start looking random if it's seeded badly.

    If you really want to geek out here, I'd suggest working with uniform random reals between 0 and 1, rather than integers between 0 and 99. Uni(0,1) RNGs are fundamental; all other distributions are based on Uni(0,1). There are a bunch of standard tests for Uni(0,1) RNGs that check uniformity, independence, randomness, etc. The TestU01 C library implements a bunch of these tests.
     
  22. Bum

    Bum I Feel Loved

    Messages:
    2,647
    Lawls @ HayNerd lolzing @ nerds
     
  23. surron

    surron People Like Me

    Messages:
    552
  24. lurari

    lurari People Like Me

    Messages:
    646
    If an EQEmu had a better working RNG than the real EQ, is that a success or failure by the developers?
     
  25. Torven

    Torven I Feel Loved

    Messages:
    2,742
    Any difference people think they see is in their heads
     
    surron and Kagatob like this.
  26. surron

    surron People Like Me

    Messages:
    552
    rolling the same number 3 times in a row is a 0.000001% chance (1/100 * 1/100 * 1/100)

    so in 1 million rolls that is expected to happen once. (0.000001% * 1000000) = 1

    our RNG lets that happen 82 times. i dont think that proof is "in my head"
     
  27. Haynar

    Haynar Administrator

    Messages:
    3,637
    We can conclude that the RNG is not perfect.

    And noone is working on a replacement.

    And surron thinks we should drop everything and fix this now.

    3 of same value in a row is meh.

    We look at pass/fails for the most part in tradeskills for example. If < 30 is a fail. Guess what? Three 5's in a row is equivalent to a 1, 6, 8 in the overall response.

    It averages out fairly well. So my suggestion is this:

    surron: quit pretending like you are smarter than everyone else. Analyzing 3 points of a data set to make conclusions is about as bad as u can get in the world of data analysis.
     
    Last edited: Jul 2, 2016
    Mambo likes this.
  28. Pythagoras

    Pythagoras New Member

    Messages:
    11
    The chance of rolling the same number (0-99) 3 times in a row is actually 1 in 10000.
    The chance of rolling a GIVEN number 3 times in a row is 1 in 1 million.
    When looking for a streak of 3, whatever the first number in the streak is doesn't matter (100% chance of being satisfied), as long as the two following it are the same (1 in 100 chance for each number).
     
    Mambo, Lenas and Pithy like this.
  29. Bum

    Bum I Feel Loved

    Messages:
    2,647
    RNG needs to be replaced with BNG

    Bum's Number Generator
     
  30. Haynar

    Haynar Administrator

    Messages:
    3,637
    I hope you goof balls realize for doing a rand 0-99 it doesnt pick the same number 3 times. There are many numbers picked in between. But after converting to an int, it looks like you got same number.

    This discussion is beyond stupid. With idiot after idiot, now being all knowing about programming and statistics.