PRNG, simple math, and an idea
3 years ago

I've recently watched a few D2 speedruns and noticed the reset strategy to get the "right" A1 town exit; I've since found the thread explaining it here: https://www.speedrun.com/d2lod/thread/wi5mn

I've put some thought into why this works and how this can be exploited further....

  1. How does a pseudo-random number generator (PRNG) work?

A PRNG has some internal state and when you ask it for a random number this state changes in a deterministic way. An example for a simple PRNG would be calling this function on some state x and returning the state as random result for each number generated.

f(x) = x * 7 mod 13

Let's say x starts with 1 (called the seed) and we want some random numbers, here's what we get:

1, 7, 10, 5, 9, 11, 12, 6, 3, 8, 4, 2, 1

That's clearly a pretty bad PRNG because it is very predictable. Seeing just one result allows us to calculate the next one (because we give out the complete state as result, easy to fix). But imagine using very large prime numbers instead of 7 and 13. And a more complex operation than a multiplication.

And imagine not being able to observe the full result but just a coin flip, e.g., mapping 1-6 to head and 7-12 to tail in this function for the output. Then it's much harder to predict.

Now there are two properties of a RNG that we are interested in:

i) how predictable is it, i.e., can we get the internal state by observing the output? ii) how big is the state, i.e., in our case x can only take 12 different values

The thing we are interested here is only the size of the state. You might think that i) is the more interesting thing, but it turns out the size is the limiting factor for Diablo II map generation.

Our example function above only has 12 different states (values 1 to 12). That means two things:

i) it repeats itself after at most 12 different outputs, see example above, it gets back to 1 in the 13th output, since that's the only state it has it'll just start over ii) there are only 12 different seeds that can be used, i.e., 12 different combinations for the first 12 outputs

We care about ii) here. Assume that this function would be used to generate a map and our map just consists of 4 random numbers between 1 and 12

How many maps that consist of 4 numbers between 1 and 12 are there? 12 * 12 * 12 * 12 = 20736 How many different maps can that PRNG generate? Only 12. Each seed (initial value) will generate a unique map.

So for example we'll never get the map 1, 10, 2, 3 because there is no such seed that generates this sequence. Or we never get a map where the same number occurs twice because it only repeats after 12 outputs, but we only look at 4.

So even tough it looks like there are only very few maps our PRNG is already insufficient.

  1. So you build a bad PRNG, a real game would surely use a better one with more than 12 possible seeds?

Diablo 2 seems to use a RNG with a seed of 64 bits: https://github.com/OpenDiablo2/OpenDiablo2/blob/8a547ebf0e23daedbaf0c5717b9f81b72a74ecb0/d2core/d2gamestate/d2gamestate.go#L22

That means there are a total of 2^64 (that's 18,446,744,073,709,551,616) different possible maps. That ought to be enough for a true randomization of maps, right? Nope.

Diablo 2 maps have lots of coin flip (or 3-way or 4-way) "decisions" that are being made during generation. Examples are:

  • Town exits in A1 and A2
  • Exit directions on maps and caves
  • Is there a river in map X
  • Which direction is the barracks in?
  • Is the <whatever> dungeon on the left or right part of the map?
  • Are there portals/weird escher-style paths/nothing in direction X in the Arcane Sanctuary?
  • Which tomb is the right one?
  • Is the path between the different Kurast areas to the left or the right?
  • Are there 2 chests or 3 chests in X?
  • ....

You can probably find tons of these examples. We are looking for ones that should not be correlated. For example, there is some correlation between area exits as some configurations are impossible (they'd run into each other).

Let's look at each of these decisions as a coin flip decision. For most it's clearly more than a single coin flip (4 town exits in A1, 3 exits for many areas), let's ignore that and say there's one coin flip on average in each decision.

How many of these coin flips in maps can you find? 100? Probably! And these are just the decisions that you can easily spot in the game. There's way way more than that internally.

How many different maps are there theoretically if they only consist if 100 coin flips? 2^100 maps. How many different maps can possibly exist with a 64 bit seed? Only 2^64.

2^100 is way more than 2^64. It's a factor of 2^36 more, that's around a factor of 68 billion more. In this example only 0.00000000145519% out of the maps could exist are actually being generated if maps consist of only 100 coin flips. But maps are way more complex than that so it's even worse (or better, for us)!

Diablo 2 can only make 64 coin flips for map generation without becoming predictable. There are more than 64 coin flips in maps, so we can gain some information by observing maps.

A lot of things that could be possible are simply never generated due to this limitation and there are unexpected correlations that can be exploited.

  1. Some assumptions about map generation in Diablo 2

There's one big assumption to make this useful that seems to be true: The whole map is generated at once for all acts. It is clearly generated at least the starting act at the same time (otherwise map hacks that just read local data wouldn't work on battle.net). It looks like it generates everything for all acts in the beginning as https://www.speedrun.com/d2lod/thread/wi5mn indicates that are correlations between A1 town layout and A3 and A4.

This is important: the whole problem could be avoided if D2 generated maps as you first enter the area. Because the PRNG state would then be influenced by whatever you did before you entered the area (such as kills, spell damage calculations, whatever else the seed is being used for).

This can be tested: start the game with --seed X and check in new games whether all acts look the same in different play-throughs. I guess they do?

  1. Finding correlations (in an automated way)

There are probably way more correlations than have been found. There's probably stuff like "real tomb was circle, therefore exit in area X faces direction Y". These could be figured out:

i) make a list with easily identifiable "markers" like the examples I gave in 2) ii) create a new map on a character that has all acts revealed iii) find all markers/results of these coin flips or X-way decisions iv) repeat several times

Once you got enough data you can write a simple tool to find correlations. Note that these can be more complex than just one finding implies one other. It could be that three different decisions are tails imply that another two can never be heads.

Gathering that data can be automated. Inject some code in the game, make it generate maps (in the simplest case by just starting a game). Look at what it generated in memory. Example code from a map hack looks easy enough: https://github.com/planqi/slashdiablo-maphack/blob/master/BH/Modules/Maphack/Maphack.cpp#L802

Stuff it finds has a position and an ID. Something like finding "tower in A1 is on the left part of the map" or "exit in arcane sanctuary is top right" is easy to figure out based on this.

No, I didn't do any of this. I didn't even play the game in 15 years... I'm just procrastinating here...

  1. Fun fact: more serious PRNG problems

This "limitation" isn't really a problem for a game like D2 when played normally. But there has been at least one case of an online poker site using a seed of only 64 bits, i.e., they had only 2^64 different possibilities the deck could be shuffled. However, there are 52! ways a deck could be shuffled. 52! is way way way bigger than 2^64 (it's bigger by factor of 4 followed by 48 zeroes). Clever people did exploit that to earn serious money. (Can't find the story right now, was ~10 years ago)

Problems with the PRNG in slot machines are also not unheard of, there's a fun story of people feeding the output of slot machines to a phone app which then told them when to bet a lot of money because the next move is gonna be a winner: https://www.wired.com/2017/02/russians-engineer-brilliant-slot-machine-cheat-casinos-no-fix/

Edit 1: removing accidental smileys Edit 2: typos

Edited by the author 3 years ago
Switzerland

Some people in MrLlamaSC discord are trying to do just that, mainly aimed at figuring out the arcane ( https://discordapp.com/invite/0cdE0TjHEACCSO2L if you want to join them). They have gathered data from like 20k seeds

Problem is there is an insane amount of stuff you can combine with each other, so if you just want to bruteforce check for correlations, you probably have to ask NASA or google if they are willing to borrow you a supercomputer for a few month to get it done.

Then looking at speedrunning, the relations need to be something thats actually useable for us. Lets say something in radament-level would tell you the arcane direction, "just get lucky" would still be the fastest strat.

The correlation itself for the relevant parts can be calculated with no problems. No brute force needed because we're not interested in all combinations and outcomes.

Idea is to define m markers that are easily accessible (obviously would use something like waypoint position, exit directions that you take anyways) and run that for n seeds.

Then calculate probabilities for the outcome of all these markers, e.g. it might be 50/50 if town exit in A2 is left or right. Call this the baseline.

Next look at a marker you are interested in such as "Arcane Sanctuary is top left" and look again at the probabilities for all other markers for the subset of samples that have the marker you are interested in. Calculate the difference to the baseline. Or the difference to the remaining samples (i.e., the ones without the marker you are looking for). If that tells you "town exit A2 is top left 70%, only 30% top right" then you've found a useful correlation.

That's a O(m * n) operation for each marker you care about, easy!

Key is you need lots of samples and lots of markers (because you don't know which ones end up being useful), so better automate this.

South Yorkshire, England

Wouldn't it be possible to run D2's code externally and observe the output? That is, you could repeatedly call the map generation code and observe the results by maphacking (maybe even that step could be automated).

The most elegant way would indeed be to use the server component of the game to generate maps and gather the necessary information. No UI needed and easy to run hundreds of instances in parallel. However, there's probably more custom code needed to run the scan on the map data.

Running in normal single-player also makes initial debugging and finding "markers" easier

Edited by the author 3 years ago
Russia

Diablo 2 seems to use a RNG with a seed of 64 bits: https:/​/​github.​com/​OpenDiablo2/​OpenDiablo2/​blob/​8a547ebf0e23daedbaf0c5717b9f81b72a74ecb0/​d2core/​d2gamestate/​d2gamestate.​go#L22

This link is going to OpenDiablo2, I'm not sure it works the same way as the original engine.

About markers - this is the first idea that comes to mind when you think about speedrun Diablo 2. But thats not how RNG in Diablo 2 works. The only 1 marker I can remember is this one - .

The thing is RNG works in different ways. I don`t think it works like "How many different maps are there theoretically if they only consist if 100 coin flips? 2^100 maps. How many different maps can possibly exist with a 64 bit seed? Only 2^64."

Because item drops are even more random than cards. Every time you receive an item, RNG work as the first time. So it looks like every map ( https://diablo.gamepedia.com/Area_Size_(Diablo_II) ) generated separately. There is no connection betwen tower level 1 and tower level 2 at all, diablo 2 speedrun is not a new thing, if diablo 2 had markers, then they were already found.

Edited by the author 3 years ago
Russia

Yeah, and the main thing - maps are not the biggest problem for speedrunners, finding a path is not a difficult part of speedrun.

64 bit seed from OpenDiablo2 was an example, actual code seems to use 32 bit per act for map generation: https://github.com/noah-/d2bs/blob/328463364b8128339e00c59e40413b2242e2a10e/D2Structs.h#L361

That means there are only 4 billion possible maps for each act. Next question would be where these seeds come from (i.e., are Acts correlated? data between A1 exits and A4/5 stuff seems to be evidence of this). But that doesn't really matter. The point is that only a very small subset of the maps that you can imagine can actually be generated by the algorithm.

You are misunderstanding what I mean by markers/map generation, sorry if I wasn't clear. Map generation doesn't work like "taking a coin flip to decide if tower goes on left side or right side of the map", yes. But, fortunately, that doesn't matter.

There are only so many maps, apparently 4 billion variations per act. These maps do have markers that are an accidental result of the generation and they can be correlated to useful facts.

With only 4 billion per act: that's very much in the realm of a brute force solution generating all the possible maps and figuring out what correlates with useful things.

The markers are just a simple way to gain information about the map. Hard to see "this is room with ID X in position Y in this dungeon", easy to see "exit faces left".

Edited by the author 3 years ago
Estonia

"Because item drops are even more random than cards. Every time you receive an item" -327

Good discussion, just want to add that there are item drops that can be manipulated. It's possible to almost always get the same item from weapon/armor racks based on the direction you approach them.

Russia

ekgm many dungeons have layouts ... again maps are not problem for speedrunners for today, only catacombs, arcanes, sewers (not at all) and worldstone are random for runners

British Columbia, Canada

The direction that exits face for most maps has always been known based on the starting tile of that map - tower will always be a left turn, no matter what, for example, and is therefore a bit irrelevant. The real benefit to the right right act 1 map is a fast act 1 layout, knowing which way kurast is, act 4 layout afaik. The only thing that would really benefit is if we could find correlation for arcane that does not lose more time than simply being lucky, most other maps we have a way of telling already, and time loss is simply weird generation ("3 lefts make a right", which as far as the way the maps are generated goes, isn't very predictable at all). Only exceptions to having a tell is catacombs 1 and 3 and worldstone keep 1 and 3 (both level 2 only if you find wp)

Game stats
Followers
758
Runs
2,308
Players
291
Latest threads
Posted 3 years ago
2 replies
Posted 1 month ago
10 replies
Posted 1 month ago
Posted 11 months ago
3 replies
Posted 1 year ago
2 replies