Comments
random-username4 years ago

64 bit seed from OpenDiablo2 was an example, actual code seems to use 32 bit per act for map generation: https://github.com/noah-/d2bs/blob/328463364b8128339e00c59e40413b2242e2a10e/D2Structs.h#L361

That means there are only 4 billion possible maps for each act. Next question would be where these seeds come from (i.e., are Acts correlated? data between A1 exits and A4/5 stuff seems to be evidence of this). But that doesn't really matter. The point is that only a very small subset of the maps that you can imagine can actually be generated by the algorithm.

You are misunderstanding what I mean by markers/map generation, sorry if I wasn't clear. Map generation doesn't work like "taking a coin flip to decide if tower goes on left side or right side of the map", yes. But, fortunately, that doesn't matter.

There are only so many maps, apparently 4 billion variations per act. These maps do have markers that are an accidental result of the generation and they can be correlated to useful facts.

With only 4 billion per act: that's very much in the realm of a brute force solution generating all the possible maps and figuring out what correlates with useful things.

The markers are just a simple way to gain information about the map. Hard to see "this is room with ID X in position Y in this dungeon", easy to see "exit faces left".

random-username4 years ago

The most elegant way would indeed be to use the server component of the game to generate maps and gather the necessary information. No UI needed and easy to run hundreds of instances in parallel. However, there's probably more custom code needed to run the scan on the map data.

Running in normal single-player also makes initial debugging and finding "markers" easier

random-username4 years ago

The correlation itself for the relevant parts can be calculated with no problems. No brute force needed because we're not interested in all combinations and outcomes.

Idea is to define m markers that are easily accessible (obviously would use something like waypoint position, exit directions that you take anyways) and run that for n seeds.

Then calculate probabilities for the outcome of all these markers, e.g. it might be 50/50 if town exit in A2 is left or right. Call this the baseline.

Next look at a marker you are interested in such as "Arcane Sanctuary is top left" and look again at the probabilities for all other markers for the subset of samples that have the marker you are interested in. Calculate the difference to the baseline. Or the difference to the remaining samples (i.e., the ones without the marker you are looking for). If that tells you "town exit A2 is top left 70%, only 30% top right" then you've found a useful correlation.

That's a O(m * n) operation for each marker you care about, easy!

Key is you need lots of samples and lots of markers (because you don't know which ones end up being useful), so better automate this.

random-username4 years ago

I've recently watched a few D2 speedruns and noticed the reset strategy to get the "right" A1 town exit; I've since found the thread explaining it here: https://www.speedrun.com/d2lod/thread/wi5mn

I've put some thought into why this works and how this can be exploited further....

  1. How does a pseudo-random number generator (PRNG) work?

A PRNG has some internal state and when you ask it for a random number this state changes in a deterministic way. An example for a simple PRNG would be calling this function on some state x and returning the state as random result for each number generated.

f(x) = x * 7 mod 13

Let's say x starts with 1 (called the seed) and we want some random numbers, here's what we get:

1, 7, 10, 5, 9, 11, 12, 6, 3, 8, 4, 2, 1

That's clearly a pretty bad PRNG because it is very predictable. Seeing just one result allows us to calculate the next one (because we give out the complete state as result, easy to fix). But imagine using very large prime numbers instead of 7 and 13. And a more complex operation than a multiplication.

And imagine not being able to observe the full result but just a coin flip, e.g., mapping 1-6 to head and 7-12 to tail in this function for the output. Then it's much harder to predict.

Now there are two properties of a RNG that we are interested in:

i) how predictable is it, i.e., can we get the internal state by observing the output? ii) how big is the state, i.e., in our case x can only take 12 different values

The thing we are interested here is only the size of the state. You might think that i) is the more interesting thing, but it turns out the size is the limiting factor for Diablo II map generation.

Our example function above only has 12 different states (values 1 to 12). That means two things:

i) it repeats itself after at most 12 different outputs, see example above, it gets back to 1 in the 13th output, since that's the only state it has it'll just start over ii) there are only 12 different seeds that can be used, i.e., 12 different combinations for the first 12 outputs

We care about ii) here. Assume that this function would be used to generate a map and our map just consists of 4 random numbers between 1 and 12

How many maps that consist of 4 numbers between 1 and 12 are there? 12 * 12 * 12 * 12 = 20736 How many different maps can that PRNG generate? Only 12. Each seed (initial value) will generate a unique map.

So for example we'll never get the map 1, 10, 2, 3 because there is no such seed that generates this sequence. Or we never get a map where the same number occurs twice because it only repeats after 12 outputs, but we only look at 4.

So even tough it looks like there are only very few maps our PRNG is already insufficient.

  1. So you build a bad PRNG, a real game would surely use a better one with more than 12 possible seeds?

Diablo 2 seems to use a RNG with a seed of 64 bits: https://github.com/OpenDiablo2/OpenDiablo2/blob/8a547ebf0e23daedbaf0c5717b9f81b72a74ecb0/d2core/d2gamestate/d2gamestate.go#L22

That means there are a total of 2^64 (that's 18,446,744,073,709,551,616) different possible maps. That ought to be enough for a true randomization of maps, right? Nope.

Diablo 2 maps have lots of coin flip (or 3-way or 4-way) "decisions" that are being made during generation. Examples are:

  • Town exits in A1 and A2
  • Exit directions on maps and caves
  • Is there a river in map X
  • Which direction is the barracks in?
  • Is the <whatever> dungeon on the left or right part of the map?
  • Are there portals/weird escher-style paths/nothing in direction X in the Arcane Sanctuary?
  • Which tomb is the right one?
  • Is the path between the different Kurast areas to the left or the right?
  • Are there 2 chests or 3 chests in X?
  • ....

You can probably find tons of these examples. We are looking for ones that should not be correlated. For example, there is some correlation between area exits as some configurations are impossible (they'd run into each other).

Let's look at each of these decisions as a coin flip decision. For most it's clearly more than a single coin flip (4 town exits in A1, 3 exits for many areas), let's ignore that and say there's one coin flip on average in each decision.

How many of these coin flips in maps can you find? 100? Probably! And these are just the decisions that you can easily spot in the game. There's way way more than that internally.

How many different maps are there theoretically if they only consist if 100 coin flips? 2^100 maps. How many different maps can possibly exist with a 64 bit seed? Only 2^64.

2^100 is way more than 2^64. It's a factor of 2^36 more, that's around a factor of 68 billion more. In this example only 0.00000000145519% out of the maps could exist are actually being generated if maps consist of only 100 coin flips. But maps are way more complex than that so it's even worse (or better, for us)!

Diablo 2 can only make 64 coin flips for map generation without becoming predictable. There are more than 64 coin flips in maps, so we can gain some information by observing maps.

A lot of things that could be possible are simply never generated due to this limitation and there are unexpected correlations that can be exploited.

  1. Some assumptions about map generation in Diablo 2

There's one big assumption to make this useful that seems to be true: The whole map is generated at once for all acts. It is clearly generated at least the starting act at the same time (otherwise map hacks that just read local data wouldn't work on battle.net). It looks like it generates everything for all acts in the beginning as https://www.speedrun.com/d2lod/thread/wi5mn indicates that are correlations between A1 town layout and A3 and A4.

This is important: the whole problem could be avoided if D2 generated maps as you first enter the area. Because the PRNG state would then be influenced by whatever you did before you entered the area (such as kills, spell damage calculations, whatever else the seed is being used for).

This can be tested: start the game with --seed X and check in new games whether all acts look the same in different play-throughs. I guess they do?

  1. Finding correlations (in an automated way)

There are probably way more correlations than have been found. There's probably stuff like "real tomb was circle, therefore exit in area X faces direction Y". These could be figured out:

i) make a list with easily identifiable "markers" like the examples I gave in 2) ii) create a new map on a character that has all acts revealed iii) find all markers/results of these coin flips or X-way decisions iv) repeat several times

Once you got enough data you can write a simple tool to find correlations. Note that these can be more complex than just one finding implies one other. It could be that three different decisions are tails imply that another two can never be heads.

Gathering that data can be automated. Inject some code in the game, make it generate maps (in the simplest case by just starting a game). Look at what it generated in memory. Example code from a map hack looks easy enough: https://github.com/planqi/slashdiablo-maphack/blob/master/BH/Modules/Maphack/Maphack.cpp#L802

Stuff it finds has a position and an ID. Something like finding "tower in A1 is on the left part of the map" or "exit in arcane sanctuary is top right" is easy to figure out based on this.

No, I didn't do any of this. I didn't even play the game in 15 years... I'm just procrastinating here...

  1. Fun fact: more serious PRNG problems

This "limitation" isn't really a problem for a game like D2 when played normally. But there has been at least one case of an online poker site using a seed of only 64 bits, i.e., they had only 2^64 different possibilities the deck could be shuffled. However, there are 52! ways a deck could be shuffled. 52! is way way way bigger than 2^64 (it's bigger by factor of 4 followed by 48 zeroes). Clever people did exploit that to earn serious money. (Can't find the story right now, was ~10 years ago)

Problems with the PRNG in slot machines are also not unheard of, there's a fun story of people feeding the output of slot machines to a phone app which then told them when to bet a lot of money because the next move is gonna be a winner: https://www.wired.com/2017/02/russians-engineer-brilliant-slot-machine-cheat-casinos-no-fix/

Edit 1: removing accidental smileys Edit 2: typos

About random-username
Joined
4 years ago
Online
4 years ago
Runs
0