Optimizing a program for analyzing a word game
2 years ago

I am part of the speedrunning community for a PC game called Bookworm Adventures. I have written some utilities to help people learn/practice the game. At the moment I'm trying to write something that can analyze a game state in order to determine the best play, similar to utilities that exist for Scrabble players.

The game's board is a 4x4 square of tiles. Playing a word removes those letter tiles from the board and replaces them with randomly generated letter tiles, with a maximum of 4 of the same tile. "Qu" exists together as a single tile. (I replaced all instances of Qu with Q in the word list to simplify this) I don't know the official probabilities of each letter but I've approximated them experimentally.

The game's dictionary is loaded into memory as a tree of linked nodes. Doing this I can quickly assemble a list of valid words that are good enough to kill the current enemy. But out of those valid words I want to assess which of them leaves behind the best letters.

The way I have it now is that it generates many random board states based on those leftover letters, sums up the best scoring play from each of those boards and returns the average.

So my current flow is something like this:

struct word{ string name; // The word itself string leave; // The letters left behind ('?' for blank spaces) };

vector<word> words; // Dump all valid plays into words int size = words.size();

for (int i=0; i<size; i++) { word curWord = words[i]; cout << curWord.name << ": " << leaveValue(curWord.leave); } With the function looking like

float leaveValue(string leave) { float sum = 0; /* Here I used multithreading to split the job among all available threads. It helps a little bit but my computer is pretty wimpy. */ for (int i=0; i<MAX_SIMS; i++) { string simBoard = randomBoard(leave); // Fills in blanks w/ random letters based on letter probability sum += findBest(simBoard); // Traverses the dictionary tree to find the best scoring word } return sum / MAX_SIMS; } However if I run this with MAX_SIMS = 1000 my results vary quite widely. It seems like I need at least 100k sims per word (maybe even 1M) in order to get a consistent average.

One thing I thought to do is to assemble a list of the randomized boards, only keeping unique boards but keeping track of how many times they appear. That way I'm not assessing the value of the same boards multiple times. Is there anything else obvious I'm missing to speed this up?

Singapore

I've played bookworm adventures before, though I've never speedrunned it...

may I ask what/how you are doing this? are you running a python script or something?