# Generating Unique Level IDs

I built a game using Unity. It is now available on iOS & Android. You can download it here.

The purpose of this game (for product managers and other observers) is to demonstrate some of the problems faced in a production application, and take you along the journey of fixing and running experiments. I hope this serves well for younger PMs.

Note: the app updates and plans here will take time as this is being developed by me. And I’m not good at this. I learned Unity a couple of months ago. But I do play a lot of games.

# The Problem

When I generate levels, and I generate a lot of them, I store the level as a string. To recap, a level is defined as a matrix, as shown below:

This matrix is stored as a string. Every time I generate a new level, I have to make sure it is unique. However, string comparisons take time, and are expensive. Add to that, my level ID is an auto-increment, making things even more tough for me to identify if a level is unique.

In short, I want a simple, easy and cheap (wall-clock) to be able to identify each level uniquely.

# The Solution

At first, I thought simply hashing the string would be enough. And while it might be true, I don’t like the way I have to do a string comparison. I’m sure that’s expensive (relative to what I’m proposing), even in C.

## The Idea

Generate a unique ID by:

1. Generate N prime numbers, where N = matrix size (a x b matrix has a*b cells) and store in an array (PRIME_ARRAY)
2. For each cell, calculate cell_val: PRIME_ARRAY[cell index] * 2 ^ [value in cell]
3. Sum the cell_val

Now, I can’t really test the theory — I guess I’ll know this in production. I can, however, do some logical tests:

1. In the case of a 2 x 2 matrix, the primary numbers are [2, 3, 5, 7]. Therefore, the minimum value is 2 + 3 + 5 + 7 = 17 and the max value is
2 * 2⁴ + 3 * 2⁴ + 5 * 2⁴ + 7 * 2⁴ = 272 (given that the max value is 4)
2. Given that the max value is 4, I have 5 ways to fill each cell, giving us a total of 625 permutations available. Which is 2x more than the available range of IDs .

OOPS! something is wrong.

I went ahead and coded it anyway, just to see how many actually fit:

`#DEMO for a 2x2 matrix#Generates a 2x2 matrix that sequentially increases cell valuematrix_arr = []matrix_uid = []for a in range(5):    for b in range(5):        for c in range(5):            for d in range(5):                #Our matrix                matrix_arr.append(f"{a} {b} ; {c} {d}")                #Our Matrix ID                matrix_uid.append( 2 * pow(2, a) + 3 * pow(2, b) + 5 * pow(2, c) + 7 * pow(5, d) )cf = pd.DataFrame.from_dict({ "matrix" : matrix_arr, "matrixUID" : matrix_uid }, orient="columns")# Lets see how many are actually duplicatedcf.duplicated('matrixUID').describe()Result:count       625unique        2top       Falsefreq        355`

Only 355 were uniquely identified, everything else were duplicates.

That’s not good.

Maybe, I can try with a different function?

`#raise every cell value to the corresponding prime number.matrix_uid.append(  pow(2, a) +  pow(3, b) + pow(5, c) +  pow(7, d) )Result:count       625unique        2top       Falsefreq        393`

That doesn’t work either.

# The Final Solution

After spending 2 hours looking every possible combination available, I succumbed to using hash. Which turned out to be super simple:

`matrix_uid.append( hash("".join(str(i) for i in [a, b, c, d])))Result:count       625unique        1top       Falsefreq        625`

Adopting this to my actual code for generating:

`# Hashing function - Adding sys.maxsize + 1 for avoiding negative numbers.def getHash(matrix):    return hash("".join(str(i) for i in matrix.tolist())) + sys.maxsize + 1# Generatives Matrixdef genMatrix():    activeCells = 0    cell_index = 0    matrix = np.zeros((maxY,maxX))    for i in range(0,maxY):        for j in range(0, maxX):            val = random.randint(0, 4)            if val != 0:                activeCells += 1            cell_index += 1            matrix[i,j] = val    return [matrix, activeCells, getHash(matrix) ]# Weird naming choice, but this is to control the density of the matrix# This is where the hash lookup will take placedef properlyGenMatrix(maxPopulationPC, minPopulationPC):    activeCells = 0    totalCells = maxX * maxY    while( activeCells/totalCells < minPopulationPC or activeCells/totalCells > maxPopulationPC):        respo = genMatrix()        activeCells = respo       return [respo, respo]# Function for storage - the string is directly interpreted by the appdef matrixToString(matrix):    xx = matrix.tolist();    strToReturn = ''    for array in xx:        for x in array:            strToReturn += str(x) + ' '        strToReturn += '; '    return strToReturn;`

Running it for 500k generations, gives us this result:

`count     500000unique         1top        Falsefreq      500000`

Interestingly, even with a random number generator and no check for uniqueness, the generated matrices are all unique. Which is not surprising, after a bit of sense check (there are 5³⁰ (9.31 × 10²⁰) permutation available for a 5x6 matrix, out of which we’re only picking 5 x 10⁵).

My worry on using a string comparison has somehow left me — seeing that the hash is returning a number and not a random string, it will be easy for me to load that entire list in memory as an array and test for uniqueness. I’m sure there are better approaches, but there is only one way to find out!

# Final Thoughts

It’s best to use inbuilt libraries and worry about the consequences, later on. (I’m sure I’ll regret saying this, but you’ll be there to witness it!) However, my greater learning is that it’s always fun to test out your hypothesis and get proven wrong.

My next step is to build a pipeline for uniquely generating these levels in the background and solving them to assign levels.

But that is a topic for another time!

If you enjoyed this and made it so far, I guess you can say HA! in the comments.