STACKED — THE PUZZLE GAME
Performance comparison of look-ups in Python v/s Javascript
[After publishing this, I asked senior python devs about this and they told me I messed up on two areas: one, using time.time() is not accurate and two, there are specialized storage options in python for these kind of lookups. I have since then amended the code and updated these findings at the bottom of this article.]
Following the exploration on the best way to identify UIDs for the matrices generated, I wanted to see which was actually faster: int look up or str look up, and if yes, how much faster? While doing that, I got side-tracked and wanted to see if JS (node) does it better (SPOILER: it does not!)
The Results
There is a significant difference in performance between JS and Python, especially for integer look-ups from an array. The rest of the data is evidence enough to show that JS is faster.
The Code
The entire code is available on GitHub (src).
Python
# We want to build a list of 17 digit integers and strings
min_range_start = 1000000000000000
max_range_start = (min_range_start * 10 - 1)
# Generate number list
num_list = []
for i in range(100000):
num_list.append(random.randint(min_range_start, max_range_start))
str_list = []
for i in range(100000):
str_list.append(random.randint(min_range_start, max_range_start))
# Look for a random number in the int list 100k times.
num_time_list = []
for i in range(100000):
x = random.randint(min_range_start,max_range_start)
t1 = time.time()
if( x in num_list):
t2 = time.time()
num_time_list.append(int((t2-t1)*1000*1000)) # micro-seconds
else:
t2 = time.time()
num_time_list.append(int((t2-t1)*1000*1000))
## similarly for str list:
str_time_list = []
for i in range(100000):
x = str(random.randint(min_range_start,max_range_start))
t1 = time.time()
if( x in str_list ):
t2 = time.time()
str_time_list.append(int((t2-t1)*1000*1000))
else:
t2 = time.time()
str_time_list.append(int((t2-t1)*1000*1000))
We’re using the time.time() function to get the time in microseconds (after multiplying by 10⁶). We’re also doing int
casting to get round numbers.
Javascript
We’re doing the exact same thing as Python, but splitting things up in functions for serial execution order (using async-await).
Please find the code on Github — it is heavily commented and is self explanatory.
One thing to note is the browser performance — I found that the timing function was not suitable at all and the performance was marginally slower. That code has been deleted from the main repo, but you can find it in the init0 commit of the main branch.
Benchmarking System
I used Macbook M1 Air for this. I ran it both on battery saver and full performance mode, just to see if there was any difference. No reportable difference was found.
Why does this happen?
In complex terms, Node V8, which is running the JS code is a virtual machine that compiles the code to machine code, at run time. Python, on the other hand, is a Global Interpreter (LockGIL), it isn’t a compiled language.
Meaning:
Instead of translating source code to machine code like C++, Python code it translated to bytecode. This bytecode is a low-level set of instructions that can be executed by an interpreter. In most PCs, Python interpreter is installed at /usr/local/bin/python3.8. Instead of executing the instructions on CPU, bytecode instructions are executed on a Virtual Machine.
Source: How does Python work?. A simple explanation of how Python code… | by Dhruvil Karani | Towards Data Science
Ultimately, in my simple/non-technical understanding, with Node (JS), code gets converted to machine code which the machine can understand. Python gets converted to something else, which is then executed by someone else to then get the machine instruction. One additional step that costs the language .5 ms of processing time.
Marton Trencseni notes:
In general, one possible explanation for a Python list to be slower than a JS array is that Python will always implement a list as a linear structure (array or linked list), so searching will be O(n). JS at the language level supports sparse arrays where you can set an arbitrary index to some value, so internally it chooses between a linear structure or a dict-like structure (eg. hashmap). If it chose the latter, “lookups” would be super-fast (O(1) for hashmap, O(logn) for trees)
Next Steps
[This part flows from the previous few articles on the next version of the Stacked App. You can find the list here.]
I’m really tempted to setup node as a server and do the unique id look up through it, but I’m guessing the micro-seconds I save will be lost in the interprocess communication. Still, I got the answer I was looking for — int lookups are marginally faster than str lookups in large lists.
I have a very tempting itch to write this in C/C++/C# and see if there are any more performance gains. But that deviates me from my main goal → launch V2 of the app. I may do it in my other spare time. Or one of my extra special data scientist friend/reader can do that for me and post it as a comment (:wink: :winkk:)
Updates
Well, obviously, I was wrong — Python can be faster than NodeJS.
Here are the results of the subsequent coding & testing:
What went wrong?
First, my timing approach was not proper — in Python, I would take a timestamp before I started my comparison, and then do the lookup, and then take another timestamp. The difference was the time taken to complete the lookup. The ‘wall-clock’ time, if you may…
But, that’s not really true — apparently, the time.time() method has a performance hit. The best way to do this, is to use timeit().
I found not significant difference though:
The only difference in time capture can be found in Numpy Int Look up, where the timeit function consistently returns single digit processing time, but the time.time() returns double digits.
Second, it’s not wrong as much as I wasn’t using the right tools for the job. Numpy Arrays are implemented in such a way that performance is on par with a C/C++ implementation. And it shows:
Surprisingly str lookups are faster (at least according to the time.time() method). This is obviously faster than node js (which had an avg of 64µs for int look up).
Overall histogram comparison:
On the code side:
#This is the setup statement
stp = '''
import timeit
import random
min_range_start = 1000000000000000
max_range_start = (min_range_start * 10 - 1)
tt_num_list = [random.randint(min_range_start, max_range_start) for i in range(100000)]
'''
# This is the statement to run, n number of times
stm = '''
1 if 5237463704375238 in tt_num_list else 0
'''
## the reason we're selecting a random number
## and not generating one on the fly is because that'll factor in
## the performance calculation.
## Even pre-generating numbers in a list and using that to lookup
## still eats from the test.
## Our time.time() test had the number identified before t1 was stored.
# function params: stmt = statment to run repeatedly,
# repeat = number of runs to repeat, number = number of times in a run
# i.e. the below line with run 1000 times, each time it runs it'll run the
# statement 10 times.
times_int = timeit.repeat(stmt = stm, setup=stp, repeat = 1000, number = 10)
# the way it works is that it'll record the total time spent in
# running the statement 10 times. This is what you get in the output.
# output is a list which contains time in seconds for each 'run'
## reducing to micro-seconds and dividing by number for per run
times_int = [ i * 1000 * 1000 / 10 for i in times_int]
# so if the nth value was .4s -> the nth run of 10 statement runs took .4s
# which means, each statement run took .4s/10 = .04s or 40ms.
# also note that we only have 1000 records from this.
What have I learned from this?
Numpy is a beast! Also, ask for help from industry experts.
Thank you for reading!