Benchmarking Python 3, Flask, and Heroku

More semantics...

Recently, I have been picking up Python with renewed interest now that a lot of libraries I'm interested in have stable Python3 releases. Perhaps, again, it is conditioning, but I really love a good IDE when using a language; especially an IDE that gets out of my way, and allows me to be productive. For me, the biggest advantage of a good IDE is object/class/library browsing. For this example, I used Visual Studio 2015 Community and the Python Tools for Visual Studio to flawlessly setup my Flask project, configure my Virtual Environment, and manage my requirements.txt.

You can find the source code for this project on GitHub.

In my last post, I did some benchmarking of Go, Gin, and Heroku. I chose Heroku because it is fairly trivial to spin up a new Heroku app: just install the toolbelt, and inside your git repo, run:

$ heroku create
$ git push heroku master

Also, another interesting aspect to Heroku is that we are deploying our software to a shared PaaS environment on top of some sophisticated virtualization, containerization, and service orchestration layers. Your compute resources are shared, and this can seriously affect the performance characteristics of your software.

(I understand I'm not saying anything revolutionary about Heroku, but these comments are for readers that might be entirely new web development and using a Platform as a Service like Heroku or Azure.)

Keep doing one thing well

Again, I am a huge proponent of the first axiom of UNIX philosophy:

Make each program do one thing well. To do a new job, build afresh rather than complicate old programs by adding new "features".

Therefore, from a business perspective, we can roughly state that our Python3/Flask exammple fulfills the same business objectives as our previous Go/Gin example. The both produce a JSON object containing an "rng" and a random number generated by a Mersenne Twister PRNG.

As stated previously, the Mersenne Twister is often used in regulated gaming (casinos/Class III, Class II, etc.) as an acceptable software psuedo random number generator. Gaming has different P/RNG needs than, say, cryptography, so using a deterministic algorithm like the Mersenne Twister is perfectly acceptable so that labs can verify the randomness of the PRNG over so many iterations.

In our Go, Gin, and Heroku example, we used an implementation of the Mersenne Twister written by GitHub user @CasualSuperman. In Python 3's standard library, the Mersenne Twister is the default PRNG in random.random():

Almost all module functions depend on the basic function random(), which generates a random float uniformly in the semi-open range [0.0, 1.0). Python uses the Mersenne Twister as the core generator. It produces 53-bit precision floats and has a period of 2**19937-1. The underlying implementation in C is both fast and threadsafe. The Mersenne Twister is one of the most extensively tested random number generators in existence. However, being completely deterministic, it is not suitable for all purposes, and is completely unsuitable for cryptographic purposes.

Interpreted vs. Compiled

The standard, default implementation of Python, CPython, provides a relatively fast execution environment for our Python code. Since the underlying Mersenne Twister in Lib\random.py is written in C (as noted above), I was certain that we would see some impressive results for an interpreted language. Python did not disappoint. In the future, I'd like to try Jython due to the amazing optimization characteristics of long-running software on top of the JVM.

Go, on the other hand, is interesting due to its garbage collection in addition to being compiled to machine code.

So, while from a technical perspective, comparing an interpreted Python application to a compiled Go application may seem unfair, I tend to look at tooling comparisons, at a high level, from a business and operations perspective.

At the end of the day, I really like creating durable software that doesn't require a high level of operational maintenance and babysitting. As noted above, from a business perspective, the Python3/Flask and Go/Gin demo micro services fulfill the same business function.

So, the question becomes, is Flask better than Gin at reducing operating expenses (OpEx) due to reduced compute resources needed to serve our application at scale? And, do those OpEx savings come with hidden costs in engineering time?

Some code

Here is the small snippet that handles the default route serving up our JSON:

from flask import render_template, jsonify  
from demoservice_flask import app  
import random

@app.route('/')
def generate_rng():  
    """Renders a random number using Python's built-in Mersenne Twister."""
    return jsonify(rng = random.random())

Meat and Potatoes

We ran the same 5 test conditions, using Loader.io, as we did for our Go/Gin test.

Test 1

View on loader.io

Test 2

View on loader.io

Test 3

View on loader.io

Test 4

View on loader.io

Test 5

View on loader.io

Not even close

Not surpisingly, the Go/Gin performance (over 300,000 without a single HTTP 40x/50x) completely dwarfed the best Python3/Flask response of 49,322 responses in 1 min.

These tests could be more accurate, but they serve as a high level exploratory test of performance capabilities of a single task. Load testing a web service requires testing every interaction between end-points in the system.

Should I use Flask or Gin?

Like most aspects of software development and engineering, the answer is:

It depends.

Performance isn't the only useful metric by which to decide your tools. Computations time is usually far less expensive than developers' time.

There are pros and cons to each languae and framework. Compared to Python, Go is rather young, and while there are many libraries, answers on StackOverflow, and blogs, the simple fact of the matter is that, today, Python has a much larger community.

And community matters when scouting for talent.

However, it may be that Go provides you performance characteristics that provide a huge benefit to your problem domain, or provides a competitive advantage against your competitors; especially at scale.

Then again, if a team can't get features to market fast enough, it doesn't really matter how fast their code runs.

The real problem is that programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times; premature optimization is the root of all evil (or at least most of it) in programming.

Donald Knuth, "Computer Programming as an Art (1974)"

The performance characteristics of Go may be a premature optimization for your project. Then again, maybe not.

It depends.