[Fixed]-How does a python web server overcomes GIL


You usually have many workers(i.e. gunicorn), each being dispatched with independent requests. Everything else(concurrency related) is handled by the database so it is abstracted from you.

You don’t need IPC, you just need a “single source of truth”, which will be the RDBMS, a cache server(redis, memcached), etc.


First of all, requests can be handled independently. However, servers want to simultaneously handle them in order to keep the number of requests that can be handled per time at a maximum.

The implementation of this concept of concurrency depends on the webserver.

Some implementations may have a fixed number of threads or processes for handling requests. If all are in use, additional requests have to wait until being handled.

Another possibility is that a process or thread is spawned for each request. Spawning a process for each request leads to an absurd memory and cpu overhead. Spawning lightweight threads is better. Doing so, you can serve hundreds of clients per second. However, also threads bring their management overhead, manifesting itself in high memory and CPU consumption.

For serving thousands of clients per second, an event-driven architecture based on asynchronous coroutines is a state-of-the-art solution. It enables the server to serve clients at a high rate without spawning zillions of threads. On the Wikipedia page of the so-called C10k problem you find a list of web servers. Among those, many make use of this architecture.

Coroutines are available for Python, too. Have look at http://www.gevent.org/. That’s why a Python WSGI app based on e.g uWSGI + gevent is an extremely performant solution.


As normal. Web serving is mostly I/O-bound, and the GIL is released during I/O operations. So either threading is used without any special accommodations, or an event loop (such as Twisted) is used.

Leave a comment