How can gunicorn handle hundreds of thousands of requests per second for django?


Now the question is what no of threads and workers can serve hundreds or thousands of requests per second? Let’s say I have a dual-core machine and I set 5 workers and 8 threads. And I can serve 40 concurrent requests?

Yes, with 5 worker processes, each with 8 threads, 40 concurrent requests can be served. How quickly they’ll be served on a dual-core box is another question.

If I am going to serve hundreds or thousands of requests, I’ll need a hundred cores?

Not quite. Requests per second is not the same as "concurrent requests".

If each request takes exactly 1 millisecond to handle, then a single worker can serve 1000 RPS. If each request takes 10 milliseconds, a single worker dishes out 100 RPS.

If some requests take 10 milliseconds, others take, say, up to 5 seconds, then you’ll need more than one concurrent worker, so the one request that takes 5 seconds does not "hog" all of your serving capability.


