Sep 11, 2009

How database response time could slow down web application responses

Ours is a large scale n-tier web application suite with a very busy database behind it. Recently, tech boss gave a nice lecture to another guy about the relations of database response and http request/response for a typical web application - I was loving the writeup!

It explains how slow database responses could cause increased load on web servers.

Look at the whole system as a black box. A user sends and HTTP
request to it, and then receives a response. Each request has to
generate a response. Users are independent of each other. i.e. if one
sees his browser not responding, the other will not be affected by it.

Before the response is generated, the application server has to do
some work and then send the response back. Not all requests will take
the same amount of time to process. But here is the crucial thing:
over a given time period, the number of requests (input) has to be
equal to the number of responses (output). Otherwise, you will see a
build-up of requests.

Now, during the peak hours of a high usage day, we have at least 1200
active users. They send 50 requests per second. So we need to
process 50 requests per second. On average, we are taking 20ms to
process each request. Now let's see what happens when the system
slows down. Assume the system becomes 5x slower. We process 10
requests per second.

After 1 second, we will have 40 in-process requests. After 5 seconds
we will have 200 in-process requests. Users who sent those requests
will see their browser "hung up". For simplicity, suppose they just
sit tight and do nothing. Remember, we have 1200 users. 200 of them
are now stuck. But the rest of the users are not aware of it and they
will continue their work.

After 10 seconds, we will end up with 400 in-process requests.

Now let's dig a little deeper into the black box. Here is the app
server and the DB server. For each request, the app server needs a
connection to the DB server. But there is an upper limit on the
number of connections between the app and DB servers. Let's assume
the upper limit is 300. We have accumulated 400 requests so far. 300
of them are being worked on in the DB. The rest 100 are "held up" by
the app server, waiting for a free ticket (connection) to the DB.

After 20 seconds, we have 800 pending requests. Of them 300 are being
worked in the DB, 500 are held up by the app server.

After 30 seconds, we have 1200 pending requests, one from each user.
The app server is holding up 900 requests.

How long can the app server hold up requests? Not for too long.
Within about 2 minutes, the browsers will start reporting errors to
the user saying something like "The server is not responding ...".

You see, the situation is already not good, even if users don't do
anything at all except their normal work.

In reality, users will not sit down tight. After 10-20 seconds of
irresponsiveness, they will start hitting reload. Each reload will
send another new request.

We enjoyed reading it :-)

No comments: