Your web application is likely rendering requests when the requesting client has already disconnected. Eric Wong helped us devise a patch for the Unicorn webserver that will test the client connection before calling the application, effectively dropping disconnected requests before wasting app server rendering time.
The Flash Sale
A common traffic pattern we see at Shopify is the flash sale, where a product will be discounted heavily or only available for a very short period of time. Our customer's flash sales can cause traffic spikes an order of magnitude above our typical traffic rate.
This blog post highlights one of the problems dealing with these traffic surges that we solved during our preparation for the holiday shopping season.
In a flash sale scenario, with our app servers under high load, response time grows. As our response time increases, customers attempting to buy items will hit refresh in frustration. This was causing a snowball effect that would contribute to reduced availability.
Connection Queues
Each of our application servers run Nginx in front of many Unicorn workers running our Rails application. When Nginx receives a request, it opens a queued connection on the shared socket that is used to communicate with Unicorn. The Unicorn workers work off requests in the order they're placed on the socket’s connection backlog.
The worker process looks something like:
The second step takes the bulk majority of time of processing a request. Under load, the queue of pending requests sitting on the UNIX socket from Nginx grows until it reaches maximum capacity (SOMAXCONN). When the queue reaches capacity, Nginx will immediately return a 502 to incoming requests as it has nowhere to queue the connection.
Pending Requests
While the app worker is busy rendering a request, the pending requests in the socket backlog represent users waiting for a result. If a users hits refresh, their browser closes the current connection and their new connection enters the end of the queue (or nginx returns a 502 if the queue is full). So what happens when the application server gets to the user's original request in the queue?
Nginx and HTTP 499
The HTTP 499 response code is not part of the HTTP standard. Nginx logs this response code when a user disconnects before the application returned a result. Check your logs - an abundance of 499s is a good indication that your application is too slow or over capacity, as people are disconnecting instead of waiting for a response. Your Nginx logs will always have some 499s due to clients disconnecting before even a quick request finishes.
HTTP 200 vs HTTP 499 Responses During a Flash Sale
When Nginx logs an HTTP 499 it also closes the downstream connection to the application, but it is up to the application to detect the closed connection before wasting time rendering a page for a client who already disconnected.
Detecting Closed Sockets
With the asynchronous nature of sockets, detecting a closed connection isn't straightforward. Your options are:
- Call select() on the socket. If a connection is closed, it will return as "data available" but a subsequent read() call will fail.
- Attempt to write to the socket.
Unfortunately it is typical for web applications to find out the client socket is closed only after spending the time and resources rendering the page, when it attempts to write the response. This is what our Rails application was doing. The net effect was that for every time a user pressed refresh, we would render that page, even if the user had already disconnected. This would cause a snowball effect until eventually our app workers were doing little but rendering pages and throwing them away and our service was effectively down.
What we wanted to do was test the connection before calling the application, so we could filter out closed sockets and avoid wasting time. The first detection option above is not great: select() requires a timeout, and generally select() with even the shortest timeout will take a fraction of a millisecond to complete. So we went with the second solution: Write something to the socket to test it, before calling the application. This is typically the best way to deal with resources anyways: just attempt to use them and there will be an error if there’s something in the way. Unicorn was already acting that way, just not until after wasting time rendering the page.
Just write an 'H'
Thankfully all HTTP responses start with "HTTP/1.1", so (rather cheekily) our patch to Unicorn writes this string to test the connection before calling the application. If writing to the socket fails, Unicorn moves on to process the next request and only a trivial amount of time is spent dealing with the closed connection.
Eric Wong merged this change into Unicorn master and soon after released Unicorn V4.5.0. To use this feature you must add 'check_client_connection true' to your Unicorn configuration.