Wednesday, August 01, 2018

Optimizing Wireshark for HTTP Analysis/Troubleshooting

I spend quite a bit of time troubleshooting various web applications, so I've done a lot of work with Wireshark's HTTP display filters. The ones I use most frequently are:

  • http.response.code - In an HTTP response, the numeric response code (200, 404, 500, etc.)
  • http.content_length - In an HTTP response (and PUTs and POSTs), the total payload size
  • http.request.method - The type of request made by the client (e.g. GET, PUT, POST, CONNECT, etc.)
  • http.request_in - For an HTTP response, the packet # of the corresponding request
  • http.response_in - For an HTTP request, the packet # of the corresponding response
  • http.time - elapsed time between request and response

That last filter needs a bit of explanation, because it can be computed in two different ways.  If the TCP preference "Allow subdissector to reassemble TCP streams" is enabled, http.time reflects the time between the request and the last packet of the response (i.e. the end of any data returned); if that preference is disabled, http.time reflects the time between the request and the first packet of the response (i.e. the HTTP response code).  I almost always have that TCP preference enabled, because the client/browser usually can't do anything with the response until it receives all of it!

So, one could work through a Wireshark session, using display filters like http.response.code==404, http.content_length > 4096, or http.time > 2.0 to display various packets...but, to be honest, I'd rather not do that much typing.  So, I set out to optimize Wireshark's performance and tweak its display for HTTP analysis.

The most significant performance optimization one can implement in Wireshark is to disable analysis of irrelevant protocols.  By default, Wireshark tries to dissect every protocol it can identify...and there are hundreds of protocols in its dissection engine.  Disabling irrelevant protocols will greatly enhance overall performance.  Keep in mind, though, that one needs visibility at all layers of the network stack; for most work in my environment, I need Wireshark to dissect Ethernet, IPv4, TCP, UDP, ICMP, SSL/TLS, HTTP, and a handful of other protocols.  I disabled analysis of all other protocols.
Now, to the GUI tweaking.  First, I created filter buttons for general display filters, so that I can apply them with a single click:
  • http - apply http - display all packets identified as HTTP
  • 1xx - apply http.response.code < 200 - display responses with informational codes
  • 2xx - apply http.response.code > 199 && http.response.code < 300 - display responses with success codes
  • 3xx - apply http.response.code > 299 && http.response.code < 400 - display responses with redirection codes
  • 4xx - apply http.response.code > 399 && http.response.code < 500 - display responses with client error codes
  • 5xx  - apply http.response.code > 499 - display responses with server error codes
  • >2s - apply http.time > 2.0 - display all responses that required more than 2s to complete
  • GETs - apply http.request.method=="GET" - display all GET requests
  • POSTs - apply http.request.method=="POST" - display all POST requests
Then, I added columns for use in "eyeballing" HTTP traffic and doing quick sorting (you can sort on any column in Wireshark's display):
  • Stream - display tcp.stream, the connection identifier generated by Wireshark as the file is read
  • Req # - display http.request_number, the request's sequence number within its connection (useful for HTTP-pipelined connections)
  • HTTP Req - display http.request_in for HTTP responses
  • HTTP Res - display http.response_in for HTTP requests
  • HTTP Time - display http.time
  • HTTP RC - display http.response.code for all responses
  • Payload - display http.content_length for all requests/responses with data payloads

Here's a sample of the results (click to enlarge):


(You can see the "one click" filter buttons in the top right.)

From here, I can sort on any column (like, oh, HTTP Time?), match requests and responses easily, identify "red flags" at a glance (why on earth did it take 1.5s to pull down PNG files of only ~280 KB?!), and examine how "red flags" affected subsequent requests on the same connection (that 1.5s delay affected the browser's processing of the next GET as well, since they were on the same HTTP-pipelined stream), all from a single view...and Wireshark's processing time is greatly improved, to boot!

Now, this isn't perfect.  If any problem occurs in IP/TCP/SSL-TLS/HTTP reassembly (for instance, a missing or corrupted packet), you won't get full information on affected HTTP transactions...but it's easy to right-click on the "missing information" packet and use Wireshark's Follow TCP stream command to zero in on a transaction with missing information and determine what happened.

(If you'd like to try this profile, you can download it as a ZIP file.  You'll want to unzip it in your Wireshark profiles directory; it will create a profile directory named 'HTTP'.  Within Wireshark, you can switch profiles by clicking on "Profile: Default" in the bottom-right corner.  Remember, though, that any configuration changes you make are automatically saved to the current profile; be careful!)

So, that's the Wireshark environment I'm using for HTTP analysis.  Did I miss something you consider important?  Is there a change you would make in your environment?  Let me know in the comments...

No comments: