Monday, July 23, 2012

Web Load Testing Basics

Looking through a number of forums I noticed that a lot of people are trying to get familiar with JMeter (so that they're capable to load test web servers) without understanding the basics of load-testing of http servers.
All what's told about here is actually the very common and the minimal stuff to know to start load testing with JMeter.

Whats the goal

You may want to load your server due to several purposes (I would point out two of them). Actually one usually puts their own value to the terms, however I would propose to understand them in the following way.

Web load testing with JMeter
  • Load Testing - this is the regular testing implying you just track the reaction of your server for load injection. The main goal of such the tytpe of testing is to learn the dependency of how your resources are consumed against the portion of load. Starting from small loading you make the load rate igger and bigger and meanwhile you check how the correasponding components consume the resources. For example you may set the performance counters for processor time utilization and disk queue for the machine where your database is hosted. By increasing the numner of users trying to log in your system you may learn the tend on how your database responds. So you're able to predict the resource consumption by extrapolating the trend.
  • Stress testing. The main goal of such the type of testing is to check if the system is ready for real load spikes. It implies you inject the unusually huge loading and check mostly whether the system is still alive and no critical data is corrupted.

What makes the resources to be consumed.

So we have http server. What makes it to consume more or less resources and wich component of your system are relevant to your goal? That highly depends on the architecture and what's your server actually doing. Any calculation usually consumes a lot of CPU time, working with data loads hard drive and RAM. So for newbies I would reccomend to set sch three counters for each node where you host your system's components. Likely they are at least: HTTP server, DB server (+ probably authorization server)

Should I completely emulate user's flow?

Some start using JMeter from the simplest stuff like recording user interaction with JMeter proxy component. They usually get some tresh-looking stuff as proxy records the requests to every resource (however you can configure it to capture only the requests which meet certain patterns ). So whether we should keep all those requsts in load scenario? Whether they are relevant to the goals we set for our testing? Not indead.

Lets first recall the http conception. Http is the protocol which is originally intended to provide hypertext (actually just the formatted text) to the clients (aka browsers). Now the http server can also provide not only the hypertext but also media, some archived data, etc. All that can be described with umbrella-term "resource". So once you request html page you do rather request resource. Each resource on the server has its own URL (Uniform Resource Locator). That URL is basically the unique identifier that says to the server where it should take the resource from.

Server gets the URL requested by the client and then decides how to process it. Depending on the web container you're using the server redirects the request to dedicated process which returns the resource. So here the magic starts.

Actually the requested resource can either exist or even not. For example you may have static html page on the server. If you request that page the server usually just returns it to you (probably after your request authorization). Such the request type does not require lot of CPU time. It requre the disk time and probably some system specific resources (like free file descriptor). The most interseting things happen if some data you're requesting require pre-evaluation.

Assume you have dynamic content on your page. Before the server returns you the page it has to parse your html, evaluate the dynamic parts (probaly it will require sql query execution). This is what matters for the load testing.

Curse on record/playback aproach

How the testers usually act to build the scenario. Testers usually somehow capture the real requests. They interact with the server using the web UI and either capture the scenario using JMeter proxy or capture the requests using browser-embedded tools. Such the approach has certain negative sides:

  1. You capture a lot of not required requests
  2. It is hard to understand the logic of such the scenario after it's been built
  3. In real life lot of the requests  are supplied with parameters which are evaluated in different ways for particular session so they'll be obsolete in new one.

What pattern should I use to reproduce the user interaction?

Follow the below rules until you're not skilled enough

  1. Find out if any resources are stored in third-party storages. Requesting such the resources should not be included in your scenario as we cannot impact the third-party components' performance
  2. Avoid requesting static resources if you're not doing stress testing
    1. Find out if any files are generated on the fly. Such the files should be requested by your scenario as they consume processor time on generation phase. All other files shouldn't.
    2. Tend to avoid requesting completely static html resources. They won't give you a picture of your server performance
    3. Tend to avoid requesting javascript files. JMeter does not process javascript anyway and won't make your scenario easier
  3. Avoid proxies. Proxy will make you confused with your scenario as it will not be easy at all to read recorded samplers. If you don't use proxy it takes a bit more time to build your scenario but you can use the power of flow control components and parse the samplers output to provide the correct parameters which are specific for particular session.

The next time I'll touch the specific of error identification in your scenario.

Read also some advanced stuff about the approach to perform load testing of comet long-polling applications (which is called by some people AJAX push).