Friday, March 12th, 2010

Real-world Load Testing and the AHA! Moment

0

Author: Imad Mouline

Web site: http://www.gomez.com

About: I'm Gomez's CTO.

<cliché> One of my favorite parts of the job is talking with our customers. </cliché> The most fulfilling aspect of talking with our customers is witnessing the “AHA!” moment, seeing the twinkle in their eye and the expression on their face when they intuitively “get” a new concept.

I’m honored to have witnessed quite a few of those moments lately.  The vast majority of them have involved a discussion about load testing.  These typically have fallen into two categories:  conversations with people who are familiar with internal load testing and are being exposed to outside-in load testing for the first time, and conversations with  people who are familiar with outside-in load testing but haven’t thought it through to its logical conclusion:  real-world load-testing.

Given that most of you reading this blog undoubtedly understand and appreciate the need to test and monitor Web applications from beyond your firewall, I’ll talk mostly about the second category of conversations that I’ve had.

One of the surprising trends that I uncovered is that, while these customers understood the need to monitor the entire application from the end-users’ perspective, they didn’t always translate and transfer that belief to the load testing world. I found that their load tests, although conducted from beyond their firewalls, would often target only the components of the Web application that lived within the firewall.  Our conversation would then go something like this:

So, you’re telling me that you only load test the components that come from behind your firewall?

Yes, of course.

What about the other components that make up the Web application? For example, your CDN, your ad vendors, your analytics, your other third parties?

Well, I’m certainly not going to load test my CDN! That would cost too much, and there’s no need! It’s supposed to handle any load I throw at it! Same with the other third parties. That’s what the SLAs cover.

Aren’t you interested in how all these various components interact, and how they impact the end-user response time when they finally come together?

That’s a silly question. Of course we are. We do some spot testing of the complete application in our QA and staging environments.

And isn’t one of your requirements getting to a test scenario that is as close to the real world as possible before you go live?

It absolutely is!

So, when’s the first time that you get to put all these components together and see how the entire application works under a real-world load?
Well, that would be…uh… when we go into production…oh….OH!

At this point, I typically do get the pleasure of seeing that twinkle, that expression. After a suitably long moment of reflection, we carry on and discuss the finer points of understanding how end-user response times vary as a function of the load put on the entire Web application delivery chain. We turn the focus on how that load is generated. Does the Web application use any JavaScript? If so, is your load generation running a full browser or merely spitting out canned, pre-recorded HTTP requests? If it’s the latter, will it truly be able to deal with the dynamic nature of your Web application? Do you know that the order in which components are called is governed by many parameters and might change with each execution of the JavaScript engine, based on the browser, the performance of individual third party objects, and countless other subtleties that can greatly change the end-to-end response time? If your load generator consists of running a large number of browsers on the same machine or cloud instance, do you know that you’ll be invalidating any response times that the system will collect, simply because the browsers will be putting an unrealistically heavy load on the machine or cloud instance they’re running on, affecting the end-to-end response time?

This is typically when the next “oh!” can be expected.

Then another moment of reflection.

Then: “so, tell me again, what’s the right way of doing this?”

Well, the right way is to test the entire Web application, from the end-users’ perspective, from wherever they are going to be. Yes, large machines or cloud computing locations can certainly be invoked to help generate high load volumes to put stress on the entire infrastructure, third parties and all. However, response times should be collected from a true end-user network,  where each testing location is running a full browser, and is only running one test at a time (just like in real life!), so as to provide an accurate end-user response time. That means that the network has to be large enough and diverse enough to reflect the real world, YOUR real world.

“Aha!”

How are you injecting reality into your load testing?

Speak Your Mind

Tell us what you're thinking...