Performance testing is more important than ever. As we increasingly expose our IT systems to our customers we intertwine the performance of those systems with our brand reputation, our customer experience, and ultimately our revenue.
Good performance testing is only part of the equation. Where we run our tests dictates whether our results provide any real understanding about how it will perform in the real world.
In an ideal world we would performance test in production. I’ll talk a bit more about where this can actually work later on, but in general this is not possible due to the following risks:
- Compromising the security of customer data
- Load tests impacting real customers
- An inability to create/update records (real customer data) – so our tests do not reflect real workload
- The time between development of code to deployment in production too long (costs too much to go back and fix efficiently)
- Integrations between the system under test and other systems (unexpectedly impacting/bringing down another part of your organisation)
The obvious solution when face with these risks is to built an environment which is identical to production – or as ‘production-like’ as possible.
So what does it mean when I say ‘production-like’?
- Hardware: The hardware should have the same spec (or VM configuration) as production wherever possible (particularly CPU, memory, and disks). This also includes the networks between the components, and the number of servers.
- The application configuration should match production. This is everything from Java heap sizes and garbage collection algorithms, to database application pools, and web server caching.
- The application version should also match production or future production (whatever you are testing).
- The database should contain a realistic volume of data. Empty database perform faster than full ones. How much data you have depends on what you want to understand – you may need to think about how performance will change in the future as the database fills up. The test data available should also have realistic variations.
- The external integrations are something less black and white. It depends what the scope of your testing is, and whether there is an available production-like environment for these integrations. It may be acceptable to stub these provided everyone is aware what the impact of this is on the results.
- Any background jobs that occur in production should also occur in your performance testing environment. These jobs could impact the user experience and change the overall behaviour of your solution.
Building and maintaining such an environment is not only important for performance testing but can have a cross-purpose for security testing, functional testing, and disaster recovery
Not having a production-like environment is not necessarily the end of the road. For example, if the hardware is lower spec in your performance testing environment but the other conditions are identical it would be accurate to say that production will perform “as good or better” if you run your tests at the full workload expected in production.
The challenge of Shift Left
So what about the move towards performance testing earlier in the life-cycle? It’s no secret I am not sold on the concept of “continuous performance testing” – it has potential, but there are a lot of caveats and considerations which often mean it’s not worth the effort. The environments we use for this kind of testing are part of the challenge.
Say we run a component (or even integrated) test each time we deploy. What environment are we deploying to? Is it production-like? Is it integrated within itself and to external systems?
Most of the time the answer is no – we are testing in a scaled back and isolated environment. So what does our load testing really tell us? What it won’t tell us is how the system will perform in the real world. The best we can hope for is to track performance over time relative to previous builds (e.g. response time, server resource consumption).
The challenge of Shift Right
If you haven’t heard about it yet “Shift Right” is the opposite of Shift Left. The idea is to continually and rapidly deploy to production. We couple this with detailed application and server monitoring (e.g. APM tools) and the ability to roll back quickly. This means we are always measuring performance metrics in a real production environment so we get better value out of all our testing and monitoring.
In many ways Shift Right solves the environment issues I mentioned earlier because we are using production as a test environment . There are still plenty of considerations:
- Can we apply synthetic load?
- If we just use real users to load the system, can we account for peak load or future load?
- Can we deploy new code to a sub-set of our users to minimise the impact if things go wrong?
And then there’s the fundamental issue – if you are building a new system with a big bang go-live you still need to understand its performance before you go live. That requires a more traditional performance testing approach. Shift Right works best with incremental change to an existing solution.
Utopia: Infrastructure as Code
I’ve read about being able to spin up and collapse full production-like test environments at will. I’m yet to see this in practice, but it sounds promising.
I would love to hear your experiences in implementing IaC and whether it helped you implement more accurate performance testing with less effort.
I think there will always be challenges with production-like environments, particularly given the increasingly distributed nature of our systems which often rely on components all over the world.
As always, it’s about thinking pragmatically about your situation. The bottom line is – make sure your test environments facilitate accurate performance tests which provide meaningful insight into the performance of your systems.