Legacy Codebase: Testing

For some reason, most legacy codebases don't have a lot of tests, and if there are tests they often are end-to-end tests that don't clearly specify what they are actually testing. In most cases the tests are just there to have enough coverage and not actually to cover the different cases. Since this is important code, you have to make sure that there are no regressions, so you have to add tests, but how?

Adding tests to a legacy codebase depends heavily on the codebase itself. If the code is rather nicely organized you can just add unit tests one class at a time. While writing these tests you will have to investigate the code and figure out all the paths. This can be a very tedious job, but it will make you understand what the code does, what it can handle and what not. You might even find edge cases where the code throws an exception. For all of this, you should write a test, since tests are not only there to test what works, but also what doesn't work.

If the codebase however is not that nicely split and has large classes and methods, lots of dependencies, does too much or anything like that that will discourage you from writing tests, starting to test it can be complicated. The idea here is not just to start writing tests and be done with it. This code clearly needs refactoring and it would be inefficient if your brand new tests would have to be refactored alongside the code. There are however different types of refactors and not all will have the same impact on the approach to write tests.

I am a big fan of unit tests per class, if however, there are too many dependencies between two classes and the separation is not done clean (which you intend to refactor), I would suggest to broaden your 'unit tests' to temporary include the depended class as well. This means that when testing class A that heavily depends on class B, I would not mock out class B but instead use the real deal and mock from there on. The reason for this is that your tests will give you more certainty when you start moving logic between class A and class B. I would also not start writing unit tests for class B until you have a clear API defined for it.

Whenever you mock something, it is vital that you know what the possible responses are and which input will yield what output. If you don't understand the mocked class, you either have to spend time inspecting the logic in that class and what is possible. Maybe en try a couple of scenario's. Alternatively, you can do this by adding tests and explore it like that. Or as I have suggested earlier, simply not mock it but include it in your scope.

In the next blog post, I will go in a more practical approach on how to deal with such code. The staring point however is always that there should be tests to avoid regressions