Scalability of Unit Tests

Started by
16 comments, last by SiCrane 16 years, 2 months ago
I am trying to understand how to properly do test driven development. I know I'm doing it wrong, but I can't figure out how to do it right. Here's a simple example that illustrates my problem. Suppose I wrote a function, sqrt, that estimates a square root of a number with Newton's method. I also wrote a test for it, sqrt_test, that tests it for some input. Now, suppose I wrote a solve_quadritic function that takes cooficients for a quadratic equation and solves it using a well known formula, utilizing my sqrt function above. I also wrote a test for it, solve_quadratic_test, that tests it for some input. So far so good. Now, there is an obscure bug in my sqrt function. I have to fix the bug. Now I've broken two unit tests - the one for sqrt function, and the one for solve_quadratic, even though I only changed the code in sqrt. You could make a number of arguments about this example (that both tests are supposed to fail because my change in sqrt affects other parts of the program, that my tests aren't actually unit tests but some other kind of tests, etc.) but the fact of the matter is that this type of testing does not scale. If a small change in the lower layer propagates all throughout my software and almost every minor change requires me to change tests that are responsible for other layers, very soon I'll spend most of my time dealing with fixing tests rather than adding features. It seems that the fundamental problem here is that solve_quadratic_test doesn't just test the algorithm in solve_quadratic, but also its dependencies. I can think of a few ways to get around this problem (pass a function pointer to sqrt to solve_quadratic so that during the test I can pass a mockup, start abstracting away unit tests, etc.) but all of these aren't really workable. So, how am I supposed to go about solving this problem?
Advertisement
Quote:Original post by CoffeeMug
Now, there is an obscure bug in my sqrt function. I have to fix the bug. Now I've broken two unit tests - the one for sqrt function, and the one for solve_quadratic, even though I only changed the code in sqrt.

What's this about breaking unit tests? You add unit tests for bugs you fixed. Add a unit tests for the bug to the sqrt test suite and you're done.

Quote:If a small change in the lower layer propagates all throughout my software and almost every minor change requires me to change tests that are responsible for other layers, very soon I'll spend most of my time dealing with fixing tests rather than adding features.

Except that they don't propagate.

Quote:
It seems that the fundamental problem here is that solve_quadratic_test doesn't just test the algorithm in solve_quadratic, but also its dependencies.

Then you're writing your tests wrong. Write your quadratic test to test the quadratic solver. Let the sqrt unit tests test the sqrt function.
It seems to me that having both tests fail is a GOOD thing, and part of the point of unit tests. That is, to have automated code helping you in tracking down all the dependencies and verifying that all code dependent on code A still works after you change code A.

The problem that you're describing seems more of one in the Unit Test reporting mechanism itself. I.e., you still want to know that all your unit test are failing, but you want to be able to see what the most likely candidates are for a common/shared piece of code that's causing the problem; perhaps some analyzer that goes through, finds all code that's failing, determines common functions, sees if those are failing... you could help the analyzer by explicitly specifiying dependencies in your test data structures (i.e., a UnitTest class would take as input UnitTests that it's dependent on).

Passing a function pointer to your solve_quadratic_test to get around using the sqrt you use in practice seems bad, because then you're not actually testing the real code. What if the bug in sqrt is only exposed by the usage of it in that function?

All that said, I'm not a test-driven development junkie or anything... all I know is I want strong assurances that when I change a piece of code that's shared and used throughout many layers, I want as many ways of catching side-effects as early as possible. The more tests that run that piece of code the better IMHO.
Quote:Original post by SiCrane
What's this about breaking unit tests?

Well, the fix for my bug broke my existing unit tests. Suppose originally sqrt returned 1.415, and my test asserted that sqrt(2) returns this value. Now I fixed it to return the correct value 1.414 - my existing test is now broken. Furthermore, my test of the quadratic solver is broken because the test was a simple assertion that the quadratic solver returns some values for a certain cooficient.
Quote:Original post by SiCrane
Except that they don't propagate.

Well, I just showed you that they do. I understand this means I am writing the tests wrong, but I'd like to figure out how to write them right.
Quote:Original post by SiCrane
Then you're writing your tests wrong. Write your quadratic test to test the quadratic solver. Let the sqrt unit tests test the sqrt function.

But how? This is what I mean, how can I structure the code in such a way that solve_quadratic_test tests the quadratic solution algorithm but not the square root? If I do a simple assertion that solve_quadratic returns a correct value for some cooficients, I can't ignore the fact that this value depends on the square root. And if I don't do that, how can I structure the code otherwise? I could have solve_quadratic return high level description of the equation, test that the *equation* is right, and then write an interpreter for it, but that's clearly not a workable solution.
Quote:Original post by CoffeeMug
Well, the fix for my bug broke my existing unit tests. Suppose originally sqrt returned 1.415, and my test asserted that sqrt(2) returns this value. Now I fixed it to return the correct value 1.414 - my existing test is now broken.

That's not a bug fix breaking a unit test. That's an improperly written unit test, which is a different animal. You fix the unit test, check it to see if it works, and move on.
Quote:
Furthermore, my test of the quadratic solver is broken because the test was a simple assertion that the quadratic solver returns some values for a certain cooficient.

Why? What unit test broke?

Quote:Well, I just showed you that they do.

No, you didn't. You stated it without demonstrating it.

Quote:
But how? This is what I mean, how can I structure the code in such a way that solve_quadratic_test tests the quadratic solution algorithm but not the square root?

By writing tests that tests the quadratic solution. If the test fails; then you determine why it fails. If the failure is because your sqrt function is borked, you fix the sqrt function, add a unit test for the bug, and rerun your quadratic equation solver test.
Quote:Original post by SiCrane
By writing tests that tests the quadratic solution. If the test fails; then you determine why it fails. If the failure is because your sqrt function is borked, you fix the sqrt function, add a unit test for the bug, and rerun your quadratic equation solver test.

I don't understand how that would work. Here's some pseudocode:
float sqrt(x){  // Newton's solution with a bug}bool sqrt_test(){  assert(sqrt(2) == 1.415);}float quadratic(a, b, c){  // Calculate via quadratic formula}float quadratic_test(){  assert(solve_quadratic(1, 2, 3) == some_value);}

I now fix a bug in sqrt to return 1.414 for 2. Now sqrt_test assertion is broken, yes? I need to fix the test. Also quadratic_test is broken, because now that I fixed sqrt, solve_quadratic returns a different value.

How should I be writing the tests correctly?
Quote:Original post by emeyex
It seems to me that having both tests fail is a GOOD thing, and part of the point of unit tests.

May be. But in practice what happens in my project is that I change lower-level layers, I *want* them to be changed, and then I just have to go through dozens of unrelated unit tests that demonstrate dependencies and fix them. It is very very rare that this actually catches an undesirable situation. The reason why I ask the question is because the process hinders me so often, and helps me so rarely, I started questioning its value.
Well the very first step would be to never write a unit test using floating point values with an equality comparison. Use an epsilon value. The second step is to write tests that for things you actually know the answer to.

Seriously, why is your test
assert(solve_quadratic(1, 2, 3) == some_value);

if some_value isn't the answer to to the quadratic in the first place? Where did some_value come from? If you pulled it out of nowhere and dumped it in the test randomly, then of course you're writing your test wrong.
Quote:Original post by SiCrane
if some_value isn't the answer to to the quadratic in the first place? Where did some_value come from? If you pulled it out of nowhere and dumped it in the test randomly, then of course you're writing your test wrong.

Well, presumably I took the existing result from my solver and looked for it in the test. In this example I can look up the value on the internet, but suppose I couldn't? In most cases what you're testing for isn't as solid as the result of a quadratic equation - you take the values that you think are right, code to get these, and then find out that what you thought is right isn't.
You're writing your unit tests wrong, and now you know how. Seriously, why would you consider taking the result of a function that you don't know if it does what you want and putting it down as the test value to be a reasonable test? The test comes first and you get the function to conform to the test; not the other way around.

This topic is closed to new replies.

Advertisement