Code inspections vs unit tests

StefanHoutzager · March 4, 2020, 4:33pm

Our objective with Inspections is to reduce the Cost of Quality by finding and removing defects earlier and at a lower cost.
While some testing will always be necessary, we can reduce the costs of test by reducing the volume of defects propagated to test. We want to use test to verify and validate functional correctness, not for defect removal and associated rework costs. We want to use test to prove the correctness of the product without the high cost of defect removal normally seen in test. Additionally we want to use test to avoid impacting the users with defective products. […]
I know of no study that has been repeated where unit test has been demonstrated to be as effective as Inspections in removing defects.
If it is accepted that Inspections have value, the next challenge voiced about Inspections is that unit test in combination with code Inspections will lead to better results. Again, every trial I know of has failed in this regard. Russell [RUS91] gives two other reasons for not testing before Inspections:

As Inspections require a motivated team, testing first may lead to a view that the code is reasonably stable and the team will be less motivated to perform the best Inspection.

With the investment of test the Producer may be less receptive to major rework on an “already-stable program image” which will also require retesting.

Ackerman found that the savings from defect detection costs in Inspections was 2.2 hours compared to 4.5 hours in unit test. A two to one savings is a good place to bank. In another organization he states a 1.4 to 8.5 staff hour relationship in finding defects with Inspections versus testing. [ACK89]

Weller [WEL93] states that there are disadvantages of inspecting after unit test:

Unit test leads programmers to have false confidence that the product works, so why inspect

It is a hard decision to inspect a large batch that has been unit tested and there may be the view that there is no longer time to inspect

He also gives reasons to perform Inspections first:

You may actually be able to bypass unit test if the Inspection results are good

You can recover earlier with lower cost to serious design defects found in Inspections versus unit test

lucaong · March 4, 2020, 4:43pm

Personal opinion here: I don’t think that code inspection and automated tests are mutually exclusive. We write automated tests primarily to reduce the cost of change, not so much to assert correctness.

One cannot really assert correctness with unit tests: as the creator of some code, if you knew that something was incorrect, you would just fix it, and if you don’t know that there is a bug lurking, you cannot write a test for it. We write tests to make sure that our code still does what it is meant to do in face of change. In other words, we write tests for our colleagues (or our future self) that will change our code in the future, so that they can do so knowing if/what they break. A well-tested project is easier to evolve.

Correctness can be inferred by manual tests and code inspections, better if by someone else than the code author. Code reviews and manual tests are about assuring quality.

I think the article is conflating these two aspect, and therefore sees a dichotomy where there is none.

I personally consider writing tests a worth investment of time, and one of the most valuable skills in coding. I also think that code reviews and a good quality assurance process are necessary.

StefanHoutzager · March 5, 2020, 5:50pm

I do not see that in the article. Qualities of inspections and testing were compared.
Another study with comparable results:

The study is from 2001, maybe right before the agile manifesto and the writings + thorough empirical studies of PhD Uncle Bob and other agile professors at diverse universities?

lucaong · March 5, 2020, 6:31pm

You may actually be able to bypass unit test if the Inspection results are good

This implies that the study considers inspections and unit tests some interchangeable alternatives. I think they are not alternatives, they serve different purposes.

We want to use test to verify and validate functional correctness, not for defect removal and associated rework costs.

This also indicates that the article evaluates testing a way to assert correctness. I think automated tests are primarily a way to keep the cost of change low.

I think this is unnecessarily sardonic. We can discuss merits and demerits of agile or testing without mocking people.

StefanHoutzager · March 6, 2020, 6:34am

The purposes have an overlap in that both serve the purpose of checking correctness.

I don’t get this. Automated tests are full of assertions. If you mean the safetynet that you construct:

you do inspections after changes also. Unit tests have a cost when changing code. You have to maintain them. Functionality changes, functions are removed and created, names of functions change, their signature changes.

I like to work fact-based, empirical research like

shows reviews to be more effective than testing. See 3.5 Review versus testing and 4.1 Cost and benefits

Claims by Uncle Bob and other agilistas about unit testing are not scientifically proved, for me they belong to the realm of pseudoscience. Moral imperatives (f.e. from Uncle Bob “if you don’t do TDD, you are unprofessional”) are unnecessary, and harmfull. I cannot take them serious.

An example of a highly ranked scientist and programmer that is not enthousiast about safetynets:

For more see the interview https://www.informit.com/articles/article.aspx?p=1193856

Read this thread BDD / TDD criticized by the way?

dimitarvp · March 6, 2020, 10:08am

Well that’s exactly the problem – they shouldn’t be compared because they are two different tools serving different purposes.

It’s like being forced to choose between a wrench and a hammer, you know?

dimitarvp · March 6, 2020, 10:17am

Nothing that relates to management of people is scientifically proven, as far as I know. Management involves psychology and working with different kinds of character traits in people – and that simply cannot be generalised and put in a comprehensive theory by science in its current form. Nor there are any large-scale experiments that prove anything beyond any doubt.

Bringing science to discussions about management is fruitless. When we discuss it, we are basically all alchemists arguing which exact ingredients will yield gold bars when mixed in the right pot.

I think it’s important we all recognise that. These “papers” you quote have a number of omissions when contrasted with the pure scientific method (like having large sample sizes, asking people from many different areas, analysing responses based on multiple factors, actively working to remove bias etc.). At best, they are from people who claim to have an epiphany after a long career. And they have a bunch of pals in a few universities that pat them on the back and sign their work without much scrutiny. Boom, the paper is now “credible”. That’s all there is to it and let’s please stop pretending otherwise.

So in this regard, your approach of citing very sketchy sources is ill-placed and won’t help with any discussion you might be willing to ignite.

StefanHoutzager · March 6, 2020, 10:39am

As I said: “The purposes have an overlap in that both serve the purpose of checking correctness.”

StefanHoutzager · March 6, 2020, 10:43am

This is about unit testing, not about management.

lucaong · March 6, 2020, 10:44am

I respectfully disagree.

As I wrote, I think the main purpose of automated tests is not, in fact, checking correctness, but rather minimizing the cost of changes. A poorly-tested project is hard to evolve, as everything you change can break something else.

Code inspections and quality assurance are for checking correctness. A project where no code inspections or quality assurance is performed is more prone to contain defects.

I therefore think that comparing code inspections with tests is comparing apples with oranges: both are necessary, for different purposes, and they are not interchangeable.

A large extent of the paper you quoted is based on anecdotal evidence and subjective interpretation by the author. To me it sounds a bit as if someone was saying: “Seatbelts and headlights were compared as a mean to reduce car accidents in low visibility. Keeping headlights on proved more effective, so they can replace the need for seat belts. Also, seat belts give users a sense of false security.”.

The testing practices described in the article are quite far from agile: in many passages, it sounds like the subject of the article performs testing as an afterthought, possibly executed by different people, as opposed to as part of the development process. For example, it advocates for testing after the code inspection is performed. I intend testing as an integral part of development, and performed by the developer themselves (no matter if one practices TDD or not, TDD is not the point here).

I do think that agile practitioners sometimes go overboard, that sometimes agile is deployed as a mean of controlling rather than as a good software practice, and that in some contexts it takes on the traits of a cult rather than a practice. I also believe that the principles of iterative development and testing as part of the development process are still valuable.

lucaong · March 6, 2020, 11:38am

If it was just about asserting correctness and detecting bugs, a round of manual testing would do just as well. You might happen to catch some bugs by running your automated tests, but you won’t catch what you didn’t design the tests for. Simplifying, you cannot reveal a bug that you don’t know of, and if you knew about it you could just fix it.

The value of automated tests is that, once written, they become part of your test suite, pinning down the invariants that should remain even when the implementation details are changed. Every time someone changes some code, the test suite informs them if some of those invariant was inadvertently broken.

The biggest cost of writing tests is payed only once, but you can run the same test as many time as you want during the development process. Therefore, they are efficient to safeguard against breaking functionality while evolving your project.

Inspections and manual testing have a cost every time they are performed. They are a better way to assert correctness and quality of newly added features, but re-inspecting or manually testing everything upon every change would be inefficient.

StefanHoutzager · March 6, 2020, 11:54am

I called it the main goal for many. When you repeat the tests that goal stays the same. For the rest: I know how it works and what the beliefs are.

lucaong · March 6, 2020, 12:17pm

But it is different. Let me make a silly example. Let’s say I am tasked with writing a function that adds two numbers. I end up writing this:

def sum(a, 0) do
  a
end

def sum(a, b) do
  sum(a + 1, b - 1)
end

I then write a test for it:

test "it adds two numbers" do
  assert sum(2, 3) == 5
end

I am not writing this test thinking “let’s see if my code is correct or not”. I know (or at least I am positive) that this test will pass. I write it to record in the test suite what my function should do, no matter the implementation. This test is not about correctness, it is there to prevent regressions.

A code review or quality assurance check then finds that my solution breaks if one tries to sum negative numbers. It reveals that my code is not correct, and also points out that it is cumbersome and inefficient. My test did not reveal this bug, nor the fact that the code is overly complex: does it mean it is useless?

I then discover the existence of + and change the implementation to:

def sum(a, b) do
  a + b
end

I can run the previous test to make sure that it still passes. I am addressing a different concern (support for negative numbers), but I still want the previous behavior to be supported (positive numbers). The previous test is useful, precisely because I can re-run it at no cost, to make sure my new code still does well what the old code was doing successfully. I will in fact also write another test to protect from regressions regarding negative numbers:

test "it adds negative numbers" do
  assert sum(2, -3) == -1
  assert sum(-2, 3) == 1
  assert sum(-2, -3) == -5
end

Ninigi · March 6, 2020, 1:49pm

I really appreciate your views on tests and agile and pretty much the software culture as it is right now, but I absolutely disagree with you in almost point you make.
This is probably a great example of how there is no absolute right way to build a house, but you can disagree in the details.

It sounds like you are suggesting that unit tests are worthless, and should always be disregarded in favor of people using the software. I think this is wrong, and I will make my case further down. I have written software with and without unit testing, and I know the pain tight unit testing can inflict on the overall developing process, so I do not agree with dogmatic TDD approach. But I also don’t believe that is how people practice TDD.
Furthermore, you mentioned Uncle Bob - he has been a person I used to look up to, because what he said made sense to me, and following some of the stuff he preached made me feel better as a developer and still - I agree, he is a dogmatic “do never change my gospel” kind of person, and some of the more recent things he said are even more disgusting. But you can not disprove things by quoting people who might have done wrong or not, that is an “argumentum ad hominem”, and known as a fallacious argument.
You also quote passages of studies that support your argument, but disregard other aspects of the argument that are way outside of the study, which is another fallacious way of arguing your point you keep using.
I realize this might come off as a personal attack, but it’s not. I just think you are wrong, and you will probably think I am wrong. That’s healthy, and that’s how it should be! You may correct me in any of the points I want to make, if I misrepresent your opinion.

So now to my actual points regarding Code Inspection VS Unit Tests.

1. Unit tests constrict development, and make changes harder
I enjoy writing unit tests, because I like watching red turn to green as a feedback for my efforts. I agree that unit tests can distract you from the actual goal, and in a test suit that was not done with unit tests in mind, the test setup can be far too complicated, just to prove that - indeed - plus(1, 3) returns 4 (if you have unit tests that basically just test usage of basic implementations like this, or implementations that have been tested in a library, delete them.)
Maybe you could say that I am an advocate of deleting tests after writing them. If you want to make sure 1 + 3 makes 4 in your setup… Sure go ahead! But don’t expect anyone (including yourself) to maintain that test, at some point someone will raise the question whether or not you should even have such a function.

I am doing different approaches to software development, I work with 0% test coverage, I work with close to 100% test coverage, I work with code that had tests written after the code, and I work with test written before. I almost certainly do not have your level of experience (sorry for making deductions based on your avatar), but I can certainly say that I am thankful for everyone who bothered to write tests, even if the tests are crap whenever I have to dive into unknown code.
Because their tests describe their thought process. I make a change, suddenly 50 tests are red… Oh oh, I did an oopsie? No, they just wrote all their tests assuming that the arguments will always be exactly X. Ok. Bad assumption, and I have to rewrite the tests anyways to fit my change. And suddenly there is only 1 red test, and I realize that I would break an existing feature with what I did.
I lost some time, but I gained a lot of insight of how the person who wrote this code thought about the process. It can be super frustrating - WHY WOULD YOU - mea culpa - we are all very guilty of writing crappy code from time to time.

However, working my way down from the top level function, to the more nitty-gritty ones, and having tests supporting my assumptions, helps me think about the over-all design, and it helps me rethink! This is my opinion, and it certainly works different for different people, but whenever I break down a function into its components, and I realize that the setup for even one of those unit tests is extremely difficult, then I might have gone into the wrong direction…
This is imo what doctests in Elixir promote. If you can not test it in a doctest, do you really want to do it like this? Should this not be a top-level function? I can not count the number of times I realized that a function should be top level, and thus tested in a more integration way, than unit way, … I can’t count it.

I said that I would advocate for deleting unit tests after writing them, and I think that is a very valid approach. If you are like me, you would want some very small scale verification on parts of your code working. If you are like me, the process of writing

test "it does exactly X if given Y"

helps you think about what your function should and should NOT be doing, and improve code quality. But it might not be beneficial to make code future proof, so by all means… Get rid of development tests! Tests are lines of code that have to be maintained, less code often means less maintenance - yes!

But by no means I think unit tests are a waste of time or could be replaced by pure integration testing, be it by humans or a machine. What you are suggesting is that libraries should be tested by their users, contributors be damned.

2. Code Inspection makes Unit Tests obsolete

*Dislcaimer: I assume we are talking about code inspection as defined by Wikipedia If I am wrong here, I am looking forward to reading what your definition is.

Not only does this disregard my point about how unit tests can help you thinking about code, it also suggests that everyone can afford code inspection.

What if you are a startup, and you think your code could be improved by the masses! You make a library public, and boom, there is your code inspection, right!!!
No, that is not how it works… People will hesitate to contribute to libraries they don’t fully understand, tests will give them security, etc. But I understand you are talking more about specific projects, where everyone is working for a single goal to achieve! Like, that company will never have people leaving, people joining!
Think about a developer who has to do a simple task, in a big project. Would it make them feel more confident about their code if you told them “we will know if it worked or not once we have the Code Inspection results”, or if they could run a test suit that either tells them “all green” or “10 red”?
If you think just doing things in a project that is completely foreign to you, without a tight test frame is completely fine, then you are a hypocrite. I have never met a junior developer, or a senior, who was fine with just changing stuff to what they thought was right, without getting feedback from anyone - or in the case I am trying to make - something.

3. Developers save time by not writing unit tests
This is a very, very controversial point. I have lost hours and hours just fixing tests that did not meet the initial expectations anymore, but I - as in, me personally, people are different - prefer delving into how things USED to be and potentially find bugs before they happen, over months of calm sea and then facing the ultimate storm.
Maybe you are not involved with these kind of things, but web developers have to face the ugly truth of no version can last forever. Upgrading an Ubuntu server to the latest version might feel great, but how long until the app has to be updated? Are you running an Elixir 1.6 app? That is so outdated… Have fun updating to 1.10 (disclaimer, I found Elixir apps to be super easy to update, since I prefer to pull in the least amount of dependencies, and Elixir tends to give very descriptive warnings, but boy… there are some tricky bastards to tackle already!)

A good test suit can give great developers a great peace of mind - (at least I did not break anything.)

4. TDD takes the fun out of development

Now this one is entirely me, putting words into @StefanHoutzager 's mouth. It always sounds like you are saying writing tests first takes away from the developer experience and freedom, basically just testing it manually and throwing the result at users is test enough. Sorry if I misinterpreted you, feel free to correct me.

I confess, I do not always write tests first. I confess I think testing certain template responses in Phoenix (or whatever framework) is a waste of time. I confess I think Uncle Bob is full of ***. I confess I think some of the projects I know like the back of my hands should quit bothering me with implementation details, badly written tests and just let me deploy what I (at that certain time think) is the fix for a bug!

But I generally enjoy the red -> green cycle of tests. I enjoy writing stuff that I know others will look at, my test descriptions, and I enjoy their feedback. I am guilty of writing tests like “it works with X”. I am guilty of overtesting. I am guilty of not testing enough.
I usually am more thankful for my over-testing past self though. At least it forces me to understand why a test was failing!

5. What are you proposing then…?
My approach would raise eyebrows in every community. Me, my next year me, would probably scold me for what I am doing right now.
I write tests… I do your approach from time to time (without the million dollar company you seem to have behind you, or whatever it is that lends you your ego) and just go for it.
I sometimes prefer writing tests after I figured out the top level API.
I sometimes hate myself for the unit tests I wrote.
I sometimes love myself for the unit tests I wrote.
I sometimes omit top level tests.
I write code without any tests!
I will not merge any code without tests into this project!

This is how programmers work, we are all human, we all prefer different approaches. But I think you are wrong @StefanHoutzager

Ninigi · March 6, 2020, 3:39pm

I want to work evidence based, not on what one enjoys or not.

Excuse me? Seriously?

“Oh man, that company drained me, 12 hours a day, tight schedules, always asking for more!”
“Ah, I want to work evidence based, not on what one enjoys or not.”

This is what we are dealing with, not some studies.

Ninigi · March 6, 2020, 3:41pm

I gave my own experience, based by me, and I think I made it quite clear that everything is my opinion.

If I can’t be prove for my own experience… wtf can.

Ninigi · March 6, 2020, 3:50pm

Point taken.

So should I tell my employer that we need to hire an expert team to inspect our code before we do the next deployment? Or should I tell them that we it will take longer because we are trying to write reasonable, test based code?

EDIT:

What exactly are you proposing?

Ninigi · March 6, 2020, 4:07pm

This question is very serious. If if you think my own-opinion-based development is wrong, please enlighten me.

And please answer my other replies, even though they were not in the same threat. I can compile it into one if need be.

dimitarvp · March 6, 2020, 4:32pm

Dude, if you want to talk to yourself, please just start a blog and get this over with?

It’s the same with you every time:

You express a strong opinion.
You claim it’s supported by science.
You cite a very sketchy paper (which, as I pointed out, doesn’t even qualify for a “paper” in the pure scientific terms).
You are passive-aggressive when criticised:

Q.E.D.

You start dancing on the edge of the forum rules by subtly telling people their opinion is bad:

There exists no evidence at all when we’re talking about ways to approach creating and maintaining a software project. There’s no OneAndOnly® solution; it all depends on the project. I don’t know why this is so hard to accept but I am giving up.

Seriously, at this point it speaks bad for me that I am even responding to you. I am taking it as a future improvement remark towards bettering myself.

Again, a friendly advice: if you like talking to yourself and don’t want your opinion discussed, just start a blog and disable comments.

StefanHoutzager · March 6, 2020, 5:33pm

There is a lot of research done concerning evidence-based software engineering and there are outcomes that you can take profit from. For the rest I will neglect your hostile reaction. That means a flag.