Write a Test that Prints PASSED

From Programmer 97-things

Jump to: navigation, search

Some years ago I was writing programs to test circuit boards in an assembly line to prepare them for final assembly. The boss was very brief and clear: Write a program that tests the new product and prints "PASSED." It seemed obvious (to me at least), that if I actually followed his instructions to the letter, the production yield would be 100%, we would realize a major corporate goal, and my boss would receive high praise. I also knew from experience that the boss did not comprehend the complexity of the problem, had not allocated enough time to properly deal with production failures, and that I (not he) would be the fool when the charade was exposed.

Practical test programs normally print "FAIL" if the product does not pass test, sometimes in large pink letters. The test operator separates the units based on this simple display, which would be OK if the work ended there. Since no one wants to throw away something after making an investment, failed units are consigned to a test technician who, with soldering iron in hand, locates bad connections and produces a "factory refurbished" unit. The guy starts with the same test program, and yes indeed, it does print "FAIL," but he is no closer to knowing which connection is bad, or even what area of the board has the problem. It could be a hidden problem beneath a component, or just that the LED was installed backwards.

This illustrates the need for good error messages. The sanity of the test technician depends on rapidly finding and repairing the problem, but the program must guide him to that end. A production test is normally a sequence of evermore focused tests, so at least indicate which test failed, and then provide documentation that allows the technician to look up possible causes. A good technician will very quickly memorize the list, becoming quite adept at correcting faults. A better approach is to build that information into the test program, actually making suggestions on the operator's screen. As production ramps up (because your test program works so well), new technicians can be trained very quickly.

The idea can be applied within code as well. Many academic code examples simply return 0 or 1 at the end of a function, and then propagate this up the call stack, leaving the top level to print vague "access failed" or "RPC error" messages. A better plan is to use small integers as error indicators, with 0 as the no-error code because there is only one way of stating that all is well. Beware though, because messages like "Error 7 in PutMsgStrgInLog," can leave non-programmers bewildered. The message should use words that the operator can relate to, and you must know your audience to decide if technical jargon or embedded names are OK. Unclear or confusing error messages can get you a call from an irate third-shift manager who "needs you at the plant immediately to sort out this stupid program."

Specific error messages for every issue make debugging go much faster. Integers and pointers take the same amount of storage, so instead of 0 or 1, return a pointer to a string. It requires no additional code to test if an integer is zero or a pointer is null, but the benefit can be dramatic. When an error is reported, you can show meaningful messages like "The data server connection is not working" or "Oscillator signal is not present." The exception mechanisms in languages like C# and Java allow for passing strings, but not every embedded coder has that luxury, and someone working on a system written in C has already had their language choice made for them.

By Kevin Kilzer

This work is licensed under a Creative Commons Attribution 3

Personal tools