8 Tips for Debugging using Top-Down Techniques and Divide and Conquer

Here we go.  Debug.  People love to write about software methodologies but rarely about debug.  Yet, it’s one of the most important skills any developer needs to have.  I was once asked about my typical debug strategy in an interview, so we know it’s important and says a lot about where we are on our programming journey.

First let me say, before you get involved with doing debug, make sure you’ve reviewed your code.  I don’t know how many times I’ve taken a shortcut and jumped right into debug only to find out that the error would have popped out at me if only I would’ve reviewed my code first.

So, now, assuming you’ve reviewed your code.  What then are some good tactics to employ when doing debug.

Well, of course, a lot of that depends, right?  I mean, what kind of software are we talking about?  Is there a server or database involved?  Are we dealing with a mobile app, or are we talking about an old-fashioned desktop program?

Let’s speak in generalities since we won’t define anything too particular in order to keep this post fairly general and high level.

Do you remember when you first learned to program?  Maybe it was in BASIC or C?  Probably Python, nowadays.  Anyway, my first language was BASIC in high school.  My first “real” language (compiled language) was C in college.  I remember my great dependence on the printf statement when problems occurred, and they will.

Of course, we were told that more sophisticated methods of debug would later be taught.  Turns out the old print statement still is very useful though, and I have no shame in admitting it or using it.

Well, of course, now you can put in fancy try/catches in many languages, which can be helpful too.  Did your teachers ever really explain how to use a debugger?  The tool that often is integrated within the IDE or in the case of GNU comes in the form of GDB (to support GCC).  That little tool, albeit a bit cryptic, is so powerful especially for finding memory-leak related stuff and pointer issues.

Let’s first review some of the more common errors that we might find before we get to some debug tips.

  1. Segmentation fault (memory or pointer issues).  This usually occurs when you step on memory that doesn’t belong to you (a misuse of pointers), or you run out of memory all together — i.e., you haven’t released unused memory or tried to acquire more memory than exists.
  2. Stack overflow.  This usually results when your function calls go too deep.  For example, when you call a function recursively and it takes too many recursive calls before the exit criterion is met.
  3. Overflow, underflow related errors.  Overflow usually means you divided by zero — or a number close to zero — or you are trying to represent a number bigger than your data type allows.  Underflow is the opposite of overflow.  Imagine dividing a number by something close to infinity.  The result is an extremely small number that can’t be represented by your data type.
  4. Allowing your algorithm to blow up.  Common issues are underflow and overflow, and stack overflow.  Exceeding array bounds, etc. (Related to 1 above.)
  5. Precision errors.  Say you use floats instead of doubles.  The accuracy of your answer will be compromised, or the algorithm may become unstable and cause an underflow or overflow error.  This tends to be more of a problem in math and numerical-type programs.
  6. Input related faults.  Faults related to not checking for the proper input data types or bounds on the data, etc.  These can cause all sorts of different kind of errors or faults.
  7. Logic errors in your program.  You implemented an algorithm incorrectly, for example.
  8. Communication errors.  Say you are trying to communicate with a server that is offline and didn’t write code to deal with this condition.
  9. Errors in a library, plugin or API, etc.  These can be hard because you often aren’t expecting it.  OK, if it’s a WordPress plugin, you probably expect it.
  10. Less likely, but sometimes the OS or browser can even cause problems.  Say a systems call has a bug in it, or a bug exists in the browser where you run your webapp.  These are even harder because you aren’t expecting it.  This can happen more so on custom operating systems or an OS that doesn’t have a large market share (same with browsers).
  11. Output error.  Say you are trying to print or control something that is experiencing an error. (This is similar to 8 but it can communicate.  Instead, it just doesn’t know what to do when a fault occurs.)

Of course, with some languages, like Java and C#, you’ll never have pointer issue as you don’t have that kind of control.  Same with memory on the heap.  You can’t acquire memory from the heap, so you never have to worry about releasing the memory back to the heap (garbage collection does it for you).

Java and C# has some nice abilities using the try/catch related constructs as well.  These can often allow a programmer to catch underflow or overflow issues, as well as deal better with unexpected or “bad” input, etc.

Well, that’s a nice review of some of the common errors one might have to deal with.  So what can we do if we have one of these errors.  Sometimes the program (and/or operating system) will tell us which error occurred, but often we don’t really know.  Say it’s a bug in a library, the browser or even the operating system itself?

So here are the options I usually employ — first assuming I’ve reviewed my code and don’t see anything wrong with it.  And when I say review my code, I use a two-step approach.

First, I do a quick review, and then, if that doesn’t help, I scrutinized every line as if I’m back in grad school and analyzing a math proof for correctness.  (I had a rule of thumb that if the professor was over 70, I would assume he had one or more bugs in his proof —  actually, usually just omissions.)

  1. I revert back to dropping a couple of print statements in areas of the program where I have some suspicion.  Here, I use the divide and conquer approach.  The essence of divide and conquer is to continue to break down the problem into smaller pieces unless you find the root cause.  If you are familiar with binary search, you are familiar with divide and conquer.  In binary search the data set is continually cut into half until you find your search item or determine it’s not in your data set.  I’ll write more about this general approach in another post.  What do I print?  Sometimes just statements that say I am on line 333.  At other times, I print out variables that I suspect are related to the problem.
  2. If the language has try/catch constructs, I add a few in areas where I think the problem may reside.  Again using divide and conquer.
  3. If step 1-2 didn’t work, I pull out the bigger guns and invoke the debugger.  I set break points and inspect variables where I think the problem may be occurring.
  4. In this step, I may create several print statements that print to a file.  Hence, print a log file while the program is running.  Often the bug was discovered using earlier steps and I never even have to use this approach.
  5. I decouple the program as much as possible.  I test each module/class/file as much as possible outside of the program using a small test-suite driver program.
  6. I further decouple the program by doing unit tests on each method or function outside of the program using a small test-suite driver program.
  7. I ask a coworker to review my code and the issue I’m having.
  8. I set up a code inspection with the team.  I consider this an all-hands approach.  Sometimes this can work when you have a really tough one that you and your code buddy can’t figure out.

Generally the tips come down to top-down strategies and employing divide and conquer, decoupling the problem, and putting your ego on the back burner and asking for help.

Other things to consider before enlisting help.

  • Take a break from the problem or bug.  If you can, work on something else for a while.  Give your brain a break.
  • Maybe even tackle it the next day after you’re more rested and have cleared your head.

Of course, a no-brainer is to Google any error message(s) you get.  I figured that was too simple to add to the list, but let’s face it, it’s often a smart idea before going too far down other paths.

We’ll there you have it.  These are my usual go-to strategies I employ.  Of course, these are meant to be rather generic.  If I was doing embedded software, for example, I would probably set up an emulator. And if I was doing some web stuff, MUCH different — like hanging out in the browser, forever. But, nonetheless, many of these debug principles should carryover across various areas of programming.