Other People’s Code

I don’t know if this is just me being useless, but one of the things I’ve always found difficult is debugging or rewriting computer programs written by other people. This is not a complaint about people who fail to document their code sufficiently to see what’s going on, it’s that even when the code is documented it seems much more difficult to spot errors in code written by other people than it is when you’ve written the program yourself.

I’ve been thinking a lot since I’ve been teaching Computational Physics here in Maynooth University. One of the standard elements of the assessment for this module is a task wherein the students are given a Python script intended to perform a given task (e.g. a numerical integral) but which contains a number of errors and asked to identify and correct the errors. This is actually a pretty tough challenge, though it is likely to be one that a graduate might have to meet if they get a job in any environment that involves programming.

Another context in which this arises is our twice-weekly computing laboratory sessions. Twice in the last couple of weeks I’ve been asked for a bit of help by students with code that wasn’t working, only to stare at the offending script for ages and fiddling with a number of things that made no difference, without seeing what turned out to be an obvious mistake. Last week it was an incorrect indent in Python (always a hazard if you’ve been brought up on Fortran). This week it was even simpler, a sign error in a line that was just supposed to calculate the mid-point of an interval. I should have been able to spot these very quickly, but I couldn’t.

What makes this so difficult? When given a mathematical calculation to mark I can usually spot errors reasonably easily (unless the working is illegible), but with code it’s different (at least for me). If I’d been given it on a piece of paper as part of a formula, I reckon I would have spotted that minus sign almost immediately.

One possibility is just that I’m getting old. While that may well be true, it doesn’t explain why I found debugging other people’s code difficult even when I was working on software at British Gas when I was 18. In that context I quite often gave up trying to edit and correct software, and instead just deleted it all and wrote my own version from scratch. That’s fine if the task is quite small, but not practicable for large suites written by teams of programmers.

I think one problem is that other people rarely approach a programming task exactly the same way as one would oneself. I have written programs myself to do the tasks given to students in the computing lab, and I’m always conscious of the method I’ve used. That may make it harder to follow what others have tried to do. Perhaps I’d be better off not prejudicing my mind doing the exercises myself?

Anyway, I’d be interested to know if anyone else has the same with other people’s code and if they have any tips that might improve my ability to deal with it. The comments box is at your disposal…

9 Responses to “Other People’s Code”

  1. There is a lot of work by eg Mark Guzdial on why teaching CS is hard, and why at present there’s only mitigation strategies (but they’re doing a lot of work on improving them…).

    A few strategies that help when I’m teaching Python:

    Use a common IDE with good support until the students have some confidence. I use spyder as it’s free with the Anaconda distribution. This catches errors earlier than coding in Jupyter notebooks, for all the latter’s other advantages.

    Encapsulate tasks into functions. Then think of the simplest possible tests of those functions. For an integration problem, make the integrand be zero, then a constant, then linear. This gives simple cases that can be hand-calculated.

    Next, print out all steps in the function, when run on simple cases where you know (or have intuition) about what the answers in the substeps are. If there’s too much output, use a “bisection” strategy: print out the input to the function (which should be right), and the output (which should be wrong), and then add/remove prints from the midpoints until you find the step that doesn’t do what you expect.

    If you’ve gone beyond beginners, and are working in an environment that can use it (like spyder), use a debugger instead of a print. It’s more powerful and much faster with compiled languages, so a good habit to get into. Often using one breakpoint, plus stepping and variable-value inspection, is enough to spot the worst problems.

    Finally, test-driven debugging doesn’t require you to need exact knowledge of the answers. You can check the symmetry of the answers, or the direction of a force term, or… A simple example would be a molecular dynamics (or gravitational N-body) calculation: if you have a function calculating the force correctly, put in just two bodies, take one step, and check that they got closer or further apart as appropriate. Then try three in a line, or an equilateral triangle. This gives information as to the symmetries of the internal calculation.

    The final point about test-driven debugging as a teaching strategy is about *performance*. Staring at code next to a puzzled student is frustrating for all concerned, but even when it works it’s still frustrating for students as they learn so little from the process. By trying lots of little tests you are not just finding what the problem is, you are also implicitly suggesting to the student that there are things they could try in the future (other than just asking you). By performing a test-driven strategy you can get (some of) the students to think about how they would test their later codes, which brings in the idea of unit testing and so on.

    • telescoper Says:

      Lots of good tips here. Just to mention that we do use Spyder here.

      I’ll also mention that what I do tend to do with the exercises is get them to do something where the answer is known. If you can’t reproduce by numerics, e.g. a definite integral whose value can be determined analytically, then you can’t trust the code to give you the right answer for an unsolved problem. Plus you can find out a bit about the accuracy of the result by comparing it with a known answer.

      • Absolutely agree that known answers are great, not just for cross-checks and convergence tests, but also for highlighting algorithmic issues (integral of 1/sqrt(x) from 0 to 1 using standard trapezoidal rule vs Gauss quadrature being one I use frequently). Sometimes testing that a function fails in the way that you expect is as important as checking it works!

      • Final plug I forgot: assuming you know the Software/Data Carpentry approaches, I would recommend both CarpentryCon (Dublin, end of May, http://www.carpentrycon.org/) and Greg Wilson’s book (http://third-bit.com/teaching/).

    • Lots of great advice here! Thanks!

  2. “I think one problem is that other people rarely approach a programming task exactly the same way as one would oneself. I have written programs myself to do the tasks given to students in the computing lab, and I’m always conscious of the method I’ve used. That may make it harder to follow what others have tried to do.”

    I think that that is it.

  3. I actually do porting of other people’s code for a living. Because of that I am permanently warped and snarky. I am also old, so not so flexible.

    My opinions with “snark on”:

    python is one of the worst languages ever. As you found, invisible white space totally changes a code, its answers, and is miserable to debug. Yes there are about 1 million open source python libraries that say they can do just about anything. But do you trust the 100K programmers who produced them for free? Using a language where an invisible character can change everything? I don’t.

    C++ is language that seems to be designed for obfuscation. It can be good if it is programmed for a team, by a team with a manager that rejects code not written clearly. I have seen that once or twice. It should never be used for a simple task. C++ can be a problem when people will be moving in and out of a project regularly unless managing the project is done rigorously and constantly.

    C is a mature language that can be fairly easy to read. It is not object oriented so it is harder to obfuscate what you are doing. (But too many layers of nested include files can still make it nearly impossible to interpret.) C code can be blindingly fast. Probably the best language for simple tasks and 1-D science.

    Fortran90 and beyond is a language I can read as easily as English. Despite the sneers from “computer scientists” (an oxymoron),
    it is still the best language for doing 3-D physics. It you are needing multidimensional arrays and want to produce fast code, use it.

    Java: Yuk!

    R: ok, it does some statistics.

    IDEs: First spend two weeks figuring out how to put it on your computer, then one month learning the IDE, then throw it away when you find it does not support something you need to do.

    Snark off:

    The best language (or programming environment) is the one that will (a) do the job that needs to get done, (b) does not include a lot of unnecessary baggage or require a lot of extra time to getting it done, (c) can produce code that runs fast on modern computers, (d) is easy to learn, (e) is easy to maintain.

    For some jobs that will be Python, Java, or C++. For some jobs that will be Fortran or C. It is fairly easy to combine separate C++, C, and Fortran source code into a single executable.

    For any project: find a debugger that works for you and invest some time learning to use it. Use print statements to quickly solve problems (and then remove them before someone notices you did it).

    • As an ageing programmer, I couldn’t be bothered to kick Fortran habit. Furthest I got was object oriented Fortran ( ! LS Fortran?) to make my Mac do graphs. Structures, pointers and whatever else experienced there enough to make me avoid C for life.

      Have also seen many Python trained PhD students start to write prog in Python then wonder why it took so long when their dataset increased by order of magnitude or 2. Python bit like the old BASIC as interpreted language. Of course, if you can find a compiled library then it goes faster (as Google does!) but in a research environment you usually want to do something new and that is where compiled language wins.

      Final point – the only thing worse than reading someone else’s prog is reading your own “legacy” code. You always think you’ve made a mistake and occasionally, unfortunately, you have!

  4. One of the biggest differences I found in industry compared to academia was the practices followed in software development. It’s not that industry development has less bugs. It’s that bugs are treated as inevitable, and strategies have been developed to minimise their introduction (or reintroduction) and to easily identify where the logic has an error.

    The biggest difference is tests. In principle, every method/function/object (language paradigm depending) is tested independently, as as an entirely separate unit. Ideally you write the tests before you write the code! Tests are run (hopefully before…) every commit to version control made. And commits are kept small, and often.

    This serves a few roles. The first is obvious: it makes it much, much easier to identify mistakes as code is written. But less obviously, it makes it easier to identify bugs that almost inevitably occur when changes are made to the code at a later date.

    And any time a bug is found that wasn’t caught by tests, this case is added to the tests. This prevents the same mistake being reintroduced into the code at a later date.

    But maybe most crucially, and most overlooked: code that is hard to test in such a fashion is typically considered badly written. Forcing yourself to write easily tested code is almost synonymous with being forced to write better code.

    There are other aspects, but test-driven development is one of the biggest differences.

    I’m sure the best of academic code is written in such a fashion, following industrial best practices. I’m sure many academics know and make the conscious choice to save time by not following these practices (just like many do in industry: I also suspect most of the time it’s the wrong decision in the long-term). But I’m sure plenty of academics aren’t even aware of these practices. I would like to see knowledge of them more wide-spread.

    Peter, you wrote (on rewriting an offending bit of code):

    “That’s fine if the task is quite small, but not practicable for large suites written by teams of programmers.”

    Large suites of software are written by teams of programmers every day, and I’m sure a troublesome function that is failing a test is just re-written if the bug can’t easily be found (I’ve done it before!). The difference is that the code is (or should be) structured such that this is indeed practical. Ideally, when you rewrite the code, you need only run the tests to verify that you have indeed rewritten the code correctly…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: