Tuesday 9 December 2008

Clean code

I've recently started reading a book called Clean Code by Bob Martin. I like this book's attitude: right up front you're left in no doubt that this is a book written by and for programmers. You're going to have to study real code examples hard to get the most out of it. So far, so good.

However, on page 29, we get the first example with more than ten lines. The book starts by presenting this listing:

private void printGuessStatistics(char candidate, int count) {
String number;
String verb;
String pluralModifier;
if (count == 0) {
number = "no";
verb = "are";
pluralModifier = "s";
} else if (count == 1) {
number = "1";
verb = "is";
pluralModifier = "";
} else {
number = Integer.toString(count);
vderb = "are";
pluralModifier = "s";
}
String guessMessage = String.format("There %s %s %s%s", verb, number, candidate, pluralModifier);
print(guessMessage);


The book suggests that this listing's local variables have unclear context at first: you have to read through to the end of the listing to work out what they are for. The book then suggests replacing the listing with this.
----------------------

public class GuessStatisticsMessage {
private String number;
private String verb;
private pluralModifier;

public String make(char candidate, int count) {
createPluralDependentMessageParts(count);
return String.format("There %s %s %s%s", verb, number, candidate, pluralModifier);
}

private void createPluralDependentMessageParts(int count) {
if (count == 0) {
thereAreNoLetters();
} else if (count == 1) {
thereIsOneLetter();
} else {
thereAreManyLetters(count);
}
}

private void thereAreManyLetters(int count) {
number = Integer.toString(count);
verb = "are";
pluralModifier = "s";
}

private void thereIsOneLetter() {
number = "1";
verb = "is";
pluralModifier = "";
}

private void thereAreNoLetters() {
number = "no";
verb = "are";
pluralModifier = "s";
}


This might be an improvement in quickly grasping the meaning of the variable names, but it's still fantastically complex. Adding a class always has a cost, because there's the overhead for understanding of wondering whether this class has other uses what it's scope is, how long it's supposed to live, and so on. But, more important, what about this solution:


String getGuessStatistics(char candidate, int count) {
switch (count) {
case : 0
return String.format("There are no %ss", candidate);
case : 1
return String.format("There is 1 %s", candidate);
default:
return String.format("There are %s %ss", Integer.toString(count), candidate);
}
}


To me, this is a better solution. This is to do with a property of code that I'm going to call "glanceability". The shorter a listing, then all else being easier, the quicker it is to understand. However, it's not just that my version is shorter: for me at least, once a piece of code gets down to a certain size and complexity, its overall thrust is capable of being understood at a glance. The last listing achieves this: one glance and I can see that it returns a new String which formats the arguments somehow. In part this is about size, but it's more about simple structure, and in particular having just one control structure: if there were six cases this code would still be glanceable. To get to that "glanceability" I'm prepared to break lots of (perhaps all) other rules. In this case, I have multiple return points, multiple calls to the same static function, and multiple places where I record that all my result strings start "There ..." string. All of which seems insignificant to me because of the glanceability.

The two listings from the book fail the glanceability test. Once they fail that, for me at least, then size and complexity aren't everything for me. If it's going to take a few seconds to understand a listing anyway, then being normalised and split into lots of small pieces might outweigh the gains from being simply shorter. I think I do prefer Bob Martin's second listing over his first, even though it's longer, because neither is glanceable, and the second doesn't have so much code that has to grasped in one go.

The second point I see here is about refactoring. The first listing from the book assumed that the best way to generate the statistics message was to generate a set of component parts according to the parameters, and then stitch those together in the same way in every case. The second listing from the book tried to fix the problems of the first while retaining the same algorithm. To get to my favourite listing, I had to stop refactoring and ditch the algorithm. It often seems to me when refactoring that I'm not radical enough. It's something to do with the refactoring tools found in IDEs, which mostly support code transformations that preserve the existing algorithm. This makes it much easier to improve the implementation of the existing algorithm than to replace it with a better algorithm: the latter requires pushing the keyboard away, sitting back and thinking hard for a while. Quite apart from being hard work, this isn't always the most sociable thing to do in a pair programming situation.

Finally, I didn't write this all in one go. In the meantime, I read some more of the book, and guess what? Just six pages further on, it tells me that "...functions should not be large enough to hold nested structures...". I think this might be a rule that'll guarantee glanceable functions.

And finally, finally, don't let this put you off the book. Everything else I've read so far strikes me as sensible and useful: and I'm confident I'll learn plenty more as I go on.

6 comments:

Unknown said...

Hi Dave,

Wouldn't another possible refactoring be to create 3 additional functions:

private String getNumber(int count) {
    return (count == 0) ? "no" : Integer.toString(count);
}

private String getVerb(int count) {
    return (count == 1) ? "is" : "are";
}

private String getPluralModifier(int count) {
    return (count == 1) ? "" : "s";
}

private void printGuessStatistics(char candidate, int count) {
    String guessMessage = String.format("There %s %s %s%s", getVerb(count), getNumber(count), candidate, getPluralModifier(count));
    print(msg);
}

Moreover those 3 functions might be candidates for a Utility class as they would be useful for other functions similar to printGuessStatistics....

-- Ron --

Unknown said...

Hey Dave, just like you (and many others if you read Uncle Bob's blog post and amazon reviews), I find the book to be very contradicting. I wouldn't recommend non-experienced (students) programmers to buy it. Originally I intended to buy "The Pragmatic Programmer", but "Clean Code" was available immediately. On my blog, I describe not only the readability issue, but also other dangers that come forth from over-exracting too eagerly (e.g. encapsulation). The sad fact is, after I described this problem, I found an example of it just a few pages later which I added in an update.

http://whathecode.wordpress.com/2010/12/07/function-hell/

At the moment I'm analysing another bad section of the book, "Data/Object Anti-Symmetry". Which brought me to a post linking to this post. ;p

http://jd-syntropy.blogspot.com/2009/01/dataobject-anti-symmetry.html

At first glance it gets the point across I'm thinking about.

The thing that strikes me most is that Uncle Bob is supposed to be an expert in this field, with my short 3 years of professional experience, I'm wondering whether I'm missing the point.

Thanks for your post, as it provides for some comfort that I'm not alone on this. :)

Dave Cleal said...

@Ron: I wouldn't do that: I might move my function to a utility class if I had other callers. But breaking it up like this I don't find helpful. A clue that this isn't going well is that your function names are a bit odd: "getNumber" returns a string?

Dave Cleal said...

@Steven: I have a much more positive view of the book than you, and I would recommend it to students. I would counsel them to be a bit sceptical about the examples though.

- Dave

Unknown said...

@Dave: By rereading my posts I can understand that my view of the book seems .. awful. ;p I do find the book very useful however. I'm only afraid, I wouldn't have been able to look at the examples sceptically without a few years of experience.

Sean said...

Dave, I find your solution easily superior to the one in Code Complete. Yours is succinct and reads well. Further, you've chosen to make it a pure function that returns a string, which makes testing easy. Kudos!

The class-based Code Complete version follows the age-old OO pattern where the interface of each method is unclear. I'm left with the same old annoying questions like: Does a method depend on just its arguments, or does it depend on member data? Does a method mutate member data or just return a value? They've opted for a needlessly complicated solution that I need to unravel. Code Complete really failed on this one.