Monthly Archives: January 2010
Software Design: Cognition and Design Principles

Software design is a subject fraught with disagreement among developers. The size and scope and importance of the subject demand serious attention from any developer considering himself more than just a hacker. What follows is my current understanding of software design as a subject.

Before we can properly deal with the subject of what good design is, we must first ask what is the purpose of design. Is it just to make the software work? Or is there something else. While the software must perform its basic function, I regard “working” as a second-order consideration. Functioning software is a necessary goal of design, but it is not sufficient to explain why we need design. Consider for a moment that many companies have working software that is poorly designed. Cearly, software does not necessarily have to be designed well, or even at all, in order for it to work—at least in the short term. The goal of design must be something else, something other than just the basic “does it work now?” question.

At this point one might be tempted to say that the purpose of design is “maintainability.” But what is meant by “maintainability?” What makes one software project maintainable, and another a disaster?

Consider that the computer does not care about the structure or organization of the code. The computer is not concerned with what language the software was written in, which patterns, techniques, processes, or tools were used in the construction of the software. To the computer, your program is just a sequential series of instructions: “Do x, Do y, If z Do a.” Why then should we concern ourselves with design principles, patterns, and practices? What are they for?

Software developers are often described as “writing code,” but we don’t normally think of ourselves as writers in the traditional sense. What if we did? When a writer sits down to write, the first questions s/he must answer is “who is my audience?” In deciding on an audience for his work, a writer constrains what is or is not acceptable in terms of the structure and content of his written work. These constraints are not arbitrary or optional: they are logically necessary implications of the choice of audience. If software developers are writers, then their work must also be constrained by their target audience. But if the target audience cannot be the computer, who is it?

It’s people. We do not write code for the computer. We write code for other people—for our co-workers, for ourselves, for any others that may have an interest in what the software is supposed to do at any time over the entire lifespan of the project. The purpose of software design is to communicate what the project does, and how it does it. Any set of software design principles and methods must be targeted at communicating to people. It must be constrained by the nature of the cognitive activity of the human mind.

Human Cognition

The human mind is only capable of dealing with 4 or 5 concretes at one time, and yet we are confronted with thousands of concrete objects, ideas, and choices that we must deal with on a daily basis. We must aggregate concretes into abstractions so that we can then treat the abstraction as a concrete for further thinking. For example, you may not be able to tell how many pipes this is must by glancing at it:

||||||||||||||||||||||||||||||

But if I broke it up into groups:

|||||    |||||    |||||    |||||    |||||    |||||

You should be able to tell that there are 6 groups of 5, ergo 30 pipes. The act of viewing the pipes in groups enables thinking. This is analogous to what we do when we perform abstraction. Abstraction is a process of integration and differentiation. We integrate like members of a class of existents, and differentiate them from the other members of the wider class to which they belong. For example, we can observe that certain objects are differentiated from other “media” by the fact that they have pages covered in text. “Book” as an abstract integrates all these objects under a common abstraction while distinguishing them from other kinds of media.

Well-designed software must be intelligible and discoverable and have an appropriate level of abstraction.

Intelligibility

Intelligibility means that even if the software does what it is supposed to do, it should make sense to the average software craftsman. I do NOT mean that we should “dumb-down” our code to the lowest common denominator. For any given field of endeavor there exists as general category of knowledge that can be expected of our target audience. Developers that do not bother to acquire that context have only themselves to blame for not being able to understand common software idioms. Suppose for example that your company has recently adopted an object-oriented language after years of writing procedural code, and the developers are not yet fully comfortable with concepts such as “classes” and “interfaces.” That does not mean that you should not introduce them to concepts such as design-patterns.

Just as we should avoid dumbing-down our code, we should also avoid “stupid code tricks.” A Stupid Code Trick is when a developer takes advantage of a subtle or little-known feature of a language to accomplish a goal that could be solved with a simpler code-construct. C is famous for this kind of code—some wise-guy developer will take advantage of the difference between “++i;” and “i++;,’” but in the context of a larger expression which consists primarily of pointer symbols, member signifiers, and parentheticals. Another Stupid Code Trick is to embed a long chain of function calls as arguments to a function. In this case, you end up with code that looks like:

// Stupid Code Trick
ProcessMyData(GetDataToProcess(GetMyIdentity(My.Session.User)), GetMethodToProcess(1, 12, 15), GetOutputFormat(OutputFormat.Default))

This should be re-written as:

var identity = GetMyIdentity(My.Session.User);
var method = GetMethodToProcess(1, 12, 15);
var format = GetOutputFormat(OutputFormat.Default);
var dataToProcess = GetDataToProcess(identity);
ProcessMyData(dataToProcess, method, format);

The temptation to perform a Stupid Code Trick seems to be rooted in the desire to reduce the number of lines of code. We must resist this temptation by remembering that our purpose is to be intelligible, not to reduce the number of lines of code.

Intelligible software should not confront the developer with too many concretes at once without a unifying abstraction. Let’s count the number of concretes in the above code:

var identity = GetMyIdentity(My.Session.User); // 3
var method = GetMethodToProcess(1, 12, 15); // 5 
var format = GetOutputFormat(OutputFormat.Default); // 3 
var dataToProcess = GetDataToProcess(identity); //4 
ProcessMyData(dataToProcess, method, format); // 4

I count one abstraction for the return result of the method, one for the method itself, and one more for each argument to the method. In the original version of this function, there were 20 concretes crammed on one line of code. By breaking up each function call into a separate call, we have a horizontal complexity of 3 to 4 on each line, and the entire algorithm is processed in 5 lines. Both horizontally and vertically the refactored code is within the ability of the mind to grasp at once.

This limit of 4 or 5 concretes means we should try to keep the number of arguments to functions as small as possible. the return result and method itself give us a concrete complexity of 2. anything over 3 arguments to a function and we are straining our ability to hold everything in our mind at once. The limit also means that we should keep methods as short as possible. It’s better to have many small and well-named functions than to have one massive function that does everything.

Discoverability

Another feature of the refactored code sample is its discoverability. Discoverability refers to the extent to which the code expresses its intent. I can make the code less discoverable by renaming its methods and arguments as follows:

var i = GetI(My.Session.User); // 3
var m = GetM(1, 12, 15); // 5
var f = GetOF(OutputFormat.Default); // 3
var d = GetData(i); //5
Process(d, m, f); // 4

I could make the code clearer by assigning named variables the three arguments to GetMethodToProcess();

const int numberOfCopies = 1;
const int leftMargin = 12;
const int topMargin = 15;

var identity = GetMyIdentity(My.Session.User); // 3
var method = GetMethodToProcess(numberOfCopies, leftMargin, topMargin); // 5
var format = GetOutputFormat(OutputFormat.Default); // 3
var dataToProcess = GetDataToProcess(identity, method, format); //5
ProcessMyData(dataToProcess, method, format); // 4

Discoverable functions describe what they do, accept only a few arguments, observe the Command-Query-Separation principle (CQS), and avoid temporal coupling.

// Temporal Coupling Example:
var myObject = this.GetBusinessObjectToBeValidated();
var validator = new Validator<MyBusinessObject>();
validator.Value = myObject;
var results = validator.GetValidationResults();

The 3rd line of the above example inhibits discoverability because nothing in the API suggests that you are required to set Value on the validator prior to calling GetValidationResults(). The above coupling could be resolved by either passing the object to be validated to the constructur, or (my preference) passing it as a parameter to the GetValidationResults() method.

// Temporal Coupling Example:
var myObject = this.GetBusinessObjectToBeValidated();
var validator = new Validator<MyBusinessObject>();
var results = validator.GetValidationResults(myObject);

It is now clear from the method signature of GetValidationResults that a business object is required in order to perform the operation.

Level of Abstraction

Good software design demands an appropriate level of abstraction. The cognitive principle is best expressed by Rand’s Razor: “[abstractions] are not to be multiplied beyond necessity—the corollary of which is: nor are they to be integrated in disregard of necessity.” Consider the following “Hello World” application:

var outputWriter = DiContainer.GetService(IOutputWriter);
var resourceService = DiContainer.GetService(IResourceServce);
var format = resourceService.GetDefaultResourceFormat();
var arguments = resourceService.GetDefaultResourceArguments();
var message = string.Format(format, arguments);
outputWriter.Write(message);

This application uses a Dependency Injection container to get an outputWriter and ResourceService. It then formats a message for display, then passes that message to the outputWriter. This application clearly requires a good bit of configuration in order to work properly. Gah! It violates Rand’s Razor in that it introduces an unnecessary level of abstraction into the application. This application would be better written as:

Console.WriteLine("Hello World!");

The above code might not be a Hello World application, and the level of abstraction expressed might be perfectly necessary in that context. Here is an example of the other side of the same problem. This interface integrates methods that are not cogent:

public interface ISuperClass
{
    void Login(User user);
    IEnumerable<FileInfo> GetFiles();
    TcpClient CreateTcpClient(string hostName, int port);
}

These methods have nothing to do with one another. Integrating them in an interface serves no purpose. The concomitant design principle of Rand’s Razor is the Single Responsibility Principle.