Tag Archives: Design
Onion Architecture: An Opinionated Approach Part 2, Anemic Data Models

Anemic Data Model is my term for a domain model indicating entities and their relationships to other entities, but having little or no business behavior. It basically serves as a contract for a data store. It’s purpose is structural, and often to provide ease of queryability.

In ShowPlanner, the domain is about planning and selling tickets to musical events called “shows.” Venues contain one or more Stages, and a Show can consist of one or more Performances on one or more Stages at a Venue. Tickets are sold to a Show.

DataModel

This data model represents the universe of entities, attributes, and relationships that the ShowPlanner application will interact with at this time. These models have no methods or functions, no behavior of any kind.

The Problem of Behavior

Why no behavior? Putting behavior in domain models raises some difficult design questions. How do the domain objects get their dependencies? Which domain object should the method or methods that control behavior go in?

Let’s take a simple case. Suppose you want to model what happens when a Ticket is purchased by a Customer. When a show is created, a number of tickets are created as well. When a Customer buys a Ticket, the ticket will be marked as sold.

Ownership

Generally speaking it’s very seldom that we write basic CRUD applications in an enterprise environment. Application behavior tends to live in the relationships and what happens when certain kinds of relationships are created and destroyed. This means that application operations often involve the state of multiple domain objects directly and indirectly involved in the relationship that is changing.

There are several possible ways to model this operation.

    public class Customer
    {
        public int? CustomerId { get; set; }

        public IList<Ticket> Tickets { get; set; }

        [Required]
        public string Name { get; set; }

        public void Purchase(Ticket ticket)
        {
            ticket.SoldTo = this;
            ticket.Sold = true; 
        }
    }

 

Or

    public class Ticket
    {
        public int? TicketId { get; set; }

        public int? ShowId { get; set; }
        [Required]
        public Show Show { get; set; }

        public int? CustomerId { get; set; }
        public Customer SoldTo { get; set; }

        public bool? Sold { get; set; }

        [Required]
        public decimal Price { get; set; }

        public void SellTo(Customer customer)
        {
            this.Sold = true;
            this.SoldTo = customer;
        }
    }

Or

    public class Show
    {
        public int? ShowId { get; set; }

        public IList<Ticket> Tickets { get; set; }

        [Required]
        public string Title { get; set; }

        public void SellTicket(Customer customer, Ticket ticket)
        {
            ticket.Sold = true;
            ticket.SoldTo = customer;
        }
    }

 

Which domain object should own the method for selling the ticket? It will depend on who you ask. When a given operation affects multiple entities in the Domain, there is no objective way to decide which entity gets the behavior. The method performing the operation has no natural home.

Dependencies

Let’s assume you secure agreement that Customer should own this operation. You’re happily coding along on something else when a few weeks later your customer tells you that when the ticket is sold a message needs to be sent to a billing system. The messaging system is represented by an interface called IMessageSender. What happens in the domain model now? How does the Customer class get a reference to the MessageSender implementation?

You could do this I guess:

        public void Purchase(Ticket ticket, IMessageSender messageSender)
        {
            ticket.SoldTo = this;
            ticket.Sold = true;

            messageSender.Enqueue(new BillForTicketMessage(this, ticket));
        }

But that’s an ugly road to walk down. You’ll start expanding dependencies when you need to add logging, authorization, validation, and a host of other concerns. Suddenly what started out as a very simple method on domain model full of complexity.

Constructor injection is not viable if I’m expecting to use an ORM to retrieve my Customer instance. Assuming you could make that work, you have to consider that with the addition of each new operation involving Customer, you’ll be adding still more dependencies which will turn Customer into more of a GOD class than will be supportable.

Wouldn’t’ design-by-composition work better?

Designing the Domain Models

I said earlier that “domain models as merely a means of persisting the state of the application.” To my way of thinking, this is their core purpose. To that end, they should effectively act as a contract to the underlying data store used by your application. They should support both queryability and data modification and be well-organized.

Queryability

To support queryability, data models should be fully connected. This just means that any relationships between entities should be referentially  expressed. Prefer Performance.Show over Performance.ShowId. Some ORM’s such as Entity Framework support using both the Id and the reference. For others, such as NHibernate, having both properties is an issue. When faced with a choice, prefer a reference over an Id.

Data Modification

To support data modification, your data models should contain all the fields used in the database. I was writing an integration test recently and I needed to create some test data in Sql Server. As I tried to insert one of my entities, I discovered that field required by the database was not actually on the model. The original developer had only populated the fields he needed for the specific operation he was writing which resulted in additional work and testing for me. It also exposed a lack of integration test coverage for any operation involving this entity as it was impossible to create new records using existing code.

Organization

Data models are the center of the onion. They should be in their own assembly or package both to make them easier to find and modify, and to prevent other developers from violating the Dependency Depth principle by referencing higher layers.

Do’s
  • Put your data models in their own project or assembly. This makes them easy to find. They are the innermost circle in your application, so this project should have next to no dependencies on anything else.
  • If you’re tools set supports it, maintain a class diagram of the models and their relationships. The diagram above was generated by Visual Studio.
  • Prefer referential relationships over using Id’s to identify related data.  Use both if your ORM supports it.
  • This should go without saying, but use an ORM.
  • In the anal-retentive list of things to consider
    • List the primary key first.
    • List single-entity references second.
    • List collection references third.
    • List the entity’s fields last.
    • List audit data (create date, create user, etc) dead last.
    • Alphabetize fields. Why? Because with larger data models it gets really hard to find the field you’re looking for in an unsorted list. This is an easy thing to habitualize and saves a good bit of headache down the road.
  • In .NET, use System.ComponentModel.DataAnnotations to attribute your models metadata when it is known.
    • Especially use StringLength because it produces a nicer error message than “String or binary data would be truncated” when using SQL Server as the database.
Don’ts
  • Don’t put behavior on the models.
  • Don’t accept dependencies in the models.
  • Don’t put ORM-specific attributes on the models unless you have no other choice.
  • Don’t put anything on the models that isn’t actually in your data store.
    • This is another way of saying “don’t put behavior on the models.” But I’ve seen developers put properties on models that they calculate and set in the course of one operation and that isn’t used in other operations. This breaks down when another operation wants to use the calculated value.
  • Don’t use data models in your UI. In web applications, this means don’t use data models in your pages and controllers. Use View Models and some sort of Request-Response api instead.

This is Part 2 of a series

Disclaimer: This series of articles is intentionally opinionated. From an architectural perspective I am much more interested in clearly defining each layer than I am in the choice of specific implementation pattern. However, an architecture doesn’t exist without the details that make it up, so I must choose some implementation pattern so that the architecture can be revealed. Since I’m the one writing the series, I’ll choose implementation patterns I like. Smile If you have a better idea, take a fork of ShowPlanner and share it!

API Design: Batch Operations

When designing a batch API, it is important that the return results be relatable to the individual arguments that were asked to be processed.

Consider an API that takes a list of objects to perform an operation on. It might have a method signature that looks like this:

        Response Process(Record[] records);

 

What is a good design for the Response object? I recently encountered a situation in which I was given a response object like this:

    public class Response
    {
        public bool Success { get; set; }
    }

 

Okay, actually it was more like this:

        bool ProcessTheRecords(int[] ids);

 

The problem with this approach is that when the operation fails there is nothing on the Response to indicate the nature of the failure. A better Response object might look like this:

    public class Response
    {
        public bool Success { get; set; }

        public OperationResult[] ErrorsAndWarnings { get; set; }

        public ProcessRecordResult[] Results { get; set; }
    }

    public class ProcessRecordResult
    {
        public string RecordIdentifier { get; set; }

        public OperationResult[] ErrorsAndWarnings { get; set; }
    }

    public class OperationResult
    {
        public string Message { get; set; }
        public Severity Severity { get; set; }
    }

    public enum Severity
    {
        Error = 1,
        Warning =2,
    }

Why is this better? Response.ErrorsAndWarnings lets me see if anything went wrong with the operation outside of any particular record. There should be a ProcessRecordResult for each record passed into the method call. Each ProcessRecordResult has its own list of ErrorsAndWarnings that indicates what may or may not be wrong with a specific record.

The purpose of the response object is to give enough information to the caller so that they can adjust their failed request into a successful one. Minimize the size and shape of the data required to perform the operation. Maximize the size and shape of the data required to understand what went wrong during the operation. These principles apply to non-batch operations as well, but they are especially frustrating when there’s an error in one of hundreds of records.

Practical advice for observing the LSP and DIP: Use the Most AbstractType Possible

This post is about concretizing the relationship between two abstract design principles and providing a piece of practical advice

The goal of the Liskov Substitution Principle is to make sure that subclasses behave as their superclass parents. The “Is-A” relationship is superceeded by the “Behaves-As-A” relationship. By observing the LSP we enable looser coupling by not making clients of the superclass catalog knowledge of the subclasses.

Implicit in the discussion around LSP is that you are actually consuming the abstract type instead of the concrete types. From the wikipedia entry on the Dependency Inversion Principle:

A. High-level modules should not depend on low-level modules. Both should depend on abstractions.

B. Abstractions should not depend upon details. Details should depend upon abstractions.

Basically, you want the clients of your code to depend on the most abstract possible type. So, as a simple heuristic when you’re writing methods, try using the most abstract type that the clients of your code can use without type casting or reflection. If you’re using ReSharper (which I highly recommend) you’ll get these suggestions automatically–but this is an easy heuristic to apply when you’re writing the method in the first place.

In general, favor interfaces over base classes and base classes over sub-classes (Interfaces are a much weaker coupling than inheritance).

Happy Coding!

Smart Enumerations, or Using Flyweight to Beat Enums Into Submission

System.Enum is a powerful type to use in the .NET Framework. It’s best used when the list of possible states are known and are defined by the system that uses them. Enumerations are not good in any situation that requires third party extensibility. While I love enumerations, they do present some problems.

First, the code that uses them often exists in the form of switch statements that get repeated about the code base. This violation of DRY is not good.

Second, if a new value is added to the enumeration, any code that relies on the enumeration must take the new value into account. While not technically a violation of the LSP, it’s close.

Thrid, Enums are limited in what they can express. They are basically a set of named integer constants treated as a separate type. Any other data or functionality you might wish them to have has to be added on through other objects. A common requirement is to present a human-readable version of the enum field to the user, which is most often accomplished through the use of the Description Attribute. A further problem is the fact that they are often serialized as integers, which means that the textual representation has to be reproduced in whatever database, reporting tool, or other system that consumes the serialized value.

Polymorphism is always an alternative to enumerations. Polymorphism brings a problem of its own—lack of discoverability. It would be nice if we could have the power of a polymorphic type coupled with the discoverability of an enumeration.

Fortunately, we can. It’s a variation on the Flyweight pattern. I don’t think it fits the technical definition of the Flyweight pattern because it has a different purpose. Flyweight is generally employed to minimize memory usage. We’re going to use some of the mechanics of the pattern employed to a different purpose.

Consider a simple domain model with two objects: Task, and TaskState. A Task has a Name, Description, and TaskState. A TaskState has a Name, Value, and boolean method that indicates if a task can move from one state to the other. The available states are Pending, InProgress, Completed, Deferred, and Cancelled.

If we implemented this polymorphically, we’d create an abstract class or interface to represent the TaskState. Then we’d provide the various property and method implementations for each subclass. We would have to search the type system to discover which subclasses are available to represent TaskStates.

Instead, let’s create a sealed TaskState class with a defined list of static TaskState instances and a private constructor. We seal the class because our system controls the available TaskStates. We privatize the constructor because we want clients of this class to be forced to use the pre-defined static instances. Finally, we initialize the static instances in a static constructor on the class. Here’s what the code looks like:

public sealed class TaskState
{
    public static TaskState Pending { get; private set; }
    public static TaskState InProgress { get; private set; }
    public static TaskState Completed { get; private set;}
    public static TaskState Deferred { get; private set; }
    public static TaskState Canceled { get; private set; }

    public string Name { get; private set; }
    public string Value { get; private set; }

    private readonly List<TaskState> _transitions = new List<TaskState>();

    private TaskState AddTransition(TaskState value)
    {
        this._transitions.Add(value);
        return this;
    }

    public bool CanTransitionTo(TaskState value)
    {
        return this._transitions.Contains(value);
    }

    private TaskState()
    {
        
    }

    static TaskState()
    {
        BuildStates();
        ConfigureTransitions();
    }

    private static void ConfigureTransitions()
    {
        Pending.AddTransition(InProgress).AddTransition(Canceled);
        InProgress.AddTransition(Completed).AddTransition(Deferred).AddTransition(Canceled);
        Deferred.AddTransition(InProgress).AddTransition(Canceled);
    }

    private static void BuildStates()
    {
        Pending = new TaskState()
                      {
                          Name = "Pending",
                          Value = "Pending",
                      };
        InProgress = new TaskState()
                         {
                             Name = "In Progress",
                             Value = "InProgress",
                         };
        Completed = new TaskState()
                        {
                            Name = "Completed",
                            Value = "Completed",
                        };
        Deferred = new TaskState()
                       {
                           Name = "Deferred",
                           Value = "Deferred",
                       };
        Canceled = new TaskState()
                       {
                           Name = "Canceled",
                           Value = "Canceled",
                       };
    }
}

 

This pattern allows us to consume the class as if it were an enumeration:

var task = new Task()
               {
                   State = TaskState.Pending,
               };

if (task.State.CanTransitionTo(TaskState.Completed))
{
    // do something
}

We still have the flexibility of polymorphic instances. I’ve used this pattern several times with great effect in my software. I hope it benefits you as much as it has me.

Sneaking Up on Reactive Extensions

I saw Matthew Podwysocki speak on Reactive Extensions at the most recent DC Alt .NET meeting. I’ve heard some buzz about Reactive Extensions (Rx) as Linq over Events. That sounded cool, so I put the sticky note in the back of my mind to look into it later. Matthew’s presentation blew my mind a bit. Rx provides so much functionality and is so different from traditional event programming that I thought it would be helpful for me to retrace a few of the first necessary steps that would go into creating something as powerful as Rx. To that end, I starting writing a DisposableEventObserver class.

This class has two goals at this point:

  1. Replace the traditional EventHandler += new EventHandler() syntax with an IDisposable syntax.
  2. Add conditions to the Observer that determine if it will handle the events.

This is learning code. What I mean by this is that it is doubtful that the code I’m writing here will ever be used an a production application since Rx will be far more capable than what I write here. The purpose of this code is to help me (and maybe you) to gain insight into how Rx works. There are two notable Rx features that I will not be handling in v1 of DisposableEventObserver:

  1. Wrangling Asynchronous Events.
  2. Composability.

The first test I wrote looked something like this:

  1. [TestFixture]
  2. public class DisposableEventObserverTests
  3. {
  4.     public event EventHandler<EventArgs> LocalEvent;
  5.  
  6.     [Test]
  7.     public void SingleSubscriber()
  8.     {
  9.         // Arrange: Setup the test context
  10.         var count = 0;
  11.  
  12.         // Act: Perform the action under test
  13.         using (var observer = new DisposableEventObserver<EventArgs>(this.LocalEvent,
  14.             (sender, e) => { count += 1; })
  15.         {
  16.             this.LocalEvent.Invoke(null, null);
  17.             this.LocalEvent.Invoke(null, null);
  18.             this.LocalEvent.Invoke(null, null);
  19.         }
  20.  
  21.         // Assert: Verify the results of the action.
  22.         Assert.That(count, Is.EqualTo(3));
  23.     }

The test fixture served as my eventing object. I’m passing the event handler as a lambda. There were two interesting things about this approach. The first is that type required to pass this.LocalEvent is the same delegate type as the that required to pass the handler. The second is that this code did not work.

I was a little confused as to why the test didn’t pass. The lines inside the using block blew up with a NullReferenceException when I tried to reference this.LocalEvent. This is odd because inside the Observer I was definitely adding the handler to the event delegate. What’s going on here? It turns out that although Events look for all intents and purposes like a standard delegate field of the same type, the .NET framework treats them differently. Events can only be invoked by the class that owns them. The event fields themselves cannot reliably be passed as parameters.

I backed up a bit and tried this syntax:

  1.  
  2. [Test]
  3. public void SingleSubscriber()
  4. {
  5.     // Arrange: Setup the test context
  6.     var count = 0;
  7.  
  8.     EventHandler<EventArgs> evt = delegate {count += 1; };
  9.  
  10.     // Act: Perform the action under test
  11.     using (var observer = new DisposableEventObserver<EventArgs>(this, "LocalEvent", evt))
  12.     {
  13.         this.LocalEvent.Invoke(null, null);
  14.         this.LocalEvent.Invoke(null, null);
  15.         this.LocalEvent.Invoke(null, null);
  16.     }
  17.  
  18.     // Assert: Verify the results of the action.
  19.     Assert.That(count, Is.EqualTo(3));
  20. }

This test implies an implementation that uses reflection to find the event and add the handler. This worked the first time at bat, however I don’t like that magic string “LocalEvent” sitting there. I thought back to Josh Smith’s PropertyObserver and wondered if I could do something similar. Here’s a test that takes an expression that resolves to the event:

  1. [Test]
  2. public void Subscribe_Using_Lambda()
  3. {
  4.     // Arrange: Setup the test context
  5.     var count = 0;
  6.  
  7.     EventHandler<EventArgs> evt = delegate { count += 1; };
  8.  
  9.     // Act: Perform the action under test
  10.     using (var observer = new DisposableEventObserver<EventArgs>(this, () => this.LocalEvent, evt))
  11.     {
  12.         this.LocalEvent.Invoke(null, null);
  13.         this.LocalEvent.Invoke(null, null);
  14.         this.LocalEvent.Invoke(null, null);
  15.     }
  16.  
  17.     // Assert: Verify the results of the action.
  18.     Assert.That(count, Is.EqualTo(3));
  19.  
  20. }

This looks much better to me. Now I’ll get a compile-error if the event name or signature changes.

The next step is to add some conditionality to the event handling. While this class will not be Queryable like Rx, I’m going to use a similar Where() syntax to add conditions. I added the following test:

  1. [Test]
  2. public void Where_ConditionNotMet_EventShouldNotFire()
  3. {
  4.     // Arrange: Setup the test context
  5.     var count = 0;
  6.  
  7.     EventHandler<EventArgs> evt = delegate { count += 1; };
  8.  
  9.     // Act: Perform the action under test
  10.     using (var observer = new DisposableEventObserver<EventArgs>(this,
  11.         () => this.LocalEvent,
  12.         evt).Where((sender, e) => e != null)
  13.         )
  14.     {
  15.         this.LocalEvent.Invoke(null, null);
  16.         this.LocalEvent.Invoke(null, null);
  17.         this.LocalEvent.Invoke(null, null);
  18.     }
  19.  
  20.     // Assert: Verify the results of the action.
  21.     Assert.That(count, Is.EqualTo(0));
  22. }

The Where condition specifies that the event args cannot be null. In this case count should never be incremented. I had to make several changes to the internals of the Observer class to make this work. Instead of registering “evt” with the event handler I had to create an interceptor method inside the Observer to test the criteria. If the criteria are met then “evt” will be called. I implemented Where as an “Add” function over a collection of Func<object, TEventArgs, bool>.

The full implementation can be found here, and the tests and be found here.

Martin Fowler on Why Agile Works

The intro is in French, but the talk is in English. Enjoy!

<!–

Pourquoi, pas comment
Pourquoi, pas comment

USI 2010 : conférence incontournable du l’IT en France
Rendez-vous annuel des Geeks et des Boss souhaitant une informatique qui transforme nos sociétés, USI est une conférence de 2 jours sur les sujets IT : Architecture de SI, Cloud Computing, iPhone, Agile, Lean management, Java, .net… USI 2010 a rassemblé 500 personnes autour d’un programme en 4 thèmes : Innovant, Durable, Ouvert et Valeur.
Plus d’informations sur www.universite-du-si.com

ObservableCollectionHandler & ChangeTracker

When working with the ObservableCollection or INotifyCollectionChanged interface, it is common to see code like the following:

void OnCollectionChanged(object sender, NotifyCollectionChangedEventArgs e)
{
    switch (e.Action)
    {
        case NotifyCollectionChangedAction.Add:
            HandleAddedItems(e);
            break;

        case NotifyCollectionChangedAction.Move:
            break;

        case NotifyCollectionChangedAction.Remove:
            HandleRemovedItems(e);
            break;

        case NotifyCollectionChangedAction.Replace:
            HandleReplacedItems(e);
            break;

        case NotifyCollectionChangedAction.Reset:
            HandleClearItems(e);
            break;
    }
}

There’s nothing particularly wrong with this code except that it’s kind of bulky, and that it starts to crop up all over your application. There’s another problem that’s not immediately obvious. The “Reset” action only gets fired when the ObservableCollection is cleared, but it’s eventargs does not contain the items that were removed from the collection. If your logic calls for processing removed items when they’re cleared, the built-in API offers you nothing. You have to do your own shadow copy of the collection so that you can respond to the Clear() correctly.

For that reason I wrote and added ObservableCollectionHandler to manage these events for you. It accepts three kinds of delegates for responding to changes in the source collection: ItemAdded, ItemRemoved, and ItemReplaced actions. (It would be easy to add ItemMoved as well, but I have seldom had a need for that so I coded the critical path first.) The handler maintains a shadow copy of the list so that the ItemRemoved delegates are called in response to the Clear() command.

[Test]
public void OnItemAdded_ShouldPerformAction()
{
    // Arrange: Setup the test context
    int i = 0;
    var collection = new ObservableCollection<Employee>();
    var handler = new ObservableCollectionHandler<Employee>(collection)
        .OnItemAdded(e => i++);

    // Act: Perform the action under test
    collection.Add(new Employee());

    // Assert: Verify the results of the action.
    Require.That(i).IsEqualTo(1);
}

Another common need with respect to ObservableCollections is the need to track which items were added, modified, and removed from the source collection. To facilitate this need I wrote the ChangeTracker class. ChangeTracker makes use of ObservableCollectionHandler to setup activities in response to changes in the source collection. ChangeTracker maintains a list of additions and removals from the source collection. It can also maintain a list of modified items assuming the items in the collection implement INotifyPropertyChanged.

Here is a sample unit test indicating it’s usage:

[Test]
public void GetChanges_AfterAdd_ShouldReturnAddedItems()
{
    // Arrange: Setup the test context
    var source = new ObservableCollection<Employee>();
    var tracker = new ChangeTracker<Employee>(source);

    // Act: Perform the action under test
    var added = new Employee();
    source.Add(added);

    // Assert: Verify the results of the action.
    var changes = tracker.GetChanges(ChangeType.All);
    Require.That(changes.Any()).IsTrue();
    Require.That(tracker.HasChanges).IsTrue();

    var change = changes.First();
    Require.That(change).IsNotNull();
    Require.That(change.Type).IsEqualTo(ChangeType.Add);
    Require.That(change.Value).IsTheSameAs(added);
}

The full source code and unit tests can be found here.

Yodelay Updated

Yodelay has been updated. I’ve modified the Regular Expression tester so that it supports n-number of test contexts with a single expression. In addition, you can search regexlib.com for a useful expression.

This screen is fully implemented using MVVM. The search window serves as an example of using MVVM with dialogs.

I’ve also pulled in the themes from WpfThemes and integrated the ThemeManager.

Check it out!

Neoshooter 1

Design Considerations of Public Functions

Writing a function is an act of design. You must choose the placement (both physical and logical), signature, and name of the function carefully. Most people understand vaguely that functions should have minimal scope, but I’d like to draw attention to a couple of design implications of public functions that I haven’t heard discussed much. These features are "permission" and “precedent.” Simply by existing, a function indicates that it is okay to call it, and establishes a precedent for the choices that led you to create it with the characteristics you chose.

Permission

When would you not want to publicly expose a function? Or what are the reasons you might wish to refuse calling permission to a function?

Here is a typical example: Suppose you have a WidgetListEditor class that manages a list of widgets bound to a list editing control on a screen. Suppose that part of the class’s task is to sort the list of widgets prior to binding to the grid. The sorting of the list is ancillary to the primary task of managing the binding to the  grid. Clients of the code should not instantiate an instance of WidgetListEditor to sort the contents of some arbitrary list of widgets elsewhere in the system. For this reason, the sorting methods of WidgetListEditor are internal concerns, and should not be exposed publicly. If the sorting methods are needed externally to WidgetListEditor, they should be extracted into first-class components of their own and provided to WidgetListEditor via some form of Dependency Injection.

Why would it be wrong to simply allow the clients of the code to instantiate WidgetListEditor at will and use the sorting method? It exposes the rest of the code to changes in WidgetListEditor. WidgetListEditor could be modified such that it loads an expensive resource when constructed. The maintainers of WidgetListEditor have no reason to believe that adding the expensive resource should cause any problems because as far as they know, this class is only instantiated once on a couple of screens. Suddenly the system is brought to a crawl because it is in fact being used all over the place as a means of “reusing” the sorting code. This is a problem of responsibility.

Or:

WidgetListEditor needs to be changed to allow arbitrary sorting by the user. This requires that the sorting methods be changed to accept a sorting criteria–but that change is made more difficult because the methods are being consumed application-wide instead of on the one or two screens the class was built for.

Or:

WidgetListEditor no longer needs to presort the contents of the list before binding, so the developer wishes to remove the sorting code–only s/he can’t because the methods are being consumed application-wide instead of on the one or two screens the class was built for.

There’s a recurring theme here: What was the class built-for? What is its exposure to change? A public method is a kind of contract with the rest of the code. Changes to public methods must be considered with greater care because of the potential effects with the rest of the code-base. When we make a method private, we reduce the potential of damage to the system if the method needs to change, or be removed altogether. This is the design advantage of encapsulation. For this reason, every function in the system (and every class in a library) needs to have the least significant scope it is possible to give it.

There is another kind of permissions problem related to public methods that is harder to pinpoint. Sometimes a method is created that does a narrow-purpose task that is needed within the system, but has the potential to corrupt the system. Posit a financial system in which every transaction is recorded as a series of debit-credit pairs. For the most part the system does not allow direct editing of entries. Instead it functions through a series of transaction templates. You submit a transaction, and the template tells it which accounts to debit, to credit, and how much, etc.. It is discovered that a rounding error creates a small reporting variance and your manager assigns you the task of creating a manual correction screen.

Since this is a transactional system, the right way to solve this problem is to create an AdjustmentTransactionTemplate and enter it into the system, but that would take a week, and you only have a day. So, you take a short-cut. You create a new method in the data access layer that lets you simply update the entry directly. You reason that since this is the only code that calls this method, it’s safe because you will be sure to update both the debit and credit together, and with the same amount. You do this, turn in the work, and everybody is happy.

What’s the problem here? There is now a method in the data access layer that allows you to manually update a specific entry without reference to a transaction template. It’s existence implies that it’s okay to use it. The methods that accept the transaction template can validate the template to verify that the transaction will balance on debits and credits. This manual entry edit method can do no such validation. This means that it is now possible to get bad data into the system which could have catastrophic consequences. At this point, it is a matter of "when, " not "if" that will happen. If you or your manager considers "it’s faster” to be a general justification for violating the design of a system, then your project will become less maintainable as time goes by. You may need to update your design to make it faster to develop against—but that’s a different problem than I’m describing here.

Precedent

Consider either of the above scenarios. What danger does the existence of either of the public functions pose to the rest of the code base? Isn’t the danger isolated to the particular components involved? What danger does the precedent pose to a code base?

In the legal system, the decision of a higher court on a relevant point is binding as legal-precedent for a lower court. Conversely, the decision of a lower-court on some legal matter is used as guidance to a higher court faced with the same decision. The reason for this is not that lawyers are crazy, but rather that the enormity of concrete circumstances possible in a legal case are too numerous for any one person or team to keep track of. When a new set of circumstances is considered by a court, it must make the best decision it can within the context of existing law and existing legal precedent. In most cases this keeps the range of legal issues organized in an intelligible manner. However, sometimes the courts make a decision that violates our sense of justice or propriety. In these cases, we can “refactor” by encouraging the legislature to explicitly change the law.

The key point is this: existing code is the best guideline for how to add new features to a code-base. You cannot expect future developers, to look at your code-base and infer the principles you built the application on if you are violating those principles yourself. Further, if you are willing to violate the design principles in one case, you will likely find it much easier to violate them again and again. This means that even though you “think” your application was developed according to a particular design, “with a couple of exceptions,” it will not be apparent to anyone else. Objectively the, the code was not designed according to any principles at all, but to your whims. To put the point another way, intent is communicated not by what’s in our heads, but what’s in our code.

If you use WidgetListEditor for sorting some place else in your code, other developers will think that that’s what you intended. If you bypass the transaction template logic in the financial system, other developers will think it’s safe to do that as well. The question of whether code is designed well is more than just a question of whether it works.  It’s a question of whether the intent of the design is consistently expressed by the code that implements it.

Conventions vs. Design

Conventions often function as a substitute for failure to grasp design principles. For example, I’ve seen teams get up in arms about using Hungarian Notation but then write meaningless function names like “DoIt()” or “Process()”, or meaningless variable names like “intVar” or “intX”. These same teams often have little or no understanding of design principles , patterns, or concepts. I have met so many developers that have never heard of YAGNI, LRM, SRP, LSP, Dependency Injection. How many developers do you know that have heard of patterns such as Singleton, Adapter, Decorator, Factory, Monostate, Proxy, Composite, or Command?

The use of conventions is often invoked as a means of aiding code clarity. And it does, but it is not the only or even the best means of attaining that goal. We need to learn to name classes, functions, and variables in meaningful ways; to write small functions that do one tiny little thing, and do it well; to write small single-purpose classes. All of the conventions in the world will not help you maintain a code-base if developers are free to write 3000-line methods with 27 layers of nested-conditionals and loops (and yes, I have seen this). Read “Clean Code” by Uncle Bob.

Design patterns, techniques, and processes are much harder to learn and master than a concrete list of do’s and don’ts (which is what most conventions amount to). Good software design is like playing a musical instrument–it requires dedication, repetition, and practice to learn and master.

Conventions do have a place. However, conventions are usually suggested by the provider of the tools you are using. For example, Microsoft has a conventions document for the .NET Framework which is applicable to all .NET Languages. (Thankfully, they eschew Hungarian Notation). It is true that this is worth learning simply in order to be a good developer citizen. My recommendation would be to simply use the conventions established by the tool-provider, or the dominant conventions within the community around the tool. Don’t waste company time and money adding your own tweaks and changes to the conventions document. Speaking personally, I would much rather maintain a code-base that is written cleanly and with good design and that violates every convention you could name.

Conventions are not a bad thing. The problem with conventions is that they are so often discussed in place of design principles. Conventions without design principles aid nothing. Without proper focus on good design, conventions can even hurt software quality because they give developers and managers the illusion of being thoughtful and disciplined about what they are doing.

Conventions can be useful in another way. There is a good bit of discussion going on right now about Convention over Configuration. CoC is a design short-cut, and consists of making decisions about how components of a software system interact by default, and only adding additional documentation when components deviate from the norm. CoC actually bolsters my point here, because it only tends to arise as a discussion point in systems that are already using good design patterns and practices.

Posted on April 3, 2010, 12:15 pm By
2 comments Categories: Design Tags:

Next Page