Category Archives: Uncategorized
Diff – Part 2

In Part 1 of this series, I created a basic class that would find differences between two versions of the same list. I’ve made some improvements to the original implementation that I’d like to share.

To begin with, I read this post on the difference between the Any() and Count() extension methods. While List<T> does implement ICollection, I wanted to change the structure of DiffResult<T> so that if I decide to change the datatype of Additions, Deletions, and Modifications, the IsEmpty property will be more efficient. To that end, I changed IsEmpty as follows:

public bool IsEmpty
{
    get { return !this.HasChanges;  }   
}

public bool HasChanges
{
    get { return this.Additions.Any()
                 || this.Deletions.Any()
                 || this.Modifications.Any(); }
}

The next thing I wanted to be able to do is automatically synchronize changes from the newlist back to the oldlist. Here is the unit test:

[Test]
public void ResolveChangesBetweenTwoLists()
{
    // Arrange: Declare any variables or set up any conditions
    //          required by your test.
    var newVersion = new List<Entity>();
    var oldVersion = new List<Entity>();

    var identityComparer = new EntityIdentityComparer();
    var modificationComparer = new EntityEqualityComparer();

    var diff = new Diff<Entity>(identityComparer, modificationComparer);

    newVersion.Add(new Entity() { Id = 1, Name = "One"}); // addition
    newVersion.Add(new Entity() {Id =2, Name="2"}); // modification

    oldVersion.Add(new Entity() { Id = 2, Name = "Two"}); // modification
    oldVersion.Add(new Entity() { Id=3, Name="Three"}); // deletion

    // Act:     Perform the activity under test.
    diff.Resolve(newVersion, oldVersion);
    var result = diff.Execute(newVersion, oldVersion);

    // Assert:  Verify that the activity under test had the
    //          expected results
    Assert.IsTrue(result.IsEmpty); // all changes should have been resolved.
}

My first stab at the implementation is as follows:

public class Diff<T>
{
    public void Resolve(IList<T> newVersion, IList<T> oldVersion)
    {
        var results = this.Execute(newVersion, oldVersion);

        foreach (var element in results.Deletions)
            oldVersion.Remove(element);

        foreach (var element in results.Additions)
            oldVersion.Add(element);

        foreach (var element in results.Modifications)
        {
            var keyItem = element;
            var item = oldVersion.First(e => _identityComparer.Equals(e, keyItem));
            var index = oldVersion.IndexOf(item);
            oldVersion[index] = element;
        }
    }
…snipped
}

This implementation is fine as far as it goes, but there is a drawback that I don’t like. If the user of this codes wishes to determine if there are changes prior to executing the Resolve method, s/he is required to execute it a second time during the Resolve step. I like that placing Resolve() on the Diff class provides a single-step execution, so I’m going to move the real work to the DiffResult class, but leave the Resolve method where it is. I changed the implementation of Diff.Resolve() to this:

public void Resolve(IList<T> newVersion, IList<T> oldVersion)
{
    var results = this.Execute(newVersion, oldVersion);
    results.Resolve(oldVersion, this._identityComparer);
}

I added DiffResult.Resolve() as follows:

public void Resolve(IList<T> oldVersion, IEqualityComparer<T> identityComparer)
{
    this.Deletions.ForEach(e => oldVersion.Remove(e));
    this.Additions.ForEach(oldVersion.Add);
    this.Modifications.ForEach( e =>
                                    {
                                        var item =
                                            oldVersion.First(element => identityComparer.Equals(element, e));
                                        var index = oldVersion.IndexOf(item);
                                        oldVersion[index] = e;
                                    }
        );

}

The updated source code for this solution can be found here.

Diff – Part 1

Have you ever needed to know if the contents of a list has changed from a given snapshot? I recently decided that I would write a quick-and-dirty Diff algorithm. I started with the following test:

[Test]
public void NullSet()
{
    // Arrange: Declare any variables or set up any conditions
    //          required by your test.
    var source = new List<Entity>();
    var target = new List<Entity>();

    // Act:     Perform the activity under test.
    var diff = new Diff<Entity>();
    
    // Assert:  Verify that the activity under test had the
    //          expected results
    var changeSet = diff.Execute(source, target);
    Assert.IsTrue(changeSet.IsEmpty);
}

I defined <Entity> as a class with an Id and a Name. This version of the diff algorithm relied on reference comparisons to find changes between the two lists. To extend it’s functionality a bit, I decided to create an EqualityComparer to replace the reference comparison.

Here is the EqualityComparer:

public class EntityIdentityComparer : IEqualityComparer<Entity>
{
    public bool Equals(Entity x, Entity y)
    {
        return x.Id == y.Id;
    }

    public int GetHashCode(Entity obj)
    {
        return obj.Id.GetHashCode();
    }
}

And here are the next tests:

[Test]
public void OneAddition()
{
    // Arrange: Declare any variables or set up any conditions
    //          required by your test.
    var source = new List<Entity>();
    var target = new List<Entity>();
    var identityComparer = new EntityIdentityComparer();

    var diff = new Diff<Entity>(identityComparer);

    source.Add(new Entity());

    // Act:     Perform the activity under test.
    var changeSet = diff.Execute(source, target);

    // Assert:  Verify that the activity under test had the
    //          expected results
    Assert.AreEqual(1, changeSet.Additions.Count);
    Assert.IsFalse(changeSet.IsEmpty);
}
[Test]
public void OneDeletion()
{
    // Arrange: Declare any variables or set up any conditions
    //          required by your test.
    var source = new List<Entity>();
    var target = new List<Entity>();
    var identityComparer = new EntityIdentityComparer();

    var diff = new Diff<Entity>(identityComparer);

    target.Add(new Entity());

    // Act:     Perform the activity under test.
    var changeSet = diff.Execute(source, target);

    // Assert:  Verify that the activity under test had the
    //          expected results
    Assert.AreEqual(1, changeSet.Deletions.Count);
    Assert.IsFalse(changeSet.IsEmpty);
}

[Test]
public void OneAdditionAndOneDeletion()
{
    // Arrange: Declare any variables or set up any conditions
    //          required by your test.
    var source = new List<Entity>();
    var target = new List<Entity>();
    var identityComparer = new EntityIdentityComparer();

    var diff = new Diff<Entity>(identityComparer);

    source.Add(new Entity() { Id = 1 });
    target.Add(new Entity() { Id = 2 });

    // Act:     Perform the activity under test.
    var changeSet = diff.Execute(source, target);

    // Assert:  Verify that the activity under test had the
    //          expected results
    Assert.AreEqual(1, changeSet.Additions.Count);
    Assert.AreEqual(1, changeSet.Deletions.Count);
    Assert.IsFalse(changeSet.IsEmpty);
}

Here is the test I wrote to detect modifications:

[Test]
public void OneModification()
{
    // Arrange: Declare any variables or set up any conditions
    //          required by your test.
    var source = new List<Entity> { new Entity() { Id = 1, Name = "Test" } };
    var target = new List<Entity> { new Entity() { Id = 1, Name = "Test 1" } };

    var identityComparer = new EntityIdentityComparer();

    var diff = new Diff<Entity>(identityComparer);

    // Act:     Perform the activity under test.
    var changeSet = diff.Execute(source, target);

    // Assert:  Verify that the activity under test had the
    //          expected results
    Assert.AreEqual(0, changeSet.Additions.Count);
    Assert.AreEqual(0, changeSet.Deletions.Count);
    Assert.AreEqual(1, changeSet.Modifications.Count);
    Assert.IsFalse(changeSet.IsEmpty);
}

Notice that my source and target lists have an entity that, according to the IdentityComparer, are the same entity. I need a new concept to be able to tell them apart. I need to know if two entities with the same Identity have the same values. In short, I need another EqualityComparer. This time, instead of testing for equality on the Id property, it should test for equality on all other properties. Here is the EqualityComparer:

public class EntityEqualityComparer : IEqualityComparer<Entity>
{
    public bool Equals(Entity x, Entity y)
    {
        return (x.Name == y.Name);
    }

    public int GetHashCode(Entity obj)
    {
        return obj.GetHashCode();
    }
}

And here is the refactored modification test:

[Test]
public void OneModification()
{
    // Arrange: Declare any variables or set up any conditions
    //          required by your test.
    var source = new List<Entity> {new Entity() {Id = 1, Name = "Test"}};
    var target = new List<Entity> {new Entity() {Id = 1, Name = "Test 1"}};

    var identityComparer = new EntityIdentityComparer();
    var modificationComparer = new EntityEqualityComparer();

    var diff = new Diff<Entity>(identityComparer
        , modificationComparer);

    // Act:     Perform the activity under test.
    var changeSet = diff.Execute(source, target);

    // Assert:  Verify that the activity under test had the
    //          expected results
    Assert.AreEqual(0, changeSet.Additions.Count);
    Assert.AreEqual(0, changeSet.Deletions.Count);
    Assert.AreEqual(1, changeSet.Modifications.Count);
    Assert.IsFalse(changeSet.IsEmpty);
}

Here is the definition of DiffResult:

public class DiffResult<T>
{
    public bool IsEmpty
    {
        get { return this.Additions.Count == 0 
                     && this.Deletions.Count == 0
                     && this.Modifications.Count == 0; }
    }

    private readonly List<T> _additions = new List<T>();
    public List<T> Additions
    {
        get { return this._additions; }
    }

    private readonly List<T> _deletions = new List<T>();
    public List<T> Deletions
    {
        get { return this._deletions; }
    }

    private readonly List<T> _modifications = new List<T>();
    public List<T> Modifications
    {
        get { return this._modifications; }
    }
}

And the implementation of Diff:

 

/// <summary>
/// Used to find changes between two snapshots of a list.
/// </summary>
/// <typeparam name="T"></typeparam>
public class Diff<T>
{
    /// <summary>
    /// Determines if two objects are supposed to be the same object.
    /// </summary>
    private readonly IEqualityComparer<T> _identityComparer;

    /// <summary>
    /// Determines if two objects have equivalent values.
    /// </summary>
    private readonly IEqualityComparer<T> _modificationComparer;

    /// <summary>
    /// Constructor
    /// </summary>
    /// <param name="identityComparer"></param>
    /// <param name="modificationComparer"></param>
    public Diff(IEqualityComparer<T> identityComparer
        , IEqualityComparer<T> modificationComparer)
    {
        Require.That
            .IsNotNull(identityComparer, "Identity comparer is required.")
            .IsNotNull(modificationComparer, "Modification comparer is required.");

        this._identityComparer = identityComparer;
        this._modificationComparer = modificationComparer;
    }

    /// <summary>
    /// Returns a changeset of the differences between the new and original versions of the list.
    /// </summary>
    /// <param name="newList">The new version of the list.</param>
    /// <param name="originalList">The original version of the list.</param>
    /// <returns></returns>
    public DiffResult<T> Execute(IEnumerable<T> newList
        , IEnumerable<T> originalList) 
    {
        var result = new DiffResult<T>();

        var addedItems = newList.Except(originalList, this._identityComparer)
            .ToList();

        var deletedItems = originalList.Except(newList, this._identityComparer)
            .ToList();

        var modifiedItems = from newElement in newList
                            from originalElement in originalList
                            where this._identityComparer.Equals(newElement, originalElement) 
                            && !this._modificationComparer.Equals(newElement, originalElement)
                            select newElement;

        result.Modifications.AddRange(modifiedItems);
        result.Additions.AddRange(addedItems);
        result.Deletions.AddRange(deletedItems);

        return result;
    }
}

You can download the solution files here.

My First Speaking Engagement

I have accepted an invitation to speak at GSP Developers  in March. My topic will be Test Driven Design. I will focus on how practicing TDD changes the overall design of your software.

I think my strategy should be:

Give a brief overview of TDD.

Demonstrate writing a simple function TDD style.

Discuss what is meant by “testable” code, meaning how testable code relates to the SOLID design principles.

Give a brief overview of the available TDD tools for .NET.

Any advice for a first-time speaker?

The Silver Bullet

No, I’m not talking about Coors Light. I”m talking about that phrase you hear bandied about in software development circles, “There is no silver bullet.”

Wikipedia describes the meaning of the phrase thusly:

The term has been adopted into a general metaphor, where "silver bullet" refers to any straightforward solution perceived to have extreme effectiveness. The phrase typically appears with an expectation that some new technology or practice will easily cure a major prevailing problem.

Historically it comes from a 1986 paper by Fred Brooks. Again from Wikipedia:

Brooks argues that "there is no single development, in either technology or management technique, which by itself promises even one order of magnitude [tenfold] improvement within a decade in productivity, in reliability, in simplicity." He also states that "we cannot expect ever to see two-fold gains every two years" in software development, like there is in hardware development.

Accidental complexity relates to problems that we create on our own and can be fixed — for example, the details of writing and optimizing assembly code or the delays caused by batch processing. Essential complexity is caused by the problem to be solved, and nothing can remove it — if users want a program to do 30 different things, then those 30 things are essential and the program must do those 30 different things.

Brooks distinction between accidental and essential complexity is not what most people have in mind when they invoke “no silver bullet.” Typically they are directing the statement toward some developer or manager who is in the throes of excitement over a new technology or practice. The statement seems to mean, “there’s no reason to get excited. Your new tool will have its own problems. It’s not really a step forward. You’re just naive.” It is this usage that I have a problem with.

The reason it bothers me is because it’s often me that is excited by some new technology or process. And every time it happens, I get “there is no silver bullet” tossed in my direction from someone. Fred Brooks is right, we’re not going to see the kinds of gains in software productivity that we see in hardware productivity. However, to all those that use “no silver bullet” to kill excitement, I must disagree.

There is a silver bullet to solving software problems. Are you ready? Do you want to know what it is? What company distributes it? How much it costs?

It’s your brain. You distribute it, and the cost is the effort of constantly learning new and better ways to accomplish your tasks. It doesn’t matter what language you use, what platform you write for, how you set up your environment. All of that is just the car—it still needs a driver. And no matter how great the car, if the driver is only holding the gas at 25mph, you’re only going to go 25mph.

You didn’t learn a single percent of what you need to know in college. You have to think critically about everything you’ve ever been taught. You have to look at the software you find and ask archeological questions such as “Why did the original developer structure the code this way? What problems was s/he trying to solve? Was this a good solution? Can I think of alternatives? Is this the best solution?” You have to ask these questions about software written by everyone, including yourself, even if you just finished typing it in.

If you adopt as a principle the idea that you should see every line of code you write as an opportunity for improvement, you will start making small improvements to your code. At first, you’ll start renaming variables to more clearly communicate intent. Then you’ll start extracting functions into smaller and smaller functions until most of them are only a few lines long. After awhile, these improvements will simply become second nature and your overall code-base will be clearer to you. In the clearer code-base, you’ll start to see other things that you can improve—things that were hidden from you before, much like a hole in the wall can be hidden by a pile of laundry. Historically, most software becomes a vicious cycle of layering hack on top of hack until the software atrophies and dies and has to be rewritten. Rinse and repeat. You can reverse the direction of that cycle—make it a virtuous cycle—by layering small improvement on top of small improvement, so that it will prosper and flourish.

On the surface, your job is to write software that solves a specific problem within a certain timeframe. But your job is also to make sure that that software will survive in the face of change. The first part is what they can teach in college. The second part is much harder. Learn it.

Read the blogs of other developers. Read books about the profession. Listen to podcasts about technology. Become active-minded about your profession. And next time some developer is excited because s/he’s found some better way of doing things, don’t say “there’s no silver bullet.” Say, “great job—how can I use that?”

Choosing Between Alternative Designs

Often as a software engineer my team is faced with a choice between two or more options for how to solve a particular problem. What will ultimately end up being the correct choice depends on information we may not have at the time we have to choose an option.

The arguments for and against the various alternatives are usually made along the lines of complexity, flexibility, and time to develop. The simpler solutions take less time to develop, but will be less flexible in the face of potential changes. More flexible solutions are great, except that they take more time to develop. If the expected changes never happen, the extra time to develop the feature is wasted effort.

Since we are not omniscient, we cannot know a priori which design alternative will ultimately be correct. Because we need to make a choice right now, we need some kind of guiding principle to help make a decision.

This idea was recently expressed in a Team Leadership Google Group I read by Charlie Poole on this thread on pair programming:

If I saw a pair debating the best approach for more than three minutes I’d
probably ask them to decide which approach to try first. This leads to useful
considerations, like "Will it be easier to refactor from A to B, or from B to A, if we
change our minds later?" When reminded that the goal is to decide what to do in the
next 5 minutes, not what is "best" in some absolute sense, most pairs move on…

When I talked about this post with my girlfriend Emily, she said “When hanging a picture, it’s better to drill a pilot hole sized to the screw first. If you drill into a stud, no harm done. If you start with a hole sized for an anchor and drill into a stud, you have patching, sanding, and painting to do.” (A stud finder helps in these situations too, but you get the idea.) I’m sure you all have similar examples.

This idea integrates nicely with the principles of YAGNI and Last Responsible Moment. Software, by nature, will change. We are unable to predict how and in what way it will change. Well designed software is not over-engineered software. If we tried to write software that would be malleable in the face of any kind of change, ever, we would never finish. There are simply too many possibilities. It would be an expensive waste of time. The alternative is not to write software that cannot change (otherwise it would no longer be soft-ware), but to make intelligent choices about which axes of change we will support.

In the face of two similar alternatives, we should choose the one that will be easier to change. This will usually be the one that commits the fewest resources, imposes the weakest restrictions, and takes the least time.

Agile and Concentration

I was reading last night on NoodleFood on the Complexity of the Conceptual Mind. Diana reposts a blog about problems of concentration. The original author starts by giving an example of how a seemingly simple task can spiral into a long sequence of related tasks. The end result of this process is often a failure to accomplish the original task. The ability to focus on long and short-term goals at the same time is a key component of project success.

Joel Spolsky has a recent post entitled Capstone projects and time management. His article begins by discussing the failure of universities to prepare students for real-world software projects, and ends with this comment:

I’ve been blaming students, here, for lacking the discipline to do term-length projects throughout the term, instead of procrastinating, but of course, the problem is endemic among non-students as well. It’s taken me a while, but I finally learned that long-term deadlines (or no deadlines at all) just don’t work with professional programmers, either: you need a schedule of regular, frequent deliverables to be productive over the long term. The only reason the real world gets this right where all-student college teams fail is because in the real world there are managers, who can set deadlines, which a team of students who are all peers can’t pull off.

Regular frequent deliverables? That sounds like the agile value of short iterations to me. What’s important here is that two separate needs of the human mind are met by small processes. The first is a need to see how what you’re doing fits into the larger project or task you’re trying to accomplish. The second is to aid your ability to concentrate on the task at hand.

TDD

I find this discussion of time-management to be very interesting. Since I’ve adopted agile programming methodologies, I’ve noticed an increase in my productivity. Some of the productivity gains stem from my increased mastery of good design practices. But I think the major factor is my focus on short deliverable iterations. Writing unit tests is a way of maintaining the context of what I need to get done. Often I’ll be writing a test for feature-A, which I’ll discover depends on feature-B that I haven’t written yet. I can stop working on test-A and go write feature-B, building tests for feature-B as I go. When I’m done with feature-B, I still have the failing test for feature-A to remind me what remains to be done. On a day to day basis, the tests allow me to branch into sub-tasks at will without too much mental strain. TDD, in addition to its other virtues, is an aid to concentration.

Refactoring

Disciplined refactoring can also be an aid to concentration. In Martin Fowler’s Refactoring: Improving the Design of Existing Code, he discusses in detail many of the ways code can be changed. He starts with examples as simple as renaming a variable. Over the course of the examples, it becomes clear that larger refactorings are composed of smaller ones. The main takeaway for me was that in order to be successful with large refactorings, I have to break the task into smaller ones and complete each small refactoring before moving on to the next one.

For example, let’s say I need to split a class into two smaller classes. I can introduce the new class and immediately move all members from class A that I think belong in class B. If I do this, I’ll immediately break large pieces of the system. I’ll have to deal with all the various compile-errors and redirection points all at once, and this before I have a chance to verify that this is really what I want to do. On the other hand, I could create class B and begin my exposing it as a property of A. I could then move one property or method at a time from A to B, correcting the compile-errors as I went. When I have A an B looking the way I want I can simply remove the property from A, completing the separation.

Again, at any time during the refactoring it is only necessary to be attentive to a few issues at once.

Long Term Tasks

Not every task can be finished in a few hours. For longer-term tasks I have to break my work into milestones that are meaningful to me. I’ve started using an online tool called Zen to keep track of these things. I discovered agilezen in a recent .NET Rocks episode. I’ve been using it for awhile now and find it to be exactly what I need to keep track of the larger details of my work.

Conclusion

To run a successful project you must break long-term goals into shorter-term goals. This is for the same reason that a successful architecture requires that you break your project into smaller components, components into smaller classes, and classes into smaller functions. Software development is like any other task in life. You have to adapt your development and task-tracking methods to facilitate success. There’s only so much you can hold in your mind at once, but agile development methods allow you told tend the trees as you grow the forest.

Ninite

While waiting on vm installations of Ubuntu and Debian Linux, I discovered ninite.com via gizmodo. Ninite will bulk-install a range of different free apps on Windows systems. As I’ve just upgraded my laptop to Windows 7, I’m happy to find this time saver!

The way it works

On the main page, you select the apps you wish to install, and ninite builds a custom installer based on your selections. This is a file that you download, so it is transferrable to other machines. If you have other free products you’d like to see as part of the ninite installer, you can recommend them to the site.

When I first tried executing the custom installer on Windows 7, I got an unspecified error. Running the installer as Administrator seems to have resolved that issue. In addition, Ninite was smart enough to detect which software was already installed.

My choices

Chrome Browser

Windows Messenger and Google Talk

iTunes and Hulu Desktop

Paint.NET and Picasa

Flash Player for IE and other browsers and Silverlight

Google Earth

7-Zip

Notepad++

I took the opportunity to suggest Kantaris as another media player, and Virtual Clone Drive for DVD image software. I used to use Deamon Tools Lite but I haven’t been able to get it to work in 64-bit Windows 7.

Original Gizmodo link: http://gizmodo.com/5388680/ninite-helps-you-upgrade-to-windows-7-by-installing-up-to-58-great-apps-at-once

Starting Point

This is my first post on a new blog. My goal with this blog is to continuously improve my skills by sharing what I’ve learned, and inviting constructive criticism from the developer community.

I contract full time developing line of business applications for a local manufacturing company. I’m currently focused on developing desktop applications for machine operators in the plants which provide real-time process flow information to decision makers. I also do some part-time work on nights and weekends.

I mostly enjoy working with the Microsoft tool stack, including Visual Studio and SQL Server. Recently, I’ve fallen in love with WPF. Over the last year I’ve become compentent in WPF, WCF, Silverlight, Linq to Sql, and Entity Framework.

Previous Page