Monthly Archives: December 2009
Diff – Part 2

In Part 1 of this series, I created a basic class that would find differences between two versions of the same list. I’ve made some improvements to the original implementation that I’d like to share.

To begin with, I read this post on the difference between the Any() and Count() extension methods. While List<T> does implement ICollection, I wanted to change the structure of DiffResult<T> so that if I decide to change the datatype of Additions, Deletions, and Modifications, the IsEmpty property will be more efficient. To that end, I changed IsEmpty as follows:

public bool IsEmpty
{
    get { return !this.HasChanges;  }   
}

public bool HasChanges
{
    get { return this.Additions.Any()
                 || this.Deletions.Any()
                 || this.Modifications.Any(); }
}

The next thing I wanted to be able to do is automatically synchronize changes from the newlist back to the oldlist. Here is the unit test:

[Test]
public void ResolveChangesBetweenTwoLists()
{
    // Arrange: Declare any variables or set up any conditions
    //          required by your test.
    var newVersion = new List<Entity>();
    var oldVersion = new List<Entity>();

    var identityComparer = new EntityIdentityComparer();
    var modificationComparer = new EntityEqualityComparer();

    var diff = new Diff<Entity>(identityComparer, modificationComparer);

    newVersion.Add(new Entity() { Id = 1, Name = "One"}); // addition
    newVersion.Add(new Entity() {Id =2, Name="2"}); // modification

    oldVersion.Add(new Entity() { Id = 2, Name = "Two"}); // modification
    oldVersion.Add(new Entity() { Id=3, Name="Three"}); // deletion

    // Act:     Perform the activity under test.
    diff.Resolve(newVersion, oldVersion);
    var result = diff.Execute(newVersion, oldVersion);

    // Assert:  Verify that the activity under test had the
    //          expected results
    Assert.IsTrue(result.IsEmpty); // all changes should have been resolved.
}

My first stab at the implementation is as follows:

public class Diff<T>
{
    public void Resolve(IList<T> newVersion, IList<T> oldVersion)
    {
        var results = this.Execute(newVersion, oldVersion);

        foreach (var element in results.Deletions)
            oldVersion.Remove(element);

        foreach (var element in results.Additions)
            oldVersion.Add(element);

        foreach (var element in results.Modifications)
        {
            var keyItem = element;
            var item = oldVersion.First(e => _identityComparer.Equals(e, keyItem));
            var index = oldVersion.IndexOf(item);
            oldVersion[index] = element;
        }
    }
…snipped
}

This implementation is fine as far as it goes, but there is a drawback that I don’t like. If the user of this codes wishes to determine if there are changes prior to executing the Resolve method, s/he is required to execute it a second time during the Resolve step. I like that placing Resolve() on the Diff class provides a single-step execution, so I’m going to move the real work to the DiffResult class, but leave the Resolve method where it is. I changed the implementation of Diff.Resolve() to this:

public void Resolve(IList<T> newVersion, IList<T> oldVersion)
{
    var results = this.Execute(newVersion, oldVersion);
    results.Resolve(oldVersion, this._identityComparer);
}

I added DiffResult.Resolve() as follows:

public void Resolve(IList<T> oldVersion, IEqualityComparer<T> identityComparer)
{
    this.Deletions.ForEach(e => oldVersion.Remove(e));
    this.Additions.ForEach(oldVersion.Add);
    this.Modifications.ForEach( e =>
                                    {
                                        var item =
                                            oldVersion.First(element => identityComparer.Equals(element, e));
                                        var index = oldVersion.IndexOf(item);
                                        oldVersion[index] = e;
                                    }
        );

}

The updated source code for this solution can be found here.

Diff – Part 1

Have you ever needed to know if the contents of a list has changed from a given snapshot? I recently decided that I would write a quick-and-dirty Diff algorithm. I started with the following test:

[Test]
public void NullSet()
{
    // Arrange: Declare any variables or set up any conditions
    //          required by your test.
    var source = new List<Entity>();
    var target = new List<Entity>();

    // Act:     Perform the activity under test.
    var diff = new Diff<Entity>();
    
    // Assert:  Verify that the activity under test had the
    //          expected results
    var changeSet = diff.Execute(source, target);
    Assert.IsTrue(changeSet.IsEmpty);
}

I defined <Entity> as a class with an Id and a Name. This version of the diff algorithm relied on reference comparisons to find changes between the two lists. To extend it’s functionality a bit, I decided to create an EqualityComparer to replace the reference comparison.

Here is the EqualityComparer:

public class EntityIdentityComparer : IEqualityComparer<Entity>
{
    public bool Equals(Entity x, Entity y)
    {
        return x.Id == y.Id;
    }

    public int GetHashCode(Entity obj)
    {
        return obj.Id.GetHashCode();
    }
}

And here are the next tests:

[Test]
public void OneAddition()
{
    // Arrange: Declare any variables or set up any conditions
    //          required by your test.
    var source = new List<Entity>();
    var target = new List<Entity>();
    var identityComparer = new EntityIdentityComparer();

    var diff = new Diff<Entity>(identityComparer);

    source.Add(new Entity());

    // Act:     Perform the activity under test.
    var changeSet = diff.Execute(source, target);

    // Assert:  Verify that the activity under test had the
    //          expected results
    Assert.AreEqual(1, changeSet.Additions.Count);
    Assert.IsFalse(changeSet.IsEmpty);
}
[Test]
public void OneDeletion()
{
    // Arrange: Declare any variables or set up any conditions
    //          required by your test.
    var source = new List<Entity>();
    var target = new List<Entity>();
    var identityComparer = new EntityIdentityComparer();

    var diff = new Diff<Entity>(identityComparer);

    target.Add(new Entity());

    // Act:     Perform the activity under test.
    var changeSet = diff.Execute(source, target);

    // Assert:  Verify that the activity under test had the
    //          expected results
    Assert.AreEqual(1, changeSet.Deletions.Count);
    Assert.IsFalse(changeSet.IsEmpty);
}

[Test]
public void OneAdditionAndOneDeletion()
{
    // Arrange: Declare any variables or set up any conditions
    //          required by your test.
    var source = new List<Entity>();
    var target = new List<Entity>();
    var identityComparer = new EntityIdentityComparer();

    var diff = new Diff<Entity>(identityComparer);

    source.Add(new Entity() { Id = 1 });
    target.Add(new Entity() { Id = 2 });

    // Act:     Perform the activity under test.
    var changeSet = diff.Execute(source, target);

    // Assert:  Verify that the activity under test had the
    //          expected results
    Assert.AreEqual(1, changeSet.Additions.Count);
    Assert.AreEqual(1, changeSet.Deletions.Count);
    Assert.IsFalse(changeSet.IsEmpty);
}

Here is the test I wrote to detect modifications:

[Test]
public void OneModification()
{
    // Arrange: Declare any variables or set up any conditions
    //          required by your test.
    var source = new List<Entity> { new Entity() { Id = 1, Name = "Test" } };
    var target = new List<Entity> { new Entity() { Id = 1, Name = "Test 1" } };

    var identityComparer = new EntityIdentityComparer();

    var diff = new Diff<Entity>(identityComparer);

    // Act:     Perform the activity under test.
    var changeSet = diff.Execute(source, target);

    // Assert:  Verify that the activity under test had the
    //          expected results
    Assert.AreEqual(0, changeSet.Additions.Count);
    Assert.AreEqual(0, changeSet.Deletions.Count);
    Assert.AreEqual(1, changeSet.Modifications.Count);
    Assert.IsFalse(changeSet.IsEmpty);
}

Notice that my source and target lists have an entity that, according to the IdentityComparer, are the same entity. I need a new concept to be able to tell them apart. I need to know if two entities with the same Identity have the same values. In short, I need another EqualityComparer. This time, instead of testing for equality on the Id property, it should test for equality on all other properties. Here is the EqualityComparer:

public class EntityEqualityComparer : IEqualityComparer<Entity>
{
    public bool Equals(Entity x, Entity y)
    {
        return (x.Name == y.Name);
    }

    public int GetHashCode(Entity obj)
    {
        return obj.GetHashCode();
    }
}

And here is the refactored modification test:

[Test]
public void OneModification()
{
    // Arrange: Declare any variables or set up any conditions
    //          required by your test.
    var source = new List<Entity> {new Entity() {Id = 1, Name = "Test"}};
    var target = new List<Entity> {new Entity() {Id = 1, Name = "Test 1"}};

    var identityComparer = new EntityIdentityComparer();
    var modificationComparer = new EntityEqualityComparer();

    var diff = new Diff<Entity>(identityComparer
        , modificationComparer);

    // Act:     Perform the activity under test.
    var changeSet = diff.Execute(source, target);

    // Assert:  Verify that the activity under test had the
    //          expected results
    Assert.AreEqual(0, changeSet.Additions.Count);
    Assert.AreEqual(0, changeSet.Deletions.Count);
    Assert.AreEqual(1, changeSet.Modifications.Count);
    Assert.IsFalse(changeSet.IsEmpty);
}

Here is the definition of DiffResult:

public class DiffResult<T>
{
    public bool IsEmpty
    {
        get { return this.Additions.Count == 0 
                     && this.Deletions.Count == 0
                     && this.Modifications.Count == 0; }
    }

    private readonly List<T> _additions = new List<T>();
    public List<T> Additions
    {
        get { return this._additions; }
    }

    private readonly List<T> _deletions = new List<T>();
    public List<T> Deletions
    {
        get { return this._deletions; }
    }

    private readonly List<T> _modifications = new List<T>();
    public List<T> Modifications
    {
        get { return this._modifications; }
    }
}

And the implementation of Diff:

 

/// <summary>
/// Used to find changes between two snapshots of a list.
/// </summary>
/// <typeparam name="T"></typeparam>
public class Diff<T>
{
    /// <summary>
    /// Determines if two objects are supposed to be the same object.
    /// </summary>
    private readonly IEqualityComparer<T> _identityComparer;

    /// <summary>
    /// Determines if two objects have equivalent values.
    /// </summary>
    private readonly IEqualityComparer<T> _modificationComparer;

    /// <summary>
    /// Constructor
    /// </summary>
    /// <param name="identityComparer"></param>
    /// <param name="modificationComparer"></param>
    public Diff(IEqualityComparer<T> identityComparer
        , IEqualityComparer<T> modificationComparer)
    {
        Require.That
            .IsNotNull(identityComparer, "Identity comparer is required.")
            .IsNotNull(modificationComparer, "Modification comparer is required.");

        this._identityComparer = identityComparer;
        this._modificationComparer = modificationComparer;
    }

    /// <summary>
    /// Returns a changeset of the differences between the new and original versions of the list.
    /// </summary>
    /// <param name="newList">The new version of the list.</param>
    /// <param name="originalList">The original version of the list.</param>
    /// <returns></returns>
    public DiffResult<T> Execute(IEnumerable<T> newList
        , IEnumerable<T> originalList) 
    {
        var result = new DiffResult<T>();

        var addedItems = newList.Except(originalList, this._identityComparer)
            .ToList();

        var deletedItems = originalList.Except(newList, this._identityComparer)
            .ToList();

        var modifiedItems = from newElement in newList
                            from originalElement in originalList
                            where this._identityComparer.Equals(newElement, originalElement) 
                            && !this._modificationComparer.Equals(newElement, originalElement)
                            select newElement;

        result.Modifications.AddRange(modifiedItems);
        result.Additions.AddRange(addedItems);
        result.Deletions.AddRange(deletedItems);

        return result;
    }
}

You can download the solution files here.