Onion Architecture: An Opinionated Approach Part 2, Anemic Data Models

Anemic Data Model is my term for a domain model indicating entities and their relationships to other entities, but having little or no business behavior. It basically serves as a contract for a data store. It’s purpose is structural, and often to provide ease of queryability.

In ShowPlanner, the domain is about planning and selling tickets to musical events called “shows.” Venues contain one or more Stages, and a Show can consist of one or more Performances on one or more Stages at a Venue. Tickets are sold to a Show.

DataModel

This data model represents the universe of entities, attributes, and relationships that the ShowPlanner application will interact with at this time. These models have no methods or functions, no behavior of any kind.

The Problem of Behavior

Why no behavior? Putting behavior in domain models raises some difficult design questions. How do the domain objects get their dependencies? Which domain object should the method or methods that control behavior go in?

Let’s take a simple case. Suppose you want to model what happens when a Ticket is purchased by a Customer. When a show is created, a number of tickets are created as well. When a Customer buys a Ticket, the ticket will be marked as sold.

Ownership

Generally speaking it’s very seldom that we write basic CRUD applications in an enterprise environment. Application behavior tends to live in the relationships and what happens when certain kinds of relationships are created and destroyed. This means that application operations often involve the state of multiple domain objects directly and indirectly involved in the relationship that is changing.

There are several possible ways to model this operation.

    public class Customer
    {
        public int? CustomerId { get; set; }

        public IList<Ticket> Tickets { get; set; }

        [Required]
        public string Name { get; set; }

        public void Purchase(Ticket ticket)
        {
            ticket.SoldTo = this;
            ticket.Sold = true; 
        }
    }

 

Or

    public class Ticket
    {
        public int? TicketId { get; set; }

        public int? ShowId { get; set; }
        [Required]
        public Show Show { get; set; }

        public int? CustomerId { get; set; }
        public Customer SoldTo { get; set; }

        public bool? Sold { get; set; }

        [Required]
        public decimal Price { get; set; }

        public void SellTo(Customer customer)
        {
            this.Sold = true;
            this.SoldTo = customer;
        }
    }

Or

    public class Show
    {
        public int? ShowId { get; set; }

        public IList<Ticket> Tickets { get; set; }

        [Required]
        public string Title { get; set; }

        public void SellTicket(Customer customer, Ticket ticket)
        {
            ticket.Sold = true;
            ticket.SoldTo = customer;
        }
    }

 

Which domain object should own the method for selling the ticket? It will depend on who you ask. When a given operation affects multiple entities in the Domain, there is no objective way to decide which entity gets the behavior. The method performing the operation has no natural home.

Dependencies

Let’s assume you secure agreement that Customer should own this operation. You’re happily coding along on something else when a few weeks later your customer tells you that when the ticket is sold a message needs to be sent to a billing system. The messaging system is represented by an interface called IMessageSender. What happens in the domain model now? How does the Customer class get a reference to the MessageSender implementation?

You could do this I guess:

        public void Purchase(Ticket ticket, IMessageSender messageSender)
        {
            ticket.SoldTo = this;
            ticket.Sold = true;

            messageSender.Enqueue(new BillForTicketMessage(this, ticket));
        }

But that’s an ugly road to walk down. You’ll start expanding dependencies when you need to add logging, authorization, validation, and a host of other concerns. Suddenly what started out as a very simple method on domain model full of complexity.

Constructor injection is not viable if I’m expecting to use an ORM to retrieve my Customer instance. Assuming you could make that work, you have to consider that with the addition of each new operation involving Customer, you’ll be adding still more dependencies which will turn Customer into more of a GOD class than will be supportable.

Wouldn’t’ design-by-composition work better?

Designing the Domain Models

I said earlier that “domain models as merely a means of persisting the state of the application.” To my way of thinking, this is their core purpose. To that end, they should effectively act as a contract to the underlying data store used by your application. They should support both queryability and data modification and be well-organized.

Queryability

To support queryability, data models should be fully connected. This just means that any relationships between entities should be referentially  expressed. Prefer Performance.Show over Performance.ShowId. Some ORM’s such as Entity Framework support using both the Id and the reference. For others, such as NHibernate, having both properties is an issue. When faced with a choice, prefer a reference over an Id.

Data Modification

To support data modification, your data models should contain all the fields used in the database. I was writing an integration test recently and I needed to create some test data in Sql Server. As I tried to insert one of my entities, I discovered that field required by the database was not actually on the model. The original developer had only populated the fields he needed for the specific operation he was writing which resulted in additional work and testing for me. It also exposed a lack of integration test coverage for any operation involving this entity as it was impossible to create new records using existing code.

Organization

Data models are the center of the onion. They should be in their own assembly or package both to make them easier to find and modify, and to prevent other developers from violating the Dependency Depth principle by referencing higher layers.

Do’s
  • Put your data models in their own project or assembly. This makes them easy to find. They are the innermost circle in your application, so this project should have next to no dependencies on anything else.
  • If you’re tools set supports it, maintain a class diagram of the models and their relationships. The diagram above was generated by Visual Studio.
  • Prefer referential relationships over using Id’s to identify related data.  Use both if your ORM supports it.
  • This should go without saying, but use an ORM.
  • In the anal-retentive list of things to consider
    • List the primary key first.
    • List single-entity references second.
    • List collection references third.
    • List the entity’s fields last.
    • List audit data (create date, create user, etc) dead last.
    • Alphabetize fields. Why? Because with larger data models it gets really hard to find the field you’re looking for in an unsorted list. This is an easy thing to habitualize and saves a good bit of headache down the road.
  • In .NET, use System.ComponentModel.DataAnnotations to attribute your models metadata when it is known.
    • Especially use StringLength because it produces a nicer error message than “String or binary data would be truncated” when using SQL Server as the database.
Don’ts
  • Don’t put behavior on the models.
  • Don’t accept dependencies in the models.
  • Don’t put ORM-specific attributes on the models unless you have no other choice.
  • Don’t put anything on the models that isn’t actually in your data store.
    • This is another way of saying “don’t put behavior on the models.” But I’ve seen developers put properties on models that they calculate and set in the course of one operation and that isn’t used in other operations. This breaks down when another operation wants to use the calculated value.
  • Don’t use data models in your UI. In web applications, this means don’t use data models in your pages and controllers. Use View Models and some sort of Request-Response api instead.

This is Part 2 of a series

Disclaimer: This series of articles is intentionally opinionated. From an architectural perspective I am much more interested in clearly defining each layer than I am in the choice of specific implementation pattern. However, an architecture doesn’t exist without the details that make it up, so I must choose some implementation pattern so that the architecture can be revealed. Since I’m the one writing the series, I’ll choose implementation patterns I like. Smile If you have a better idea, take a fork of ShowPlanner and share it!

Leave a Reply