Design Tips: Allow Consumers To Handle Exceptions

I ran into a piece of code where it implemented a kind of retry pattern. It will retry a call if an exception is thrown from the first call. A simple version looks like (not a production or real code).

One of a problem with the above code is that when the first timeout exception occurs, the consumer cannot see it anywhere. At least, the ServiceCallHelper should give consumers a chance to deal with exceptions.

How do we do that with less impact on the design? We should strive for a solution which does not introduce a new dependency. I have seen a tendency of injecting a Logger. I might, and will, work. However, now you have dependency on the logger. And the consumer has to know about that logger.

Now take a look at the new version that I propose

With that simple change, we have introduced an extension point that consumers will be happy. And we ensure that our API does not hide away exceptions.

If you are designing an API, consider your extension points, and whether the API allows consumers to have a chance to be aware of exceptions.

.NET framework comes with powerful built classes Func and Action. Take advantages of the two might help you simplified your design.

Leaky Abstraction – Linq Usage

I am not sure how many percents of developers are thinking about Leaky Abstraction when coding, especially coding in OOP umbrella. Me? Not much since recently. I do not know why. I just simply did not think about it. Common trends are that we, as developers, focus on the new technologies, design pattern, best practices, … all those cool and fancy stuff. Many developers build great things without knowing that concept. I understood that. However, what if we know it, we might build a better product with fewer bugs and easy to maintain. When I actually understand it, I feel smarter. Let’s see what I am talking about.

First, check it out from these trusted resources. If you are a developer, there is a high chance that you know those sources.

Wikipedia

Joel on Software

Coding Horror

Recent years, I improved my skill via Pluralsight The courses from Zoran Horvat have changed the way I think about programming, the way I think about OOP. At the same time, I have many code base that I have been working on for years. When I looked at the code and compared to what I have learned, there is a big gap. The problem was that I did not know how to close that gap. I was kind of stuck in my own thinking.

With time, I started to understand them deeply. And then I started to make small changes in a stable, safe way. Let’s start the journey.

 

Let’s assume that we have a School and many Teachers. And at the end of an education year, the school has to give statistic about

  1. How many teachers does it have?
  2. How many mathematics teachers?
  3. How many of them have been in the school for more than 10 years?

They are very basic requirements. Without any hesitation, one can come up with this code. It works perfectly as required.

   public class School
    {
        public string Name { get; set; }
        public IList<Teacher> Teachers { get; set; }
    }

    public class Teacher
    {
        public string Name { get; set; }
        public string Specialty { get; set; }
        public DateTime StartedOn { get; set; }
        public bool IsStillAtWork { get; set; }
    }
    class Program
    {
        static void Main(string[] args)
        {
            var school = new School
            {
                Name = "Gotham City"
            };
            school.Teachers.Add(new Teacher
            {
                Name = "Batman",
                Specialty = "Mathematics",
                StartedOn = DateTime.Now.AddYears(-11),
                IsStillAtWork = true
            });
            school.Teachers.Add(new Teacher
            {
                Name = "Joker",
                Specialty = "Chemical",
                StartedOn = DateTime.Now.AddYears(-6),
                IsStillAtWork = false
            });
            Console.WriteLine("Total teachers: {0}", school.Teachers.Count(x => x.IsStillAtWork));
            Console.WriteLine("Mathematics teachers are: {0}", 
                string.Join("; ", school.Teachers
                                        .Where(x => x.Specialty== "Mathematics")
                                        .Select(x => x.Name)));
            Console.WriteLine("> 10 years teachers are: {0}",
                string.Join("; ", school.Teachers
                                        .Where(x => DateTime.Now >= x.StartedOn.AddYears(10))
                                        .Select(x => x.Name)));
        }
    }

How many issues can you find in the above code? The total teacher count only counts the IsStillAtWork. However, the next 2 statements do not. Once identified, a developer can go in and fix the code easily: by adding one more condition for each where statement. A short-revised version

            Console.WriteLine("Total teachers: {0}", school.Teachers.Count(x => x.IsStillAtWork));
            Console.WriteLine("Mathematics teachers are: {0}", 
                string.Join("; ", school.Teachers
                                        .Where(x => x.Specialty== "Mathematics" && x.IsStillAtWork)
                                        .Select(x => x.Name)));
            Console.WriteLine("> 10 years teachers are: {0}",
                string.Join("; ", school.Teachers
                                        .Where(x => DateTime.Now >= x.StartedOn.AddYears(10) && x.IsStillAtWork)
                                        .Select(x => x.Name)));

So far so good! Where is the problem? where is the “Leaky Abstraction”?

 

Let’s distinguish the consumer and the domain. The “Program” class is the consumer. The School and Teacher are the domain. For those simple requirements, the consumer has to know too much about the domain knowledge which should be captured by the domain itself.

  • The consumer has to know how to filter teachers that are at work.
  • The consumer has to know how to decide a teacher is a mathematics.
  • The consumer has to know how to decide a teacher is at work for a long time.

What if we have many consumers over the School class? Then each consumer has to know that knowledge and makes its own implementation. Here we have a real problem of Leaky Abstraction at the simplest level of using Linq to filter data. We also have the duplication issue. The logic is duplicated. If the domain has only one consumer, that code is fine. It does what it is expected to do. In many applications, it is not the case, unfortunately.

It is really hard to have a code without leaky abstraction. It is kind of an impossible mission. What we should do is to aware of the situation, weigh the pros and cons of fixing them.

 

So what are possible solutions? The goal is to capture the logic inside the domain.

Solution 1: We could move the logic into the School class as below

   public class School
    {
        public string Name { get; set; }
        public IList<Teacher> Teachers { get; set; }

        public int CountTeacherAtWork()
        {
            return Teachers.Count(x => x.IsStillAtWork);
        }

        public IEnumerable<Teacher> MathematicsTeachers()
        {
            return Teachers.Where(x => x.Specialty == "Mathematics" && x.IsStillAtWork);
        }

        public IEnumerable<Teacher> ExperiencedTeachers()
        {
           return Teachers.Where(x => DateTime.Now >= x.StartedOn.AddYears(10) && x.IsStillAtWork);
        }
    }

    class Program
    {
        static void Main(string[] args)
        {
            var school = new School
            {
                Name = "Gotham City"
            };
            school.Teachers.Add(new Teacher
            {
                Name = "Batman",
                Specialty = "Mathematics",
                StartedOn = DateTime.Now.AddYears(-11),
                IsStillAtWork = true
            });
            school.Teachers.Add(new Teacher
            {
                Name = "Joker",
                Specialty = "Chemical",
                StartedOn = DateTime.Now.AddYears(-6),
                IsStillAtWork = false
            });
            Console.WriteLine("Total teachers: {0}", school.CountTeacherAtWork());
            Console.WriteLine("Mathematics teachers are: {0}", 
                string.Join("; ", school.MathematicsTeachers()
                                        .Select(x => x.Name)));
            Console.WriteLine("> 10 years teachers are: {0}",
                string.Join("; ", school.ExperiencedTeachers()
                                        .Select(x => x.Name)));
        }
    }

The logic is captured in 3 methods: CountTeacherAtWork, MathematicsTeachers, and ExperiencedTeachers. So far so good! Any consumer can consume the API without worrying the logic. And we also solve the duplication issue.

But that solution has some issues

  1. The number of methods in School class will explode.
  2. Do we forget to check if Teachers list is null? Are we sure that we have a valid Teacher list?
  3. When adding new methods operating on the Teacher list, some might forget the add the IsStillAtWork condition.

Just name a few. In my opinion, the second issue is the worst.

Solution 2: Capture logic in a Collection class

It is a better solution. Instead of thinking about school and teacher, what if we think of “Collection of Teachers“? So at any point in time, instead of working with a single Teacher, we work with a collection of teachers. Sometimes, the collection might be empty or 1 item.

In OOP, when there is logic, they should be captured inside objects. Let’s another version, where the logic is captured in an object.

    public class TeacherCollection
    {
        private readonly IList<Teacher> _teachers = new List<Teacher>();

        public TeacherCollection(IEnumerable<Teacher> teachers)
        {
            if (teachers != null)
                _teachers = teachers.Where(x => x.IsStillAtWork).ToList();
        }

        public TeacherCollection WhereTeachMathematics()
        {
            return new TeacherCollection(_teachers.Where(x => x.Specialty == "Mathematics"));
        }

        public TeacherCollection WhereExperienced()
        {
            return new TeacherCollection(_teachers.Where(x => DateTime.Now >= x.StartedOn.AddYears(10)));
        }

        public int Count
        {
            get { return _teachers.Count; }
        }

        public IEnumerable<Teacher> AsEnumerable
        {
            get { return _teachers.AsEnumerable(); }
        }
    }

    public class School
    {
     
        public string Name { get; set; }
        public IList<Teacher> Teachers { get; set; }

        public TeacherCollection TeacherCollection
        {
            get
            {
                return new TeacherCollection(Teachers);
            }
        }
    }

    class Program
    {
        static void Main(string[] args)
        {
            var school = new School
            {
                Name = "Gotham City"
            };
            school.Teachers.Add(new Teacher
            {
                Name = "Batman",
                Specialty = "Mathematics",
                StartedOn = DateTime.Now.AddYears(-11),
                IsStillAtWork = true
            });
            school.Teachers.Add(new Teacher
            {
                Name = "Joker",
                Specialty = "Chemical",
                StartedOn = DateTime.Now.AddYears(-6),
                IsStillAtWork = false
            });
            Console.WriteLine("Total teachers: {0}", school.TeacherCollection.Count);
            Console.WriteLine("Mathematics teachers are: {0}", 
                string.Join("; ", school.TeacherCollection
                                        .WhereTeachMathematics()
                                        .AsEnumerable
                                        .Select(x => x.Name)));
            Console.WriteLine("> 10 years teachers are: {0}",
                string.Join("; ", school.TeacherCollection
                                        .WhereExperienced()
                                        .AsEnumerable
                                        .Select(x => x.Name)));
        }
    }

By introducing the TeacherCollection object, we can handle the query side of the object. As the requirement arises, the number of “Where” clauses also increase. That is a challenge and we must keep an eye on the design and modify when necessary. Regardless of the problems might arise with the new design, we gain these benefits

  1. The collection object is completed. We do not have to check for the state of the object. The inner collection is always not null.
  2. This is a perfect example of the “Map-Reduce” pattern. Where the “Where” is the reduce. By introducing a number of proper “Where”, we can capture the logic and chain the condition to the final collection that we want. It is also easy to create a Map function which allows us to transform to the collection object.
  3. Immutable. Each “Where” will result in a new collection object. Because the collection object is designed to support the query side of the operation, it is crucial to keep in mind that you must not add methods that will change the internal state of the collection object.
  4. Improve readability. The better naming the better readability!

Hmm, where is the command side? How to modify the teachers? The command operations should belong to the main domain object. That is the School class. I still keep the Teachers property for that purpose. I know it is not a good design when exposing the Teachers property for that purpose. But it is there to demonstrate the point. In production code, I would design a better solution to deal with command operations. It is out of the scope of this post.

 

Wow, that is cool! We should go in and fix all the places in our current codebase. No! No! and No! Blindly apply anything will cause more damages than benefits. Linq is a powerful tool and we use it everywhere in the code. Our main problem is that we have not asked the right question. “Is it the right place to use Linq? Should we wrap the logic somewhere?” those are wonderful questions to ask whenever we code.

Ok, then what do I take from here? what should I do? I have a few suggestions

  1. Take a moment and understand the leaky abstraction. Try to map it with a real-life example. The textbook is hard to understand.
  2. Look at your current codebase. Find a place where you think there is a possibility of the leaky abstraction. Hints: Find the Linq usage, or properties of type List, Collection or Enumerable.
  3. To find out if your classes have Leaky Abstraction issue, try to see from the Consumer point of view.
  4. Evaluate the pros and cons. Simply consider is It worth the effort or not?
  5. Make one small change at a time.

It is a journey. Good things take time. Using the same approach, we can find more leaky abstraction in other areas of our code.

Thank you for your time. I hope you can take some and apply to your daily job.

Unit Test from Pain to Joy

Recently I have made an improvement in a project regarding its unit test. The improvement has a huge impact on the system in a good way. It also has a huge impact on my thinking, my philosophy about unit test, again, in a good way.

The Story

I have been working on a huge, complex codebase. It is still written with .NET 4.0, about 6 years old. Part of the system is WCF service employed CQRS style. The code has its existing unit test. And we have added more. The tests have both integration tests and mocked tests.

Integration tests, in our context, means starting with the top-most layer (presentation layer at WCF service) down to the database, and then getting data back from the database using Query Handler or Repository. In short, it is a complex testing style. And to my opinion, it is not good. I would not have done that.

Mocked tests, in our context, means mocking all the dependencies. In short, all the interfaces are mocked. Even for the domain objects, we also create proxy objects and mock their properties, methods. It turns out a big mistake.

Most of the time, we are mocking instead of testing state. There are a few problems (pain) with our approach.

First, it is hard to write. To mock dependencies, you have to know all the dependencies. Which means developers have to read the implementation code, figure out how they are interacting with each other, figure out what methods call on which interfaces. Those are not really fun. And they do not bring any real value.

Second, it is fragile. Whenever we add code or refactor a piece of code, the unit test breaks. Because the unit test assumes that a number of certain methods are there and that they are called in a specific order. Which is, in our context, not suitable.

Third, it is damn hard to write tests to verify a bug fix.

Root Cause

How did the hell on earth I end up in that situation? Everything has its own reasons. And I want to figure out mine.

I have bought this excuse for so long. And I was happy with it, unfortunately

Because the codebase is complex. The design was wrong. We do not have time to redesign it.

The fingers were pointing to the other guys, other developers. No, it was not true.

What role have I played in that mess?“, I asked. Oh, turn out I play a very big role. After reflection, here are some reasons to me not doing well on my that area.

Wrong Mindset

A long time ago, when I started knowing and writing unit tests, I was sold the thought that we have to cut the dependencies, especially the database dependency. And with the raising of mocking, the promising of TDD, I mocked (in the hope of cutting dependencies) as much as I could. It works in many scenarios. However, because I believe it is the right thing, I forget to ask right questions.

Not Ask Right Questions

I just wrote tests without asking right questions.

What am I testing here?

Kind of a stupid question, but very powerful. Depending on types of application, on the architecture, answers are different. Because of my wrong mindset, I focused on how instead of what. Answering that question allows me to analyze further, allows me to actually look at the system in a systematical way, instead of a theoretical way.

Solution

First, I decided to throw away what I thought I know about Unit Test. Here are what I want my Unit Test should be

  1. Easy to write, easy to understand from code.
  2. Resilient to refactoring. Do not have to modify unit tests when using another interface in the implementation code. In short, the tests should be there to guarantee the code correctness at maintenance phase.

While writing this post, I created a github repo, welcome to DotConnect.

What are We Testing?

Such a simple but powerful question! However, I sometimes forgot to ask. We know that we should write unit test for our code. Have we ever considered to answer that question properly? Take an example, given that we will build a simple web service (WCF) to CRUD an Account into the SQL database. What are we going to test? Each will have a different answer, thus, drive their unit test implementation.

When asking that question, the important is to remove the term Unit. I find it is a trap. When that term presents, my mind is trapped in defining what unit is. Therefore, I forget the purpose of my testing.

From my own opinion, at the abstract level, I will categorize them into 2 categories 1) Functional Test and 2) Architecture Test

Functional Test

That are tests to govern the correctness of the system. For this kind of test, we have to define clearly what is the final state.

InputOutput
The simple diagram of all processes

To implement a proper test, one must clearly define the Output. Some common outputs are 1) Database, 2) Filesystem

To define a good output, we have to define the proper scope (which comes later in this post)

Architecture Test

For some systems, architecture is important. Let’s say all the call to the Database must go through a repository. Or that the Controller (MVC application) must delegate the work to the next layer (such as Service or Command/Query Handler).

Usually, we use Mock to accomplish the testing goal. Because we do not really care about the actual implementation. We care about the sequence of calls.

What are Dependencies?

Dependencies must be listed out explicitly. At the minimum, there should be a simple diagram like this

Dependencies
High-level dependencies of the system. Each box (such as Google) is a dependency

And do not go into the detail of those dependencies. Better keep the high-level view.

What is the Scope?

Without a scope, things get messy. A proper, explicit said scope will help to define the Input and Output. I made a mistake at this question so I defined a wrong scope. I, once, defined scope at the Project level. I had a unit test for Command Handler project, which will mock the dependency to the Repository project. Then I had another unit test for Repository project. They, first, looked logical and reasonable. However, with the tested of the time, it proves I was wrong.

Once I realized it (and that is why I write this post), I defined the scope at Command Handler level only, remove the concept of Repository test. Which allows me to define the Input is the Command, and the Output is the changes in the database.

This is a game changer step for me. For years, I have been focusing on the term Unit. The problem is that it is hard to define the unit. Will it be a function, a method? Will it be a class? or Will it be an assembly? Well, I do not know. Better I just choose to forget about them.

So what do I have so far in my toolbox regarding unit test? Here they are

  1. Ask the question: What am I testing?
  2. Explicitly list all dependencies at high-level
  3. Define testing scope

My Unit Test toolbox
My Unit Test toolbox

Applicants

Back to the story, the system I have been working with is a complex system, a data-driven system. The data is back by SQL Server. From the architecture point of view, it is WCF service with CQRS architecture. When a command is executed, there are a bunch of things involved, the domain, the domain service, the external services (AD FS, payment service, …), … eventually, the data is saved into SQL Server database.

From the command side:

Q: What am I testing here?

A: Save data correctly in the database

We should not care about what domain involved, what domain service called, … They are internal implementation. And they are changed frequently. We chased the wrong rabbit hole.

From the query side:

Q: What am I testing here?

A: Get data from the database correctly.

We should not care about how data is filtered, how data is combined, … They are internal implementation. And the same reasoning goes as in Command.

In both cases, the test will give an input and verify the output. However, we still have a problem with dependencies. A big change in my mindset is that I no longer see the database as a dependency. Rather it is a part of the internal system. Why? Because it is an essential part. It can be set up locally and in CI environment. Therefore, my definition of dependency is

Dependency is external systems that we do not control. That they are hard to set up. That we do not care about their implementation. Database should NOT be one of them.

How to mock those dependencies of not using a Mocking framework? A good practice is that for every dependency, there should be a Proxy (the Proxy design pattern). The proxy implementation is injected at the runtime with the help of an IoC framework such as Windsor Container. For Unit Test, I create a fake implementation and tweak as I want.

I took me a little while to set up all those things. But it works. It gives a lot of payoffs.

Implementation Detail

[PS: This section has not finished yet. However, this post has started for a while. I think I should publish it and save the implementation detail for another post.]

To implement this kind of test, I need to interact with the database to insert and get the data regardless of the system logic. This separation is very important. To accomplish this, I use Linq To SQL.

Due to the confidential contract, I am going to create a simple demo instead of using a real application. Let’s create a simple User form MVC application.

[Code]

Having a separated assembly allows me to isolate the changes. The Linq DBContext allows me to interact with the database as I need.

All the tests have a pattern

  1. Assumption: Prepare data. This step insert the data into the database for the command to execute
  2. Arrange: Prepare the command to verify.
  3. Act: Invoke the command handler correspondent to the command. Each command has its own Command Handler.
  4. Assert: Verify the result. Use Linq To SQL to get the data from the database and verify our expectations.

Instead of repeating the steps, I create a Strategy pattern

When Mock?

There are still scenarios where the Mocking is a perfect solution. Usually, it is the top layer of the system. Let’s take MVC (WebAPI) as an example. In my opinion, the Controller should be as light-weight as possible. What a controller should do are

  1. Validate input
  2. Prepare a request (can be a Command or a Query)
  3. Dispatch the request to the next layer. If the system employees CQRS, that layer is usually a Command Handler or Query Handler.
  4. Return the result

Which steps should be mocked? The step #3. What are we testing? We test to ensure that the Controller sends correct command/query to the next layer. We test the behavior of Controllers. The mock might be a perfect fit for Architecture Test.

[Code]

What’s Next?

The implementation detail for all the stuff I write here. Now it is time to let this post out so I get something DONE.

Code That Embraces Changes

Developers write code. They write code for specific functionalities, with requirements from clients (customers, bosses, leaders,…). When requirements come, they know what to do; know what code to write. In the world of business applications, the implementation is not a big deal. The story might be different if we write code for embedded systems using C/C++.

However, most of the problems come later, when the first draft of the function has done. The functions have deployed to the customers. They test and realize that they have to change here and there. If I have to pick one constant in the software development, I would pick CHANGE. The clients will change their mind. They know better when they actually see the product. You, as a developer, cannot avoid changes and cannot tell clients stop changing. No. It does not work that way.

Agile emerges with the spirit: Embrace changes. A wonderful mindset shift! We have to adapt that fact. However, the biggest question remains: How do you deal with changes?

How do you deal with changes in code?

I know that there are processes to deal with changes. But, ultimately, it all comes down to the code. You have to change the code, most of the time you have to.

I do not have a single answer or solution to that problem. However, I do have some experience in designing code that gives me the courage, the tool to embrace changes.

Here we go. But first, let’s define a simple feature. We have to work on a concrete example.

Requirement

As a business owner, I want to export orders (that are made today) into CVS file so that I can import into Excel and build some charts.

Assumption: There is a shopping system. They take orders from users. The orders are stored in the database. They are all basic stuff if you have ever worked with a business application system.

First Implementation

Without any hesitation, jump right into the VS2017 and implement the feature. Btw, how hard it is to implement such a simple feature, right?

    public class Order
    {
        public string Customer { get; set; }
        public DateTime OrderedAt { get; set; }
        public decimal TotalCost { get; set; }
        public int NumberOfProducts {get;set;}
    }
    class Program
    {
        static void Main(string[] args)
        {
            var ordersFromDatabase = new List<Order>
            {
                new Order{Customer = "Batman", TotalCost = 100000, OrderedAt = DateTime.Today.AddDays(-3)},
                new Order{Customer = "Superman", TotalCost = 1000, OrderedAt = DateTime.Today.AddDays(-2)},
                new Order{Customer = "Spiderman", TotalCost = 100, OrderedAt = DateTime.Today.AddDays(-1)}
            };
            const string headers = "Customer;Total Cost;Ordered At";
            var data = new List<string>{headers};
            foreach (var order in ordersFromDatabase)
            {
                data.Add(string.Join(";", order.Customer, order.TotalCost, order.OrderedAt));
            }
            var fullPathToCsvFile = @"d:\export\orders.csv";
            File.WriteAllLines(fullPathToCsvFile, data, Encoding.Unicode);
        }
    }

The actual implementation in a real project will be different. To demonstrate the point, I simplify the code.

What do we have above?

  1. An order has 3 information: Customer, Total cost, and Ordered At.
  2. There are 3 orders for Batman, Superman, and Spiderman. Assuming that they come from the database. Batman is a rich man so he ordered a lot 😛
  3. Export to CVS with “;” deliminator.
  4. The file is located at “d:\export\orders.csv” with Unicode encoding.

The code works. Mission accomplished.

Changes Come

Here are some changes that a customer might want to make

  1. Add NumberOfProducts into the export
  2. Change the location of exported file or even the encoding.
  3. Hey. I want to see reports of 2 days instead of just today.
  4. ….

Things go on in various ways. Finally, they all come to you, developers, to reflect the changes in code. If the application is just a simple stupid console app in this example, there is not any problem at all. Everything can be changed in a matter of minutes.

But we all know that it is not true in the real life.

Embrace Changes

I would prefer to do some logical analysis before actually writing code. If I start my code first, it will affect my thinking process. Sometimes it helps.

To embrace changes, we, first, must identify the possible changed areas. Isolate and Conquer approach

  1. Identify the possible changed areas
  2. Isolate them. Mean you can conquer them. Because you know where to look at, where to change the code.

Identify

Next and hardest question is

How to identify those areas?

Answer

Think in process and interaction. Capture concepts and patterns

Back to export order example. What does it mean? How do I apply in that example?

What is the process of exporting order?

  1. Get the orders
  2. Build data
  3. Write data into CVS file

Export orders process
Export orders process

Obviously, the process will not change. Because we control how we want to implement it. In other words, it does not depend on the customer. What changes? The 3 boxes: Get orders, Build data and Write to file.

Let’s go through each of them

Get Orders

  • Where to get them?
  • How to get them?
  • Criteria

Build data

  • How many columns do we need?
  • What is the format of currency, datetime, … ? the presentation of the data

Write to file

  • What deliminator to use? “;” or “,”? Maybe directly to Excel file?
  • Where to store the file? in which encoding?
  • What if the file fails to save due to permission?

Let’s say we have identified them. Now, let’s isolate them.

Isolate

I will modify the code as below

    class Program
    {
        static void Main(string[] args)
        {
            var orders = GetOrders();
            var data = BuildData(orders);
            WriteToFile(data);
        }

        private static IList<Order> GetOrders()
        {
            return new List<Order>
            {
                new Order{Customer = "Batman", TotalCost = 100000, OrderedAt = DateTime.Today.AddDays(-3)},
                new Order{Customer = "Superman", TotalCost = 1000, OrderedAt = DateTime.Today.AddDays(-2)},
                new Order{Customer = "Spiderman", TotalCost = 100, OrderedAt = DateTime.Today.AddDays(-1)}
            };
        }

        private static IList<IList<string>> BuildData(IList<Order> orders)
        {
            // Initialize rows with header
            var rows = new List<IList<string>>
            {
                new List<string> {"Customer", "Total Cost", "Ordered At"}
            };
            // Add rows data for each order
            foreach (var order in orders)
            {
                rows.Add(new List<string>
                {
                    order.Customer,
                    order.TotalCost.ToString(CultureInfo.CurrentCulture),
                    order.OrderedAt.ToString(CultureInfo.CurrentCulture)
                });
            }
            return rows;
        }

        private static void WriteToFile(IList<IList<string>> data)
        {
            const string deliminator = ";";
            var fullPathToCsvFile = @"d:\export\orders.csv";
            var csvData = data.Select(a => string.Join(deliminator, a))
                .ToList();
            File.WriteAllLines(fullPathToCsvFile, csvData, Encoding.Unicode);
        }
    }

Each method in the Main class encapsulates a concept, area that we have mentioned. Whether we should move them into separated classes is not that important. But yes, you should move them when in production code.

With that in place, when clients want to change something, you are more confident to embrace them. Or when there are bugs. In the beginning of the post, I mentioned that clients want to add NumberOfProducts into the report. Do you know where to change? Yeah! You got it – change in the BuildData method.

The above refactoring is good to embrace changes. However, we can make it better. I leave it to you if you wish to challenge yourself.

Takeaway

Take the “Embrace changes” seriously with you. Train your mindset into the concept. As a developer, we must be aware of that simple fact. Complaining about them will not solve the problem. You will have to deal with it anyway. Better, you should be in proactive position. Changes come! Not a problem!

When implementing a new feature, write new code, I would suggest that you stop for a moment, draw the process, the interaction in your head (better if you do it on paper) before you actually write code. Once you have done that, the possible changes areas emerge. Capture them immediately.

I have a simple process for you

  1. Think about the process, the interaction in normal language. Not the language of the code.
  2. Identify the possible changes areas.
  3. Capture them immediately. You should write them down in the paper.
  4. Write code.
  5. Design patterns, best practices come last.

I hope you enjoy the reading and use them for your next piece of code.

Have a nice day!

Implement Audit Log for Legacy Code

Given a legacy project, the code has been developed for 6 years, the requirement is to log all the changes in the system. For example, if a user changes his street name, that must be logged so that the administrator can know the old and new values.

The project is a typical codebase where there is domain model, SQL Database using NHibernate framework.

The question is how do we implement that requirement without polluted the domain model, without making breaking changes in the design and functionalities?

Assuming that we have a super simple domain model with User and Address.

    public class User
    {
        public Guid Id { get; set; }
        public string Name { get; set; }
        public ResidentAddress ResidentAddress { get; set; }
        public IList<VisitedAddress> VisitedAddresses { get; set; }
    }

    public class ResidentAddress
    {
        public string HouseNumber { get; set; }
        public string StreetName { get; set; }
    }

    public class VisitedAddress
    {
        public string HouseNumber { get; set; }
        public string StreetName { get; set; }
        public string Country { get; set; }
    }

    public class SuperCoolService
    {
        public void ChangeUserResidentStreet(User user, string streetName)
        {
            user.ResidentAddress.StreetName = streetName;
        }

        public void AddVisitedAddress(User user, VisitedAddress visitedAddress)
        {
            user.VisitedAddresses.Add(visitedAddress);
        }
    }

A user has a residential address and a number of visited addresses. Whenever a user changes his street name, audit log it. Whenever a user visits a new address, audit log it. You got the idea.

Capture Changes

We, unfortunately, cannot redesign the domain model. And we cannot go into every single method and detect if values change. No, that would pollute the domain model and destroy the design. Audit Log is a system concern, not a business domain concern.

It turned out that we can accomplish that goal with the beautiful design of NHibernate. In short, NHibernate allows us to hook in its event pipeline. It allows us to add custom logic for pre and post of Insert/Update/Delete an object. Check more detail from NHibernate document website.

When designing systems, I prefer the explicit approach instead of implicit. I want to have the power of deciding what objects I want to audit, at which properties. Life is so much easier if you are in the control. The same goes for code.

    public interface IAuditable
    {
    }

    [AttributeUsage(AttributeTargets.Property)]
    public class ExplicitAuditAttribute : Attribute
    {
 
    }

    public class User : IAuditable
    {
        public Guid Id { get; set; }
        [ExplicitAudit]
        public string Name { get; set; }
        public ResidentAddress ResidentAddress { get; set; }
        public IList<VisitedAddress> VisitedAddresses { get; set; }
    }

    public class ResidentAddress : IAuditable
    {
        [ExplicitAudit]
        public string HouseNumber { get; set; }
        [ExplicitAudit]
        public string StreetName { get; set; }
    }

    public class VisitedAddress : IAuditable
    {
        [ExplicitAudit]
        public string HouseNumber { get; set; }
        [ExplicitAudit]
        public string StreetName { get; set; }
        [ExplicitAudit]
        public string Country { get; set; }
    }

All need-audit-log classes are decorated with the IAuditable interface. Auditable properties are marked with ExplicitAudit.

    public class NhAuditUpdateDomainHandler : IPostUpdateEventListener
    {
        public void OnPostUpdate(PostUpdateEvent @event)
        {
            var auditable = @event.Entity as IAuditable;
            if (auditable == null)
                return;
            var dirtyFieldsIndex = @event.Persister.FindDirty(@event.State, @event.OldState, @event.Entity, @event.Session);
            if (dirtyFieldsIndex.Length == 0)
                return;
// Custom logic to handle audit logic
        }
    }

Depending on your domain, your audit logic, the detail is yours. By implementing an IPostUpdateEventListener, we can extract all the [ExplicitAudit] properties with their old and new values. Pretty neat design.

Give It Meaning

Soon we hit a problem. We knew the street name was changed. However, we could not know to whom it belongs to. We cannot know the parent of the address. The current design only allows us to know the address of a user, not the other way around.

// Reality
Street name change from "Batman" to "Superman"

// However, we expect this
User: Thai Anh Duc
    Street name change from "Batman" to "Superman"

Think about the situation where you have many changes. The information is useless.

Context. The address is lacking Context. At the time of audit log, it needs to know who it is a part of.

The question becomes how do we give that information for the Address class without introducing a hard reference back to the User class?

Explicit Tell

The goal is to minimize the impact on the existing codebase, to prevent changes in business logic. We cannot support the audit log feature without touching the existing code. I decided to categorize the domain objects into 2 categories: ProvideContext and DependableContext

    public interface IAuditContextProvider
    {
        AuditContext ProvideContext();
    }
	
    public interface IAuditDependableContext
    {
        AuditContext ProvidedContext();
    }
  1. Context Provider: Will tell the infrastructure its own context. It acts like a root. It is the User class in our example.
  2. Dependable Context: Will tell the infrastructure the context it is provided. This allows the infrastructure links the context. It is the Address class in our example.

I must write some code to build the infrastructure to link them. It is infrastructure code, not interfere with the domain code.

There are 2 scenarios we have to deal with: Component and Reference relationship.

Component Relationship

In NHibernate with SQL, a component relationship means 2 objects are saved in the same table. Here is the detail document from NHibernate site.

User_Address_Component
User and home address are merged into one table

Because address cannot stand alone. It cannot be added directly to NHibernate’s session. This gives me a chance to intercept its parent (the user object) when it is attached to NHibernate’s session. I just need to find a way to tell the NHibernate: Hey, whenever you save a user object (via its session), tell the user to populate the audit context to its components.

Thank the beautiful design of NHibernate, this code works beautifully

    public class ProvideAuditContextSaveOrUpdateEventListener : ISaveOrUpdateEventListener
    {
        public void OnSaveOrUpdate(SaveOrUpdateEvent @event)
        {
            var mustProvideContext = @event.Entity as IMustProvideAuditContext;
            if (mustProvideContext != null)
                mustProvideContext.ProvideAuditContext(NHibernateUtil.IsInitialized);
        }
    }

What the code says is that “Hey, if you are a ‘must provide audit context (IMustProvideAuditContext interface)‘, then do it”. When the entity comes to the later phases in the pipeline, it has the audit context provided.

Reference Relationship

However, if the Address object is a reference object (in which it is saved in a separated table), it will not work. Because the referenced objects are inserted first. Take an example, we want to allow a user having many addresses. In this situation, User has its own table. And Address has its own table. There is a UserId column in the Address table.

User has addresses
User has addresses

Take an example, we want to allow a user having many addresses. In this situation, User has its own table. And Address has its own table. When a new address is added to a user, that address is saved first. Then the user is updated.

Because Address knows its user, we can take an advantage by overriding the ProvidedContext method

    public override AuditContext ProvidedContext()
    {
        var context = base.ProvidedContext();
        return context.ContextId == Guid.Empty ? new AuditContext{ContextId = UserId} : context;
    }

If it has a context, then use it. Otherwise, create a new one with UserId

Takeaway

When starting a new project, audit log might not be mentioned, might not be seen as a feature. We, as a developer, should ask customers, managers if we should concern about the Audit Log. No matter the answer, you should be aware of its coming. You should consider Audit Log when designing the code.

Audit Log is an infrastructure concern. Make sure you treat it as is. A very bad approach is to go in every single method call in the domain and register an audit log (or any kind of audit). There are better ways to deal with infrastructure concern from Domain code, depending on the framework you used. In my project, I accomplish that goal by combining

  1. Attribute: By decorated a property with [ExplicitAudit] attribute, I control what properties I want to audit.
  2. NHibernate Events: A well-designed framework will allow you to hook into its pipeline and add your custom logic. I take advantages of NHibernate well-designed events. If you use EF, I am sure you will find many places to hook your custom logic in.

Have you seen the power of OCP (Open-Closed Principle) in NHibernate design? Such a powerful principle.

The real implementation is far more complex than in this post. However, the principles stand. It was really a challenge.

So far so good. It is not a perfect implementation. But I am happy with the result.