C# Sharpen Your Sword

Do you know C#? Yes. I do. How long have you been using it? 4-5 years. I am an experienced C# developer, sir.

While interviewing candidates for C# developer position, I came across many CV with saying kind of the same thing. Candidates claim that they are experienced C# developers. I am sure they are. Many of them are. But, unfortunately, there are some who misunderstood themselves.

Why was that? Regardless of C# evolution, they kept writing the same code as if it was C# 1.0. Many just use C# to write basic code such as class, if statement. They use basic data types not even care or know that there are better versions of data structures that can do a better job.

Many experienced developers have problems with writing code that access database, manipulate XML, read files, … The forever argument is they will google when they need it. Ok, that makes sense. Oh no! experienced developers ask google for basic operations?? Hmm that does not make any sense at all.

One day I realized that I am a C# developer, too (Just kidding I know I am a C# developer, writing C# code for a living). I started to ask myself

How much do I know about C#?

I know some. That’s all I can admit. With the thinking flow, I thought many C# developers have that problem; they are not aware of how much they know about C#.

I decided to compose a list of fundamental that every developer should know about C#. Not everyone uses all of them in their daily job.

There are 2 purposes when I composed the list

  1. Self-improvement: I want to improve my C# skill regardless of where I am now. I want to have a strong foundation.
  2. Help others: I have developers at my workplace. I want to help whoever accept my help. Remember that helping is growing, a perfect win-win situation.

 

Primitive Data Types and Keywords

What are differences between integer and float data types? between float and double? And when to use what?

The problem: many do not think twice before deciding a data type to use. What primitive data type do you use to hold a person’s age? Probably it is int, right?

What are checked and unchecked keywords used for?

String Manipulation and Regular Expression

Developers deal with string more often than they thought.

  1. Concatenate strings
  2. Format string with Join
  3. Format string with Format
  4. Find string with regular expression
  5. ToString(): when displaying a value (integer, date, currency, …), we convert it to a string.
  6. String and Culture

The regular expression is an exciting, sometimes headache, topic. How many ways are there to check if a string is an integer?

Protection Levels

Hey, do you remember these: Private, Protected, Internal, Protected Internal, and Public?

The problem: Do not know how to take advantages of the language to protect data correctly.

I have seen developers have a habit of using public for methods and properties, private for fields. The public modifier is used because it is the easiest, which allows a function can be consumed everywhere.

When was the last time you think of using Internal, or Protected Internal?

Collections and Concurrent Collections

How many types of collection do you know? And when to use what? List, Queue, Stack, Dictionary, Array, ArrayList, … Oh how about the readonly and concurrent versions?

The problem: Not use the right collection. Usually, developers take the easiest one instead of a proper one.

Using the right collection is very important. Because it will protect the data from unexpected access, unexpected modification.

The most common use is the List<T> class. Because it does not require any thinking process.

To use a proper collection type, at least a developer should consider these questions

  • Is it ordered?
  • Access pattern? FIFO, FILO, Random, by index, by key, …
  • Is it readonly?
  • Do we need to keep it after processing all items?
  • What are the most common operations (business operations) on that collection?

 

Threading, Task, Parallelism, Async and Await

This topic is not easy at all. It is not required that every developer has a deep understanding; however, a fundamental is crucial.

The problem: Use without a bit of understanding how it works. This causes many nasty issues in production.

XML

Sometimes, we forget how to process an XML document. That’s sad. A direct XML processing might not be popular these days. But a proper understanding is important.

File and I/O

The problem: We do not know what we are doing.

When there is a need to read a file, developers use the first API available File.Open with just enough required parameters. Many times, they fail to ask

  1. What is the file encoding?
  2. What if a file is being opened by another process?
  3. What if a file does not exist?
  4. What if a file is too big?
  5. Do we need to load its content at once? or should we process line by line?

Have you ever wondered those questions?

ADO.NET

The rise of ORM tools (EF, NHibernate, …) makes developers’ life easier when dealing with databases. That comes at a cost. Developers seem to forget how to interact with databases.

The problem: Do not know how to work with databases.

When doing interviews, I asked candidates about the SQL transaction in an EF application. They said EF does it all for us. Yes. It does. The next question, when does it commit a transaction? Answer: Oh, EF does it for us. Period.

Lacking the database fundamental is such a bad excuse.

LINQ

It is hard to say whether it is a fundamental thing. It came out with C# 3.0. Before, you can write code with for loop. Proper use of LINQ will improve the readability of your code.

Covariance and Contravariance

Same as LINQ. If you understand them, you are rock.

Dynamic

Not used often, but listed here for reference.

 

Having the list is a first step to sharpen my C# sword. The list works as an agenda, a guiding star. Some of the items I am good at, others I am a beginner.

Stay calm, sit down, put hands on the sword, move it slowly. That’s good! Let’s the game on.

Leaky Abstraction – Linq Usage

I am not sure how many percents of developers are thinking about Leaky Abstraction when coding, especially coding in OOP umbrella. Me? Not much since recently. I do not know why. I just simply did not think about it. Common trends are that we, as developers, focus on the new technologies, design pattern, best practices, … all those cool and fancy stuff. Many developers build great things without knowing that concept. I understood that. However, what if we know it, we might build a better product with fewer bugs and easy to maintain. When I actually understand it, I feel smarter. Let’s see what I am talking about.

First, check it out from these trusted resources. If you are a developer, there is a high chance that you know those sources.

Wikipedia

Joel on Software

Coding Horror

Recent years, I improved my skill via Pluralsight The courses from Zoran Horvat have changed the way I think about programming, the way I think about OOP. At the same time, I have many code base that I have been working on for years. When I looked at the code and compared to what I have learned, there is a big gap. The problem was that I did not know how to close that gap. I was kind of stuck in my own thinking.

With time, I started to understand them deeply. And then I started to make small changes in a stable, safe way. Let’s start the journey.

 

Let’s assume that we have a School and many Teachers. And at the end of an education year, the school has to give statistic about

  1. How many teachers does it have?
  2. How many mathematics teachers?
  3. How many of them have been in the school for more than 10 years?

They are very basic requirements. Without any hesitation, one can come up with this code. It works perfectly as required.

   public class School
    {
        public string Name { get; set; }
        public IList<Teacher> Teachers { get; set; }
    }

    public class Teacher
    {
        public string Name { get; set; }
        public string Specialty { get; set; }
        public DateTime StartedOn { get; set; }
        public bool IsStillAtWork { get; set; }
    }
    class Program
    {
        static void Main(string[] args)
        {
            var school = new School
            {
                Name = "Gotham City"
            };
            school.Teachers.Add(new Teacher
            {
                Name = "Batman",
                Specialty = "Mathematics",
                StartedOn = DateTime.Now.AddYears(-11),
                IsStillAtWork = true
            });
            school.Teachers.Add(new Teacher
            {
                Name = "Joker",
                Specialty = "Chemical",
                StartedOn = DateTime.Now.AddYears(-6),
                IsStillAtWork = false
            });
            Console.WriteLine("Total teachers: {0}", school.Teachers.Count(x => x.IsStillAtWork));
            Console.WriteLine("Mathematics teachers are: {0}", 
                string.Join("; ", school.Teachers
                                        .Where(x => x.Specialty== "Mathematics")
                                        .Select(x => x.Name)));
            Console.WriteLine("> 10 years teachers are: {0}",
                string.Join("; ", school.Teachers
                                        .Where(x => DateTime.Now >= x.StartedOn.AddYears(10))
                                        .Select(x => x.Name)));
        }
    }

How many issues can you find in the above code? The total teacher count only counts the IsStillAtWork. However, the next 2 statements do not. Once identified, a developer can go in and fix the code easily: by adding one more condition for each where statement. A short-revised version

            Console.WriteLine("Total teachers: {0}", school.Teachers.Count(x => x.IsStillAtWork));
            Console.WriteLine("Mathematics teachers are: {0}", 
                string.Join("; ", school.Teachers
                                        .Where(x => x.Specialty== "Mathematics" && x.IsStillAtWork)
                                        .Select(x => x.Name)));
            Console.WriteLine("> 10 years teachers are: {0}",
                string.Join("; ", school.Teachers
                                        .Where(x => DateTime.Now >= x.StartedOn.AddYears(10) && x.IsStillAtWork)
                                        .Select(x => x.Name)));

So far so good! Where is the problem? where is the “Leaky Abstraction”?

 

Let’s distinguish the consumer and the domain. The “Program” class is the consumer. The School and Teacher are the domain. For those simple requirements, the consumer has to know too much about the domain knowledge which should be captured by the domain itself.

  • The consumer has to know how to filter teachers that are at work.
  • The consumer has to know how to decide a teacher is a mathematics.
  • The consumer has to know how to decide a teacher is at work for a long time.

What if we have many consumers over the School class? Then each consumer has to know that knowledge and makes its own implementation. Here we have a real problem of Leaky Abstraction at the simplest level of using Linq to filter data. We also have the duplication issue. The logic is duplicated. If the domain has only one consumer, that code is fine. It does what it is expected to do. In many applications, it is not the case, unfortunately.

It is really hard to have a code without leaky abstraction. It is kind of an impossible mission. What we should do is to aware of the situation, weigh the pros and cons of fixing them.

 

So what are possible solutions? The goal is to capture the logic inside the domain.

Solution 1: We could move the logic into the School class as below

   public class School
    {
        public string Name { get; set; }
        public IList<Teacher> Teachers { get; set; }

        public int CountTeacherAtWork()
        {
            return Teachers.Count(x => x.IsStillAtWork);
        }

        public IEnumerable<Teacher> MathematicsTeachers()
        {
            return Teachers.Where(x => x.Specialty == "Mathematics" && x.IsStillAtWork);
        }

        public IEnumerable<Teacher> ExperiencedTeachers()
        {
           return Teachers.Where(x => DateTime.Now >= x.StartedOn.AddYears(10) && x.IsStillAtWork);
        }
    }

    class Program
    {
        static void Main(string[] args)
        {
            var school = new School
            {
                Name = "Gotham City"
            };
            school.Teachers.Add(new Teacher
            {
                Name = "Batman",
                Specialty = "Mathematics",
                StartedOn = DateTime.Now.AddYears(-11),
                IsStillAtWork = true
            });
            school.Teachers.Add(new Teacher
            {
                Name = "Joker",
                Specialty = "Chemical",
                StartedOn = DateTime.Now.AddYears(-6),
                IsStillAtWork = false
            });
            Console.WriteLine("Total teachers: {0}", school.CountTeacherAtWork());
            Console.WriteLine("Mathematics teachers are: {0}", 
                string.Join("; ", school.MathematicsTeachers()
                                        .Select(x => x.Name)));
            Console.WriteLine("> 10 years teachers are: {0}",
                string.Join("; ", school.ExperiencedTeachers()
                                        .Select(x => x.Name)));
        }
    }

The logic is captured in 3 methods: CountTeacherAtWork, MathematicsTeachers, and ExperiencedTeachers. So far so good! Any consumer can consume the API without worrying the logic. And we also solve the duplication issue.

But that solution has some issues

  1. The number of methods in School class will explode.
  2. Do we forget to check if Teachers list is null? Are we sure that we have a valid Teacher list?
  3. When adding new methods operating on the Teacher list, some might forget the add the IsStillAtWork condition.

Just name a few. In my opinion, the second issue is the worst.

Solution 2: Capture logic in a Collection class

It is a better solution. Instead of thinking about school and teacher, what if we think of “Collection of Teachers“? So at any point in time, instead of working with a single Teacher, we work with a collection of teachers. Sometimes, the collection might be empty or 1 item.

In OOP, when there is logic, they should be captured inside objects. Let’s another version, where the logic is captured in an object.

    public class TeacherCollection
    {
        private readonly IList<Teacher> _teachers = new List<Teacher>();

        public TeacherCollection(IEnumerable<Teacher> teachers)
        {
            if (teachers != null)
                _teachers = teachers.Where(x => x.IsStillAtWork).ToList();
        }

        public TeacherCollection WhereTeachMathematics()
        {
            return new TeacherCollection(_teachers.Where(x => x.Specialty == "Mathematics"));
        }

        public TeacherCollection WhereExperienced()
        {
            return new TeacherCollection(_teachers.Where(x => DateTime.Now >= x.StartedOn.AddYears(10)));
        }

        public int Count
        {
            get { return _teachers.Count; }
        }

        public IEnumerable<Teacher> AsEnumerable
        {
            get { return _teachers.AsEnumerable(); }
        }
    }

    public class School
    {
     
        public string Name { get; set; }
        public IList<Teacher> Teachers { get; set; }

        public TeacherCollection TeacherCollection
        {
            get
            {
                return new TeacherCollection(Teachers);
            }
        }
    }

    class Program
    {
        static void Main(string[] args)
        {
            var school = new School
            {
                Name = "Gotham City"
            };
            school.Teachers.Add(new Teacher
            {
                Name = "Batman",
                Specialty = "Mathematics",
                StartedOn = DateTime.Now.AddYears(-11),
                IsStillAtWork = true
            });
            school.Teachers.Add(new Teacher
            {
                Name = "Joker",
                Specialty = "Chemical",
                StartedOn = DateTime.Now.AddYears(-6),
                IsStillAtWork = false
            });
            Console.WriteLine("Total teachers: {0}", school.TeacherCollection.Count);
            Console.WriteLine("Mathematics teachers are: {0}", 
                string.Join("; ", school.TeacherCollection
                                        .WhereTeachMathematics()
                                        .AsEnumerable
                                        .Select(x => x.Name)));
            Console.WriteLine("> 10 years teachers are: {0}",
                string.Join("; ", school.TeacherCollection
                                        .WhereExperienced()
                                        .AsEnumerable
                                        .Select(x => x.Name)));
        }
    }

By introducing the TeacherCollection object, we can handle the query side of the object. As the requirement arises, the number of “Where” clauses also increase. That is a challenge and we must keep an eye on the design and modify when necessary. Regardless of the problems might arise with the new design, we gain these benefits

  1. The collection object is completed. We do not have to check for the state of the object. The inner collection is always not null.
  2. This is a perfect example of the “Map-Reduce” pattern. Where the “Where” is the reduce. By introducing a number of proper “Where”, we can capture the logic and chain the condition to the final collection that we want. It is also easy to create a Map function which allows us to transform to the collection object.
  3. Immutable. Each “Where” will result in a new collection object. Because the collection object is designed to support the query side of the operation, it is crucial to keep in mind that you must not add methods that will change the internal state of the collection object.
  4. Improve readability. The better naming the better readability!

Hmm, where is the command side? How to modify the teachers? The command operations should belong to the main domain object. That is the School class. I still keep the Teachers property for that purpose. I know it is not a good design when exposing the Teachers property for that purpose. But it is there to demonstrate the point. In production code, I would design a better solution to deal with command operations. It is out of the scope of this post.

 

Wow, that is cool! We should go in and fix all the places in our current codebase. No! No! and No! Blindly apply anything will cause more damages than benefits. Linq is a powerful tool and we use it everywhere in the code. Our main problem is that we have not asked the right question. “Is it the right place to use Linq? Should we wrap the logic somewhere?” those are wonderful questions to ask whenever we code.

Ok, then what do I take from here? what should I do? I have a few suggestions

  1. Take a moment and understand the leaky abstraction. Try to map it with a real-life example. The textbook is hard to understand.
  2. Look at your current codebase. Find a place where you think there is a possibility of the leaky abstraction. Hints: Find the Linq usage, or properties of type List, Collection or Enumerable.
  3. To find out if your classes have Leaky Abstraction issue, try to see from the Consumer point of view.
  4. Evaluate the pros and cons. Simply consider is It worth the effort or not?
  5. Make one small change at a time.

It is a journey. Good things take time. Using the same approach, we can find more leaky abstraction in other areas of our code.

Thank you for your time. I hope you can take some and apply to your daily job.

Unit Test from Pain to Joy

Recently I have made an improvement in a project regarding its unit test. The improvement has a huge impact on the system in a good way. It also has a huge impact on my thinking, my philosophy about unit test, again, in a good way.

The Story

I have been working on a huge, complex codebase. It is still written with .NET 4.0, about 6 years old. Part of the system is WCF service employed CQRS style. The code has its existing unit test. And we have added more. The tests have both integration tests and mocked tests.

Integration tests, in our context, means starting with the top-most layer (presentation layer at WCF service) down to the database, and then getting data back from the database using Query Handler or Repository. In short, it is a complex testing style. And to my opinion, it is not good. I would not have done that.

Mocked tests, in our context, means mocking all the dependencies. In short, all the interfaces are mocked. Even for the domain objects, we also create proxy objects and mock their properties, methods. It turns out a big mistake.

Most of the time, we are mocking instead of testing state. There are a few problems (pain) with our approach.

First, it is hard to write. To mock dependencies, you have to know all the dependencies. Which means developers have to read the implementation code, figure out how they are interacting with each other, figure out what methods call on which interfaces. Those are not really fun. And they do not bring any real value.

Second, it is fragile. Whenever we add code or refactor a piece of code, the unit test breaks. Because the unit test assumes that a number of certain methods are there and that they are called in a specific order. Which is, in our context, not suitable.

Third, it is damn hard to write tests to verify a bug fix.

Root Cause

How did the hell on earth I end up in that situation? Everything has its own reasons. And I want to figure out mine.

I have bought this excuse for so long. And I was happy with it, unfortunately

Because the codebase is complex. The design was wrong. We do not have time to redesign it.

The fingers were pointing to the other guys, other developers. No, it was not true.

What role have I played in that mess?“, I asked. Oh, turn out I play a very big role. After reflection, here are some reasons to me not doing well on my that area.

Wrong Mindset

A long time ago, when I started knowing and writing unit tests, I was sold the thought that we have to cut the dependencies, especially the database dependency. And with the raising of mocking, the promising of TDD, I mocked (in the hope of cutting dependencies) as much as I could. It works in many scenarios. However, because I believe it is the right thing, I forget to ask right questions.

Not Ask Right Questions

I just wrote tests without asking right questions.

What am I testing here?

Kind of a stupid question, but very powerful. Depending on types of application, on the architecture, answers are different. Because of my wrong mindset, I focused on how instead of what. Answering that question allows me to analyze further, allows me to actually look at the system in a systematical way, instead of a theoretical way.

Solution

First, I decided to throw away what I thought I know about Unit Test. Here are what I want my Unit Test should be

  1. Easy to write, easy to understand from code.
  2. Resilient to refactoring. Do not have to modify unit tests when using another interface in the implementation code. In short, the tests should be there to guarantee the code correctness at maintenance phase.

While writing this post, I created a github repo, welcome to DotConnect.

What are We Testing?

Such a simple but powerful question! However, I sometimes forgot to ask. We know that we should write unit test for our code. Have we ever considered to answer that question properly? Take an example, given that we will build a simple web service (WCF) to CRUD an Account into the SQL database. What are we going to test? Each will have a different answer, thus, drive their unit test implementation.

When asking that question, the important is to remove the term Unit. I find it is a trap. When that term presents, my mind is trapped in defining what unit is. Therefore, I forget the purpose of my testing.

From my own opinion, at the abstract level, I will categorize them into 2 categories 1) Functional Test and 2) Architecture Test

Functional Test

That are tests to govern the correctness of the system. For this kind of test, we have to define clearly what is the final state.

InputOutput
The simple diagram of all processes

To implement a proper test, one must clearly define the Output. Some common outputs are 1) Database, 2) Filesystem

To define a good output, we have to define the proper scope (which comes later in this post)

Architecture Test

For some systems, architecture is important. Let’s say all the call to the Database must go through a repository. Or that the Controller (MVC application) must delegate the work to the next layer (such as Service or Command/Query Handler).

Usually, we use Mock to accomplish the testing goal. Because we do not really care about the actual implementation. We care about the sequence of calls.

What are Dependencies?

Dependencies must be listed out explicitly. At the minimum, there should be a simple diagram like this

Dependencies
High-level dependencies of the system. Each box (such as Google) is a dependency

And do not go into the detail of those dependencies. Better keep the high-level view.

What is the Scope?

Without a scope, things get messy. A proper, explicit said scope will help to define the Input and Output. I made a mistake at this question so I defined a wrong scope. I, once, defined scope at the Project level. I had a unit test for Command Handler project, which will mock the dependency to the Repository project. Then I had another unit test for Repository project. They, first, looked logical and reasonable. However, with the tested of the time, it proves I was wrong.

Once I realized it (and that is why I write this post), I defined the scope at Command Handler level only, remove the concept of Repository test. Which allows me to define the Input is the Command, and the Output is the changes in the database.

This is a game changer step for me. For years, I have been focusing on the term Unit. The problem is that it is hard to define the unit. Will it be a function, a method? Will it be a class? or Will it be an assembly? Well, I do not know. Better I just choose to forget about them.

So what do I have so far in my toolbox regarding unit test? Here they are

  1. Ask the question: What am I testing?
  2. Explicitly list all dependencies at high-level
  3. Define testing scope
My Unit Test toolbox
My Unit Test toolbox

Applicants

Back to the story, the system I have been working with is a complex system, a data-driven system. The data is back by SQL Server. From the architecture point of view, it is WCF service with CQRS architecture. When a command is executed, there are a bunch of things involved, the domain, the domain service, the external services (AD FS, payment service, …), … eventually, the data is saved into SQL Server database.

From the command side:

Q: What am I testing here?

A: Save data correctly in the database

We should not care about what domain involved, what domain service called, … They are internal implementation. And they are changed frequently. We chased the wrong rabbit hole.

From the query side:

Q: What am I testing here?

A: Get data from the database correctly.

We should not care about how data is filtered, how data is combined, … They are internal implementation. And the same reasoning goes as in Command.

In both cases, the test will give an input and verify the output. However, we still have a problem with dependencies. A big change in my mindset is that I no longer see the database as a dependency. Rather it is a part of the internal system. Why? Because it is an essential part. It can be set up locally and in CI environment. Therefore, my definition of dependency is

Dependency is external systems that we do not control. That they are hard to set up. That we do not care about their implementation. Database should NOT be one of them.

How to mock those dependencies of not using a Mocking framework? A good practice is that for every dependency, there should be a Proxy (the Proxy design pattern). The proxy implementation is injected at the runtime with the help of an IoC framework such as Windsor Container. For Unit Test, I create a fake implementation and tweak as I want.

I took me a little while to set up all those things. But it works. It gives a lot of payoffs.

Implementation Detail

[PS: This section has not finished yet. However, this post has started for a while. I think I should publish it and save the implementation detail for another post.]

To implement this kind of test, I need to interact with the database to insert and get the data regardless of the system logic. This separation is very important. To accomplish this, I use Linq To SQL.

Due to the confidential contract, I am going to create a simple demo instead of using a real application. Let’s create a simple User form MVC application.

[Code]

Having a separated assembly allows me to isolate the changes. The Linq DBContext allows me to interact with the database as I need.

All the tests have a pattern

  1. Assumption: Prepare data. This step insert the data into the database for the command to execute
  2. Arrange: Prepare the command to verify.
  3. Act: Invoke the command handler correspondent to the command. Each command has its own Command Handler.
  4. Assert: Verify the result. Use Linq To SQL to get the data from the database and verify our expectations.

Instead of repeating the steps, I create a Strategy pattern

When Mock?

There are still scenarios where the Mocking is a perfect solution. Usually, it is the top layer of the system. Let’s take MVC (WebAPI) as an example. In my opinion, the Controller should be as light-weight as possible. What a controller should do are

  1. Validate input
  2. Prepare a request (can be a Command or a Query)
  3. Dispatch the request to the next layer. If the system employees CQRS, that layer is usually a Command Handler or Query Handler.
  4. Return the result

Which steps should be mocked? The step #3. What are we testing? We test to ensure that the Controller sends correct command/query to the next layer. We test the behavior of Controllers. The mock might be a perfect fit for Architecture Test.

[Code]

What’s Next?

The implementation detail for all the stuff I write here. Now it is time to let this post out so I get something DONE.

In the Mind of a Developer

Working with a 5-6 years old system gives me a good chance to reflect on code written by other developers. When looking at an old code, even written by me, I often think “hmm what a mess! who wrote that code? what were they thinking?” That’s kind of things! The right attitude is that I should not have thought that. However, that is what’ve happened. To me, it is a good thing because I started to think of “I can do it better“.

Each codebase has its own story. This time I got a chance to look and troubleshoot another codebase. I decided to take a deeper step with questions to myself

What were the developers thinking? How would they end up in that design/decision?

The main point is for me to learn something. I am not interested in spending my time to criticise them.

The Story

Because of the contract and sensitive thoughts, I will rename and make up my terms. The system is a service that allows many systems call to log data. It has the same notion of any logging framework in .NET, such as log4net, NLog, … except this is a custom WCF service.

    public class LogCommand
    {
        public string Product { get; set; }
        public string User { get; set; }
        public DateTime CreatedAt { get; set; }
        public string LogMessage { get; set; }
    }

    public class LogEntry
    {
        public Guid ProductId { get; set; }
        public Guid UserId { get; set; }
        public DateTime CreatedAt { get; set; }
        public string LogMessage { get; set; }
    }
    public class LogCommandHandler
    {
        public void Execute(LogCommand command)
        {
            // Begin a transaction because we use NHibernate to talk with SQL Server
            var productId = CreateOrGetProduct(command);
            var userId = CreateOrGetUser(command);
            var logEntry = new LogEntry
            {
                ProductId = productId,
                UserId = userId,
                CreatedAt = command.CreatedAt,
                LogMessage = command.LogMessage
            };
            _session.Save(logEntry);
        }

        private Guid CreateOrGetProduct(LogCommand command)
        {
            // if this is a new product, create.
            // otherwise return existing product id
            return Guid.NewGuid();
        }

        private Guid CreateOrGetUser(LogCommand command)
        {
            // if this is a new user, create.
            // otherwise return existing product id
            return Guid.NewGuid();
        }
    }

There is more than above. But you get the idea. Here is the logic

  1. Get the right Product associated with the log. Insert a new Product record if a product is new.
  2. Get the right User associated with the log. Insert a new User record if a user is new.
  3. Insert a new LogEntry with the ProductId and UserId

And in the database, we have 3 tables

  1. Product: for products
  2. User: for users
  3. LogEntry: for log entries

The reason for having Product and User tables is to support searching from a WPF application (let’s call it LogViewer). At the LogViewer, users want to be able to choose Product from a dropdown. The same goes for users.

The system works as It is supposed to be …. until one day. When there are many concurrent calls to the service. It works unstable.

To cut the story short, the problem caused by the way it managed Product and User. When a production is created (but have not flushed to the database yet), it is put into a global cache to improve performance. When a next request comes in, the system will check the global cache first. Here is how the problem occurred

  1. Request 1: Create a new product, put in the global cache. But NOT commit transaction yet.
  2. Request 2: See that product in the global cache, it takes the product and assigns ProductId to the LogEntry. And it commits the transaction.
  3. The database foreign key constraint throws an exception because the ProductId has not persisted in the database yet.

Hmm? But Why?

The interesting part to me is the question I asked myself. Why? What brought the developer to that implementation?

Here is my theory.

Note: It is just me pondering my own thoughts as part of my learning practice.

Developers tend to focus on the technologies, on how to implement things. Many have been working with SQL Database for years. Thus, they have built a Relational Thinking Mindset. Let’s go back in time and try to imagine what they thought when they got the requirement.

We need to build a log service that allows us to store and search logs by product or user. A log entry will belong to a Product and be created by a User.

Oh cool! Got it. Then we will have 3 tables, one for Product, one for User, and one for Log. And we will have foreign keys from between Log and Product, between Log and User. Easy cake! Job done!

If I were him years ago, I would have done the same. What was missing? – The usage analysis.

Ok! Analysis

Move forward to now where I have years of developing, where I have a problem at hand, where I fight down the root cause. I meant I have so many information at my disposal. What questions would I ask? What concerns would I put?

What are the main operations?

  1. Write log
  2. Search log
  3. Get all products – for the search purpose
  4. Get all users – for the search purpose

What are the most frequently used operations?

  1. Write log

There is a good notion of CQRS in this service. The Write Log operation is Command, the rest are Query. Can we support queries without maintaining Product and User tables by ourselves in code? Yes, we can. SQL Server comes with the powerful concept: View.

By creating views for Product and User, the service can query on them. It gives the same result as maintaining tables.

Takeaway

What happens if we interrupt our default thinking process with

Hey! wait a minute!

That should be enough to allow us to think differently. It might help you start a questioning process.

When looking at a codebase, no matter how good/bad it is, try to reason the question

What were they thinking? How would they end up with the design?

I gain so much with this approach. So hope it works for you too.

Setup a Full Federation Scenario with Web Application, Web Service, Windows Client, and ADFS Server Development Environment – Part 1

As a developer, we participate in many projects. In each project, there is a kind of Framework-Ready. By having framework-ready, developers just need to focus on developing business functionalities. It is a good setup, a good environment. Each person focuses on their best.

I have been working in Federation-Business-Application where the interaction is complicated, secured. And it needs a lot of environment setup. Most of the time, there is already Framework-Ready setup; I just use it.

So far so good, except I have not had enough skill in those areas. What if I have to setup a full environment locally for my sake of testing/experiencing? I felt pain just thinking about it.

5-4-3-2-1 GO! I decided to give it a GO.

Scenario

The common scenario is that there are 3 components

  1. WCF Web Service: The central service taking care of business application/logic. This service is secured and not exposed to the outside world.
  2. WPF (Windows)/Console Client: A UI application that will allow users to do their jobs internally. This client will connect to WCF service. Most of the time, users used this client has a lot of permissions.
  3. ASP.NET Web MVC Application: A public web application that allows public users to interact with the WCF Service. This application supports a subset limited of functionalities.
  4. ADFS Server: User management is done by AD FS Server.

The implementation of those applications are out of the scope, and not that interested either. The interesting part is the communication between them in a development environment.

I want to setup something like this

Scenario
General overview of components

I want to have

  1. A local AD FS server
  2. https communication between services
  3. Use Windows Identity Foundation (WIF) to manage login

Ask Google

I can explain the whole thing in words, in my mind, in the logic. I would have thought that I googled and get the job done. Reality? Google gives me so many information. All the information I need is out there. The problem is when you actually start to read them and apply in your job.

Why? Because Google can give you pieces, but you have to connect them. Google cannot help you connect the dots.

That said, I will use those piece and write the way I connect them. You might have your own way.

AD FS Server

Sounds a trivial task. Sounds like I can google it and follow the instructions. But, hell NO. Problem? Because I do not have System Administration background. Therefore, I have had a hard time understand the relationship between components. I could not draw a mental representation of them.

Googling around, I know that I have to setup things called: Domain Service (AD DS), Certificate Service (AD CS), and Federation Service (AD FS). Unfortunately, none of them knows me 🙁 I do not know them either 😛

So instead of following the instructions, I decided to make sense of them first. I have to draw a picture of them, AKA mental representation.

At the minimum, I need 3 things: Users, Certificates, and Login.

Active Directory Domain Services (AD DS)

Less than a second, I can find this useful document from Microsoft Docs.

A directory is a hierarchical structure that stores information about objects on the network. A directory service, such as Active Directory Domain Services (AD DS), provides the methods for storing directory data and making this data available to network users and administrators. For example, AD DS stores information about user accounts, such as names, passwords, phone numbers, and so on, and enables other authorized users on the same network to access this information.

Active Directory stores information about objects on the network and makes this information easy for administrators and users to find and use. Active Directory uses a structured data store as the basis for a logical, hierarchical organization of directory information.

My take:

ADDS allows me to create and manage users. That’s it! That is all I need to know.

Active Directory Certificate Services (AD CS)

Now that I have users. I need certificates to setup https communication. ADCS allows me to generate certificates that use in my lab environment. It does so many other things. However, all I need is some valid certificates to use for development purpose.

Turn it on with the instruction from Microsoft Site.

Active Directory Federation Services (AD FS)

And finally, I need to setup ADFS. There is a perfect instruction here. If you are a developer, you should check out the Microsoft Docs. At the highest abstract level (at least to my understanding), what it does is that it gives you a nice login form. It manages users who consume your service.

My ADFS Local
My AD FS Local Server. 3 services in a computer

With very little knowledge about Administration, Server, I manage to install just enough for my needs. Once I know what I have to install, it is rather easy to do. Because most of the information you need is already there, for free. The most important thing is to figure what I need, and how to make sense of them.

In my development environment, I decided

  • Everything is in one single Virtual Machine (Hyper-V from Windows)
  • Computer name: DC01. Because I might want to have other servers later on.
  • Domain: tad.local
  • AD FS: adfs.tad.local
  • Windows Server 2016 Data Center (trial version for 180 days)

The main purpose of this post is to document what I understood about them. I do not write the detail of installation processes and other problems I have had while doing it. I did that for 2 purposes

  1. Those instructions are already there, well-written, on the internet.
  2. After 6 months, when the trial is over, I have to reinstall everything again. That is a good test for my understanding. The more I do the more skill I get.

 

Next

I want to take advantage of the setup by exploring various scenarios

  1. A website uses AD FS for login.
  2. A WCF Service which serves the requests from the Website.
  3. How about a Windows Client application consumes the service? Oh yes, there is.

Again, one can easily find those topics on the internet. Nothing is new in here. I just try to write it in my own way, my own understanding.

The more I write, the better I am.