Thái Anh Đức

Welcome to Python

Have been working with C# for a decade, it is time to learn a new language. Some options popping over my head, of course they must be different than C# – Java is excluded immediately, are Python, Ruby on Rails, Closure, … kind of scripting languages. I know some about JavaScript. So I definitely do not want to dig deeper into it.

I remember a statistic from StackOverflow 2018, Python has a good position. I also heard somewhere that Python is awesome. So here I come Python.

If one wishes to know what Python is, ask Google. It gives you all the awesome resources ranging from introduction to intermediate to advanced. The community is strong as well.

The post is my journey, my thoughts when I started to learn Python. It also serves as my documentation. If someone ask me how to get started to Python, I can give them the link. Well, at least that what I thought.

So let’s get started!

Environment

Each developer has their own favorite environments. I come from the .NET world so Visual Studio Code is my tool. Of course, I can use Visual Studio. But seems Visual Studio Code is a better choice.

Visual Studio Code + Python Extension
Install Python on Windows. Download and follow the instruction here at python org

Installing both Visual Studio Code and Python are fast. Everything is ready in a matter of minutes.

Quick Overview

Some notes about python:

Python file extension is py Ex: hello-python.py.
Can take a single Python file and run with python command line.
Interactive shell is powerful. From the console/terminal/PowerShell, type the command python and the Python shell is ready.
Dynamic type. The type is known at the execution time.
Write "Hello World" program in a single line of code: print("Hello World") directly from the shell.

I can write functional code or object oriented code.

Everything in Python is an object. There are 2 functions that I find very useful – type() and id().

name = "Thai Anh Duc"

duplicatedname = "Thai Anh Duc"

# Want to know the type?
type(name)

# Want to know where it is store in the memory?
id(name)

# Are they equal? Are they the same?
name == duplicatedname

So name equals duplicatedname. But they are not the same. By using the id() function, we know that are store in different locations.

Materials

I learned from:

Pluralsight Courses: It is always my first and default option when learning technologies.
Tech Beamer: Very detail with examples, explanations.

Conventions

Python uses indentation for code block instead of square brackets ({}) in C#. There is a coding guidelines, standard PEP8. At the beginning, I do not worry too much about them. I just focus on some simple building blocks. And fortunately, the Visual Studio Code has helped me checked and suggested correction.

This block of code is enough for me to remember what are important

if(1 > 0):
    print("Of course, it is.")
for index in [1, 2, 3, 4]:
    print("hello {0}. This is a loop, display by string format".format(index))
    print("Same code block")

def i_am_a_function(parameters):
    # Comment what the function does
    print(parameters)

Data Types

Date types are crucial to code, even with a dynamic type language. As a developer, you have to know what kind of data you are dealing with. Are we talking about a string, an integer, a big decimal, or a date, …

Refer to Tech Beamer for detail of each type. A special note is at the number types. There are 3 types:

int – for integer value. There is no limit as far as the computer can hold it. Unlike C#, there are limit on each number type, such as Int16, Int32, Int64.
float – for floating value. There is a limit of 15 decimal digits.
complex – for other cases. For example, the operator 7/3, instead of returning the value 2.333333, it is represented as (2,1) – this is the complex type.

Because type is dynamic, sometimes it is a good idea to check the type first. Python has isinstance function

name = "Thai Anh Duc"
# It is a string
isinstance(name, str)

age = 36
# It is an int
isinstance(age, int)

bank_amount = 1.1
# It is a float
isinstance(bank_amount, float)

Function and Class

One might not need a class but function.

def append_lower(name):
    # Lower case the name and then append " python" to the end
    return name.lower() + " python"

class Python101():
    # User-defined classes should be camel case
    static_constant_field = "Static field, access at the class level"
    def __init__(self, comment)
        # Constructor
        # :param self: this in C# language
        # :param comment: constructor parameter. Which tells Python that it requires a parameter
        self.comment = comment # Declare an instance field and assign value

    def say_hi(self)
        print(self.comment)

print(Python101.static_constant_field)

p101 = Python101("You are awesome")
p101.say_hi()

The code speaks for itself.

Summary

It is quite easy for a C# developer to get started with Python. With the above building blocks, I am ready to write some toy programs to play around with Python. There are many that I want to see how Python does. The initial list are:

IO
Network
Threading

The list goes on as I learn. I am ready to go next.

Observation – Watch Out Boundaries

When I first started my software development career I think writing software was hard. And it is true. However, the definition of writing software at that time was different. What I really meant was my code met the functional requirement (and it was not always true) and ran. That was when I did not see in production. So everything was working fine in my machine.

Over the time, I have had chances to bring my code into production and seen them running. No surprise that they did not work well in production. Every developer knows the famous "It works well in my machine". And it might work well in test. But it always has problems in production.

Why? There is no single answer for that problem. Software are developed by developers with different level of skills, experience, intelligent, … Even the team has the most talented developers in the world, their products still have bugs. So I am not trying to find a solution for that problem. Instead, I embrace and observe the fact. There is no silver solution but there are tips and tricks to prevent as much as possible.

From my own experience, from Pluralsight courses, from youtube, … from any source I touched, I want to document what I have observed. I do not intend to go in the detail of each item. Rather I want to have a list and some explanations, references. The detail varies project by project.

SQL

If you build applications with SQL Server as data storage, watch out 2 common unbounded patterns

Select N+1
Select * without top n

Asking Google for "Select N+1" you will know what it is immediately. There are detail explanation with code example. Usually developers that have worked in ORM know it very well.

The second watch out is a bit tricky. That is when your applications issue a query to SQL in this pattern

SELECT * FROM dbo.Employee WHERE [Predicate]

In the test environment, there is no problem. However, in the production, the data is huge and that query might return millions of records.

These days many applications do not talk to the database directly. Instead there are ORM/LinqToSQL in between. And this piece of code is not uncommon

var employeesByName = ctx.Employees.Where(x => x.Name.StartWith("Smith")).ToList();

Some ORMs might put a limit on the generated queries. What developers need to do is to review all the generated queries.

The rule of thumb is that always control the number of returned records. Put the max on everything.

Connections

There are many kinds of connections that applications make – connect to the database, connect to external services. When making such those calls, there are some watch out

Timeout: Make sure a timeout value is set on everything. Usually the modern frameworks have default values. Just make sure there is and you are aware of them.
Close connections properly. Just imagine what happens if you have your door opened? Bad things happen.

JSON – Serialization and Deserialization

JSON is cool. Developers work with JSON everyday in one form or another. Many take it for granted. We rarely pay attention to the size of data. I once experienced such a problem here – hidden cost of an architecture.
So if you have to explicitly use JSON directly in your code, ask these questions

Do I have to use it? Is there any other options?
What size? Is the size under controlled?

Some might argue that RAM is cheap. So why should we care too much about the size? Yes. RAM is cheap but it has limit. Once it reaches the limit, your application will freeze or crash. And if your applications are running on the cloud, everything your applications consume, there is cost involved.

Enumerable, List, ForEach

Does developer write code without the usage of loop? Have we ever wondered how many items are there in a list? When talking about a list, we should be aware that all items are stored in memory. So the size really matters here even it is trivial.
Another trap is at the Enumerable. Enumerable represents a sequence that mostly expected to iterate only once. By nature, we do not know the size of a sequence (There is no Count property on the IEnumerable interface). Therefore, when calling .ToList(), be aware of all the nasty things can happen.

I brought them here for reference. It might not a thing that takes down production. But it is nice to be aware of as well.

There might be more about boundaries to watch out. Those are what I have come up so far. What’s yours?

C# 7 Tuple Better Test Assertion

Recently I read C# In Depth 4.0 where I met the new Tuple design in C# 7. It is really a cool feature. Beside the syntax sugar, it offers capacities that developers can leverage.

At the time of reading it, I was tasked with writing unit tests in my job. It triggered my memory about Semantic Comparison with Likeness. The main idea of semantic comparison is to compare 2 objects with certain properties. It allows developers to define what equality means. The tuple supports the equality by default. So maybe I should be able to use the tuple to accomplish the same thing as Likeness.

In this post, I will write a simple unit test without Likeness or tuple, then refactors it with Likeness, finally uses Tuple. Let’s explore some code.

public class Product
{
    public Guid Id { get; set;}
    public string Name { get; set;}
    public double Price { get; set;}
    public string Description { get; set;}
}

[TestFixture]
public class ProductTests
{
    [Test]
    public void Test_Are_Products_Same()
    {
        var expectedProduct = new Product
        {
            Name = "C#",
            Price = 10,
            Description = "For the purpose of demoing test"
        };

        var reality = new Product
        {
            Name = "C#",
            Price = 10,
            Description = "For the purpose of demoing test"
        };

        // Assert that 2 products are the same. Id is ignored
    }
}

The task is simple. How are we going to assert the 2 products?

Old Fashion

Very simple. We simply assert property by property.

[TestFixture]
public class ProductTests
{
    [Test]
    public void Test_Are_Products_Same()
    {
        var expectedProduct = new Product
        {
            Name = "C#",
            Price = 10,
            Description = "For the purpose of demoing test"
        };

        var reality = new Product
        {
            Name = "C#",
            Price = 10,
            Description = "For the purpose of demoing test"
        };

        Assert.AreEqual(expectedProduct.Name, reality.Name);
        Assert.AreEqual(expectedProduct.Price, reality.Price);
        Assert.AreEqual(expectedProduct.Description, reality.Description);
    }
}

Some might think that we should override the Equals method in the Product class. I do not think it is a good idea. The definition of equality between production and unit test are tremendously different. Be careful before overriding equality.

The product class has 3 properties (except the Id property). So the code still looks readable. Think about the situation where there are 10 properties.

Likeness – Semantic Comparison

There is a blog post explaining it in the detail. In this demo, we can rewrite our simple test.

[TestFixture]
public class ProductTests
{
    [Test]
    public void Test_Are_Products_Same()
    {
        var expectedProduct = new Product
        {
            Name = "C#",
            Price = 10,
            Description = "For the purpose of demoing test"
        }.AsSource()
        .OfLikeness<Product>();

        var reality = new Product
        {
            Name = "C#",
            Price = 10,
            Description = "For the purpose of demoing test"
        };

        Assert.AreEqual(expectedProduct, reality);
    }
}

Likeness is a powerful tool in your testing toolbox. Check it out if you are interested in.

Tuple – Customized

The idea is that we can produce a tuple containing asserted properties and compare them. This allows us to flatten the structure if wished.

[TestFixture]
public class ProductTests
{
    [Test]
    public void Test_Are_Products_Same()
    {
        var expectedProduct = (Name = "C#", Price = 10, Description = "For the purpose of demoing test");

        var reality = new Product
        {
            Name = "C#",
            Price = 10,
            Description = "For the purpose of demoing test"
        };

        Assert.AreEqual(expectedProduct, (reality.Name, reality.Price, reality.Description));
    }
}

It might not look different from the Likeness approach. And I do not say which approach is better. It is just another way of doing things.

Summary

So which approach is better? None of them. Each has their own advantages and disadvantages. They are options in your toolbox. How they are used depends on you, developers. Definitely I will take advantages of the new Tuple in both production and unit test code.

The Process of Making Elegant Unit Tests

Unit tests are part of the job that developers do while building software. Some developers might not write unit tests. But, IMO, majority does. If you are one of those, how do you treat the unit test code comparing to the production code?

Do you think about maintainability?
Will you refactor the test to make it better? Note: I use the term refactoring from the Refactoring book by Martin Fowler.
Have you used the advantages that the test framework offer?

To be honest, I had not thought about that much. In the beginning of my career, I wrote tests that followed the current structure of the projects. I did not question that much. Over the time, I started to feel the pain so I made changes Unit Test from Pain to Joy which has served me well in that project.

Write Tests

Recently, I have had a chance to work on another project. I was tasked with writing unit tests (and integration tests) to get used to the system and to be able to run tests with multiple credentials, AKA login user.

Here is the test, not a real one of course. But the idea is the same. I want to run the test with different credentials. The username and password must be passed to the parameters. This is very useful when you look at a test report. By parameterizing the report will show values passed to the test.

[TestFixture]
internal class MultiCredentialsTest
{
    [Test]
    [TestCase("read_user", "P@ssword")]
    [TestCase("write_user", "P@ssword")]
    public void RunWithDifferentCredentials(string username, string password)
    {
        // The test body goes here.
    }
}

And there will be many of them. It worked as expected. But there are potential problems. Can you guess the problem?

What happens when one of test users is changed either username or password?
What if we want to add more test users into the test suites?

Think about the situation where there are hundreds even thousands of them. It will be a pain. I need a solution to centralize the test data. My process has started.

Make Them Better – Manageable

It was time to look at what NUnit offers. NUnit has supported TestCaseSource. You should check it out first if you have not known it. In the nutshell, it allows developers to centralize test data in a manageable manner. That was exactly what I was looking for.

I created a TestCredentialsSource to produce the same test data. I would prefer the name TestCredentialsFactory, but seems the "source" fits better in the unit test context.

internal class TestCredentialsSource
{
    public static  IEnumerable<object> ReadWriteUsers = new IEnumerable<object>{
        new object[]{"read_user", "P@ssword"},
        new object[]{"write_user", "P@ssword"}
    }
}

The test was rewritten with version V1. There are 2 versions for comparison.

[TestFixture]
internal class MultiCredentialsTest
{
    [Test]
    [TestCase("read_user", "P@ssword")]
    [TestCase("write_user", "P@ssword")]
    public void RunWithDifferentCredentials(string username, string password)
    {
        // The test body goes here.
    }

    [Test]
    [TestCaseSource(typeof(TestCredentialsSource), "ReadWriteUsers")]
    public void RunWithDifferentCredentials_V1(string username, string password)
    {
        // The test body goes here.
    }
}

In the test, I did not have to deal with test values. The test data was encapsulated in the TestCredentialsSource with the ReadWriteUsers static field.

Make Them Even Better – Reuse and Duplication

It was good with known specific set of users. There were certain tests that want to run with a specific user. It should be fairly easy with another property in the TestCredentialsSource

internal class TestCredentialsSource
{
    public static  IEnumerable<object> ReadWriteUsers = new IEnumerable<object>{
        new object[]{"read_user", "P@ssword"},
        new object[]{"write_user", "P@ssword"}
    }

    public static   IEnumerable<object> SpecificUser = new IEnumerable<object>{
        new object[] {"special_user", "P@ss12345"}
    }
}

What if I wanted to test with only "read_user"? What if I wanted to combine the "read_user" with "special_user" for another test? One option was to define them in the TestCredentialsSource. Which was still fine because it was still manageable in a single file. But it was awkward.

Was there any better alternative?

Yes, there was. Let’s encapsulate the data in a class. Welcome to TestCredentials class.

internal class TestCredentials
{
    public string Username { get; }
    public string Password { get; }
    public TestCredentials(string username, string password)
    {
        Username = username;
        Password = password;
    }
    ///<summary>
    /// Convert the object into an array of properties object which can be used by the TestDataSource
    ///</summary>
    public object[] ToTestSource() => new object[] { Username, Password };

    public static TestCredentials ReadUser = new TestCredentials("read_user", "P@ssword");
    public static TestCredentials WriteUser = new TestCredentials("write_user", "P@ssword");
    public static TestCredentials SpecialUser = new TestCredentials("special_user", "P@ss12345");
}

The class supplied 3 factory methods to construct the needed credentials. This was the only single place where the data was provided without any duplication.
The TestCredentialsSource became much cleaner

internal class TestCredentialsSource
{
    public static  IEnumerable<object> ReadWriteUsers = new IEnumerable<object>{
        TestCredentials.ReadUser.ToTestSource(),
        TestCredentials.WriteUser.ToTestSource()
    }

    public static   IEnumerable<object> SpecificUser = new IEnumerable<object>{
        TestCredentials.SpecialUser.ToTestSource()
    }
}

Cool! The data has gone from the source definition. But there was still one thing that I did not like much – the setup of "SpecificUser" in the TestCredentialsSource. Having a source for a single value did not sound right to me.

There was a solution – convert the TestCredentials to a source that NUnit can understand. Implement the IEnumerable. TestCaseData is defined by the NUnit framework

internal class TestCredentials : IEnumerable<TestCaseData>
{
    public string Username { get; }
    public string Password { get; }
    public TestCredentials(string username, string password)
    {
        Username = username;
        Password = password;
    }
    ///<summary>
    /// Convert the object into an array of properties object which can be used by the TestDataSource
    ///</summary>
    public object[] ToTestSource() => new object[] { Username, Password };

    public static TestCredentials ReadUser = new TestCredentials("read_user", "P@ssword");
    public static TestCredentials WriteUser = new TestCredentials("write_user", "P@ssword");
    public static TestCredentials SpecialUser = new TestCredentials("special_user", "P@ss12345");

    public IEnumerator<TestCaseData> GetEnumerator()
    {
        return new List<TestCaseData>
            {
                new TestCaseData(Username, Password)
            }.GetEnumerator();
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }
}

With the in place, I could write the 2 below tests. There was no limitation of what combinations I could make of test credentials.

[Test]
[TestCaseSource(typeof(TestCredentials), "SpecialUser")]
public void RunWithDifferentCredentials_SpecialUser(string username, string password)
{
    // The test body goes here.
}

[Test]
[TestCaseSource(typeof(TestCredentials), "SpecialUser")]
[TestCaseSource(typeof(TestCredentials), "WriteUser")]
public void RunWithDifferentCredentials_CombinedUsers(string username, string password)
{
    // The test body goes here.
}

The End

The good result does not come by accident. I could have stopped at any step in the process. By pushing a little bit further, by asking the right questions, the end result was more than I had expected.
If you are writing tests code, should you look at them again and ask questions? Give it a try and see how far it takes you.

Watch Out Abstract Dependency

The context is a C# .NET Core WebAPI system. The system employs the Onion Architecture. There is an API layer which is ASP.NET MVC controller. Then there is an Application Service layer inside the API layer. In the core of the onion, it is the domain. One can quickly google for Onion Architecture. If you do not know it, I suggest you take a look at it first.

One of a key points in the Onion Architecture is the inner layer must not know the outer layer. Which translates into the C# project that the Application Service Layer must not reference directly to the API project. The API consumes the Application Service NOT the other way around.

There is a requirement that application configuration (note that it is not system configuration like connection string, certificates …) should be maintained in a JSON file. Let’s call it appConfig.json. The file is deployed with the API. The implementation takes the advantage of .NET Core JSON configuration so the runtime can load the file.

Let simplify the requirement to focus on the architecture. The appConfig defines the default currency. The value is used to display in the UI and for other processing in the Application Service.

Come to the tricky part. Where do we should place the implementation: API or Application Service?

My answer is always in the API. What if the implementation is in the Application Service? Let’s see the implementation and analyze.

By using the IConfiguration interface, the Application Service has a dependency on Microsoft.Extensions.Configuration package which seems fine. Because it is abstraction.

Has the dependency between API and Application Service changed? Implementation-wise NO. Because there is no direct reference from Application Service to the API.

But there is in term of design. The Application Service implementation is using the infrastructure supplied by the API – the configuration is managed by the API layer.

Will it cause any problem? Well, it depends on what we care most about. If the architecture is the main concern, then yes, it is a problem. The architecture is broken unintentionally. Think about a scenarios where the Application Service is by another client, such as WPF application. Does it make sense to bring the Microsoft.Extensions.Configuration package to WPF application?

The Application Service defines the interface, which says “hey! I need to know the default currency, but I cannot figure it out by myself.” The outer layers, can be a WebAPI, can be WPF application, will supply the implementation, which says “Not a problem! I know where to get the default currency for you.”

Thái Anh Đức

Welcome to Python

Environment

Quick Overview

Materials

Conventions

Data Types

Function and Class

Summary

Like this:

Observation – Watch Out Boundaries

SQL

Connections

JSON – Serialization and Deserialization

Enumerable, List, ForEach

Like this:

C# 7 Tuple Better Test Assertion

Old Fashion

Likeness – Semantic Comparison

Tuple – Customized

Summary

Like this:

The Process of Making Elegant Unit Tests

Write Tests

Make Them Better – Manageable

Make Them Even Better – Reuse and Duplication

The End

Like this:

Watch Out Abstract Dependency

Like this:

Environment

Quick Overview

Materials

Conventions

Data Types

Function and Class

Summary

Share this:

Like this:

SQL

Connections

JSON – Serialization and Deserialization

Enumerable, List, ForEach

Share this:

Like this:

Old Fashion

Likeness – Semantic Comparison

Tuple – Customized

Summary

Share this:

Like this:

Write Tests

Make Them Better – Manageable

Make Them Even Better – Reuse and Duplication

The End

Share this:

Like this:

Share this:

Like this: