Have Fun with Fibonacci

I barely remember when the last time I implemented the famous Fibonacci was. It was the doctrine example of recursive implementation. I am not sure whether it is still a thing at the moment.

Over the weekend, I read some recent stuff in C#. The Fibonacci came to my mind. I have not tried to implement it differently and have not tested how bad it is with recursive implementation. So it is fun to have write some code.

Given that we need to know the result of F30 (or F40), how long would it take and how many calls in the recursive approach? And how about the alternative implementation?

Recursive Implementation

class Program
{
    private static int _recursiveCount = 0;
    private static long RecursiveFibonacci(int n)
    {
        _recursiveCount++;
        return n < 2 ? n : RecursiveFibonacci(n - 1) + RecursiveFibonacci(n - 2);
    }

    private static void DisplayFibonacci(Func<int, long> fibo, int number)
    {
        Console.WriteLine($"{number}: {fibo(number)}");
    }

    static void Main(string[] args)
    {
        const int number = 30;
        var stopwatch = new Stopwatch();
        stopwatch.Start();
        DisplayFibonacci(RecursiveFibonacci, number);
        stopwatch.Stop();

        Console.WriteLine($"Completed after: {stopwatch.Elapsed}; Recursive Counts: {_recursiveCount}");
    }
}

Run and observe the result

Completed after: 00:00:00.0457925; Recursive Counts: 2.692.537

The completion time depends on the computer the code runs. But the recursive count is impressive—2.5 millions time. I increased the number to 50 but lost my patient to wait.

Recursive with Cache

We could improve the recursive solution with result caching. Again, this is a simple implementation for fun.

private static readonly List<long> _fiboCache = new List<long> { 0, 1 };
private static long FibonacciRecursiveWithCache(int n)
{
    _recursiveCount++;
    while (_fiboCache.Count <= n)
    {
        _fiboCache.Add(-1);
    }

    if (_fiboCache[n] < 0)
    {
        _fiboCache[n] = n < 2 ? n : FibonacciRecursiveWithCache(n - 1) + FibonacciRecursiveWithCache(n - 2);
    }

    return _fiboCache[n];
}

For Loop Implementation

And I gave a try with non-recursive approach. There are more lines of code but runs so much faster

private static long FibonacciForLoop(int n)
{
    long n_1 = 1;
    long n_2 = 1;
    long total = 0;
    for (int i = 3; i <= n; i++)
    {
        total = n_1 + n_2;
        n_1 = n_2;
        n_2 = total;
    }

    return total;
}

I do not need 3 variables (n_1, n_2, total). The solution only needs 2 variables. However, I felt natural with 3 variables. It follows the way I calculate it manually.

So let put them together and see the differences

static void Main(string[] args)
{
    const int number = 30;

    Console.WriteLine("Recursive");
    var stopwatch = new Stopwatch();
    stopwatch.Start();
    DisplayFibonacci(RecursiveFibonacci, number);
    stopwatch.Stop();

    Console.WriteLine($"Completed after: {stopwatch.Elapsed}; Recursive Counts: {_recursiveCount}");

    Console.WriteLine("Recursive with cache");
    _recursiveCount = 0;
    stopwatch.Reset();
    stopwatch.Start();
    DisplayFibonacci(FibonacciRecursiveWithCache, number);
    stopwatch.Stop();

    Console.WriteLine($"Completed after: {stopwatch.Elapsed}; Recursive Counts: {_recursiveCount}");

    Console.WriteLine("For loop");
    stopwatch.Reset();
    stopwatch.Start();
    DisplayFibonacci(FibonacciForLoop, number);
    stopwatch.Stop();

    Console.WriteLine($"Completed after: {stopwatch.Elapsed}");
}

Ladies and gentlemen, I present to you

Recursive
30: 832040
Completed after: 00:00:00.0277573; Recursive Counts: 2692537
Recursive with cache
30: 832040
Completed after: 00:00:00.0003827; Recursive Counts: 59
For loop
30: 832040
Completed after: 00:00:00.0001500

That’s it! I know how to implement the Fibonacci.

Technical Notes – CosmosDB Change Feed and Azure Function

Some notes while looking deeper into the integration between Azure CosmosDB Change Feed and Azure Function. Most of the time, we simply use the built-in trigger. And it just works. That is the beauty of the Azure.

// Azure function code, CosmosDb Trigger. Took from MS Example
public static class CosmosTrigger
{
    [FunctionName("CosmosTrigger")]
    public static void Run([CosmosDBTrigger(
        databaseName: "ToDoItems",
        collectionName: "Items",
        ConnectionStringSetting = "CosmosDBConnection",
        LeaseCollectionName = "leases",
        CreateLeaseCollectionIfNotExists = true)]IReadOnlyList<Document> documents,
        ILogger log)
    {
        if (documents != null && documents.Count > 0)
        {
            log.LogInformation($"Documents modified: {documents.Count}");
            log.LogInformation($"First document Id: {documents[0].Id}");
        }
    }
}

The example is everywhere. Nothing is fancy about it.

In a project, we took the advantages of that feature to migrate data from CosmosDB to Azure SQL Database for later processing. I wanted to make sure that we are well-prepared for the production. So I did some learnings. So here are the notes. Note that none of them are mine or new. They are simply my writing in the way that I want to remember them and in the areas that I am interested in.

Container, Logical Partition, and Physical Partition

Change Feed is per container. If a database has 3 containers, each has its own Change Feed. The Change Feed ensures that the documents are sent in the order they were written.

Change Feed is reliable as the database itself.

Under the hood, data is stored in many physical partitions. At that level, Change Feed is actually per physical partition. The data might be shuffled from one physical partition to another. When that happens, the Change Feed is moved as well. So how to ensure the document order across the physical partition, especially after moved? The Change Feed Processor (CFP) manages all the complexity for us.

Change Feed Processor (CFP)

In theory, developers can write code to interact directly with Change Feed. It is possible but not practical. In practice, not many (I cannot say NONE) want to. Instead, many depend on the Change Feed Processor (CFP). The MS Docs has sample code that you can write your own consumer code.

Azure Function with CosmosDb trigger

Azure CosmosDb trigger configuration

By default, the poll interval is 5 seconds (see the FeedPollDelay attribute).

Azure Function with the CosmosDb trigger is a layer on top of the CFP. It saves us from dealing with hosting, deployment, scaling, … with the power of Azure Function.

If the function execution fails, by throwing an exception, the changed documents are sent again in the next run. So there is a risk that the flow is stuck if the failure has not designed to handle properly. The Change Feed and Azure Function ensure that your code will received the changed documents at least once.

C# Extend with Extension Method

Many C# developers know the existence of Extension Methods. It gives developers the ability to extend functionalities of the existing API without modifying the original source code.

A simple example is to check if a string is a numeric value. There are more than one solution to solve that problem. And one of them is that we want to write code like this:


string houseNumber = "123";
bool isNumber = houseNumber.IsNumeric();

The IsNumeric is not a built-in method on the string type. With the extension method, we could implement the IsNumeric easily.

public static bool IsNumeric(this string input)
{
    return Regex.IsMatch(input, "^[0-9]+$");
}

That is the basic concept. LINQ is the main consumer of the benefit. Or we might say that LINQ was the reason why the extension method concept was born. Of course, we do not have to know it to use LINQ.

So what is this post about? We could take advantages of the extension method to improve our design and implementation even when the code is under development, which means we are free to change our code.

Take a look at this simple interface

public interface IDocumentRepository
{
    /// <summary>
    /// Get a document of given type <T> by id. Return null if not found
    /// </summary>
    T GetDocument<T>(Guid id);
}

A simple, generic repository. The detail implementation does not matter here.

Later in the development, there are many places having this kind of logic (code)

var document = _repository.GetDocument<Product>(id);

if(document == null)
    throw new ObjectNotFoundException();

// Do something with the document here

Just some checking. It is not a big deal, except the fact that we need to do it in many places. So we want the ability to write this code

var document = _repository.GetOrThrow<Product>(id);

So what are options?

Modify the interface and implementation

One could go with this approach

public interface IDocumentRepository
{
    /// <summary>
    /// Get a document of given type <T> by id. Return null if not found
    /// </summary>
    T GetDocument<T>(Guid id);

    T GetOrThrow<T>(Guid id);
}

And implement the method in all derived classes. The approach works but I do not like it much for a couple of reasons

  1. That logic does not fit there naturally. It is a logic added on the existing functionality
  2. All the implemented classes are modified. In the best case, there is only one implemented class
  3. And what about later we need other kind of that functionalities? Will we keep expanding the interface?
  4. Usually the interface is at the infrastructure, but the extended functions are in the service or small bounded contexts. Some functions do not make any sense in other boundaries

Extend by extension methods

We could keep the interface clean as it was designed and extend the function by using extension methods as shown

public static class DocumentRepositoryExtensions
{
    public static T GetOrThrow<T>(this IDocumentRepository repository, Guid id)
    {
        var document = repository.GetDocument<Product>(id);

        if(document == null)
            throw new ObjectNotFoundException();

        return document;
    }
}

And we are free to place this code in the place that uses it. If it is required in many layers, assemblies, consider to put it in a separate assembly.

However, be reasonable otherwise you end up with extension methods everywhere. It is about placing the responsibilities in the right place.

I have been using this approach in my recent projects. And I am glad I have done it!

Event Sourcing – Generate Consecutive Ordering

Interesting things happen when system is under load with many concurrent requests. I got one when working on a big project. We got the problem during the development where the system was under many concurrent requests and there were many requests to insert data.

Our system employed Event Sourcing (ES) architecture. It has been chosen before I joined the project. It had been working great until that moment.

We got many exceptions in the system. And soon we tracked down the root cause of those exceptions was that some events were not projected thus resulting inconsistent data in the query model (the read side). But how could that happen? What was the root cause of that root cause?

Our approach depends on the order of events persisted in the Event Store which is a SQL implementation. And some of the events were not in the right order that they were supposed to be.

Let say there are 2 requests R1 and R2

  1. R1 generates 3 events in order: A1, A2, A3
  2. R2 generates 3 events in order: B1, B2, B3

Events in a request are sent to an Event Store in a transaction. And we expect them to have consecutive orders. With the 2 above requests, we expected the orders is

A1, A2, A3, B1, B2, B3

Or

B1, B2, B3, A1, A2, A3

And it has been working like that for years.

We designed the Events table with Order column is the primary key with Identity(1,1) so SQL handles the ordering for us.

One day!

When the system is under load with many concurrent requests

The event ordering is not consecutive.

What does it mean? Instead of having the orders as described earlier, we got this in the database:

A1, A2, B1, A3, B2, B3

Notice the order of B1 and A3

Because the way our projection was designed and implemented, when the ordering was wrong, not consecutive, some events were not projected.

But how could that happen? According to SQL documentation, there is no guarantee that the values generated by Identity column are consecutive in a transaction. If you want to ensure consecutiveness, you have to set proper isolation level and locking. Unfortunately, we could not do that. It will hurt the write performance. We want to write events as fast as possible.

Consecutive values within a transaction – A transaction inserting multiple rows is not guaranteed to get consecutive values for the rows because other concurrent inserts might occur on the table. If values must be consecutive then the transaction should use an exclusive lock on the table or use the SERIALIZABLE isolation level.

So what could we do? There is always a solution.

If you need orders, you have to be in charged

The first thing is that we have to manage orders manually instead of delegating to SQL Identity feature.

We know the number of events in a request/transaction. So if we know the current order number, we can preserve a range of values from Current + 1 to Current + Number Of Events. And then set the current order to the max in the range. If we can ensure that preserving a range and update the current order is performed in an atomic manner, other requests will get correct orders. It is impossible to have 2 requests with the same range. And even if there is, the SQL will throw an unique key constraint exception.

We need a EventOrdering table to store the current ordering.

Take the power of SQL to preserve a range. SQL allows getting values from an update statement. This query does all the power. It allows setting new value in an atomic operation and gets back the old and new values

const string sql = "UPDATE EventOrdering " +
                    "SET CurrentOrdering = CurrentOrdering + @numberOfEvents " +
                    "OUTPUT Deleted.CurrentOrdering + 1 as Lowest,Inserted.CurrentOrdering as Highest " +
                    "WHERE EventOrderingId=@eventOrderingId";

To access the old value, use the Deleted temp table. To access the new value, use the Inserted temp table

/// <summary>
/// Get the current (latest) ordering and reserve a range of ordering from [Current + 1] to [Current + <see cref="numberOfEvents"/>]
///
/// Example: Current:        11
///          numberOfEvents: 5
///          reserved range: 12, 13, 14, 15, 16
///          Updated Current:16
///
/// The returned value: [12 - 16] where
///     12: Lowest
///     16: Highest
/// </summary>
/// <param name="numberOfEvents"></param>
/// <returns></returns>
private OrderingRange GetOrderingRange(int numberOfEvents)
{
    const string sql = "UPDATE EventOrdering " +
                    "SET CurrentOrdering = CurrentOrdering + @numberOfEvents " +
                    "OUTPUT Deleted.CurrentOrdering + 1 as Lowest,Inserted.CurrentOrdering as Highest " +
                    "WHERE EventOrderingId=@eventOrderingId";

    using (var conn = new SqlConnection(_connectionString).AsOpen())
    {
        var param = new
        {
            numberOfEvents = numberOfEvents,
            eventOrderingId = EventOrderingId
        };
        var ordering = conn.QueryFirstOrDefault<OrderingRange>(sql, param, commandTimeout: CommandTimeout);
        if (ordering == null)
        {
            throw new InvalidOperationException("Cannot read the global ordering record from the table EventOrdering. " +
                                                "Ensure that the table is in the database with one row");
        }

        return ordering;
    }
}

/// <summary>
/// We might just need the Lowest value and increase by 1 for each event.
/// However, having both values sounds right with the range, and might support the debugging/logging
/// </summary>
public class OrderingRange
{
    public long Lowest { get; set; }
    public long Highest { get; set; }
}

Once we got the range, we simply assigned them to the events to be saved.

The problem solved! And I have learned new things about SQL.

Gowhere – Read CSV file

Continue my Gowhere journey. I need to read data from a CSV file—a common task in programming, dealing with file.

A sample content with 3 columns ID, Effort, and Title.

ID,Effort,Title
"94503","4","Bug: The line was not aligned correctly"
"97018","12","Implement a cool feature"
"97595","1","Document an incident"

The file size is small so I do not have to worry too much about the performance at this point. Let’s see what requires to perform the task in Go.

In general, here are steps to read and parse a CSV file

  1. Open the file
  2. Read everything or line by line
  3. Parse the text into desired outcome format
  4. Close the file. When to close depends on how you read the file

Open a file

Go supplies a built-in package "os" to work with the file system. Opening a file will return a pointer to the file—if no error, otherwise an error, a normal pattern in Go.

There are 3 required steps

  1. Open the file by os.Open(file)
  2. Check for error before usage
  3. Close the file by using a defer function. This is important otherwise the file is locked
    // Need the os package to read open a file
    // f is a pointer to File object (*File)
    f, err := os.Open(file)
    // Check for error before usage
    if err != nil {
        writeLog(err)
        return nil
    }
    // Close file by a defer function
    defer func() {
        er := f.Close()
        if er != nil {
            writeLog(er)
        }
    }()

Read file

Depending on the file size and the type of applications, developers can choose either read all contents at once or line by line. I always prefer line by line.

    // This is a cool concept. Given that we have a pointer to the file opened by the os.
    // A scanner is created and scan line by line
    b := bufio.NewScanner(f)

    stats := make(map[string]int64)
    for b.Scan() {
        line := b.Text()
        parts := strings.Split(line, ",")
        effortText := strings.Trim(parts[1], "\"")
        if effortText == "" {
            continue
        }

        effort, err := strconv.ParseInt(effortText, 0, 0)
        if err != nil {
            writeLog(err)
            continue
        }
        id := strings.Trim(parts[0], "\"")
        stats[id] = effort
        fmt.Printf("%s - %d\n", id, effort)
    }

Go supplies bufio package to help manipulating IO (file). The first time I heard about the Scanner concept. It hits my mind immediately, kind of "wow that makes sense and cool".

After holding a point to a Scanner:

  1. Call Scan() to loop throw the buffer data
  2. Call Text() to access the current line. If the file is opened in binary mode, use a different method
  3. Parse the line to meet your logic

Proceed data

For common operations on a string, strings package is there. For conversion from a string to various data types—int, float, …—strconv package is there. They are all built-in packages.

Close the file

Handle by the defer function, which is called when the caller function exists.

    // Close file by a defer function
    defer func() {
        er := f.Close()
        if er != nil {
            writeLog(er)
        }
    }()