Just how big is Noda Time anyway?

Some years back, I posted a graph showing the growth of Subversion’s codebase over time, and I thought it might be fun to do the same with Noda Time. The Subversion graph shows the typical pattern of linear growth over time, so I was expecting to see the same thing with Noda Time. I didn’t¹.

Noda Time’s repository is a lot simpler than Subversion’s (it’s also at least an order-of-magnitude smaller), so it wasn’t that difficult to come up with a measure of code size: I just counted the lines in the .cs files under src/NodaTime/ (for the production code) and src/NodaTime.Test/ (for the test code).

I decided to exclude comments and blank lines this time round, because I wanted to know about the functional code, not whether we’d expanded our documentation. As it turns out, the proportion of comments has stayed about the same over time, but that ratio is very different for the production code and test code: comments and blank lines make up approximately 50% of the production code, but only about 20–25% of the test code.

Here’s the graph. It’s not exactly up-and-to-the-right, more… wibbly-wobbly-timey-wimey.

Code and test size are both approximately
flat over time, though with dips before some of the major releases. — Noda Time’s codesize in KLOC over time

There are some thing that aren’t surprising: during the pre-1.0 betas (the first two unlabelled points) we actively pruned code that we didn’t want to commit to for 1.x², so the codebase shrinks until we release 1.0. After that, we added a bunch of functionality that we’d been deferring, along with a new compiled TZDB file format for the PCL implementation. So the codebase grows again for 1.1.

But then with 1.2, it shrinks. From what I can see, this is mostly due to an internal rewrite that removed the concept of calendar ‘fields’ (which had come along with the original mechanical port from Joda Time). This seems to counterbalance the fact that at the same time we added support for serialization³ and did a bunch of work on parsing and formatting.

1.3 sees an increase brought on by more features (new calendars and APIs), but then 2.0 (at least so far) sees an initial drop, a steady increase due to new features, and (just last month) another significant drop.

The first decrease for 2.0 came about immediately, as we removed code that was deprecated in 1.x (particularly, the handling for 1.0’s non-PCL-compatible compiled TZDB format). Somewhat surprisingly, this doesn’t come with a corresponding decrease in our test code size, which has otherwise been (roughly speaking) proportional in size to the production code (itself no real surprise, as most of our tests are unit tests). It turns out that the majority of this code was only covered by an integration test, so there wasn’t much test code to remove.

The second drop is more interesting: it’s all down to new features in C# 6.

For example, in Noda Time 1.3, Instant has Equals() and GetHashCode() methods that are written as follows:

public override bool Equals(object obj)
{
    if (obj is Instant)
    {
        return Equals((Instant)obj);
    }
    return false;
}
public override int GetHashCode()
{
    return Ticks.GetHashCode();
}

In Noda Time 2.0, the same methods are written using expression-bodied members, in two lines (I’ve wrapped the first line here):

public override bool Equals(object obj) =>
    obj is Instant && Equals((Instant)obj);
public override int GetHashCode() => duration.GetHashCode();

That’s the same functionality, just written in a terser syntax. I think it’s also clearer: the former reads more like a procedural recipe to me; the latter, a definition.

Likewise, ZoneRecurrence.ToString() uses expression-bodied members and string interpolation to turn this:

public override string ToString()
{
    var builder = new StringBuilder();
    builder.Append(Name);
    builder.Append(" ").Append(Savings);
    builder.Append(" ").Append(YearOffset);
    builder.Append(" [").Append(fromYear).Append("-").Append(toYear).Append("]");
    return builder.ToString();
}

into this:

public override string ToString() =>
    $"{Name} {Savings} {YearOffset} [{FromYear}-{ToYear}]";

There’s no real decrease in test code size though: most of the C# 6 features are really only useful for production code.

All in all, Noda Time’s current production code is within 200 lines of where it was back in 1.0.0-beta1, which isn’t something I would have been able to predict. Also, while we don’t quite have more test code than production code yet, it’s interesting to note that we’re only about a hundred lines short.

Does any of this actually matter? Well, no, not really. Mostly, it was a fun little exercise in plotting some graphs.

It did remind me that we have certainly simplified the codebase along the way — removing undesirable APIs before 1.0 and removing concepts (like fields) that were an unnecessary abstraction — and those are definitely good things for the codebase.

And it’s also interesting to see how effective the syntactic sugar in C# 6 is in reducing line counts, but the removal of unnecessary text also improves readability, and it’s that that’s the key part here rather than the number of lines of code that results.

But mostly I just like the graphs.

Or, if you prefer BuzzFeed-style headlines, “You won’t believe what happened to this codebase!”. ↩
To get to 1.0, we removed at least: a verbose parsing API that tried to squish the Noda Time and BCL parsing models together, an in-code type-dependency graph checker, and a very confusingly-broken CultureInfo replacement. ↩
I’m not counting the size of the NodaTime.Serialization.JsonNet package here at all (nor the NodaTime.Testing support package), so this serialization support just refers to the built-in XML and binary serialization. ↩