farblog

by Malcolm Rowe

Just how big is Noda Time anyway?

Some years back, I posted a graph showing the growth of Subversion’s codebase over time, and I thought it might be fun to do the same with Noda Time. The Subversion graph shows the typical pattern of linear growth over time, so I was expecting to see the same thing with Noda Time. I didn’t1.

Noda Time’s repository is a lot simpler than Subversion’s (it’s also at least an order-of-magnitude smaller), so it wasn’t that difficult to come up with a measure of code size: I just counted the lines in the .cs files under src/NodaTime/ (for the production code) and src/NodaTime.Test/ (for the test code).

I decided to exclude comments and blank lines this time round, because I wanted to know about the functional code, not whether we’d expanded our documentation. As it turns out, the proportion of comments has stayed about the same over time, but that ratio is very different for the production code and test code: comments and blank lines make up approximately 50% of the production code, but only about 20–25% of the test code.

Here’s the graph. It’s not exactly up-and-to-the-right, more… wibbly-wobbly-timey-wimey.

Code and test size are both approximately
flat over time, though with dips before some of the major releases.
Noda Time’s codesize in KLOC over time

There are some thing that aren’t surprising: during the pre-1.0 betas (the first two unlabelled points) we actively pruned code that we didn’t want to commit to for 1.x2, so the codebase shrinks until we release 1.0. After that, we added a bunch of functionality that we’d been deferring, along with a new compiled TZDB file format for the PCL implementation. So the codebase grows again for 1.1.

But then with 1.2, it shrinks. From what I can see, this is mostly due to an internal rewrite that removed the concept of calendar ‘fields’ (which had come along with the original mechanical port from Joda Time). This seems to counterbalance the fact that at the same time we added support for serialization3 and did a bunch of work on parsing and formatting.

1.3 sees an increase brought on by more features (new calendars and APIs), but then 2.0 (at least so far) sees an initial drop, a steady increase due to new features, and (just last month) another significant drop.

The first decrease for 2.0 came about immediately, as we removed code that was deprecated in 1.x (particularly, the handling for 1.0’s non-PCL-compatible compiled TZDB format). Somewhat surprisingly, this doesn’t come with a corresponding decrease in our test code size, which has otherwise been (roughly speaking) proportional in size to the production code (itself no real surprise, as most of our tests are unit tests). It turns out that the majority of this code was only covered by an integration test, so there wasn’t much test code to remove.

The second drop is more interesting: it’s all down to new features in C# 6.

For example, in Noda Time 1.3, Instant has Equals() and GetHashCode() methods that are written as follows:

public override bool Equals(object obj)
{
    if (obj is Instant)
    {
        return Equals((Instant)obj);
    }
    return false;
}
public override int GetHashCode()
{
    return Ticks.GetHashCode();
}

In Noda Time 2.0, the same methods are written using expression-bodied members, in two lines (I’ve wrapped the first line here):

public override bool Equals(object obj) =>
    obj is Instant && Equals((Instant)obj);
public override int GetHashCode() => duration.GetHashCode();

That’s the same functionality, just written in a terser syntax. I think it’s also clearer: the former reads more like a procedural recipe to me; the latter, a definition.

Likewise, ZoneRecurrence.ToString() uses expression-bodied members and string interpolation to turn this:

public override string ToString()
{
    var builder = new StringBuilder();
    builder.Append(Name);
    builder.Append(" ").Append(Savings);
    builder.Append(" ").Append(YearOffset);
    builder.Append(" [").Append(fromYear).Append("-").Append(toYear).Append("]");
    return builder.ToString();
}

into this:

public override string ToString() =>
    $"{Name} {Savings} {YearOffset} [{FromYear}-{ToYear}]";

There’s no real decrease in test code size though: most of the C# 6 features are really only useful for production code.

All in all, Noda Time’s current production code is within 200 lines of where it was back in 1.0.0-beta1, which isn’t something I would have been able to predict. Also, while we don’t quite have more test code than production code yet, it’s interesting to note that we’re only about a hundred lines short.

Does any of this actually matter? Well, no, not really. Mostly, it was a fun little exercise in plotting some graphs.

It did remind me that we have certainly simplified the codebase along the way — removing undesirable APIs before 1.0 and removing concepts (like fields) that were an unnecessary abstraction — and those are definitely good things for the codebase.

And it’s also interesting to see how effective the syntactic sugar in C# 6 is in reducing line counts, but the removal of unnecessary text also improves readability, and it’s that that’s the key part here rather than the number of lines of code that results.

But mostly I just like the graphs.


  1. Or, if you prefer BuzzFeed-style headlines, “You won’t believe what happened to this codebase!”. 

  2. To get to 1.0, we removed at least: a verbose parsing API that tried to squish the Noda Time and BCL parsing models together, an in-code type-dependency graph checker, and a very confusingly-broken CultureInfo replacement. 

  3. I’m not counting the size of the NodaTime.Serialization.JsonNet package here at all (nor the NodaTime.Testing support package), so this serialization support just refers to the built-in XML and binary serialization. 

Simple privilege separation using ssh

If you have a privileged process that needs to invoke a less-trusted child process, one easy way to reduce what the child is able to do is to run it under a separate user account and use ssh to handle the delegation.

This is pretty simple stuff, but as I’ve just wasted a day trying to achieve the same thing in a much more complicated way, I’m writing it up now to make sure that I don’t forget about it again.

(Note that this is about implementing privilege separation using ssh, not about how ssh itself implements privilege separation; if you came here for that, see the paper Preventing Privilege Escalation by Niels Provos et al.)

In my case, I’ve been migrating my home server to a new less unhappy machine, and one of the things I thought I’d clean up was how push-to-deploy works for this site, which is stored in a Mercurial repository.

What used to happen was that I’d push from wherever I was editing, over ssh, to a repository in my home directory on my home server, then a changegroup hook would update the working copy (hg up) to include whatever I’d just pushed, and run a script (from the repository) to deploy to my webserver. The hook script that runs sends stdout back to me, so I also get to see what happened.

(This may sound a bit convoluted, but I’m not always able to deploy directly from where I’m editing to the webserver. This also has the nice property that I can’t accidentally push an old version live by running from the wrong place, since history is serialised through a single repository.)

The two main problems here are that pushing to the repository has the surprising side-effect of updating the working copy in my home directory (and so falls apart if I accidentally leave uncommitted changes lying around), and that the hook script runs as the user who owns the repository (i.e. me), which is largely unnecessary.

For entirely separate reasons, I’ve recently needed to set up shared Mercurial hosting (which I found to be fairly simple, using mercurial-server), so I now have various repositories owned by a single hg user.

I don’t want to run the (untrusted) push-to-deploy scripts directly as that shared user, because they’d then have write access to all repositories on the server. (This doesn’t matter so much for my repositories, since only I can write to them, and it’s my machine anyway, but it will for some of the others.)

In other words, I want a way to allow one privileged process (the Mercurial server-side process running as the hg user) to invoke another (a push-to-deploy script) in such a way that the child process doesn’t retain the first process’s privileges.

There are lots of ways to achieve this, but one of the simplest is to run the two processes under different user accounts, then either find a way to communicate between two always-running processes (named pipes or shared memory, for example), or for one to invoke the other directly.

The latter is more appropriate in this case, and while the obvious way for a (non-root) user to run a process as another is via sudo, the policy specification for that (in /etc/sudoers) is… complicated. Happily, there’s a simpler way that only requires editing configuration files owned by the two users in question: ssh.

The setup is fairly easy: I’ve created a separate user that will run the push-to-deploy script (hg-blog), generated a password-less keypair for the calling (hg) user, and added the public key (with from= and command= options) to /home/hg-blog/.ssh/authorized_keys.

Now the Mercurial server-side process can trigger the push script simply by creating a $REPOS/.hg/hgrc containing:

[hooks]
changegroup.autopush = ssh hg-blog@localhost

This automatically runs the command I specified in the target user’s authorized_keys, so I don’t even have to worry about listing it here1.

In conclusion, ssh is pretty good tool for creating a simple privilege separation between two processes. It’s ubiquitous, and doesn’t require root to do anything special, and while the case I’m using it for here involves two processes on the same machine, there’s actually no reason that they couldn’t be on different machines.

The ‘right’ answer may well be to run each of these as Docker containers, completely isolating them from each other. I’m not at that point yet, and in the meantime, hopefully by writing this up I won’t forget about it the next time I need to do something similar!


  1. In this case, adding a command restriction doesn’t protect against a malicious caller, since the command that’s run immediately turns around and fetches the next script to run from that same caller. It does protect against someone else obtaining the (password-less by necessity) keypair, I suppose, though the main reason is the one listed above: it means that ‘what to do when something changes’ is specified entirely in one place. 

Mobile friendly

A Nexus 7 and Nexus 5 side-by-side,
showing a view of this blog post. Including this image, of course.
Mobiles friendly: the Nexii are much happier now

This month, I decided to do something about the way this site rendered on mobile devices. Now that it works reasonably well, I thought it might be interesting to talk about what I needed to change — which, as it turned out, wasn’t that much.

What’s the problem?

First off, here’s what things used to look like on a Nexus 6 (using my pangrammatic performance post as an example).

The page is zoomed out so that the central
column occupies about 40% of the screen width. There is a large left margin,
and an even larger right margin.
Desktop view: we’re not even using half the screen!
(Nexus frames from the Android device art generator, used under CC BY 2.5)

Double-tapping on a paragraph zooms to fit the text to the viewport, which produces something that’s fairly readable, but you can still scroll left and right into dead space.

As well as making it a pain to just scroll vertically, this also caused other problems, like the way that double-tapping on bulleted lists (which have indented left margins) would zoom the viewport such that it cropped the left edge of the main content area.

Although some text now fits to the viewport
width, an earlier paragraph has been truncated at the left edge.
Zooming in to a bulleted list: what happened to the first paragraph?

This is all pretty terrible, of course, and about par for the course for mobile browsers.

So what’s going on here? Well, for legacy reasons, mobile browsers typically default to rendering a faux-desktop view, first by setting the viewport so that it can contain content with a fixed “fallback” width (usually around 1000px), and then by fiddling with text sizes to make things more readable.

meta viewport to the rescue

This behaviour can be overridden fairly easily using the (de facto standard, but not particularly well defined) meta viewport construct. For example, this is what I needed to include to revert to a more sensible behaviour:

<meta name=viewport
    content="width=device-width, initial-scale=1">

The two clauses have separate and complementary effects:

(This is all explained in rather more detail in the Google Developers document that I linked to above1.)

In practice, I’d recommend just taking the snippet above as a cargo-cultable incantation that switches off the weird faux-desktop rendering and fits the content to the screen.

So, after I’ve added the above, I’m done? Not quite.

overflow: visible

With a viewport, the initial view is no longer zoomed out, but now feels cramped, and the page still needs to be scrolled to reach content that extends past the right edge

Our viewport still needs to be scrolled horizontally to reach some of the content, which is far from ideal, and we’ve no longer got any left-hand margin at all. All in all, it’s pretty hard to read our content even though it’s now zoomed in.

It’s probably worth taking a step back to look at the layout we’re using.

The overall page structure here is pretty trivial, roughly:

body {
  max-width: 600px;
  margin: 0 auto;
}

This centres the <body> in the viewport, allowing it to expand up to 600px wide.

We can fix the disappearing margins with body { padding: 0 1em; } (which only has an effect if the body would otherwise be flush to the viewport edges), and while we’re here, we might as well change that max-width: 600px to something based on ems (I went for max-width: 38em).

Most of the content of <body> is text in paragraphs; that’s fine. The two immediate problems are code snippets (in <pre> blocks), and images.

Right away we can see a problem: the images have a declared width and height, and aren’t going to adapt if the width of the <body> element changes.

The code snippets have a related problem: <pre> text won’t reflow, and the default CSS overflow behaviour allows block-level content to overflow its content box, expanding the viewport’s canvas and reintroducing horizontal scrolling2.

We can fix the code snippets fairly easily by enabling horizontal scrollbars for the snippets where needed:

pre {
  overflow: auto;
  overflow-y: hidden;
}

This uses overflow, a CSS 2.1 property, to ensure that content is clipped to the content box, adding scrollbars if needed. It then uses overflow-y, a CSS3 property, to remove any vertical scrollbars, leaving us with only the horizontal scrollbars (or none). If the overflow-y property isn’t supported (and in practice it is), the browser will still render something reasonable.

Responsive images

That doesn’t help with the images, of course. The term you’ll want to search for is “responsive images”, but what we’re actually going to do is size the image so that it fits within the space available3.

One easy way to do this is to simply replace:

<img src="kittens" width="400" height="300">

with

<img src="myimage" style="width: 100%">

and, broadly speaking, that’s what I’m now doing4. Note that you do need to drop the height property (and so might as well drop width too), otherwise you’ll have an image with a variable width and fixed height (which doesn’t work so well, as you might imagine).

There are some caveats with older versions of Internet Explorer (aren’t there always?) but in my case I’ve decided that I’m only interested in supporting IE9 and above5, so these don’t apply.

But wait a sec: we declared the image’s dimensions in the first place so that the browser could reserve space for the image, rather than reflowing the page as it downloaded them. Does this mean that we need to abandon that property?

Maybe. Somewhat surprisingly, there isn’t any way (yet6) to declare the aspect ratio (or, equivalently, original size) of an image while also allowing it to be resized to fit a container. However, all’s not lost: for common image aspect ratios, we can adopt a technique documented by Anders Andersen where we prevent reflow by pre-sizing a container to a given aspect ratio.

The tl;dr is that we use something like the following markup instead:

<div class="ratio-16-9">
  <img src="myimage" style="width: 100%">
</div>

We then pre-size the containing div using the CSS rule padding-bottom: 56.25% (9/16 = 0.5625; CSS percentages refer to the container’s width), and position the image over the div using absolute positioning, taking it out of the flow.

This works, but there are some caveats: it only works for images with common aspect ratios, of course (4:3 and 16:9 are pretty common, but existing images might have any aspect ratio), and, as written, it only works for images that are sized to 100% of the container’s width (though you could handle fixed smaller sizes as well, if desired).

In my case, I elected to make all images sized to 100% of the viewport width (which works well, mostly), and applied the reflow-avoidance workaround only to those images with 16:9 or 4:3 aspect ratios, leaving the others to size-on-demand.

I did notice some surprising rounding differences on Chrome that lead me to reduce that 56.25% of padding to 56.2% (which may truncate the image by a pixel or two; better than allowing the background to show through, though). I suspect this may be because Chrome allows HTML elements in general to have fractional CSS sizes, while it appears to restrict images to integral pixel sizes.

Just a quality of implementation issue

This gave me pretty good results, but I also took the opportunity to make a few other changes to make things work a little better:

It’s worth noting that a lot of these changes also improved the site on desktop browsers. That’s not really surprising: “mobile-friendly” is more about adaptability than a particular class of device.

Resources

So there you have it: for a good mobile site, you may only have to a) add a meta viewport tag, and b) size your content (particularly images) to adapt to the changing viewport width.

Here are some resources (some of which I mentioned above) that I found useful:


  1. Somewhat surprisingly, this is the best reference I’ve found for what the meta viewport tag actually does. 

  2. In theory, the same problem can occur for other elements; for example, an unbreakable URL in running text can cause a <p> element to overflow. In practice, though, that’s not something that I’ve found worth handling. 

  3. There is more to responsive images than just resizing. For example, you can serve completely different images to different devices using media queries (so-called “art direction”). However, that’s way more complicated than what I needed. 

  4. You can alternatively use max-width if you only want to shrink images wider than their container; I also wanted to enlarge the smaller ones. 

  5. Why only IE9? It’s available on everything going back to Windows Vista, and it’s the first version to support SVG natively and a bunch of CSS properties that I’m using (::pseudo-elements, not(), box-shadow, to name a few). Windows XP users could well have trouble connecting to this server in the first place anyway, due to the SSL configuration I’m using, so requiring IE9/Vista doesn’t seem too unreasonable. 

  6. From what I’m lead to believe, this is being actively worked on. 

Cloudy DNS

This is a machine running on the end of an ADSL line. It’s not a very happy machine:

$ uptime
 11:28:01 up 781 days,  1:39,  1 user,  load average: 2.01, 2.03, 2.05

It’s actually idle, so why is the load average above 2.0? Because there’s an unkillable mdadm process stuck in a D state, and a second mount process that’s permanently runnable.

So why haven’t I just rebooted it (and better still, upgraded it: obviously it’s running an old kernel)? Because I’m not entirely convinced it’ll start up again: the disks were acting a bit suspiciously, and lately the PSU fan has been making a bit of a racket as well.

Unfortunately, it’s also a machine that’s accumulated infrastructure that I care about: DNS, Apache, and so on. The data is safely backed up off-machine, but if I just tear it down, a bunch of things will be broken while I’m rebuilding it. So instead, I’ve been trying to decommission it piece-by-piece.

I’ve also got a bit bored running all my own infrastructure, so some of those moving parts have been put onto dedicated consumer hardware (getting the router to handle internal DNS and DHCP, getting a Synology NAS for Samba, etc), and I’ve moved some others onto a hosted VM, so that I don’t have to worry about the hardware: that copy of Apache has been (mostly) obsoleted by moving this site to Google Compute Engine last January, for example.

But there’s still a few things that I’m depending upon this machine for. Until recently, one was as the primary DNS server for farside.org.uk.

I was using a free secondary DNS service from BuddyNS: they provide replicas that I listed as the primaries, and those did regular zone transfers from my server for the source of truth.

That was pretty convenient, and BuddyNS have been pretty great (the free tier is good for up to 300K queries per month, of which I was using about 70-100K), but they only provide secondary DNS, so I went looking for another solution.

I’m sure that there are many other DNS providers around, but since I’m hosting www.farside.org.uk on Google Compute Engine, I decided to try out Google Cloud DNS, which provides a simple primary DNS service, available via anycast over both IPv4 and IPv6 (that arrangement seems to be fairly standard for DNS providers nowadays).

This one’s not free, but it is pretty cheap: US$0.20/month per domain, plus US$0.40/month per million queries. For me, that should work out to less than $3/year1.

Otherwise, it seems to be broadly similar to other DNS providers. You can make updates via a JSON/REST API, and API client libraries and a basic command-line client are provided. They do only support a predefined set of resource record types, though I suspect that’s not a problem for most people2.

I actually switched a few weeks ago, but until very recently the programmatic REST API was the only way to make changes, so this wasn’t really a product I’d want to recommend: technically, it worked, but editing a JSON document by hand to send via the command-line client was… suboptimal.

Fortunately, there’s now an editor embedded in the Google Developers Console, so you can also make changes interactively.

The Google Cloud DNS editor lists
individual resource records, along with facilities for in-place editing,
creation and deletion.
The new Cloud DNS editor in the Google Developer Console

Overall, I’m happy enough with the switch: it seems to work well, and didn’t take much effort (once I’d remembered to quote my TXT strings properly, ahem).

I did make one or two changes to the domain at the same time, most notably removing the A record for farside.org.uk itself (which had originally been present for direct mail delivery, years ago). This does mean that http://farside.org.uk/ will no longer resolve3, but that hopefully shouldn’t cause any real problems.


  1. Full disclosure: I’m currently getting an employee discount, so I’ll be paying less than that. 

  2. I did have to drop an RP RR as a result of this, though I wasn’t actually using it for anything. 

  3. Previously, this would end up at the aforementioned machine and be redirected by that copy of Apache to www.farside.org.uk, which runs elsewhere.