I’ll freely admit that I don’t need to use a lot of maths1 in my day
job: pretty much the only things I really need to understand are basic
statistics and, occasionally, how to run a t-test to see whether an
optimisation has actually improved things (though I guess I don’t even need
to understand the maths for that one).
But sometimes maths ends up happening anyway, and it can be interesting
when that happens. Here’s one such story:
Let’s say that we have some frontend service that we’d like to understand
the performance of in a production configuration. Maybe we want to
understand its RAM usage under load, or the latency for some given types of
requests, that kind of thing.
In our example system, we’ll send requests to a single frontend task, but
there’s a wrinkle: that frontend uses a sharded or replicated backend RPC
service to service these requests, and the frontend will only
connect2 to a given backend task when needed. That connection
setup will take a little extra time whenever it occurs, which we don’t want
to count in our test.
If we keep sending requests, then eventually the frontend task should
have connected to all the backend tasks, and we’ll have a steady-state we
can start analysing.
The question then is, how long will it take for our system to set up all of
these connections and reach that steady-state? Or, more precisely, how many
warm-up requests do we need to send before we can start our real
measurements?
Let’s say that our frontend has a hundred backend tasks that it can connect
to, and that we’ll model the selection of each backend task as uniformly
random for each request. Do we need to send a few hundred requests to our
frontend before all of those backends are connected, or a few thousand?
Or more?
Incidentally, when I did something like this for real, I used a pragmatic
solution instead: keep sending requests until the metric I was monitoring
looked like it had settled down, then start measuring. But it did make me
wonder how we might go about calculating this mathematically.
Back-of-the-envelope maths time: we know that it has to be at least 100
requests, since we can only make at most one connection to a backend task
for any given frontend request. We also know that doing it in exactly 100
requests should both be possible and extremely unlikely:
The first request we send will definitely trigger a connection to some
backend task, since we haven’t made any connections at all yet.
The second request will probably make another connection. Since we have
100 backend tasks, the chance is only $1 / 100$ that we pick the same
backend as we used for the first request, so in $99 / 100$ cases we’ll
connect to a new backend task.
The third request will connect to a third backend in $98 / 100$ cases,
and so on.
After we’ve sent that 100th request, the chance that we’ve
connected to all 100 backends is therefore:
… which is a very small number: about 1 in $10^{42}$.
So if we want to have a high chance of having connected to all of the
backend tasks, we’ll definitely need more than 100 requests, but it’s still
not very clear how many: in theory, it could take an unbounded number of
requests, since we might never pick all 100!
It turns out that this problem has a name: it’s the coupon collector’s
problem, based on the idea of collecting
all of $n$ different coupons by drawing randomly, one-by-one.
Calculating the expected number of requests we need is fairly
straightforward: we can just sum the expected number of requests we need to
make each new connection:
For the first connection, we always need exactly one request.
For the second connection, the probability of making a new request is
$99 / 100$, so the expected number of requests we need is $100 / 99$.
For the third connection, the expected number of requests is $100 / 98$,
and so on.
To make the last connection, we expect that it’ll take $100 / 1$ or 100
requests, which makes sense: we have to pick the one unconnected backend
from 100 options (in the same way that on average you’d expect to need to
make six die rolls on a regular six-sided die in order to roll some
pre-chosen number).
for n=100, we get about 518.74, which is the expected number of requests
we’d need to send.
But what does this number actually tell us? If we send 519 requests, does
that mean that we’ll probably have connected to all of the backends? Or
that we have exactly a 50% chance of having done so? How many requests do we
need if we want to have a 95% chance of having connected to all of the
backends, say?
What we’ve just calculated is the expected value, and
is an average of the number of requests.
In other words, if we were to run this experiment a large number of times,
counting the number of requests until we successfully connected to all the
backends ($C_1, C_2, C_3, \dots$), then the expected value is simply the
average of those counts.
Since some of these counts might be quite large, this expected value isn’t
actually that useful for thinking about the number of requests we’re going
to need to send. What we really want to know is what the actual
distribution looks like.
Since randomly picking those last few backends takes quite a few requests,
the distribution is heavily skewed to the right. To cut to the chase, it
looks like this:
Probability distribution (PMF) for the coupon collector’s
problem, for n=100 coupons
For every point $x$ on the x-axis, the height of the curve shows the
probability of collecting 100 coupons — or connecting to all 100 backends,
in our example — in exactly $x$ attempts, so for $x = 100$ we have the
very small (but non-zero) probability we calculated earlier, while for $x
\lt 100$ (the hatched area) the probability is zero, as those values are
impossible.
The expected value (518.74) we calculated above is marked, as are the median
(497) and 95th percentile (754). All of these are larger than the
mode (460), which is the most-likely number of attempts we’d need.
This graph also has non-zero values everywhere for $x \geq 100$, though it
drops off quickly: the probability that we’d take more than 1000 attempts is
only 0.4% or so, for example.
Wikipedia goes into the
details
of how to calculate this distribution exactly, though I don’t feel competent
enough to explain it here. One useful nugget, though, is that we can
estimate some of the statistics if we map this distribution to what is
apparently called a Gumbel distribution:
The mode for any given $n$ coupons is approximately $n \ln{n}$.
For any percentile $p$, we can compute $k_p \approx n \ln{n} - n
\ln{\left(-\ln{p}\right)}$.
For example, for $n=100$ and $p=0.95$, this gives a mode of 460, which is
correct, and a 95th percentile of 758, which is pretty close to
the real value of 754.
In this example, we could be very sure (~99.57%) that if we were to send
about a thousand requests, we’ll have connected to all of our backends.
So, the next time you need to work out how many requests it takes to warm up
a cache or fill a connection pool, you could use this to work out some
magic numbers. But there’s an important caveat for this specific example:
all of the above relies on our initial assumption that picking a backend is
uniformly random.
In our non-spherical-cow world, load balancers don’t pick backends at
random, and instead use a round-robin or similar strategy that
would most likely end up actively picking the unconnected backends,
significantly reducing the number of requests we actually need to connect to
everything.
Maths gives us an interesting worst-case upper bound, but it also suggests
why the pragmatic solution was a better idea: sometimes it’s just easier to
watch your latency metrics until they flatten out, and let the maths happen
in the background.
Perhaps you’re wondering whether our hypothetical frontend
could just pre-connect to all possible backend tasks upon startup? Sure,
it could, but let’s pretend that there are good reasons that it doesn’t.
In any case, the same principles here can also apply to other areas like
cache warming, for example, so let’s just go with the maths for now. ↩
We can actually compute a closed-form approximation of the
expected value by way of harmonic numbers. It
turns out that the exact value we want is $n H_n$, where $H_n$ is the
nth harmonic number. We can approximate $H_n$ as $\ln{n} +
\gamma + \frac{1}{2n}$, where $\gamma$ is Euler’s
constant, approximately $0.577$, and so
multiplying that approximation of $H_n$ by $n$ produces a reasonable
approximation of the expected value for the coupon collector’s problem. ↩
This last weekend, I spent a little time noodling around with the static
site generator that I wrote to generate this website. One thing I remembered
was that a few of its tests were slightly flaky, and this time I really
wanted to dig into what was going on.
One such flaky test did the following:
Get the current time, as start.
Do some processing that should update a file.
Get the current time again, as end.
Expand the range of start and end so that they fall
on second boundaries, because “some filesystems can only store
modification times to the second”.
Check that the target file’s modification time falls within
[start, end).
Perhaps you already know where this is going, but while the test above
might work on a strictly POSIX-compliant system (more on that later), it
doesn’t work in practice on Linux, occasionally failing because the
modification time we see in step 2 is slightly before the start
time we obtain in step 1, albeit just by a few milliseconds!
Exactly why this happens is down to how Linux updates file modification
times when a file is written to.
I’m going to dig into this below, but to avoid making this any longer, I’m
also going to limit myself a bit, and make some simplifying assumptions:
I’m going to assume the real-time clock is monotonic. It isn’t, but it’s
trivially true that time can behave weirdly when the clock skips
backwards, so I don’t think it’s that interesting to discuss further here.
I’m going to assume that the operating system doesn’t crash (or the
machine lose power) at any point. There’s a lot of complexity around the
durability of filesystem metadata and data, and what we can (or can’t)
rely on being visible following a crash.
I’m going to ignore mmap(). It’s fairly well-known that writes made via
memory-mapped files are underspecified as to when file times are updated.
I’m pretty much going to ignore access times and focus only on
modification times. Access time also have a load of extra options in
practice on Linux (filesystems are often mounted with relatime now,
etc).
First, a little history
Let’s start our story back in 1979.
While earlier Unixen had creation and modification times, it was V7 Unix
that introduced the “atime”, “mtime”, and “ctime” names that
later made their way into the POSIX standard. The original1
POSIX standard defined these as:
time_t st_atime Time of last access.
time_t st_mtime Time of last data modification.
time_t st_ctime Time of last status change.
I think the first two are pretty clear, but ctime is a little confusing:
from the “ctime” name, you might have assumed it was “creation time”, but as
you can see from above, it’s a “status change” time. Effectively, mtime is
the time that the file contents changed, while ctime is the time the file
metadata changed (and since the metadata includes the mtime, any update to
mtime must also update ctime).
While POSIX doesn’t include any concept of file creation time, some
operating systems and filesystems do. For example, ext2 does, and Linux has
a statx() system call that returns an extra btime field with a file’s
creation time (“btime” from “birth time”, following prior usage in BSD).
What POSIX says about how file times are updated is
actually quite interesting. Rather than just specifying that modification
times are updated when a file is written to, the various times are instead
“marked for update” whenever certain specific operations complete, so (e.g.)
a successful write() will mark mtime and ctime as to-be-updated.
Implementations are then free to actually run that update later on (“At an
update point in time, any marked fields shall be set to the current time”,
says POSIX), subject to certain operations that must trigger an update
(calling stat(), for instance).
All that seems okay for our test, though? While we might not see exactly
the right modification time, we should still see a time that sits within the
range we’re expecting? Except we don’t.
As far as I can tell, the reason for that is that Linux doesn’t strictly
follow POSIX here, though I must admit that it’s not completely clear: POSIX
does leave quite a lot of room for ambiguity anyway2.
What times can filesystems actually store?
Before we talk about Linux in general, it’s probably worth talking about the
difference between granularity (what the filesystem can store) and
accuracy (how close the stored time is to real time), since that’s
something that’s confused me in the past.
Different filesystems support different granularities for the times that
they can store. To pick a few examples:
ext2/3/4: nowadays, almost always nanosecond resolution
NTFS: 100ns resolution (as it represents times in 100ns increments)
FAT: has a mixture of resolutions
To be extremely specific about ext2/3/4 for a moment: the on-disk
representation supports nanosecond resolution (and dates after
2038) if the filesystem was created with an inode size of 256 bytes
rather than 128 bytes (as shown in tune2fs -l output).
If you have an ext2/3/4 filesystem, it’s almost certainly using 256-byte
inodes already: they became the default in 2008 when creating a filesystem
of 512MiB or larger, and nowadays are used by default regardless of size.
Modern Linux kernels3 support whatever the filesystem
supports.
Incidentally, this does make me wonder whether I actually had access to an
older filesystem when I was writing the test I mentioned at the start of
this post, or whether I’d just misunderstood why the times I was seeing in
my test didn’t match up with what I expected4.
So if some filesystems only support certain granularities, what happens if
we try to set a finer-grained value ourselves? In this case, the kernel will
truncate the value to what the filesystem supports first (see
s_time_gran in the kernel source, which is how
each filesystem declares its granularity). This truncation (rather than
round-to-nearest, say) is also what POSIX requires.
We can see this behaviour easily by creating a file with a fixed
modification time on a few different filesystems.
First, ext4. As mentioned above, this supports nanosecond granularity, so
the value we specify is used directly for the access and modification times
(while the current time is recorded for the status change and file creation
times):
If we explicitly create the ext4 filesystem with 128-byte inodes, we can see
that we only have times stored to seconds resolution (and that we no longer
have a creation time):
On FAT filesystems, access times use day granularity (the value above is
midnight UTC for the time I gave), modification times use two-second
granularity (FAT doesn’t store a separate ctime, so ctime is always equal to
mtime), while creation times use 10ms granularity.
Back to ext4: if we create a file with the current time, we can also see
the weird behaviour I mentioned right at the start:
The recorded times are 1.5 milliseconds before the time printed by the
preceding date command! What’s going on?
What’s the time, Mister Linux?
So what about accuracy? How does Linux choose what mtime value to write when
a file is updated, and why does it seem like time is going backwards here?
On the face of it, this seems like an odd question: why wouldn’t the kernel
just set the modification time equal to the current time every time a file
is written? I think that’s what most people (myself included) would have
assumed was happening.
The main reason (and probably the only reason, as far as Linux is concerned)
is efficiency: files are written to a lot, and recording a new modification
time on every write would potentially mean doing a lot of extra work, even
if everything stays in cache. Not only that, but merely fetching the current
time can be surprisingly expensive5, especially on very old hardware
without access to CPU cycle counters.
Instead, Linux (up until 6.13) maintains a ‘coarse’ current time that
updates every timer ‘tick’ (usually once every 4ms or 10ms6), and then
all writes made (to any file) during that interval will record exactly the
same modification time, even if the filesystem could store a finer-grained
value.
This is absolutely the explanation for my test failures above: the
modification time I’m seeing is the time of the last timer tick, which is
before the start time that I captured before writing the file.
Having tested this explicitly, it looks like in practice the mtime I see is
indeed always older than the real current time by somewhere between 0–4ms
plus a constant (~700µs), which is exactly what we’d expect.
Is Linux actually being POSIX-compliant here? Not that it matters too much,
but I’d say not? While POSIX allows file time updates to be delayed, I don’t
see that it allows the time used during an update to be anything other than
the current time at the point the update is run7, which is
the same time printed by date, etc.
In other words, POSIX seems to require that the recorded time is on-or-after
the actual time, and Linux records a time that’s on-or-before the actual
time.
Multigrain timestamps!
While I was researching this, I ran across a new feature called multigrain
timestamps. This isn’t present in the version of Debian that I’m currently
using (it was added to Linux 6.13, so it’ll be in Debian 14), but it does
change kernel behaviour in an interesting way.
The kernel documentation is pretty easy to read,
but in summary this feature watches to see if a file’s timestamp has been
observed (with stat(), e.g.) since its times were last updated, and if so,
the next write will use a fine-grained timestamp (i.e. the ‘real’ time),
if using a coarse-grained timestamp would make it look like the modification
time hadn’t changed.
(There’s a small extra wrinkle in that, to maintain ordering, the act of
using a fine-grained timestamp for any file also has to drag along the
current ‘coarse’ timestamp, otherwise a file that only needs a
coarse-grained timestamp could appear to have been updated before an
earlier update that needed a fine-grained timestamp.)
It seems like this was primarily intended for NFSv3 exports, which want to
use timestamps to see if a file has changed since they were last read, but I
think it should also help in any other cases where you have something
watching a file for changes.
(It won’t help fix my tests, though, since the decision about whether to use
a fine-grained or coarse-grained timestamp is made when the file is written,
and the first write to a file will — I assume — always use a
coarse-grained timestamp.)
Fixing (‘fixing’) my tests
So how should I fix my flaky tests?
I could have fixed them by replacing “Get the current time” with “Create a
temporary file on the same filesystem as the target file and read its mtime”
(or alternatively I’m pretty sure I could read CLOCK_REALTIME_COARSE
directly, albeit that might not be easy from Python).
If I were to do that, the three times would then be monotonically
increasing. In practice, this test takes much less than 4ms to run, so it’s
actually extremely likely that all three times would be exactly the same.
(This should not be that much of a surprise.)
But taking a step back, it’s also worth considering why I had this test in
the first place.
In this case, the functionality was inherited from an even older incarnation
of this tool that relied upon Subversion’s use-commit-times feature to
record when each post was last updated, and a desire to have the (HTML)
output file have a last-modified timestamp close to that of the (Markdown)
source file.
I don’t store posts in Subversion any more, so this whole feature wasn’t
really achieving much. And by far the most-sensible thing to do here is to
simply to delete the code (and tests) that was playing around with mtime,
and just let the kernel pick a reasonable time by itself.
And so that’s how I fixed my flaky test.
References
In the interests of citing my sources, here’s a few more that I used while
researching this post.
The Linux source and mailing lists are a great resource for
finding out why things work the way they do. In particular, the following
were particularly useful:
Avery Pennarun’s mtime comparison considered
harmful, which I found while writing
this (and which covers a lot of the same ground; I wish I’d found it
earlier!)
I’m quoting from the older (2004) POSIX standard here for
simplicity: later versions replace the time_t fields with struct
timespec fields with slightly different names, and redefine the
existing members as macros. ↩
Among the other things that POSIX leaves unspecified:
whether file time updates must run for all files at once, or whether
it’s technically compliant for two files updated one after another to
end up with mtimes in the opposite order. (Nobody does that last one in
practice — it’d break tools like make that need to see if an input
file has changed relative to an output file — but I don’t see how
POSIX disallows it.) ↩
If you happen to be looking at the Linux source code (as
I was when writing this post), note that only fs/ext4 supports
nanosecond resolution and years past 2038. In contrast, fs/ext2 does
not: even though it can use ext2 filesystems with 256-byte inodes, it
won’t do anything with the additional data. However, fs/ext4 is almost
certainly what any modern kernel will be configured to use for ext2 (and
ext3) filesystems, as fs/ext2 is essentially just kept around for
reference purposes now. ↩
That test dates from 2013, but at the time I would have
been running them on a small Debian 7 box under my desk (whereas now I’m
running on a small Debian 13 box under someone else’s cloud), but I
don’t think that’s old enough for the filesystem to have been created
with 128-byte inodes. It’s just about possible that I was referring to
an even-older Slackware box that I might still have been using at the
time. ↩
Applications reading the current time repeatedly is also why
nowadays userspace calls like clock_gettime() and gettimeofday()
usually don’t involve a kernel syscall (and so won’t appear in strace
output); see man vdso for a detailed explanation. In that case the
returned time is always correct, though. ↩
The kernel tick rate is determined by CONFIG_HZ, which on Debian is
set to 250, giving an update every 1s/250 = 4 milliseconds. ↩
I do see the (non-normative) POSIX
rationale, which says “The accuracy of the time
update values is intentionally left unspecified so that systems can
control the bandwidth of a possible covert channel”, which is a) not
what I would have guessed for the primary justification! but also b)
reads more to me as not setting an upper-bound on the amount of time
that can pass between a write and an (unforced) file time update, rather
than allowing an arbitrary (and otherwise unmentioned) jitter to be
introduced into update time values. ↩
Noda Time 3.0.0 came out yesterday1, bringing a shiny
new parcel of date- and time-related functionality.
What’s new in 3.0? Firstly, there’s a couple of things in 3.0 that just
plain make it easier to use Noda Time:
Nullable reference types. The API now correctly uses the
nullable reference types introduced in C# 8.0 to document when a
method or property may accept or return a null value.
Nullability was previously noted in our documentation, but now (with
appropriate compiler support) you can opt-in to warnings that indicate
where you might be accidentally passing a null somewhere you shouldn’t.
A plethora of API improvements. For example, we now have a
YearMonth type that can represent a value like “May 2020”;
TzdbDateTimeZoneSource now provides explicit
dictionaries mapping between TZDB and Windows time zone IDs; and
DateAdjusters.AddPeriod() creates a date
adjuster that can be used to add a Period to dates, along with many
other improvements. As always, see the version history and API
changes page for full details.
A single library version. Previous versions of Noda Time were slightly
fragmented when it came to supporting different framework versions. For
example, Noda Time 1.x was specific to the .NET Framework, and later added
a Portable Class Library version that was missing a few key functions,
while Noda Time 2.x again provided a separate .NET Standard version that
differed slightly from the ‘full’ version. As of Noda Time 3.0, we have
just one library version, providing the same functionality on all
platforms.
Better support for other frameworks. Most core types are now annotated
with TypeConverter and XmlSchemaProvider attributes. Type
converters are used in various frameworks to convert one type into other
(typically, to or from a string) — for example, ASP.NET will use type
converters to convert query string parameters into typed values — while
the XML schema attributes make it possible to build an XML schema
programmatically for web services that make use of Noda Time types.
Performance
Although not as significant as the changes from Noda Time 1.x to 2.x,
performance is still a key concern for Noda Time.
In 3.0.0, we’ve managed to eke out a little more performance for some common
operations: finding the earlier of two LocalDate values now takes
somewhere between 40–60% of the time it did in Noda Time 2.x, while parsing
text strings as LocalTime and LocalDate values using common (ISO-like)
patterns should also be a little faster, taking around 90% of the time it
did in Noda Time 2.x.
Caveats
The change from Noda Time 2.x to 3.0 is not as big a change as the one
from Noda Time 1.x to 2.0, but there are still some small incompatibilities
to watch out for.
The migration document details everything that we’re aware of, but there
are two points worth calling out explicitly:
Noda Time 3.x has (slightly) greater system requirements than Noda Time
2.x. While Noda Time 2.x required either .NET Framework 4.5+ or .NET Core
1.0+, Noda Time 3.x requires “netstandard2.0”; that is, .NET Framework
4.7.2+ or .NET Core 2.0+.
.NET binary serialization is no longer supported. While .NET Core 2.0
added some support for binary serialization, binary serialization has
many known deficiencies, and other serialization frameworks are now
generally preferred. Accordingly, we have
removed support for binary serialization entirely from Noda Time 3.x.
Noda Time still natively supports .NET XML serialization for all core
types, and we also provide official libraries for serializing using JSON
(1, 2) and Google’s
protobuf.
In general, though, we expect that most projects using Noda Time 2.x should
be able to replace it with Noda Time 3.0.0 transparently.
Availability
You can get Noda Time 3.0.0 from the NuGet repository as usual
(core and testing packages), or from the
links on the Noda Time home page.
Note that the serialization packages were decoupled from the main release
during the 2.x releases, and so (for example) there is no new version of
NodaTime.Serialization.JsonNet; the current version of that
library will work just fine with Noda Time 3.0.0.
What’s next?
Good question. While Noda Time is fairly mature as a library, we do have a
few areas we’d like to explore for the future: making use of Span<T> in
text parsing, and providing a little more information from CLDR sources
(stable timezone IDs, for example). If you’re interested in helping out,
come and talk to us on the mailing list.
And once again, I’m going to copy/paste this to produce the official Noda Time blog post. (The evidence suggests that this is the only way I’ll get any content on my personal site, after all.) ↩
[Insert obligatory “well, it’s been a while since I’ve written anything for
this blog” paragraph here.]
With 2018 finally complete, I thought it might be fun to take a quick look
at the books I read last year.
Goodreads has a “reading challenge” each year wherein you can set a target
number of books to read. In 2016, I hit my target of 34 books, albeit only
by cramming both the SRE book and The Calendar of the Roman
Republic (long story) on the last day of that year. Buoyed by
success, I increased it to 38 books for 2017… and then got distracted by
life and fell a bit short.
So, for 2018, I kept the same target as for 2017, and tried to not get
distracted. A few weeks ago, I’d got a little bit ahead of that — woohoo
me! — and decided it might be fun to put together a short review of each.
So here are all the books I read in 2018, in (roughly) chronological order.
Robots vs. Fairies, various authors
Starting off 2018, an anthology of short stories: some about robots, and
some about fairies. Definitely mixed, with a few really good ones, and
a few that are… not so good (John Scalzi’s comes to mind as one of the
latter, surprisingly).
The Fifth Season (The Broken Earth, #1), N.K.
Jemisin
So, this I definitely liked. It has a great premise, post-apocalyptic —
or maybe just apocalyptic, given the intro — fantasy, good characters,
and good worldbuilding, it won the 2016 best novel Hugo, and yet… I
haven’t picked up the series again.
I’m not sure exactly why: perhaps because of the writing style (it’s
present tense, partly in second person), perhaps because I was irritated
by the way the writer withheld some key information the characters knew,
or perhaps because of the incomplete ending. It’s possible that the
sequels are brilliant, but I haven’t got around to finding out yet.
Possibly in 2019.
Dark State (Empire Games, #2), Charles
Stross
Continuing Stross’ reboot of the Merchant Princes series, a
multiple-alternate-timeline spy/techno-thriller. Stross groks politics and
economics (and technology), so this is actually a pretty good alt-history
analysis as well as being a lot of fun. (Although if we could stop heading
towards the dystopian timeline in real life, that’d be great, thanks.)
The Night Masquerade (Binti, #3), Nnedi
Okorafor
Quoting from the one Goodreads review I did write: I was looking forward
to this offering a conclusion to the series. Well, in some ways it does do
that, and in some — quite important — ways, it doesn’t. I think I’d
have been better just appreciating the great world-building here rather
than the plot.
Beneath the Sugar Sky (Wayward Children, #3),
Seanan McGuire
So, what if fairy tales were real? What happens when they’re over? That’s
the premise of this series — in much the same way as Stross’
Equoid asks what it might be like if unicorns
were real (spoilers: sharp horns, so blood, mostly).
This book is almost-standalone, with some of the children from earlier
books going on portal-hopping adventures of their own. I liked this one a
lot more than the second book in the series, which had a different focus,
and was a bit more serious. Also, I’ve just realised that book #4
(In an Absent Dream) is out next week!
The Fox’s Tower and Other Tales, Yoon Ha
Lee
A collection of flash fiction from Yoon Ha Lee, who’s also written some
excellently weird science fiction and interactive fiction. Like
Robots vs. Fairies above, I thought this was somewhat
hit-and-miss.
The stories I enjoyed more tended to be those heavy on imagery and light
on ‘plot’ (such plot as is possible with flash fiction), though The
Stone-Hearted Soldier was an excellent inclusion, and an exception
to that rule (but also one of the longer stories).
An Unkindness of Ghosts, Rivers Solomon
A dystopian space opera set around a study of oppression and segregation
aboard a generation spaceship. The protagonists are incredibly varied and
interesting characters, though the bad guys are unfortunately cardboard.
I remember this being something I wanted to keep reading (if challenging
in parts), but I can’t actually remember any of the plot at this point.
Minor issues notwithstanding, I definitely enjoyed this.
The Arcadia Project series, #1–3
(Borderline,
Phantom Pains,
Impostor Syndrome),
Mishell Baker
From one set of neuroatypical characters to another. No spaceships here,
but an urban fantasy/mystery that posits a link between fey and Hollywood
celebrity. The whole series is great, the characters are believable and
well-rounded (and self-sabotaging and dysfunctional). I was worried that I
wouldn’t be that interested in a Los Angeles movie-town setting, but the
characters and story won me over.
This series ties with Smoke and Iron (below) as my favourite
read of 2018. Recommended.
The Gone World, Tom Sweterlitsch
So, apparently I liked this enough to give it 4/5 on Goodreads, but I
can’t actually remember anything about it. It looks like it’s a time
travel/murder mystery/apocalypse story? Perhaps I should re-read it.
Sleeping Giants (Themis Files, #1), Sylvain
Neuvel,
Told via the medium of interviews and news clippings, in the style of
World War Z, this is the story of how the discovery of a
giant robot hand plays out politically. There is some sci-fi here, but
mostly it’s the politics from Arrival that takes centre
stage.
This was alright, but again, I’ve not picked up the next in the series.
The journal/interview format makes it hard to get much in the way of
interaction between characters, and the story seemed more interested in
the politics than in the sci-fi/mystery aspect (which is fine, just not
what I was looking for).
The Red Rising series, #1–4
(Red Rising,
Golden Son,
Morning Star,
Iron Gold),
Pierce Brown
Dystopian sci-fi. The blurb says “Ender’s Game meets The Hunger Games”,
and I suppose that’s about right: the protagonist takes on the elite by
infiltrating them and subverting them from within, only this time we’re
talking about Mars, and later an entire solar system.
I enjoyed the first few books in the series, but somewhere around the
third or fourth I started to get a bit tired of the diffusion of the story
to uninteresting point-of-view characters, and also in the continuous
faux-Roman melodramatics.
The first book is definitely good by itself, and maybe I’ll pick the
series up again at some point.
Kindred, Octavia E. Butler
This is also sci-fi, or maybe fantasy1, but is probably simpler to
think of as historical fiction. A modern progressive black woman is
transported to early 19th century Maryland, deep in the
antebellum American South.
With the caveat that “modern” here means the 1970s (the book being
published in 1979), this is a fascinating story — if deeply unsettling
at times — about how culture shapes behaviour, and how social
hierarchies and systems can be justified and propagated by those within
the system.
The Pliocene Exile / Galactic Milieu series
(The Many-Coloured Land,
The Golden Torc,
The Nonborn King,
The Adversary;
Intervention;
Jack the Bodiless,
Diamond Mask,
Magnificat),
Julian May
An easy re-read. Julian May’s epic galaxy- and time-spanning series starts
with a fantastic premise: as Earth has joined a galactic federation of
sorts, and as humanity has begun to evolve psionic powers, a misfit group
of disaffected/adventurous travellers escapes into exile via a one-way
time wormhole that deposits them in France, in the Pliocene epoch, 6
million years ago2.
Without spoiling too much, the story shifts very quickly from science
fiction to something closer to high fantasy (for the first series, at
least; the second is in a more contemporary time period, and is more
‘regular’ sci-fi). Weaving mythology and an epic story, this is well
worth the time to read.
A Rag, a Bone and a Hank of Hair, Nicholas
Fisk
This YA dystopia was fun to read when I was a lot younger (it was
published in 1980; I probably read it sometime in the mid-1980s, along
with a lot of other Nicholas Fisk), but it hasn’t really held up that
well. The motivation behind the plot falls apart a bit on any analysis,
and some of the technology is a bit dated now (explanations about
miniature tape recorders, that kind of thing).
However, I do still like how the protagonist learns to interact with the
other characters (both modern, and not-so-modern), and how their attitude
changes over the course of the story, and I do still appreciate the swerve
away from hard sci-fi that happens partway through. It’s flawed, but it’s
still a classic.
The Lady Astronaut series, #1–2, plus the initial novelette
(The Lady Astronaut of Mars,
The Calculating Stars,
The Fated Sky),
Mary Robinette Kowal
Alt-history in which the author bootstraps the space race a decade early
via a meteorite-shaped forcing function. Post-steampunk, but
pre-electronic-computer; the author describes it as “punchcard punk”.
This is Hidden Figures meets Apollo 13, with a
strong focus on the racial and gender discrimination of the
1950s3.
(The novelette was published first — winning the 2014 Hugo for best
novelette — but is set some thirty or so years after the novels. I read
it first, but you could easily read it after: it’s not directly connected
to the novels.)
The novels suffer very slightly from telling two separate stories: one is
a humanity-against-the-elements story (Apollo 13 or The
Martian), while the other is a documentary about 1950s cultural
attitudes. Both are interesting stories, but I found it a little
frustrating when the story would focus tightly on the protagonist to the
exclusion of the wider global impact (pun most definitely intended).
However, overall this is definitely worth reading.
The Labyrinth Index (Laundry Files, #9),
Charles Stross
Well, we’re past the Lovecraftian singularity at this point, and it’s all
about surviving while the transhumans play. One of whom happens to be
inhabiting the Prime Minister at present, and who has opinions about
foreign policy.
Mhari, who we met in her current incarnation in The Rhesus
Chart a while back, is presently attempting
to stay alive while said elder god is playing eleven-dimensional chess
nearby. Meanwhile, the US appears to have collectively forgotten that
the executive branch exists…
I liked this a lot. Mhari was interesting without being annoying, as I
worried she might be (she was in some of the earlier books; deliberately
so in order to annoy Bob, I think). Otherwise, this was pretty much
exactly as I expected at this point in the series: a lot of fun.
Revenant Gun (The Machineries of Empire, #3),
Yoon Ha Lee
Yoon Ha Lee’s conclusion about a 400-year-old immortal general and crazy
magic that works because of a shared consensual reality. It’s military
sci-fi, kinda?
I can’t really discuss this without spoilers, but while it did more
hand-holding than earlier books in the series, it still featured a lot of
creative worldbuilding.
Lies Sleeping (Rivers of London, #7), Ben
Aaronovitch,
Like The Labyrinth Index above, by the time you get this far
into a series, you pretty much know what to expect: in this case, a fun
police procedural with magic and geeky in-jokes.
However, I did find it a bit hard to follow what was going on with the
plot here, which seemed to be both a bit muddled and to reach back over
the whole of the series. (I’ve also not read the associated graphic
novels, which might have helped, though they’re not supposed to be
necessary prerequisites.)
A Canticle For Leibowitz, Walter M. Miller
Jr.
A classic (1959) post-apocalyptic sci-fi tale published during a high
point in Cold War tensions. In the far aftermath of nuclear war, society
struggles to drag itself out of a new dark age, and to rediscover and
protect old knowledge. This is three distinct stories — originally
published as such — separated by time (centuries), and vaguely connected
by place.
This unapologetically puts forwards a Christian (specifically, Catholic)
viewpoint, with the church to some extent a main character. It has some
ironic humour, but also serious comment about ethics and human nature.
With one exception near the end, I didn’t find it to be too preachy.
It made a big impact at the time, but is it actually a good story
nowadays? Well, meh. I found it thought-provoking (and somewhat
depressing) in turns, but I can’t actually say that the story exists much
more than as a framework for the author’s viewpoints. Largely
unsatisfying, and probably more important for the historical context now.
Smoke and Iron (The Great Library, #4), Rachel
Caine
Okay, this is just brilliant. Along with The Arcadia Project series
(above), this was easily one of my favourite reads of 2018.
So, why? Well, it’s got good worldbuilding, a fast-paced (and fun) plot,
it’s got great characters and character development, and good writing.
The plot itself starts immediately after Ash and
Quill, so talking about the plot directly would spoil
the earlier books. In general, though, this series is a YA
alt-history/fantasy in which the Great Library (of Alexandria) has become
a ruthless worldwide power, tightly controlling both the dissemination of
information and also the source for some of the magic/alchemy that’s
available in this world.
On the writing: one section in particular has the viewpoint character
magically hypnotized into believing that they’re someone else, and the
author shifts the (tight third-person) text to match that impersonated
character, having the viewpoint character not just act as another, but
having the prose notice (and the character comment on internally) an
entirely different set of things appropriate for the character they were
impersonating. Subtle, but I liked it.
I could have become a mass murderer after I hacked my governor module, but
then I realized I could access the combined feed of entertainment channels
carried on the company satellites. It had been well over 35,000 hours or
so since then, with still not much murdering, but probably, I don’t know,
a little under 35,000 hours of movies, serials, books, plays, and music
consumed. As a heartless killing machine, I was a terrible failure.
The opening lines of All Systems Red
Murderbot is a fairly apathetic and introverted humanform security droid
that just wants to be left alone to watch sci-fi soap operas, but stupid
humans keep doing stupid things that stop it from doing so, or worse, are
trying to interact with it rather than let it stand in a corner by itself
(to watch soap operas again, probably).
This is a series of four novellas written with Murderbot narrating, and
it’s delightful. They are short, so each has a fairly straightforward
plot, but it’s great fun nonetheless.
Ra, Sam Hughes
On the one hand, Ra is excellent: it’s a hard sci-fi novel (novella?)
with some really well thought-through worldbuilding. To some extent, it
puts me in mind of Snow Crash. (It also has some
really nice in-jokes, which I don’t think I can reference without being
spoilery.)
It was published in chapters on Sam Hughes’ blog (at
qntm.org/ra, where you can read it for free), and
there are also a few EPUB versions, some of which you can choose to pay
for.
So as a self-published story, it’s really rather good. Unfortunately, on
the other hand, I think it could also do with some quite significant
editing, as there seem to be two almost completely different stories here,
and while they’re linked, the story switches at one point from something
grounded (like Snow Crash) to something incomprehensible by
Greg Egan, and while both are good, I don’t think they fit well together.
To sum up: I managed to read 40 books last year, almost all of which were
fiction, mostly urban fantasy and sci-fi, to nobody’s surprise. (I also
started and failed to finish a bunch of non-fiction books).
I think I did a better job of picking books with diverse protagonists this
time round, and while most of the books I read were published in the last
few years (40% were published in 2018), I managed to also seek out a few
older ones (Kindred, for example, I’m really glad I got round
to reading).
Onward to 2019!
I’d have called it sci-fi purely because it has time-travel, but I ran across an interview with Butler in which she points out, “Kindred is fantasy. I mean literally, it is fantasy. There’s no science in Kindred.” She has a point. ↩
… though from what I can tell, 6 Ma is squarely in the Miocene epoch, not the Pliocene. In A Pliocene Companion, Word of God resolves this by stating that, in-universe, the Pliocene is considered to start around 11 Ma (not 5.6 or 5.33 Ma, as in our reality). ↩
And to a large extent, discrimination that’s still present today: there’s a line where our heroine says that “people would ignore what I said until [my husband] repeated it”, which sounds familiar enough. ↩