Peer-to-peer ephemeral public communities

Aether Dev.13 released: up to 30x faster updates, live-updating ‘firehose’ feed

After a bit of quiet, a new release of Aether has been cut. Your app should update itself on Windows and Mac, and on Linux, you should download the new Snap after backing up your user profile to get this one.

This one does come with two major features, so I wanted to highlight them a little bit.

The new graph compiler

Context

As a quick recap, Aether is a P2P network that distributes a content graph. This distribution happens on a flood basis, so every node has every piece of content. (Upcoming feature: users can choose what to broadcast out)

We manage to keep the network at a manageable size because of two things: a) All content on the network is text, and b) Aether is an ephemeral network: too old content is deleted based on a deletion policy from all nodes. Even if the too-old content is kept available somewhere, the other nodes won’t ask for it, so the propagation is killed.

This is all raw data though - while there are some nifty stuff in being able to efficiently sync content on a flood basis (we do some opportunistic caching and pre-baked responses), this is ultimately a few [thousand | hundred thousand] rows on a SQL table.

And it turns out, the derivation step that allows the users to have user profiles, votes (vote counts!), moderation, elections and all the other niceties of Aether from those raw SQL rows is actually fairly complex. As in, it’s a little too slow to do it real time: if you open a thread with 50 posts, assuming real time, you’d need to fetch the community entity, the thread entity, the 50 posts, and all votes those 50 posts ever received and the user entities of all of those entities mentioned above.

As you can imagine, even a relatively small thread like this creates an exponentially increasing number of SQL queries, given the constraint to render it on runtime.

What Aether does instead, is that it has a frontend graph compiler. This compiler is incremental: it only compiles content as it comes in, and saves it into a separate, compiled key-value store. This is the main store that the user interacts with, and since these entities are ‘baked in’ per user settings, the calls are instant. Every time we receive new raw content from the network, we incrementally patch this ‘view of the universe’. This solves the issue with not being able to render the whole network in runtime.

This is also what allows the users to ultimately be able to choose their own moderators. Every user is in full control of his own frontend graph, and they can disable specific moderators they don’t like, or add the ones that they happen to like. Since this whole graph is compiled on the frontend, these kind of changes can be taken into account at compile time, which happens about every 2 minutes.

Updates

Since we now have a decently sized corpus of content, the frontend compile times have been increasing. This started around a few seconds when Aether first launched, and as of the last month, it’s reached 90~ seconds on my computer, which isn’t great! Especially considering the fact that the app attempts a refresh every 2 minutes, that means a lot of time spent in ‘Refreshing…’ state. It does not prevent the users from interacting with the app, neither does it stress the CPU or RAM, but it does create some disk use and general weirdness that manifests itself as latency spikes.

I have been trying a few approaches to bring this back to a reasonable duration (a few seconds for subsequent compiles) for the past month or so. In other words, we want this time to scale with the number of new entries coming in, not with the total size of the database.

Results

Here’s what’s shipping with this update:

Our first-ever compile has improved 1.9x, and our subsequent compiles have improved a whopping 30.6x. 🙂

But how?

The frontend compiler was, up until now, ‘good enough’ — it had not seen much attention in terms of speed or optimisation. This was the first time that someone had the chance to dive in and figure out what could be better. In short, a bunch of low hanging fruits!

(Heads up, the two below are largely about software engineering. Feel free to skip if that’s not your cup of tea.)

Optimisation 1

This was fairly straightforward: I ended up batching the compilation result writes into blocks, so that they happened as part of the same transaction. Now, this is actually fairly standard, however in Aether, this was a little more complex than usual because of Go’s concurrency. We’ve been using a lot of goroutines (~threads) which do run in their own processor core or hyperthread, and this is great, because it fully utilises your CPU, even if you have a gigantic AMD Threadripper with an ungodly number of cores, Aether can still fully saturate it if needed.

However, that also meant that every thread, when done, would have to commit its own results to the key-value store. This made things simpler, at the cost of writing a lot of tiny pieces of data to the disk in a way that created some disk contention.

In the end, we ended up profiling the parts of the compiler process that suffered the most from this disk write bottleneck, and I’ve changed the code in a fashion that allows me to delay (tradeoff: RAM) these writes together and flush them into the disk at the same time, and do it in a thread-safe way. Doing this removed a major disk bottleneck at the cost of some increase in implementation complexity.

This is what bought us the drop in the initial compile from 92s to 47s.

Optimisation 2

The best way to make a program be faster is to make it do less, and avoid work as much as humanly possible.

This second trick is effectively that. I’ve come up with a way to determine, for every patch (raw set of updates coming from the backend), the frontend now determines what is called, borrowing a term from Physics, the observable universe. This means anything that could possibly be affected by the set of changes that are incoming.

The nice thing about having that map of all possibilities that could ever be invoked is that it is possible to reduce the scope the graph compiler operates: instead of working on the whole graph, it can now only operate on the parts of the graph that has a nonzero chance of having been affected. This leads to a massive reduction in the amount of work needed, and this is how we can get the subsequent compile speed from 92s to 3s.

In short, because we can now definitively prove that certain pieces of work is unnecessary, we can now skip them and do nothing, and that is why this is so fast. For the things we cannot prove that won’t be changed we still compile. We might be able to push this idea of the observable universe ever closer to its smallest possible form, however this has already provided massive gains even in its v1 that is shipping now.

In the end, though, this is a tricky one to assess the correctness of. We’ve been testing this for the past week or so, and it seems to be functional so far. Nevertheless, lots of things affect each other in surprising ways, and it is possible that this observable universe determination might be missing ‘following’ certain possible interactions, at least in its first iteration. Ultimately this is something that we need to see in the real world, so it’s probably best that it is out for real user feedback now.

The ‘New’ feed

This has been one of the more requested features in the recent past, so I’m happy to say that this is now in. This is pretty much what it says on the tin, it’s a feed of the most recent interactions on the network. Since the network is not huge yet, this feed still advances slowly enough to make sense, so it’s been fun watching it advance for the past day or two I’ve been testing this. This is basically ‘The Firehose’.

This feed is also useful for people who want to build on Aether as well, since it can be consumed by other apps as the raw input.

It is also live updating, meaning you will see new posts coming in as they arrive by the minute.

By the way, it’ll start to collect new content from the point of installation on the new update, but if you’d like to get it filled sooner, you can go to your frontend config folder (the exact location is shown in the app settings), and delete the KvStore.kv after fully shutting down the app. After reopening it, it should build the new feed from scratch.

Other bug fixes

We also have a smaller set of bugfixes that don’t merit mentions on their own — they are largely cosmetic.

Availability

We’ve been testing this release for the past week or so, but the changes are large enough that we need to see it on the live network. The update will be pushed to the auto-update system in about a day from this blog post: if all goes right, by the time you’re reading this, it should be already available.

This is it for now. Cheers!