Common Flows

Mim connections are stateless for the server, so this connection 'lifecycle' is entirely under the control of the asking node. This is not a lifecycle where you application needs to do it this way. This is just the documentation of how Aether, the Mim reference client, does this. If you have different priorities and concerns, your lifecycle should also be different.

The local node is called L. The remote node is called R.

Bootstrap

Bootstrap happens when a node joins the network for the first time.

Bootstrapping is an expensive process for the entire network in terms of bandwidth usage. If you are a node with significant numbers of people bootstrapping off of you, you will be disproportionately affected. To prevent that, bootstrapping process is designed to be soft on any one node. Instead it tries to split the load into the network more equally.

Bootstrap process does not give the new user the entire history of the network, because that would be too taxing for everyone in the network. What it provides is simply the last 7 days by default, minimum 1000 objects and maximum 10,000 (For more information on the limits, see Standard Results Policy).

L is given by the user R1's Address as the bootstrapper, or a bootstrap Address is hardcoded to the installed client.

1) L does a POST on the status endpoint of the R1 to determine if the node is available. If it receives a HTTP 200 and the node header, it saves the header as an Address and continues forward. If it receives a HTTP 429 Too Busy, it waits 120 seconds before trying again.

2) L does a POST request on the Addresses endpoint with no filters of R1. This gives L all Addresses R is willing to give to L within the Standard Results Policy.

If the POST request fails (e.g. because R1 is a static node), L does a GET request to the same endpoint.

3) L connects to R2 ... R8, the nodes it has received from the bootstrap node, and it asks the latest caches of the entities in the network for the last day, via doing a GET request with cache filter (cache=0).

Each node gets an object type, so R2 gets asked, for example, latest cache of boards, R3 latest cache of threads, etc.

4) L connects to R9 ... R15 and it asks the latest caches of entities for the day before last day (cache=1).

5) L repeats this with different nodes until it has 7 days of caches for all endpoints (cache=7). When the node is done, the bootstrap is completed.

In the case there are less addresses that are online than the caches that need to be requested, the process wraps around and R1 gets asked a second time.

Introduction

Introduction happens when a node encounters a node that it previously had not encountered.

1) L does a POST on the status endpoint of the R1 to determine if the node is available. If it receives a HTTP 200 and the node header, it saves the header as an Address and continues forward. If it receives a HTTP 429 Too Busy, it waits 120 seconds before trying again.

2) L requests index via GET of R.

3) L receives the last day's cache, paginated. It goes through the pages and the cache points out to the day before's cache. L goes through all of the caches and their pagination, by default up until it reaches 7 days, or more if the end user has instructed the node to do so.

4) L saves the timestamp of index of the most recent cache. This is the point L has synced up to. For L, all 7 endpoints of R now have the same timestamp.

5) L does a POST request to index with the timestamp filter. Since this is a POST request, this time, it will not hit the cache, and the result will be the index starting from the timestamp to current.

6) L saves the new timestamp. L now has knowledge of all posts that R has from the oldest R is willing to give, to now.

7) L goes through the list of fingerprints and creates a list of fingerprints of objects it does not have, and wants. These list of fingerprints are provided with just enough information that L can make a decision on whether it wants them or not.

8) For the things that L does not have, L hits the appropriate endpoints with POST requests, filtered by fingerprints. These endpoints will return information that L has requested.

Sync

Sync happens when a node encounters a node that it previously encountered and introduced.

1) L does a POST on the status endpoint of the R1 to determine if the node is available. If it receives a HTTP 200 and the node header, it saves the header as an Address and continues forward. If it receives a HTTP 429 Too Busy, it waits 120 seconds before trying again.

This is important because L needs to know the Node Id of the R to be able to fetch the timestamps associated with that Node it from its own database.

2) L does a POST request on R's index endpoint with a timestamp filter.

3) R returns all new and updated entities since the timestamp.

4) L determines which fingerprints it wants to request from R. L goes to the appropriate endpoints, and requests the objects via a POST request with a fingerprints filter.