Relayer Operational Overview
The Relayer stack is fast, respects your workflow, and keeps you in control
In order to meet almost any use case and performance target, parallelization and scale are a core focus of the relayer stack. An outcome of this approach is that the configurable parameters have implications on what level of hardware and networking one will need to run at a certain performance level.
Due to the wide disparity between the level of performance expected from a home user and an enterprise customer, it can be challenging to approach a broad recommended configuration. It is relatively easy to saturate a typical (asymmetrical) home internet line that may be only 30-50 Mbit/s upload; this can be done on relatively trivial hardware. It is thus assumed multi-user, SME-level use may require fiber internet, typically offering 500+ Mbit/s speed to provide acceptable performance. Using a reasonable erasure code (EC) setting of perhaps 30:60 data-parity scheme (3x redundancy), one can saturate a 1 Gbit line with an 8-12 core CPU and 16 GB of RAM. Using the same EC, one can achieve 2.4 Gbit/s with a 32-core CPU and 64 GB of RAM.
In operation, the relayer stack receives data uploaded to it and initially places it in a local cache. This cache is where the EC rows are taken from and transformed in to shards before going out to the network. The local cache also enables the dataset being uploaded to the relayer to be transferred at a rate above what can be uploaded out to the network, so workflows can offload a set quickly and allow it to upload (or drain) over a longer period (perhaps overnight). In the future, cache will also allow for some frequently accessed files to be stored locally, improving performance and reducing bandwidth when accessed.
To allow the relayer to access and serve the data passing through it, a PostgreSQL database is utilized. This database stores shard maps, individual user metadata, info, credentials, as well as bucket metadata. Small files (less than 4 MB) are packed together in to 4 MB sectors, as they are otherwise too little to perform erasure code transforms on directly. Files under 256 B can’t be efficiently packed, and are instead kept in the local database.
In many ways the relayer architecture resembles the original Renter-Host Protocol, but with important improvements. Similar is that the renter (relayer) produces EC shards based on data objects, uploads or accesses them, and maintains metadata including shard locations. Improved is that the relayer distributes these pieces among a wider, tunable set, leading to increased throughput and robustness. Distinct is the architecture around metadata and databasing allowing for controllable permissions, sharing, and optimizations around small files, caching, as well as other performance and enterprise oriented additions.
The result is a scalable and performant cloud that delivers enterprise features and optimizations. Far more control is given to the user than on any other cloud service offering, and with this power must come ownership from the operator to configure and allocate their provision responsibly.