Performance evaluations will play a key role in maturing the capabilities and our understanding of them for the Relayer
We are excited to announce a long-running family of performance evaluation articles. These articles will approach performance characteristics of various parts of our product stack, and offer insights and comparisons. We are driving a “measure everything” culture here at XNS, and have formalized a methodology to identify, test, and analyze many components of our infrastructure and software.
Launching first is an article highlighting the results of a sustained Relayer upload throughput test, and a benchmark for what a traditional cloud provider could achieve on the same configuration. Over the following months we will focus on performance-related topics like latency, cache-drain, hardware scaling, and configuration parameters. Tackling these topics will build a foundation for analytics, reveal key strengths (or weaknesses), and provide a wealth of reference information useful for us, our partners, and our users.
Testing culture is critically important to foster. Our approach has evolved into to a multi-step process involving team members across disciplines. This looks like discussion around aspects that can and should be measured, to developing test matrices and approaches, to systematically deploying and uploading standardized results.
Though specifics can and will vary based on what aspect is being focused, our test suite has matured to resemble the following:
- Through discussion with team and partners, identify a component of the Relayer or its ecosystem to focus. Determine how best to approach directly measuring this component.
- Develop test matrices. Several rounds of testing may be expected.
- Establish what hardware or other requirements are needed. It is typical for a bare-metal virtual private server (VPS) matching a certain spec to be rented specifically for testing.
- Run our tool suite on this VPS, which will automatically increment through test cases and upload results to our database.
- Monitor test administration (anywhere from 12 hours to 7+ days), typically viewing through a series of custom Grafana dashes. Typically represented is hardware utilization (disk, CPU, RAM, networking), and telemetry pulled from the Relayer stack itself.
- Perform an early analysis of results to identify if alternative testing or modifications may be required. You want to know as early as possible if this will be the case.
- Debrief and begin analytics on results. Don’t stop where you’ve either validated/invalidated assumptions, try to find interesting correlations or unexpected results.
- Formalize findings, plot key performance indicators (KPIs), prepare for publishing.
Premiering in the companion article Upload Throughput Evaluation, we look forward to sharing the results of ongoing evaluations. These will provide key insights to our team, and its our hope that publishing these findings openly will serve our readers and network in not just being the best decentralized storage network, but the best home for cloud data, period.