As we run validators on several Cosmos-based blockchains, significant effort has been devoted to automating their setup and maintenance. This enhances our efficiency, resilience, and also reduces the chances of human mistakes.
In this post, we'll explore how we've built our system and the daily tools utilized to ensure smooth operations. Let's dive into the specifics of our Cosmos validators infrastructure.
Our infrastructure is deployed on Kubernetes across various cloud providers, including AWS and GCP, but mainly bare metal solutions like OVH & Data Packet are managed by a central control plane.
We've standardized on large servers, which allows us to run multiple nodes for different blockchains on a single machine. This approach gives us great flexibility in where we run our nodes, primarily across Europe and Asia. It also enables dynamic adjustments of the CPU, RAM, and disk resources as needed. However, for blockchains with high block rates, like Injective or dYdX, we use dedicated servers equipped with high-frequency CPUs.
Our entire system is managed through a GitOps workflow, offering several benefits such as a reliable source of trust, a comprehensive history, and the integration of Continuous Integration processes.
We've also fully automated the bootstrap and maintenance of Cosmos-based chains, including default configuration setup, snapshot management, and upgrade procedures. This level of automation streamlines our operations and ensures consistency across our deployments.
Our security strategy is comprehensive and multifaceted, ensuring the protection of sensitive data and the uninterrupted operation of our infrastructure:
These measures uphold a high-security standard across our infrastructure, safeguarding our validators and the reliability of the networks they support.
Initially, our configuration included running a single validator per chain, supplemented by a spare located in a different geographical location.
As we evolved, we embraced the use of Horcrux cosigners, which substantially enhanced our system's security and efficiency.
This evolved validator architecture underscores our commitment to not only maintaining but continually enhancing the security, efficiency, and resilience of our blockchain operations. By leveraging advanced technologies like Horcrux cosigners and WireGuard VPNs, we ensure that our infrastructure remains robust and capable of adapting to the ever-evolving landscape of blockchain technology.
To monitor all our validators efficiently, we rely on two sources of metrics:
The cosmos-validator-watcher extends beyond monitoring, offering insights into total stakes, reward commissions, and tracking our votes on current on-chain governance proposals.
Leveraging our GitOps workflow, the validator watcher enables us to automate the upgrade process through webhooks. This approach is more efficient than swapping out binaries (especially in an immutable container), a common practice with tools like Cosmovisor.
For those interested in a deeper dive into how our cosmos-validator-watcher operates, we’ve made the information available on our GitHub repository at https://github.com/kilnfi/cosmos-validator-watcher.
Horcrux, a Multi-Party Computation (MPC) signing service for Tendermint nodes, enhances validator infrastructure security and availability by utilizing a cluster of signer nodes, ensuring fault tolerance, securing private keys through threshold Ed25519 signatures, and boosting performance. Explore the documentation to upgrade your validator infrastructure with Horcrux.
Kiln is the leading enterprise-grade staking platform, enabling institutional customers to stake their digital assets programmatically and whitelabel staking functionality into their offerings. Kiln runs validators on all major PoS blockchains, with over $4b of stake under management. As an experienced Cosmos-based chains node operator, we offer staking services and real-time data for various chains, including ATOM, Osmosis, TIA, INJ, KAVA, and more, to fully meet our customers' requirements. Last year, we proudly open-sourced our Cosmos Validator Watcher, a tool designed to streamline monitoring and alerting.