General Development Principles
I. Security
This project inherently carries a significant security risk due to its nature and the assets it manages. Given the potential of GPU clusters, especially in the realm of cryptocurrency and high-performance computing, hacking attempts are possible .
To fortify, we have following measures:
- Modular Infrastructure: we architected the infrastructure using distinct, modular blocks, each with a singular point of responsibility. Example: separate APIs for miners, customers, public web access, internal requests, analytics dashboards, monitoring, etc.
- Robust Firewall and Authentication Layer: We use a separate Firewall Layer with robust monitoring and extensive alerting. Paying particular attention to the authorization layer. Splitting and limiting access rights as much as possible for all infrastructure users.
- Extensive Logging Layers: Every action/transaction/request is logged. Any suspicious activity will trigger an alert for manual verification.
II. High load
Given the project's management of a substantial quantity of high-load infrastructure components, it's important to ensure that the infrastructure itself doesn't become a bottleneck for the entire system. To achieve this, we prioritize the following strategies:
- Modular Design: By breaking down the system logic into smaller, more manageable pieces, we can facilitate faster and more efficient scaling on demand.
- Operational Queuing: This prevents overloading of any single component and ensures smooth data flow across the system, especially during peak loads.
- Comprehensive Monitoring and Alerting: We maintain detailed logs of all system activities. Continuously monitor system health, performance metrics, and resource utilization to ensure optimal operation. Implemented real-time alerts to notify relevant teams of any potential issues or irregularities, enabling swift action and mitigation.
III. Billing
With our aim to become global number 1 affordable GPU provider, we established an infrastructure that guarantees cost-efficiency while safeguarding against unexpected costs.
The infrastructure is built with the following strategies in mind:
- Automated Cost Monitoring/Alerting: Integrated real-time cost monitoring tools to track GPU usage and associated costs. This will help in preventing overuse and ensuring that operations remain within budget.
- Extensive Notification Capabilities: Implemented a comprehensive notification system that can alert users and administrators through various channels based on the urgency and nature of the message (SMS, IMs, emails).
Forecasting: We incorporate predictive analytics tools to forecast future GPU usage and costs based on historical data and trends. Providing these data both to our Clients & Suppliers. This allows our most important assets, the customers, for better budgeting, resource allocation, and planning.
Updated 2 months ago