General Development Principles

I. Security

This project inherently carries a significant security risk due to its nature and the assets it manages. Given the potential of GPU clusters, especially in the realm of cryptocurrency and high-performance computing, hacking attempts are possible .

To fortify, we have following measures:

Modular Infrastructure: we architected the infrastructure using distinct, modular blocks, each with a singular point of responsibility. Example: separate APIs for miners, customers, public web access, internal requests, analytics dashboards, monitoring, etc.
Robust Firewall and Authentication Layer: We use a separate Firewall Layer with robust monitoring and extensive alerting. Paying particular attention to the authorization layer. Splitting and limiting access rights as much as possible for all infrastructure users.
Extensive Logging Layers: Every action/transaction/request is logged. Any suspicious activity will trigger an alert for manual verification.

II. High load

Given the project's management of a substantial quantity of high-load infrastructure components, it's important to ensure that the infrastructure itself doesn't become a bottleneck for the entire system. To achieve this, we prioritize the following strategies:

Modular Design: By breaking down the system logic into smaller, more manageable pieces, we can facilitate faster and more efficient scaling on demand.
Operational Queuing: This prevents overloading of any single component and ensures smooth data flow across the system, especially during peak loads.
Comprehensive Monitoring and Alerting: We maintain detailed logs of all system activities. Continuously monitor system health, performance metrics, and resource utilization to ensure optimal operation. Implemented real-time alerts to notify relevant teams of any potential issues or irregularities, enabling swift action and mitigation.

III. Billing

With our aim to become global number 1 affordable GPU provider, we established an infrastructure that guarantees cost-efficiency while safeguarding against unexpected costs.

The infrastructure is built with the following strategies in mind:

Automated Cost Monitoring/Alerting: Integrated real-time cost monitoring tools to track GPU usage and associated costs. This will help in preventing overuse and ensuring that operations remain within budget.
Extensive Notification Capabilities: Implemented a comprehensive notification system that can alert users and administrators through various channels based on the urgency and nature of the message (SMS, IMs, emails).

Forecasting: We incorporate predictive analytics tools to forecast future GPU usage and costs based on historical data and trends. Providing these data both to our Clients & Suppliers. This allows our most important assets, the customers, for better budgeting, resource allocation, and planning.