Before June 2022, io.net was exclusively devoted to developing institutional-grade quantitative trading systems for both the United States stock market and the cryptocurrency markets. Our primary challenge was constructing the infrastructure necessary to accommodate a robust backend trading system with significant computational power.
Our trading strategies, bordering on high-frequency trading (HFT), necessitated real-time monitoring of the tick data of over 1,000 stocks and 150 cryptocurrencies. HFT is a method of trading that uses robust computer programs to transact many orders in fractions of a second.
It uses complex algorithms to analyze multiple markets and execute orders based on market conditions. Furthermore, our system had to dynamically backtest and adjust algorithm parameters for each asset in real-time while also being optimized to facilitate trading for more than 30,000 individual clients across ETrade.com, Alpaca Markets, and Binance.com, maintaining a latency below 200 milliseconds from market events to system reaction on client account for order execution.
Such an infrastructure requires a dedicated team of MLOps and DevOps professionals. However, our discovery of Ray.io, an open-source library used by OpenAI to distribute GPT-3/4 training across over 300,000 CPUs and GPUs, revolutionized our approach and streamlined our infrastructure management. Furthermore, we increased our speed to build this backend from over six months to less than 60 days.
After integrating Ray into our backend and preparing to deploy the application on a cluster of GPU and CPU workers to handle our substantial computing power, we faced the wall of price for running such a system due to overpriced GPU on-demand cloud providers.
For instance, an NVIDIA A100 card price was over $80/day per card. We needed more than 50 of these cards to run on average 25 days/month, amounting to $80 x 50 cards x 25 days = 100K USD/month.
This cost posed a severe challenge for us as well as for other self-funded startups in the AI/ML industry.
Even with such high prices, compute requirements for AI apps have been doubling every three months, 10x every 18 months; therefore, OpenAI had to rent a +300K CPU and 10K GPU to train GPT3, and this is just the beginning.
 Avg market price: https://www.paperspace.com/pricing
Updated 6 days ago