- Published on
A Simple Architecture for Crypto Trading Bot
- Authors
- Name
- Joshua Ali
- @Joshua472963523
Background
To gain an edge in algo trading, you first need to acquire large volumes of rich data, which allows you to develop sophisticated instruments for multi-timeframe analysis. Some key data includes raw trades, liquidations, candlesticks (Klines), and open interest candles. The trouble is that crypto exchanges typically do not allow you to poll raw trade or liquidation data — or at least not very far back. Therefore, you must rely on their WebSocket streams running 24/7 to ingest this data. Historical data is increasingly difficult to obtain from these exchanges, and some no longer offer downloadable liquidation data. This is problematic if you wish to backtest, as you won’t have access to an important part of the puzzle. It would be like trying to navigate through the desert with a faulty compass.
It is therefore important to develop dedicated services to collect, process, and manage this data at scale.
This post outlines a simple architecture I've developed for ingesting high-frequency raw data, as well as general data like Klines, from multiple crypto exchanges and efficiently managing the associated storage requirements.
Choosing the Right Database: QuestDB
After experimenting with various time-series databases, I found QuestDB to be an outstanding solution for high-frequency data ingestion, capable of handling millions of entries per second with exceptional performance. I was previously using TimeScaleDB but found it insufficiently optimised for the volume and speed required for real-time data ingestion.
I first learned about it in a blog post from quant Dean Markwick — who offers a mathematical approach to trading.
A major challenge you'll face is storage. You will need many gigabytes of disk space, even in the hundreds if you plan to store raw trades for many symbols across multiple exchanges spanning several months. With QuestDB, you can mitigate this by configuring Time-To-Live (TTL) on database tables to automatically purge data older than your specified retention period (e.g., 30 days).
Here's an example of how to create a trades table with TTL configuration:
CREATE TABLE IF NOT EXISTS trades (
symbol SYMBOL,
price DOUBLE,
quantity DOUBLE,
timestamp TIMESTAMP,
is_buyer_maker BOOLEAN,
exchange SYMBOL,
trade_id STRING,
value DOUBLE
) TIMESTAMP(timestamp)
PARTITION BY DAY
TTL 15 DAYS
DEDUP UPSERT KEYS(timestamp, trade_id, exchange, symbol);
This configuration provides several benefits:
- Automatic partitioning by day for optimal query performance
- Deduplication to prevent duplicate trade records
- Automatic cleanup of data older than 15 days
- Efficient indexing on timestamp for time-series queries
If your goal is to collect data indefinitely, especially for backtesting, then you can extend the TTL or even omit it. You can also create more tables for the other data such as Klines, Open Interest, and Funding Rates, customising and partitioning according to your requirements. You can also create Materialized Views for quick access to this data. These are essentially new tables that aggregate the base table at specific intervals.
const intervals = [
{ name: '10s', partition: 'HOUR', ttl: '6h' },
{ name: '1m', partition: 'DAY', ttl: '7d' },
{ name: '5m', partition: 'DAY', ttl: '7d' },
{ name: '15m', partition: 'DAY', ttl: '7d' },
{ name: '30m', partition: 'DAY', ttl: '14d' },
{ name: '1h', partition: 'DAY', ttl: '30d' },
{ name: '4h', partition: 'DAY', ttl: '60d' },
{ name: '1d', partition: 'WEEK', ttl: '5w' }
];
// Create OHLC views for trades
for (const interval of intervals) {
const viewName = `trades_OHLC_${interval.name}`;
const createViewSQL = `
CREATE MATERIALIZED VIEW IF NOT EXISTS ${viewName}
WITH BASE trades REFRESH INCREMENTAL
AS (
SELECT
timestamp,
symbol,
exchange,
first(price) AS open,
max(price) AS high,
min(price) AS low,
last(price) AS close,
sum(quantity) AS volume,
sum(value) AS value
FROM trades
SAMPLE BY ${interval.name}
) PARTITION BY ${interval.partition} TTL ${interval.ttl};
`;
await this.executeHttpQuery(createViewSQL);
console.log(`Created materialized view: ${viewName}`);
}
The Utility of Each Data Source
Raw Trades
Volume Profile Analysis Construct precise volume profiles (histograms of volume at different price levels) over customised time ranges and price intervals. This enables identification of volume-based support and resistance levels like the Value Area.
Order Flow Analysis Generate FootPrint candles to identify traded volume at various price levels. This granular data grants us key insights such as:
- Stacked Imbalances: Identifying successive price levels dominated by either buyers or sellers
- Volume Delta / Cumulative Volume Delta Analysis: Tracking the difference between buy and sell volume at each price level
- Market Profile Construction: Building comprehensive market profiles showing value areas and point of control
Liquidations
Capturing liquidation data through WebSockets enables construction of real-time liquidation heatmaps, providing valuable insights into potential cascade events and market stress points.
Higher Timeframe (HTF) Data
Klines, open interest and funding rates from exchanges are still very important because you can perform HTF analysis using a wide range of indicators, and this data is readily available from exchanges. Thus, the architecture described below will include a separate worker for collecting this data as well.
Architecture Overview

The system architecture consists of three primary microservices, each optimised for specific responsibilities:
Core Services
1. Market Ingestor Service
- Purpose: Real-time data collection and persistence
- Connections: Multiple crypto exchanges (Binance, Coinbase, Kraken, etc.) via WebSockets
- Data Types: Raw trades, liquidation events, order book snapshots
- Storage: High-frequency writes to QuestDB via TCP with automatic partitioning
- Features:
- Connection resilience with automatic reconnection
- Data validation and filtering
- Rate limiting and backpressure handling
2. Kline Worker Service
- Purpose: Periodic aggregated data collection
- Method: RESTful API polling with configurable intervals
- Data Types: Kline data (multiple timeframes), Open Interest, funding rates
- Features:
- Configurable polling intervals per data type
- Historical data backfilling capabilities
- Data consistency validation
3. Signals Engine
- Purpose: Real-time technical analysis and signal generation
- Frequency: High-frequency execution (sub-second to minute intervals)
- Processing: Computes technical indicators, volume profiles, and custom signals
- Storage: Results cached in Redis for ultra-low latency access
- Features:
- Scalable computation pipeline
- Custom indicator development framework
- Real-time alert system
These interconnected services form the foundation for flexible applications. These applications can execute queries directly from QuestDB and perform extremely fast reads, effectively O(1), on technical data stored in Redis.
Versatile Applications
Research and Development
- Backtesting Engine: Historical strategy validation with tick-level precision
- Paper Trading: Risk-free strategy testing with real market conditions
- Market Research: Deep analysis of market microstructure and behavior patterns
Production Trading
- Live Trading Execution: Real-time strategy deployment with minimal latency
- Risk Management: Dynamic position sizing and stop-loss mechanisms
- Portfolio Management: Multi-strategy and multi-asset coordination
Performance Considerations
Scalability Features
- Horizontal Scaling: Each service can be independently scaled based on load
- Load Balancing: WebSocket connections distributed across multiple ingestor instances
- Data Sharding: QuestDB partitioning enables efficient parallel processing
Monitoring and Observability
- Metrics Collection: Comprehensive monitoring of ingestion rates, processing latency, and system health
- Alerting System: Proactive notifications for data gaps, system errors, or performance degradation
- Dashboard Integration: Real-time visualisation of system performance and data quality
This architecture provides the foundation for sophisticated crypto trading operations, offering the flexibility and performance needed to compete in today's fast-moving digital asset markets.