pairs_trading/PAIRS_TRADING_BACKTEST_USAGE.md
2025-07-20 18:11:45 +00:00

185 lines
5.5 KiB
Markdown

# Enhanced Pairs Trading Backtest Usage Guide
## Overview
The enhanced `pt_backtest.py` script now supports multi-day and multi-instrument backtesting with SQLite database output. This guide explains how to use the new features.
## New Features
### 1. Multi-Day Data Processing
- Process multiple data files in a single run
- Support for wildcard patterns in configuration files
- CLI override for data file specification
### 2. Dynamic Instrument Selection
- Auto-detection of instruments from database
- CLI override for instrument specification
- No need to manually update configuration files
### 3. SQLite Database Output
- Automated storage of backtest results
- Structured data format for analysis
- Optional database output (can be disabled)
## Command Line Arguments
### Required Arguments
- `--config`: Path to configuration file
- `--result_db`: Path to SQLite database for results (use "NONE" to disable)
### Optional Arguments
- `--datafiles`: Comma-separated list of data files (overrides config)
- `--instruments`: Comma-separated list of instruments (overrides auto-detection)
## Usage Examples
### Basic Usage (Auto-detect instruments, use config datafiles)
```bash
python src/pt_backtest.py --config configuration/crypto.cfg --result_db results.db
```
### Specify Instruments via CLI
```bash
python src/pt_backtest.py \
--config configuration/crypto.cfg \
--result_db results.db \
--instruments "BTC-USDT,ETH-USDT,ADA-USDT"
```
### Override Data Files via CLI
```bash
python src/pt_backtest.py \
--config configuration/crypto.cfg \
--result_db results.db \
--datafiles "20250528.mktdata.ohlcv.db,20250529.mktdata.ohlcv.db"
```
### Complete Override (Custom instruments and data files)
```bash
python src/pt_backtest.py \
--config configuration/crypto.cfg \
--result_db results.db \
--instruments "BTC-USDT,ETH-USDT" \
--datafiles "20250528.mktdata.ohlcv.db,20250529.mktdata.ohlcv.db"
```
### Disable Database Output
```bash
python src/pt_backtest.py \
--config configuration/crypto.cfg \
--result_db NONE
```
## Configuration File Updates
### Wildcard Support in Data Files
The configuration file now supports wildcards in the `datafiles` array:
```json
{
"datafiles": [
"2025*.mktdata.ohlcv.db",
"specific_file.db",
"202405*.mktdata.ohlcv.db"
]
}
```
### Multiple Patterns
You can specify multiple wildcard patterns:
```json
{
"datafiles": [
"202405*.mktdata.ohlcv.db",
"202406*.mktdata.ohlcv.db",
"special_data.db"
]
}
```
## Database Schema
The script creates a `pt_bt_results` table with the following schema:
| Column | Type | Description |
|--------|------|-------------|
| date | DATE | Trading date extracted from filename |
| pair | TEXT | Trading pair name (e.g., "BTC-USDT & ETH-USDT") |
| symbol | TEXT | Individual symbol (e.g., "BTC-USDT") |
| open_time | DATETIME | Trade opening time |
| open_side | TEXT | Opening side (BUY/SELL) |
| open_price | REAL | Opening price |
| open_quantity | INTEGER | Opening quantity |
| open_disequilibrium | REAL | Disequilibrium at opening |
| close_time | DATETIME | Trade closing time |
| close_side | TEXT | Closing side (BUY/SELL) |
| close_price | REAL | Closing price |
| close_quantity | INTEGER | Closing quantity |
| close_disequilibrium | REAL | Disequilibrium at closing |
| symbol_return | REAL | Individual symbol return (%) |
| pair_return | REAL | Combined pair return (%) |
## Auto-Detection Logic
### Instrument Auto-Detection
When `--instruments` is not specified, the script:
1. Connects to each data file
2. Queries distinct `instrument_id` values from the configured table
3. Removes the configured prefix (`instrument_id_pfx`)
4. Uses the resulting symbols for pair generation
### Data File Resolution
The script resolves data files in this order:
1. If `--datafiles` is specified, use those files
2. Otherwise, process each pattern in config `datafiles`:
- Expand wildcards using `glob.glob()`
- Resolve relative paths using `data_directory`
- Remove duplicates and sort
## Output
### Console Output
- Lists all data files to be processed
- Shows auto-detected or specified instruments
- Displays trade signals for each file
- Prints returns by day and pair
- Shows grand totals and outstanding positions
### Database Output
- Creates database and table automatically
- Stores detailed trade information
- Includes calculated returns
- One record per symbol per trade
## Error Handling
The script includes comprehensive error handling:
- Invalid data files are skipped with warnings
- Database connection errors are reported
- Auto-detection failures fall back gracefully
- Processing errors are logged with stack traces
## Performance Considerations
- Wildcard expansion happens once at startup
- Database connections are opened/closed per operation
- Large numbers of files are processed sequentially
- Memory usage scales with the number of instruments and data points
## Troubleshooting
### Common Issues
1. **No instruments found**: Check that the database contains data for the specified exchange_id
2. **No data files found**: Verify wildcard patterns and data_directory path
3. **Database errors**: Ensure write permissions for the result database path
4. **Memory issues**: Consider processing fewer files at once or reducing instrument count
### Debug Tips
- Use `--result_db NONE` to disable database output during testing
- Start with a small set of instruments using `--instruments`
- Test with explicit file lists using `--datafiles` before using wildcards
- Check console output for detailed processing information