pairs_trading/PAIRS_TRADING_BACKTEST_USAGE.md
2025-07-20 18:11:45 +00:00

5.5 KiB

Enhanced Pairs Trading Backtest Usage Guide

Overview

The enhanced pt_backtest.py script now supports multi-day and multi-instrument backtesting with SQLite database output. This guide explains how to use the new features.

New Features

1. Multi-Day Data Processing

  • Process multiple data files in a single run
  • Support for wildcard patterns in configuration files
  • CLI override for data file specification

2. Dynamic Instrument Selection

  • Auto-detection of instruments from database
  • CLI override for instrument specification
  • No need to manually update configuration files

3. SQLite Database Output

  • Automated storage of backtest results
  • Structured data format for analysis
  • Optional database output (can be disabled)

Command Line Arguments

Required Arguments

  • --config: Path to configuration file
  • --result_db: Path to SQLite database for results (use "NONE" to disable)

Optional Arguments

  • --datafiles: Comma-separated list of data files (overrides config)
  • --instruments: Comma-separated list of instruments (overrides auto-detection)

Usage Examples

Basic Usage (Auto-detect instruments, use config datafiles)

python src/pt_backtest.py --config configuration/crypto.cfg --result_db results.db

Specify Instruments via CLI

python src/pt_backtest.py \
    --config configuration/crypto.cfg \
    --result_db results.db \
    --instruments "BTC-USDT,ETH-USDT,ADA-USDT"

Override Data Files via CLI

python src/pt_backtest.py \
    --config configuration/crypto.cfg \
    --result_db results.db \
    --datafiles "20250528.mktdata.ohlcv.db,20250529.mktdata.ohlcv.db"

Complete Override (Custom instruments and data files)

python src/pt_backtest.py \
    --config configuration/crypto.cfg \
    --result_db results.db \
    --instruments "BTC-USDT,ETH-USDT" \
    --datafiles "20250528.mktdata.ohlcv.db,20250529.mktdata.ohlcv.db"

Disable Database Output

python src/pt_backtest.py \
    --config configuration/crypto.cfg \
    --result_db NONE

Configuration File Updates

Wildcard Support in Data Files

The configuration file now supports wildcards in the datafiles array:

{
    "datafiles": [
        "2025*.mktdata.ohlcv.db",
        "specific_file.db",
        "202405*.mktdata.ohlcv.db"
    ]
}

Multiple Patterns

You can specify multiple wildcard patterns:

{
    "datafiles": [
        "202405*.mktdata.ohlcv.db",
        "202406*.mktdata.ohlcv.db",
        "special_data.db"
    ]
}

Database Schema

The script creates a pt_bt_results table with the following schema:

Column Type Description
date DATE Trading date extracted from filename
pair TEXT Trading pair name (e.g., "BTC-USDT & ETH-USDT")
symbol TEXT Individual symbol (e.g., "BTC-USDT")
open_time DATETIME Trade opening time
open_side TEXT Opening side (BUY/SELL)
open_price REAL Opening price
open_quantity INTEGER Opening quantity
open_disequilibrium REAL Disequilibrium at opening
close_time DATETIME Trade closing time
close_side TEXT Closing side (BUY/SELL)
close_price REAL Closing price
close_quantity INTEGER Closing quantity
close_disequilibrium REAL Disequilibrium at closing
symbol_return REAL Individual symbol return (%)
pair_return REAL Combined pair return (%)

Auto-Detection Logic

Instrument Auto-Detection

When --instruments is not specified, the script:

  1. Connects to each data file
  2. Queries distinct instrument_id values from the configured table
  3. Removes the configured prefix (instrument_id_pfx)
  4. Uses the resulting symbols for pair generation

Data File Resolution

The script resolves data files in this order:

  1. If --datafiles is specified, use those files
  2. Otherwise, process each pattern in config datafiles:
    • Expand wildcards using glob.glob()
    • Resolve relative paths using data_directory
    • Remove duplicates and sort

Output

Console Output

  • Lists all data files to be processed
  • Shows auto-detected or specified instruments
  • Displays trade signals for each file
  • Prints returns by day and pair
  • Shows grand totals and outstanding positions

Database Output

  • Creates database and table automatically
  • Stores detailed trade information
  • Includes calculated returns
  • One record per symbol per trade

Error Handling

The script includes comprehensive error handling:

  • Invalid data files are skipped with warnings
  • Database connection errors are reported
  • Auto-detection failures fall back gracefully
  • Processing errors are logged with stack traces

Performance Considerations

  • Wildcard expansion happens once at startup
  • Database connections are opened/closed per operation
  • Large numbers of files are processed sequentially
  • Memory usage scales with the number of instruments and data points

Troubleshooting

Common Issues

  1. No instruments found: Check that the database contains data for the specified exchange_id
  2. No data files found: Verify wildcard patterns and data_directory path
  3. Database errors: Ensure write permissions for the result database path
  4. Memory issues: Consider processing fewer files at once or reducing instrument count

Debug Tips

  • Use --result_db NONE to disable database output during testing
  • Start with a small set of instruments using --instruments
  • Test with explicit file lists using --datafiles before using wildcards
  • Check console output for detailed processing information