## **Revision Document – v3 Output Contract & Figure Specifications** This single guide merges **I/O plumbing**, **logging**, **CI hooks**, **artefact paths**, and **figure design** into one actionable playbook. Apply the steps **in order**, submitting small PRs so CI remains green throughout. --- ### 0 ▪ Foundations | Step | File(s) | Action | |------|---------|--------| | 0.1 | **`config.yaml`** | Add: ```yaml base_dirs: {results: results, models: models, logs: logs} output: {figure_dpi: 150, figure_size: [16, 9], log_level: INFO}``` | | 0.2 | `src/utils/run_id.py` | `make_run_id()` → `"20250418_152310_ab12cd"` (timestamp + short git‑hash). | | 0.3 | `src/__init__.py` | Expose `__version__`, `GIT_SHA`, `BUILD_DATE`. | --- ### 1 ▪ Core I/O & Logging | File | Content | |------|---------| | **`src/io_manager.py`** | `IOManager(cfg, run_id)`
• `path(section, name)`: returns full path under `results|models|logs|figures`.
• `save_json`, `save_df` (CSV ≤ 100 MB else Parquet), `save_figure` (uses cfg dpi/size). | | **`src/logger_setup.py`** | `setup_logger(cfg, run_id, io)` with colourised console (INFO) + rotating file handler (DEBUG) in `logs//`. | **`run.py` entry banner** ```python run_id = make_run_id() cfg = load_config(args.config) io = IOManager(cfg, run_id) logger = setup_logger(cfg, run_id, io) logger.info(f"GRU‑SAC v{__version__} | commit {GIT_SHA} | run {run_id}") logger.info(f"Loaded config file: {args.config}") ``` --- ### 2 ▪ Stage Outputs | Stage | Implementation notes | Artefacts | |-------|---------------------|-----------| | **Data load & preprocess** | After sampling/NaN purge save:
`io.save_json(summary, "preprocess_summary")`
`io.save_df(df.head(20), "head_preprocessed")` | `results//preprocess_summary.txt`
`head_preprocessed.csv` | | **Feature engineering** | Generate correlation heat‑map (see figure table) → `io.save_figure(...,"feature_corr_heatmap")` | 〃 | | **Label generation** | Log distribution; produce histogram figure. | `label_histogram.png` | | **Baseline 1 & 2** | Consolidate in `baseline_checker.py`; each returns dict with accuracy, CI etc.
`io.save_json(report,"baseline1_report")` (and 2). | `baseline1_report.txt / baseline2_report.txt` | | **Feature whitelist** | Save JSON to `models//final_whitelist_.json`. | — | | **GRU training** | Use Keras CSVLogger to `logs//gru_history.csv`; after training plot learning curve. | `gru_learning_curve.png` + `.keras` model | | **Calibration (Vector)** | Save `calibrator_vec_.npy`; plot reliability curve. | `reliability_curve_val_.png` | | **SAC training** | Write `episode_rewards.csv`, plot reward curve, save final agent under `models/sac_train_/`. | `sac_reward_plot.png` | | **Back‑test** | Save step‑level CSV, metrics JSON, summary figure. | `backtest_results_.csv`
`performance_metrics_.txt`
`backtest_summary_.png` | --- ### 3 ▪ Figure Specifications | File | Visualises | Layout / Details | |------|-------------|------------------| | **feature_corr_heatmap.png** | Pearson correlation of engineered features (pre‑prune). | Square heat‑map, features sorted by |ρ| vs target; diverging palette centred at 0; annotate |ρ| > 0.5; colour‑bar. | | **label_histogram.png** | Direction‑label class mix (train split). | Bar chart: Down / Flat / Up (binary shows two). Percentages on bar tops; title shows ε value. | | **gru_learning_curve.png** | GRU training progress. | 3 stacked panes: total loss (log‑y), val dir3 accuracy, vertical dashed “early‑stop”; share epoch‑axis. | | **reliability_curve_val_*.png** | Calibration quality post‑Vector scaling. | Left 70 %: reliability diagram (10 equal‑freq bins). Right 30 %: histogram of predicted p_up. Title shows ECE & Brier. | | **sac_reward_plot.png** | Offline SAC learning curve. | Smoothed episode reward (EMA 0.2) vs steps; action‑variance on twin y‑axis; checkpoint ticks. | | **backtest_summary_*.png** | Live back‑test overview. | 3 stacked plots:
1) Price line + blue/red background for edge ≥ 0.1.
2) Position size step‑graph.
3) Equity curve with shaded draw‑downs; textbox shows Sharpe & Max DD. | _All figs_: 16 × 9 in, 150 DPI, `plt.tight_layout()`, footer `"© GRU‑SAC v3"` right‑bottom. --- ### 4 ▪ Unit Tests * `tests/test_output_contract.py` * Run mini‑pipeline (`tests/smoke.yaml`), assert each required file exists > 2 KB. * Validate JSON keys (`accuracy`, `ci_lower` etc.). * `assert_any_close(softmax(logits), probs)` for logits view. --- ### 5 ▪ CI Workflow (`.github/workflows/pipeline.yml`) ```yaml jobs: build-test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Set up Python uses: actions/setup-python@v4 with: {python-version: "3.10"} - run: pip install -r requirements.txt - run: black --check . - run: ruff . - run: pytest -q - name: Smoke e2e run: python run.py --config tests/smoke.yaml - name: Upload artefacts uses: actions/upload-artifact@v4 with: name: run-${{ github.sha }} path: | results/*/* logs/*/* ``` --- ### 6 ▪ Documentation Updates * **`README.md`** → new *Outputs* section reproducing the artefact table. * **`docs/v3_changelog.md`** → one‑pager summarising v3 versus v2 differences (labels, calibration, outputs). --- ### 7 ▪ Roll‑out Plan (5‑PR cadence) 1. **PR #1** – run‑id, IOManager, logger, CI log upload. 2. **PR #2** – data & feature stage outputs + tests. 3. **PR #3** – GRU training outputs + calibration figure. 4. **PR #4** – SAC & back‑test outputs, reward & summary figs. 5. **PR #5** – docs & README refresh. Tag `v3.0.0` after PR #5 passes. --- ### 8 ▪ Success Criteria for CI Fail the pipeline when **any** occurs: * `baseline1_report.txt` CI‑LB < 0.52 * `edge_filtered_accuracy` (val) < 0.60 * Back‑test Sharpe < 1.2 or Max DD > 15 % --- Implementing this **single integrated revision** provides: * **Deterministic artefact paths** for every run. * **Rich, shareable figures** for quick diagnostics. * **Audit‑ready logs/reports** for research traceability. Merge each step once CI is green; you’ll have a reproducible, fully instrumented pipeline ready for iterative accuracy pushes toward the 65 % target.