140 lines
6.4 KiB
Plaintext
140 lines
6.4 KiB
Plaintext
## **Revision Document – v3 Output Contract & Figure Specifications**
|
||
This single guide merges **I/O plumbing**, **logging**, **CI hooks**, **artefact paths**, and **figure design** into one actionable playbook.
|
||
Apply the steps **in order**, submitting small PRs so CI remains green throughout.
|
||
|
||
---
|
||
|
||
### 0 ▪ Foundations
|
||
|
||
| Step | File(s) | Action |
|
||
|------|---------|--------|
|
||
| 0.1 | **`config.yaml`** | Add: ```yaml base_dirs: {results: results, models: models, logs: logs} output: {figure_dpi: 150, figure_size: [16, 9], log_level: INFO}``` |
|
||
| 0.2 | `src/utils/run_id.py` | `make_run_id()` → `"20250418_152310_ab12cd"` (timestamp + short git‑hash). |
|
||
| 0.3 | `src/__init__.py` | Expose `__version__`, `GIT_SHA`, `BUILD_DATE`. |
|
||
|
||
---
|
||
|
||
### 1 ▪ Core I/O & Logging
|
||
|
||
| File | Content |
|
||
|------|---------|
|
||
| **`src/io_manager.py`** | `IOManager(cfg, run_id)` <br>• `path(section, name)`: returns full path under `results|models|logs|figures`.<br>• `save_json`, `save_df` (CSV ≤ 100 MB else Parquet), `save_figure` (uses cfg dpi/size). |
|
||
| **`src/logger_setup.py`** | `setup_logger(cfg, run_id, io)` with colourised console (INFO) + rotating file handler (DEBUG) in `logs/<run_id>/`. |
|
||
|
||
**`run.py` entry banner**
|
||
|
||
```python
|
||
run_id = make_run_id()
|
||
cfg = load_config(args.config)
|
||
io = IOManager(cfg, run_id)
|
||
logger = setup_logger(cfg, run_id, io)
|
||
logger.info(f"GRU‑SAC v{__version__} | commit {GIT_SHA} | run {run_id}")
|
||
logger.info(f"Loaded config file: {args.config}")
|
||
```
|
||
|
||
---
|
||
|
||
### 2 ▪ Stage Outputs
|
||
|
||
| Stage | Implementation notes | Artefacts |
|
||
|-------|---------------------|-----------|
|
||
| **Data load & preprocess** | After sampling/NaN purge save: <br>`io.save_json(summary, "preprocess_summary")`<br>`io.save_df(df.head(20), "head_preprocessed")` | `results/<run_id>/preprocess_summary.txt`<br>`head_preprocessed.csv` |
|
||
| **Feature engineering** | Generate correlation heat‑map (see figure table) → `io.save_figure(...,"feature_corr_heatmap")` | 〃 |
|
||
| **Label generation** | Log distribution; produce histogram figure. | `label_histogram.png` |
|
||
| **Baseline 1 & 2** | Consolidate in `baseline_checker.py`; each returns dict with accuracy, CI etc. <br>`io.save_json(report,"baseline1_report")` (and 2). | `baseline1_report.txt / baseline2_report.txt` |
|
||
| **Feature whitelist** | Save JSON to `models/<run_id>/final_whitelist_<run_id>.json`. | — |
|
||
| **GRU training** | Use Keras CSVLogger to `logs/<run_id>/gru_history.csv`; after training plot learning curve. | `gru_learning_curve.png` + `.keras` model |
|
||
| **Calibration (Vector)** | Save `calibrator_vec_<run_id>.npy`; plot reliability curve. | `reliability_curve_val_<run_id>.png` |
|
||
| **SAC training** | Write `episode_rewards.csv`, plot reward curve, save final agent under `models/sac_train_<run_id>/`. | `sac_reward_plot.png` |
|
||
| **Back‑test** | Save step‑level CSV, metrics JSON, summary figure. | `backtest_results_<run_id>.csv`<br>`performance_metrics_<run_id>.txt`<br>`backtest_summary_<run_id>.png` |
|
||
|
||
---
|
||
|
||
### 3 ▪ Figure Specifications
|
||
|
||
| File | Visualises | Layout / Details |
|
||
|------|-------------|------------------|
|
||
| **feature_corr_heatmap.png** | Pearson correlation of engineered features (pre‑prune). | Square heat‑map, features sorted by |ρ| vs target; diverging palette centred at 0; annotate |ρ| > 0.5; colour‑bar. |
|
||
| **label_histogram.png** | Direction‑label class mix (train split). | Bar chart: Down / Flat / Up (binary shows two). Percentages on bar tops; title shows ε value. |
|
||
| **gru_learning_curve.png** | GRU training progress. | 3 stacked panes: total loss (log‑y), val dir3 accuracy, vertical dashed “early‑stop”; share epoch‑axis. |
|
||
| **reliability_curve_val_*.png** | Calibration quality post‑Vector scaling. | Left 70 %: reliability diagram (10 equal‑freq bins). Right 30 %: histogram of predicted p_up. Title shows ECE & Brier. |
|
||
| **sac_reward_plot.png** | Offline SAC learning curve. | Smoothed episode reward (EMA 0.2) vs steps; action‑variance on twin y‑axis; checkpoint ticks. |
|
||
| **backtest_summary_*.png** | Live back‑test overview. | 3 stacked plots:<br>1) Price line + blue/red background for edge ≥ 0.1.<br>2) Position size step‑graph.<br>3) Equity curve with shaded draw‑downs; textbox shows Sharpe & Max DD. |
|
||
|
||
_All figs_: 16 × 9 in, 150 DPI, `plt.tight_layout()`, footer `"© GRU‑SAC v3"` right‑bottom.
|
||
|
||
---
|
||
|
||
### 4 ▪ Unit Tests
|
||
|
||
* `tests/test_output_contract.py`
|
||
* Run mini‑pipeline (`tests/smoke.yaml`), assert each required file exists > 2 KB.
|
||
* Validate JSON keys (`accuracy`, `ci_lower` etc.).
|
||
* `assert_any_close(softmax(logits), probs)` for logits view.
|
||
|
||
---
|
||
|
||
### 5 ▪ CI Workflow (`.github/workflows/pipeline.yml`)
|
||
|
||
```yaml
|
||
jobs:
|
||
build-test:
|
||
runs-on: ubuntu-latest
|
||
steps:
|
||
- uses: actions/checkout@v4
|
||
- name: Set up Python
|
||
uses: actions/setup-python@v4
|
||
with: {python-version: "3.10"}
|
||
- run: pip install -r requirements.txt
|
||
- run: black --check .
|
||
- run: ruff .
|
||
- run: pytest -q
|
||
- name: Smoke e2e
|
||
run: python run.py --config tests/smoke.yaml
|
||
- name: Upload artefacts
|
||
uses: actions/upload-artifact@v4
|
||
with:
|
||
name: run-${{ github.sha }}
|
||
path: |
|
||
results/*/*
|
||
logs/*/*
|
||
```
|
||
|
||
---
|
||
|
||
### 6 ▪ Documentation Updates
|
||
|
||
* **`README.md`** → new *Outputs* section reproducing the artefact table.
|
||
* **`docs/v3_changelog.md`** → one‑pager summarising v3 versus v2 differences (labels, calibration, outputs).
|
||
|
||
---
|
||
|
||
### 7 ▪ Roll‑out Plan (5‑PR cadence)
|
||
|
||
1. **PR #1** – run‑id, IOManager, logger, CI log upload.
|
||
2. **PR #2** – data & feature stage outputs + tests.
|
||
3. **PR #3** – GRU training outputs + calibration figure.
|
||
4. **PR #4** – SAC & back‑test outputs, reward & summary figs.
|
||
5. **PR #5** – docs & README refresh.
|
||
|
||
Tag `v3.0.0` after PR #5 passes.
|
||
|
||
---
|
||
|
||
### 8 ▪ Success Criteria for CI
|
||
|
||
Fail the pipeline when **any** occurs:
|
||
|
||
* `baseline1_report.txt` CI‑LB < 0.52
|
||
* `edge_filtered_accuracy` (val) < 0.60
|
||
* Back‑test Sharpe < 1.2 or Max DD > 15 %
|
||
|
||
---
|
||
|
||
Implementing this **single integrated revision** provides:
|
||
|
||
* **Deterministic artefact paths** for every run.
|
||
* **Rich, shareable figures** for quick diagnostics.
|
||
* **Audit‑ready logs/reports** for research traceability.
|
||
|
||
Merge each step once CI is green; you’ll have a reproducible, fully instrumented pipeline ready for iterative accuracy pushes toward the 65 % target. |