gru_sac_predictor/prompts/output_artefacts.txt
2025-04-20 17:52:49 +00:00

140 lines
6.4 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

## **Revision Document v3 Output Contract & Figure Specifications**
This single guide merges **I/O plumbing**, **logging**, **CI hooks**, **artefact paths**, and **figure design** into one actionable playbook.
Apply the steps **in order**, submitting small PRs so CI remains green throughout.
---
### 0  Foundations
| Step | File(s) | Action |
|------|---------|--------|
| 0.1 | **`config.yaml`** | Add: ```yaml base_dirs: {results: results, models: models, logs: logs} output: {figure_dpi: 150, figure_size: [16, 9], log_level: INFO}``` |
| 0.2 | `src/utils/run_id.py` | `make_run_id()` → `"20250418_152310_ab12cd"` (timestamp + short githash). |
| 0.3 | `src/__init__.py` | Expose `__version__`, `GIT_SHA`, `BUILD_DATE`. |
---
### 1  Core I/O & Logging
| File | Content |
|------|---------|
| **`src/io_manager.py`** | `IOManager(cfg, run_id)` <br>• `path(section, name)`: returns full path under `results|models|logs|figures`.<br>• `save_json`, `save_df` (CSV  100 MB else Parquet), `save_figure` (uses cfg dpi/size). |
| **`src/logger_setup.py`** | `setup_logger(cfg, run_id, io)` with colourised console (INFO) + rotating file handler (DEBUG) in `logs/<run_id>/`. |
**`run.py` entry banner**
```python
run_id = make_run_id()
cfg = load_config(args.config)
io = IOManager(cfg, run_id)
logger = setup_logger(cfg, run_id, io)
logger.info(f"GRUSAC v{__version__} | commit {GIT_SHA} | run {run_id}")
logger.info(f"Loaded config file: {args.config}")
```
---
### 2  Stage Outputs
| Stage | Implementation notes | Artefacts |
|-------|---------------------|-----------|
| **Data load & preprocess** | After sampling/NaN purge save: <br>`io.save_json(summary, "preprocess_summary")`<br>`io.save_df(df.head(20), "head_preprocessed")` | `results/<run_id>/preprocess_summary.txt`<br>`head_preprocessed.csv` |
| **Feature engineering** | Generate correlation heatmap (see figure table) → `io.save_figure(...,"feature_corr_heatmap")` | 〃 |
| **Label generation** | Log distribution; produce histogram figure. | `label_histogram.png` |
| **Baseline 1 & 2** | Consolidate in `baseline_checker.py`; each returns dict with accuracy, CI etc. <br>`io.save_json(report,"baseline1_report")` (and 2). | `baseline1_report.txt / baseline2_report.txt` |
| **Feature whitelist** | Save JSON to `models/<run_id>/final_whitelist_<run_id>.json`. | — |
| **GRU training** | Use Keras CSVLogger to `logs/<run_id>/gru_history.csv`; after training plot learning curve. | `gru_learning_curve.png` + `.keras` model |
| **Calibration (Vector)** | Save `calibrator_vec_<run_id>.npy`; plot reliability curve. | `reliability_curve_val_<run_id>.png` |
| **SAC training** | Write `episode_rewards.csv`, plot reward curve, save final agent under `models/sac_train_<run_id>/`. | `sac_reward_plot.png` |
| **Backtest** | Save steplevel CSV, metrics JSON, summary figure. | `backtest_results_<run_id>.csv`<br>`performance_metrics_<run_id>.txt`<br>`backtest_summary_<run_id>.png` |
---
### 3  Figure Specifications
| File | Visualises | Layout / Details |
|------|-------------|------------------|
| **feature_corr_heatmap.png** | Pearson correlation of engineered features (preprune). | Square heatmap, features sorted by |ρ| vs target; diverging palette centred at 0; annotate |ρ| > 0.5; colourbar. |
| **label_histogram.png** | Directionlabel class mix (train split). | Bar chart: Down / Flat / Up (binary shows two). Percentages on bar tops; title shows ε value. |
| **gru_learning_curve.png** | GRU training progress. | 3 stacked panes: total loss (logy), val dir3 accuracy, vertical dashed “earlystop”; share epochaxis. |
| **reliability_curve_val_*.png** | Calibration quality postVector scaling. | Left 70 %: reliability diagram (10 equalfreq bins). Right 30 %: histogram of predicted p_up. Title shows ECE & Brier. |
| **sac_reward_plot.png** | Offline SAC learning curve. | Smoothed episode reward (EMA 0.2) vs steps; actionvariance on twin yaxis; checkpoint ticks. |
| **backtest_summary_*.png** | Live backtest overview. | 3 stacked plots:<br>1) Price line + blue/red background for edge ≥ 0.1.<br>2) Position size stepgraph.<br>3) Equity curve with shaded drawdowns; textbox shows Sharpe & Max DD. |
_All figs_: 16 × 9 in, 150 DPI, `plt.tight_layout()`, footer `"© GRUSAC v3"` rightbottom.
---
### 4  Unit Tests
* `tests/test_output_contract.py`
* Run minipipeline (`tests/smoke.yaml`), assert each required file exists > 2 KB.
* Validate JSON keys (`accuracy`, `ci_lower` etc.).
* `assert_any_close(softmax(logits), probs)` for logits view.
---
### 5  CI Workflow (`.github/workflows/pipeline.yml`)
```yaml
jobs:
build-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v4
with: {python-version: "3.10"}
- run: pip install -r requirements.txt
- run: black --check .
- run: ruff .
- run: pytest -q
- name: Smoke e2e
run: python run.py --config tests/smoke.yaml
- name: Upload artefacts
uses: actions/upload-artifact@v4
with:
name: run-${{ github.sha }}
path: |
results/*/*
logs/*/*
```
---
### 6  Documentation Updates
* **`README.md`** → new *Outputs* section reproducing the artefact table.
* **`docs/v3_changelog.md`** → onepager summarising v3 versus v2 differences (labels, calibration, outputs).
---
### 7  Rollout Plan (5PR cadence)
1. **PR #1**  runid, IOManager, logger, CI log upload.
2. **PR #2**  data & feature stage outputs + tests.
3. **PR #3**  GRU training outputs + calibration figure.
4. **PR #4**  SAC & backtest outputs, reward & summary figs.
5. **PR #5**  docs & README refresh.
Tag `v3.0.0` after PR #5 passes.
---
### 8  Success Criteria for CI
Fail the pipeline when **any** occurs:
* `baseline1_report.txt` CILB < 0.52
* `edge_filtered_accuracy` (val) < 0.60
* Backtest Sharpe < 1.2 or Max DD > 15 %
---
Implementing this **single integrated revision** provides:
* **Deterministic artefact paths** for every run.
* **Rich, shareable figures** for quick diagnostics.
* **Auditready logs/reports** for research traceability.
Merge each step once CI is green; youll have a reproducible, fully instrumented pipeline ready for iterative accuracy pushes toward the 65 % target.