55 lines
4.5 KiB
Plaintext
55 lines
4.5 KiB
Plaintext
Below is a consolidated list of every “validation gate” we enforce in the pipeline—split by the **GRU** (prediction) stage and the **SAC** (position‑sizing) stage. Each check either **aborts** the run (hard‐fail) or **warns** you that a non‑critical gate didn’t clear.
|
||
|
||
---
|
||
|
||
## GRU‑stage Validation Gates
|
||
|
||
| Gate # | Check | Data used | Threshold | Action on Fail |
|
||
|--------|----------------------------------------------------------------|-----------------------|------------------|--------------------|
|
||
| **G1** | **Raw binary LR** (Internal split) | train 80/20 split | CI LB ≥ `binary_ci_lb` (0.52) | **Abort** |
|
||
| **G2** | **Raw binary RF** (Internal split, optional) | train 80/20 split | CI LB ≥ `binary_rf_ci_lb` (0.54) | **Abort** (if enabled) |
|
||
| **G3** | **Raw ternary LR** (Internal split, if `use_ternary`) | train 80/20 split | CI LB ≥ `ternary_ci_lb` (0.40) | **Warn** |
|
||
| **G4** | **Raw ternary RF** (Internal split, optional) | train 80/20 split | CI LB ≥ `ternary_rf_ci_lb` (0.42) | **Warn** |
|
||
| **G5** | **Forward‑fold binary LR** (True OOS, test fold t+1) | fold’s test set | CI LB ≥ `forward_ci_lb` (0.52) | **Abort** |
|
||
| **G6** | **Feature‑selection re‑baseline** (post‑prune binary LR) | pruned train 80/20 | CI LB ≥ `binary_ci_lb` (0.52) | **Abort** |
|
||
| **G7** | **Calibration check** (edge‑filtered p_cal on val split) | val split | CI LB ≥ `calibration_ci_lb` (0.55) | **Abort** |
|
||
|
||
> **Notes:**
|
||
> • G1 catches “no predictive signal” cheaply.
|
||
> • G5 ensures it actually *generalises* forward.
|
||
> • G7 makes sure your calibrated probabilities have real edge before SAC ever sees them.
|
||
|
||
---
|
||
|
||
## SAC‑stage Validation Gates
|
||
|
||
| Gate # | Check | Data used | Threshold | Action on Fail |
|
||
|---------|-------------------------------------------------------------|-----------------------|---------------------|----------------------|
|
||
| **G8** | **Edge‑filtered binary LR** | val split probabilities | CI LB ≥ `edge_binary_ci_lb` (0.60) | **Abort** |
|
||
| **G9** | **Edge‑filtered binary RF** | val split probabilities | CI LB ≥ `edge_binary_rf_ci_lb` (0.62) | **Abort** |
|
||
| **G10** | **Edge‑filtered ternary LR** (if `use_ternary`) | val split p_cat[:,2]–p_cat[:,0] | CI LB ≥ `edge_ternary_ci_lb` (0.57) | **Warn** |
|
||
| **G11** | **Edge‑filtered ternary RF** (if `use_ternary`) | val split p_cat[:,2]–p_cat[:,0] | CI LB ≥ `edge_ternary_rf_ci_lb` (0.58) | **Warn** |
|
||
| **G12** | **Backtest performance** (Sharpe / Max DD on test fold) | aggregated test folds | Sharpe ≥ `backtest.sharpe_lb` (1.2)<br>Max DD ≤ `backtest.max_dd_ub` (15 %) | **Abort** if violated|
|
||
|
||
> **Notes:**
|
||
> • G8–G9 gate the **high‐confidence edge** you feed into SAC. If they fail, SAC will only ever go all‑in/flat, so we abort.
|
||
> • G10–G11 warn you if your flat/no‑move sizing is shaky—SAC can still run, but you’ll get a console warning suggesting you tweak your flat thresholds or features.
|
||
> • G12 validates final **live‐like** performance; if you can’t hit the Sharpe/Max DD targets on the unseen test folds, the entire run is considered a no‑go.
|
||
|
||
---
|
||
|
||
### What to do on each pattern
|
||
|
||
1. **Any GRU‑abort gate (G1, G2, G5, G6, G7) fails** →
|
||
**Stop** before training. Improve features, horizon, calibration settings, or prune strategy.
|
||
|
||
2. **GRU passes but SAC‑binary edge gates (G8/G9) fail** →
|
||
**Stop** before SAC training. Means your probabilities have no reliable high‑confidence edge—tweak calibration threshold or retrain GRU.
|
||
|
||
3. **GRU & SAC‑binary gates pass, but SAC‑ternary edge gates (G10/G11) warn** →
|
||
**Proceed** with a warning: consider adding flat‑specific features or raising the `edge_threshold`.
|
||
|
||
4. **All gates pass** →
|
||
Full pipeline runs to completion: GRU training, SAC training, backtest, resulting in models, logs, and performance reports.
|
||
|
||
By strictly enforcing these gates, you ensure every GRU and SAC model you train has demonstrable, forward‑tested edge—maximizing your chances of hitting that 65 % directional target in live trading. |