gru_sac_predictor/prompts/validation_gates.txt

Below is a consolidated list of every “validation gate” we enforce in the pipeline—split by the **GRU** (prediction) stage and the **SAC** (position‑sizing) stage.  Each check either **aborts** the run (hard‐fail) or **warns** you that a non‑critical gate didn’t clear.

---

## GRU‑stage Validation Gates

| Gate # | Check                                                          | Data used             | Threshold        | Action on Fail     |
|--------|----------------------------------------------------------------|-----------------------|------------------|--------------------|
| **G1** | **Raw binary LR** (Internal split)                             | train 80/20 split     | CI LB ≥ `binary_ci_lb` (0.52)  | **Abort**          |
| **G2** | **Raw binary RF** (Internal split, optional)                   | train 80/20 split     | CI LB ≥ `binary_rf_ci_lb` (0.54) | **Abort** (if enabled) |
| **G3** | **Raw ternary LR** (Internal split, if `use_ternary`)          | train 80/20 split     | CI LB ≥ `ternary_ci_lb` (0.40) | **Warn**           |
| **G4** | **Raw ternary RF** (Internal split, optional)                  | train 80/20 split     | CI LB ≥ `ternary_rf_ci_lb` (0.42) | **Warn**           |
| **G5** | **Forward‑fold binary LR** (True OOS, test fold t+1)           | fold’s test set       | CI LB ≥ `forward_ci_lb` (0.52) | **Abort**          |
| **G6** | **Feature‑selection re‑baseline** (post‑prune binary LR)        | pruned train 80/20    | CI LB ≥ `binary_ci_lb` (0.52)  | **Abort**          |
| **G7** | **Calibration check** (edge‑filtered p_cal on val split)       | val split             | CI LB ≥ `calibration_ci_lb` (0.55) | **Abort**       |

> **Notes:**
> • G1 catches “no predictive signal” cheaply.
> • G5 ensures it actually *generalises* forward.
> • G7 makes sure your calibrated probabilities have real edge before SAC ever sees them.

---

## SAC‑stage Validation Gates

| Gate #  | Check                                                       | Data used             | Threshold           | Action on Fail       |
|---------|-------------------------------------------------------------|-----------------------|---------------------|----------------------|
| **G8**  | **Edge‑filtered binary LR**                                 | val split probabilities | CI LB ≥ `edge_binary_ci_lb` (0.60) | **Abort**            |
| **G9**  | **Edge‑filtered binary RF**                                 | val split probabilities | CI LB ≥ `edge_binary_rf_ci_lb` (0.62) | **Abort**            |
| **G10** | **Edge‑filtered ternary LR** (if `use_ternary`)             | val split p_cat[:,2]–p_cat[:,0] | CI LB ≥ `edge_ternary_ci_lb` (0.57) | **Warn**           |
| **G11** | **Edge‑filtered ternary RF** (if `use_ternary`)             | val split p_cat[:,2]–p_cat[:,0] | CI LB ≥ `edge_ternary_rf_ci_lb` (0.58) | **Warn**           |
| **G12** | **Backtest performance** (Sharpe / Max DD on test fold)     | aggregated test folds | Sharpe ≥ `backtest.sharpe_lb` (1.2)<br>Max DD ≤ `backtest.max_dd_ub` (15 %) | **Abort** if violated|

> **Notes:**
> • G8–G9 gate the **high‐confidence edge** you feed into SAC.  If they fail, SAC will only ever go all‑in/flat, so we abort.
> • G10–G11 warn you if your flat/no‑move sizing is shaky—SAC can still run, but you’ll get a console warning suggesting you tweak your flat thresholds or features.
> • G12 validates final **live‐like** performance; if you can’t hit the Sharpe/Max DD targets on the unseen test folds, the entire run is considered a no‑go.

---

### What to do on each pattern

1. **Any GRU‑abort gate (G1, G2, G5, G6, G7) fails** →
   **Stop** before training.  Improve features, horizon, calibration settings, or prune strategy.

2. **GRU passes but SAC‑binary edge gates (G8/G9) fail** →
   **Stop** before SAC training.  Means your probabilities have no reliable high‑confidence edge—tweak calibration threshold or retrain GRU.

3. **GRU & SAC‑binary gates pass, but SAC‑ternary edge gates (G10/G11) warn** →
   **Proceed** with a warning: consider adding flat‑specific features or raising the `edge_threshold`.

4. **All gates pass** →
   Full pipeline runs to completion: GRU training, SAC training, backtest, resulting in models, logs, and performance reports.

By strictly enforcing these gates, you ensure every GRU and SAC model you train has demonstrable, forward‑tested edge—maximizing your chances of hitting that 65 % directional target in live trading.