gru_sac_predictor/prompts/validation_gates.txt
2025-04-20 17:52:49 +00:00

55 lines
4.5 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

Below is a consolidated list of every “validation gate” we enforce in the pipeline—split by the **GRU** (prediction) stage and the **SAC** (positionsizing) stage. Each check either **aborts** the run (hardfail) or **warns** you that a noncritical gate didnt clear.
---
## GRUstage Validation Gates
| Gate # | Check | Data used | Threshold | Action on Fail |
|--------|----------------------------------------------------------------|-----------------------|------------------|--------------------|
| **G1** | **Raw binary LR** (Internal split) | train 80/20 split | CI LB  `binary_ci_lb` (0.52) | **Abort** |
| **G2** | **Raw binary RF** (Internal split, optional) | train 80/20 split | CI LB  `binary_rf_ci_lb` (0.54) | **Abort** (if enabled) |
| **G3** | **Raw ternary LR** (Internal split, if `use_ternary`) | train 80/20 split | CI LB  `ternary_ci_lb` (0.40) | **Warn** |
| **G4** | **Raw ternary RF** (Internal split, optional) | train 80/20 split | CI LB  `ternary_rf_ci_lb` (0.42) | **Warn** |
| **G5** | **Forwardfold binary LR** (True OOS, test fold t+1) | folds test set | CI LB  `forward_ci_lb` (0.52) | **Abort** |
| **G6** | **Featureselection rebaseline** (postprune binary LR) | pruned train 80/20 | CI LB  `binary_ci_lb` (0.52) | **Abort** |
| **G7** | **Calibration check** (edgefiltered p_cal on val split) | val split | CI LB  `calibration_ci_lb` (0.55) | **Abort** |
> **Notes:**
> • G1 catches “no predictive signal” cheaply.
> • G5 ensures it actually *generalises* forward.
> • G7 makes sure your calibrated probabilities have real edge before SAC ever sees them.
---
## SACstage Validation Gates
| Gate # | Check | Data used | Threshold | Action on Fail |
|---------|-------------------------------------------------------------|-----------------------|---------------------|----------------------|
| **G8** | **Edgefiltered binary LR** | val split probabilities | CI LB  `edge_binary_ci_lb` (0.60) | **Abort** |
| **G9** | **Edgefiltered binary RF** | val split probabilities | CI LB  `edge_binary_rf_ci_lb` (0.62) | **Abort** |
| **G10** | **Edgefiltered ternary LR** (if `use_ternary`) | val split p_cat[:,2]p_cat[:,0] | CI LB  `edge_ternary_ci_lb` (0.57) | **Warn** |
| **G11** | **Edgefiltered ternary RF** (if `use_ternary`) | val split p_cat[:,2]p_cat[:,0] | CI LB  `edge_ternary_rf_ci_lb` (0.58) | **Warn** |
| **G12** | **Backtest performance** (Sharpe / Max DD on test fold) | aggregated test folds | Sharpe  `backtest.sharpe_lb` (1.2)<br>Max DD  `backtest.max_dd_ub` (15 %) | **Abort** if violated|
> **Notes:**
> • G8G9 gate the **highconfidence edge** you feed into SAC. If they fail, SAC will only ever go allin/flat, so we abort.
> • G10G11 warn you if your flat/nomove sizing is shaky—SAC can still run, but youll get a console warning suggesting you tweak your flat thresholds or features.
> • G12 validates final **livelike** performance; if you cant hit the Sharpe/Max DD targets on the unseen test folds, the entire run is considered a nogo.
---
### What to do on each pattern
1. **Any GRUabort gate (G1, G2, G5, G6, G7) fails** →
**Stop** before training. Improve features, horizon, calibration settings, or prune strategy.
2. **GRU passes but SACbinary edge gates (G8/G9) fail** →
**Stop** before SAC training. Means your probabilities have no reliable highconfidence edge—tweak calibration threshold or retrain GRU.
3. **GRU & SACbinary gates pass, but SACternary edge gates (G10/G11) warn** →
**Proceed** with a warning: consider adding flatspecific features or raising the `edge_threshold`.
4. **All gates pass** →
Full pipeline runs to completion: GRU training, SAC training, backtest, resulting in models, logs, and performance reports.
By strictly enforcing these gates, you ensure every GRU and SAC model you train has demonstrable, forwardtested edge—maximizing your chances of hitting that 65 % directional target in live trading.