Below is a consolidated set of revision instructions — and the key code snippets you’ll need — to switch your GRU/SAC pipeline to supervise **log‑returns** (so your “ret” and “gauss_params” heads are targeting log‑return), and to wire everything end‑to‑end so you get a clean 55 %+ edge before SAC. --- ## 🛠 Revision Playbook 1. **Compute forward log‐returns in your data pipeline** In `TradingPipeline.define_labels_and_align` (or wherever you set up `df`): ```python # Replace any raw-return calculation with log‐return N = config['gru']['prediction_horizon'] df['fwd_log_ret'] = np.log(df['close'].shift(-N) / df['close']) df['direction_label'] = (df['fwd_log_ret'] > 0).astype(int) # If you do ternary: flat_thr = config['gru']['flat_sigma_multiplier'] * df['fwd_log_ret'].rolling(…) df['dir3_label'] = pd.cut(df['fwd_log_ret'], bins=[-np.inf, -flat_thr, flat_thr, np.inf], labels=[0,1,2]).astype(int) ``` 2. **Align your targets** Drop the last N rows so `fwd_log_ret` has no NaNs: ```python df = df.iloc[:-N] ``` 3. **Pass log‐return into both heads** When you build your sequences and target dicts: ```python y_ret_seq = ... # shape (n_seq, 1) from fwd_log_ret y_dir3_seq = ... # one‑hot from dir3_label y_train = {'mu': y_ret_seq, # Huber head 'gauss_params': y_ret_seq, # NLL head uses same target 'dir3': y_dir3_seq} # classification head ``` 4. **Update your GRU builder to match** Make sure your v3 model has exactly three outputs: ```python model = Model(inputs, outputs=[mu_output, gauss_params_output, dir3_output]) model.compile( optimizer=Adam(lr), loss={ 'mu': Huber(delta), 'gauss_params': gaussian_nll, 'dir3': categorical_focal_loss }, loss_weights={'mu':1.0, 'gauss_params':0.2, 'dir3':0.4}, metrics={'dir3':'accuracy'} ) ``` 5. **Train with the new targets dict** In `GRUModelHandler.train(...)`, replace your fit call with: ```python history = model.fit( X_train_seq, y_train_dict, validation_data=(X_val_seq, y_val_dict), …callbacks… ) ``` 6. **Calibrate on the “dir3” softmax outputs** Your calibrator (Temp/Vector) must now consume the 3‑class logits or probabilities: ```python raw_logits = handler.predict_logits(X_val_seq) calibrator.fit(raw_logits, y_val_dir3) ``` 7. **Feed SAC the log‐return μ and σ** In your `TradingEnv`, when you construct the state: ```python mu, log_sigma, probs = gru_handler.predict(X_step) sigma = np.exp(log_sigma) edge = 2 * calibrated_p_up - 1 # if binary z_score = np.abs(mu) / sigma state = [mu, sigma, edge, z_score, prev_position] ``` 8. **Re‐run baseline check on log‐returns** Your logistic baseline in `run_baseline_checks` should now be trained on `X_train_pruned` vs `y_dir3_label` (or binary), ensuring the CI ≥ 0.52 before you even build your GRU. 9. **Validate end‑to‑end edge** After these changes, you should see: - Baseline logistic CI LB ≥ 0.52 - GRU “edge” hit‑rate ≥ 0.55 on validation - SAC backtest hitting meaningful Sharpe/Win‑rate gates --- ## 🔧 Example Code Snippets ### 1) Gaussian NLL stays the same: ```python @saving.register_keras_serializable(package='GRU') def gaussian_nll(y_true, y_pred): mu, log_sigma = tf.split(y_pred, 2, axis=-1) y_true = tf.reshape(y_true, tf.shape(mu)) inv_var = tf.exp(-2*log_sigma) return tf.reduce_mean(0.5 * inv_var * tf.square(y_true-mu) + log_sigma) ``` ### 2) Build & compile v3 model: ```python def build_gru_model_v3(...): inp = layers.Input((lookback, n_features)) x = layers.GRU(gru_units, return_sequences=True)(inp) x = layers.LayerNormalization()(x) if attention_units>0: x = layers.MultiHeadAttention(...)(x,x) x = layers.GlobalAveragePooling1D()(x) gauss = layers.Dense(2, name='gauss_params')(x) mu = layers.Lambda(lambda z: z[:,0:1], name='mu')(gauss) dir3_logits = layers.Dense(3, name='dir3_logits')(x) dir3 = layers.Activation('softmax', name='dir3')(dir3_logits) model = Model(inp, [mu, gauss, dir3]) model.compile( optimizer=Adam(lr), loss={'mu':Huber(delta), 'gauss_params':gaussian_nll, 'dir3':categorical_focal_loss}, loss_weights={'mu':1.0,'gauss_params':0.2,'dir3':0.4}, metrics={'dir3':'accuracy'} ) return model ``` ### 3) Fitting in your handler: ```python history = self.model.fit( X_train_seq, y_train_dict, validation_data=(X_val_seq, y_val_dict), epochs=max_epochs, batch_size=batch_size, callbacks=[early_stop, csv_logger, TqdmCallback()] ) ``` --- ### Why these changes boost edge? - **Log‐returns** stabilize variance & symmetrize up/down moves. - **NLL + Huber on log‐return** gives the model both distributional uncertainty (σ) and a robust error measure. - **Proper softmax head** on three classes (up/flat/down) cleans up classification. - **Calibration + optimized edge threshold** ensures your SAC agent only sees high‐confidence signals (edge≥thr). Together, this gets your baseline GRU above 55 % “edge” on validation, so the SAC agent can then learn a meaningful sizing policy rather than fight noise. Let me know if you need any further refinements!