Event-Driven Representation Learning in Sparse Financial Time Series

A Macro-Contextual Conceptual Framework and Methodology

Niran Pravithana

VI. Experimental Design, Dataset Construction & Reproducibility Framework

This section establishes the methodological standards for the entire research:

  • How data is prepared
  • How labels and event sequences are constructed
  • How data partitioning and time periods are designed
  • How experiments are conducted to ensure unbiased and reproducible results

Formally: this section constitutes the "research protocol" ensuring that results derive from transparent, reproducible processes rather than trial-and-error tuning.

6.1 Dataset Definition — Multi-Asset Event-Time Panel

Let there be a set of assets

$$\mathcal{A} = \{a_1,\dots,a_M\}$$

And a data time range

$$[T_{start}, T_{end}]$$

For asset $a$, the unified event set (from Section 2):

$$\tilde{\mathcal{S}}^a = \{(t_i^a, x_i^a, v_i^a)\}_{i=1}^{N_a}$$

With discrete-time prices

$$P^a(t), \quad t\in\mathcal{T}^a$$

The complete system dataset

$$\mathcal{D} = \big\{ (\tilde{\mathcal{S}}^a, P^a(t)) \mid a\in\mathcal{A} \big\}$$

Structurally:

  • This is event-time panel data
  • Not based on uniform-interval sampling
  • Emphasizing the "event → sequence → outcome" structure

6.2 Event Construction Protocol (Asset & Macro Levels)

6.2.1 Asset-Level Event Extraction

Events derive from various feature types:

  • Boolean condition triggers
  • Factor-state transitions
  • Indicator crossing events
  • Structural / fundamental signals

Define the event generator

$$\Phi_{asset}: \text{raw feature stream} \longrightarrow \mathcal{X}_{asset}$$

Requirements:

  1. Events must derive from information known at that time
  2. Back-filled data is prohibited
  3. Timestamps must strictly precede outcomes

6.2.2 Macro-Level Event Definition

The macro event set:

$$\mathcal{M} = \{(t_j^{macro}, m_j)\}$$

Must be defined according to ex-ante observable rules, such as:

  • Official QE start dates from announcements
  • Interest rate change dates
  • Shock dates recorded in public sources

Hindsight definitions are prohibited, such as "retrospectively considering this period a crisis."

Source documentation must be specified and definitions frozen before training begins.

6.2.3 Unified Event Merge Procedure

The sequence merge process (for asset $a$):

$$\tilde{\mathcal{S}}^a = \text{merge-sort}(\mathcal{S}^a, \mathcal{M})$$

Enforced invariants:

  1. Non-decreasing time: $t_1^a \le t_2^a \le \dots$
  2. Identical macro events across all assets
  3. No retroactive event creation

6.3 Outcome Label Construction

Let future returns be

$$r^a(t,\Delta) = \frac{P^a(t+\Delta)-P^a(t)}{P^a(t)}$$

Define the outcome function

$$y_t^a = g(r^a(t,\Delta))$$

Threshold-based example:

$$y_t^a = \begin{cases} 1, & r^a(t,\Delta) \ge \tau_{up}\\ -1, & r^a(t,\Delta) \le \tau_{down}\\ 0, & \text{otherwise} \end{cases}$$

Critical requirements:

  • $y_t^a$ must use only prices in the interval $[t, t+\Delta]$
  • Median / forward fill across future periods is prohibited
  • $\tau$ and $\Delta$ must be specified before experiments and locked

6.4 Causal-Safe Training Window Construction

For each sample at time $t$

$$X_t^a = \tilde{\mathcal{S}}^a_{(-\infty,t)}$$
$$Y_t^a = y_t^a$$

That is

$$(X_t^a, Y_t^a) \quad\text{constructed as} \quad \textbf{past} \rightarrow \textbf{future}$$

Rolling retrospective windows that consume future data are prohibited.

6.5 Temporal Splitting & Forward Evaluation Protocol

To maintain time-causal validity, partition into:

  • Train
  • Validation
  • Test

With

$$T_{train} < T_{val} < T_{test}$$

And rolling-forward evaluation:

$$\Pi_k = \Pi\left( [t_k, t_{k+1}] \right)$$

Benefits:

  • Verification of performance stability over time
  • Reduced risk of selecting biased periods

6.6 Experimental Arms (What We Compare Against)

For meaningful research results, comparable baselines are required.

6.6.1 Baseline Models

  1. Random / Majority baseline
  2. Logistic regression on aggregated features
  3. GRU / LSTM (no macro tokens)
  4. Transformer without macro tokens
  5. Transformer with macro tokens (proposed)

Objective: not to "beat" baselines, but to demonstrate that incorporating "event sequences + macro context" provides structural informational value.

6.7 Hyperparameter Governance (Pre-Analysis Rule)

To avoid tuning bias, specify pre-registered ranges

Examples:

$$d \in \{64, 128, 256\}$$
$$L \in \{2, 4, 6\}$$
$$\lambda_g \in \{10^{-4}, 10^{-3}, 10^{-2}\}$$

Final model selection:

  • Based on time-split validation
  • Retrospective selection from test set is prohibited

6.8 Reproducibility Requirements

Research is considered reproducible when it includes:

  1. Versioned dataset recipe (raw data need not be released, but construction formula must be)
  2. Config-locked experiment files (e.g., YAML / JSON)
  3. Recorded commit hash, random seed, hyperparameters, training logs
  4. Consistency verification functions such as
$$\text{Hash}(\tilde{\mathcal{S}}^a) = \text{constant}$$

Confirming sequences remain unchanged between runs.

6.9 Error & Risk Audit — What Can Go Wrong

For transparency, potential risks must be assessed:

  • Regime mis-labeling
  • Survivorship bias in stock universe
  • Corporate actions causing price jumps
  • Missing-event distortion
  • Redundancy of correlated events

All items should be documented in a risk appendix.

6.10 Interpretation Scope & Ethical Boundary

This document specifies that:

  • Results are for structural research purposes
  • Not to be interpreted as profit prediction tools
  • No direct causal claims are made
  • This is pattern-relation research only

6.11 Connection to Final Sections

With Section 6 establishing foundations for dataset, experiment protocol, and reproducibility:

The next section addresses Limitations, Extensions & Future Research Directions (scope limitations, extension frameworks, and future research directions).