II. Event Representation & Embedding

II. Event Representation & Embedding Framework

This section bridges raw data with a mathematical framework that enables model learning. The objectives are to:

Support sparse, asynchronous, heterogeneous time events
Integrate asset-level events and macro-level events within a unified sequence
Provide a directly implementable system architecture

2.1 Formal Event Structure

A single event is represented as

$$z_i = (t_i, x_i, v_i)$$

where

$t_i$ = event timestamp
$x_i$ = event type (feature / indicator / macro tag)
$v_i$ = quantitative value (Boolean, discrete, or continuous)

The complete event domain is defined as

$$x_i \in \mathcal{X}_{asset} \cup \mathcal{X}_{macro}$$

Events may therefore be either:

Company/stock-level signals (micro-structure / strategy / factor signals)
Macroeconomic events (QE, QT, crisis flags, policy shocks, etc.)

2.2 Unified Event Sequence

For asset $a$, let there be an asset-specific event set

$$\mathcal{S}^a = \{z_i^a\}_{i=1}^{N_a}$$

And a shared set of market-wide macroeconomic events

$$\mathcal{M} = \{z_j^{macro}\}_{j=1}^{K}$$

We construct a unified event sequence by

$$\tilde{\mathcal{S}}^a = \text{merge-sort}(\mathcal{S}^a, \mathcal{M})$$

The result is a single sequence containing both:

Events specific to that stock
Macro events occurring within the same time period

Events are ordered chronologically, enabling the model to perceive continuous temporal relationships between events at both levels within a unified structure.

2.3 Time Encoding & Temporal Geometry

Since the data constitutes an asynchronous / irregular time series, inter-event intervals carry structural meaning. We define

$$\Delta t_i = t_i - t_{i-1}$$

And encode it as an embedding

$$\tau_i = \phi_{\tau}(\Delta t_i)$$

Possible implementations include:

Log-scale buckets
Continuous projection layers
Positional-style time kernels

The key insight is enabling the model to perceive the "tempo" of patterns, not merely their sequential order.

2.4 Event Token Representation

We define an event-to-vector transformation function

$$e_i = f_{\theta}(x_i, v_i, \tau_i)$$

Decomposed into components

$$e_i = \big[ \text{type-emb}(x_i), \text{feature-emb}(x_i), \text{value-proj}(v_i), \tau_i \big]$$

Component descriptions:

type-emb — Indicates whether the token is an asset-event or macro-event
feature-emb — Distinguishes signal types such as Feature-ID, Regime-ID
value-proj — Handles Boolean / discrete / continuous values in a unified vector form
$\tau_i$ — Encodes temporal interval meaning

This structure enables development teams to directly map real features to token embeddings.

2.5 Sparsity Awareness & Event Importance

Given the large number of features, yet expecting that only a subset carries causal significance in specific contexts, we introduce a feature-gating function

$$\alpha_i = g_{\psi}(x_i)$$

Applied as a scaling factor

$$\tilde{e}_i = \alpha_i \cdot e_i$$

With regularization to encourage sparsity

$$\Omega(\psi) = \lambda \|\alpha\|_1$$

This does not force feature reduction, but allows the representation to gradually self-select important features.

2.6 Implementation-Ready View

At the system level, a single token can be viewed as JSON, for example:

{
  "t": 1712001234,
  "type": "asset_event",
  "feature": "feature_X_217",
  "value": 1,
  "delta_t": 5400
}

Mapping through embedding layers as described above yields vector $e_i$, which is then fed into the sequence model.

In other words:

Data engineering handles sequence construction
ML model operates only on embeddings + sequence layers

2.7 Connection to Sequence Model

Given the vector sequence

$$\mathbf{E}^a = (e_1, \dots, e_n)$$

The sequence model (e.g., Transformer / TCN) learns

$$\mathbf{H}^a = F_{\Theta}(\mathbf{E}^a)$$

This forms the foundation for:

Learning pattern accumulation
Performing regime-conditioned analysis downstream

The next section formally describes the backbone architecture and training objective.