finmlkit.label.kit module

A API wrapper around the core numba function for better usability

class finmlkit.label.kit.SampleWeights[source]

Bases: object

A wrapper class for time decay and class balance weights calculation. These weights should be run on the training window part of the full dataset.

static compute_final_weights(avg_uniqueness: Series, time_decay_intercept: float = 1.0, return_attribution: Series = None, vertical_touch_weights: Series = None, labels: Series = None) DataFrame[source]

Compute the time decay and class balance weights based on the average uniqueness and return attribution. Normalizes return attribution to sum up to event count.

Parameters:
  • avg_uniqueness – Average uniqueness weights for the events.

  • return_attribution – Provide unnormalized return attribution if use this as info weights instead of average uniqueness.

  • vertical_touch_weights – Provide vertical touch weights if you want to apply them to the final weights.

  • time_decay_intercept – The intercept for the time decay function. 1.0 means no decay, 0.0 means full decay. Negative values will erase the oldest portion of the weights.

  • labels – Provide labels if you want to apply class balancing to the final weights.

Returns:

A pandas Dataframe containing the weight parts and the combined weights.

static compute_info_weights(trades: TradesData, labels: DataFrame, normalize: bool = False) DataFrame[source]

Computes the average uniqueness and (non-normalized) return attribution for the events.

Parameters:
  • trades – The raw trades on which the events are evaluated

  • labels – Labels dataframe containing event indices and touch indices (output of compute_labels method).

  • normalize – Whether to normalize the returned weights.

Returns:

A pandas DataFrame containing the average uniqueness and return attribution and vertical touch weights.

class finmlkit.label.kit.TBMLabel(features: DataFrame, target_ret_col: str, min_ret: float, horizontal_barriers: tuple[float, float], vertical_barrier: Timedelta, min_close_time: Timedelta = Timedelta('0 days 00:00:01'), is_meta: bool = False)[source]

Bases: object

compute_labels(trades: TradesData) tuple[DataFrame, DataFrame][source]

Compute the labels for the events using the triple barrier method.

Parameters:

trades – The raw trades data the events will be evaluated

Returns:

A tuple containing: - The features DataFrame with the event indices and other features. - A dataframe containing labels, event indices, touch indices, returns, and weights.

compute_weights(trades: TradesData, normalized: bool = False) DataFrame[source]

Computes the sample average uniqueness and return attribution. :param trades: Same Raw trades data passed to compute_labels(). :param normalized: Whether to normalize the weights. :return: DataFrame containing the sample average uniqueness and return attribution.

property event_count: int

Get the number of events in the features DataFrame. :return: The number of events.

property event_range: str

Get the range of event timestamps. :return: A string containing the first and last event timestamps.

property event_returns: Series

Get the log returns associated with each event. :return: A pandas Series containing the log returns.

property features: DataFrame

Get the features corresponding the generated labels. I might be a subset of the original features DataFrame due to TBM evaluation window. :return: The features DataFrame.

property first_event_timestamp: Timestamp | None

Get the timestamp of the first event. :return: The timestamp of the first event.

property full_output: DataFrame

Get the full output DataFrame containing labels, event indices, touch indices, returns, and weights. :return: A pandas DataFrame containing the full output.

property labels: Series

Get the labels for the events. :return: A pandas Series containing the labels.

property last_event_timestamp: Timestamp | None

Get the timestamp of the last event. :return: The timestamp of the last event.

property target_returns: Series

Get the target returns for the events. :return: A pandas Series containing the target returns.