otp.Source.distinct#

Source.distinct(running=False, bucket_interval=0, bucket_time='end', bucket_units='seconds', bucket_end_condition=None, boundary_tick_bucket='new', selection='first')[source]#

Outputs all distinct values for a specified set of key fields.

Parameters
  • keys (str or list) – Specifies a list of tick attributes for which unique values are found. The ticks in the input time series must contain those attributes.

  • key_attrs_only (bool) – If set to true, output ticks will contain only key fields. Otherwise, output ticks will contain all fields of an input tick in which a given distinct combination of key values was first encountered.

  • running (bool) –

    Aggregation will be calculated as sliding window. running and bucket_interval parameters determines when new buckets are created.

    • running = True

      aggregation will be calculated in a sliding window.

      • bucket_interval = N (N > 0)

        Window size will be N. Output tick wil be generated when tick “enter” window (arrival event) and when “exit” window (exit event)

      • bucket_interval = 0

        Left boundary of window will be binded to start time. For each tick aggregation will be calculated in [start_time; tick_t].

    • running = False

      buckets partition the [query start time, query end time) interval into non-overlapping intervals of size bucket_interval (with the last interval possibly of a smaller size). If bucket_interval is set to 0 a single bucket for the entire interval is created.

    Default: False - create totally independent buckets. Number of buckets = (end - start) / bucket_interval’)

  • bucket_interval (int) – Determines the length of each bucket (units depends on bucket_units).

  • bucket_time (Literal['start', 'end']) –

    Control output timestamp.

    • start

      the timestamp assigned to the bucket is the start time of the bucket.

    • end

      the timestamp assigned to the bucket is the end time of the bucket.

  • bucket_units (Literal['seconds', 'ticks', 'days', 'months', 'flexible']) –

    Set bucket interval units.

    If set to flexible bucket_end_criteria must be set.

  • bucket_end_condition (condition) – An expression that is evaluated on every tick. If it evaluates to “True”, then a new bucket is created. This parameter is only used if bucket_units is set to “flexible”

  • boundary_tick_bucket (Literal['new', 'previous']) –

    Controls boundary tick ownership.

    • previous

      A tick on which bucket_end_condition evaluates to “true” belongs to the bucket being closed.

    • new

      tick belongs to the new bucket.

    This parameter is only used if bucket_units is set to “flexible”

  • selection (Literal['first', 'last']) – Controls the selection of the respective beginning or trailing part of ticks.

Return type

Source

Examples

>>> data = otp.Ticks(dict(x=[1, 3, 1, 5, 3]))
>>> data = data.distinct('x')
>>> data.to_df()
Time  x
0 2003-12-04  1
1 2003-12-04  3
2 2003-12-04  5