otp.Source.update_timestamp#

Source.update_timestamp(timestamp_field, timestamp_psec_field=None, max_delay_of_original_timestamp=0, max_delay_of_new_timestamp=0, max_out_of_order_interval=0, max_delay_handling='complain', out_of_order_timestamp_handling='complain', zero_timestamp_handling=None, log_sequence_violations=False, inplace=False)#

Assigns alternative timestamps to input ticks.

In case resulting timestamps are out of order, time series is sorted in ascending order of those timestamps.

Ticks with equal alternative timestamps are sorted in ascending order of the respective original timestamps, and equal ones among those are further sorted by ascending values of the OMDSEQ field, if such a field is present.

Parameters
  • timestamp_field (str) – Specifies the name of the input field, which is assumed to carry alternative timestamps.

  • timestamp_psec_field (str) –

    Fractional (< 1 millisecond) parts of alternative timestamps will be taken from this field. They are assumed to be in picoseconds (0.001 of nanosecond).

    If this field is specified, then even if alternative timestamps have nanosecond granularity, only millisecond parts of them will be taken, nanosecond part will be rewritten.

    Useful in case timestamp_field has lower than millisecond granularity.

  • max_delay_of_original_timestamp (int or datetime offset) –

    Changes the start time of the query to original_start_time - max_delay_of_original_timestamp to make sure it processes all possible ticks that might have a new timestamp in the query time range.

    If integer value is specified, it is assumed to be milliseconds.

  • max_delay_of_new_timestamp (int or datetime offset) –

    Changes the end time of the query to original_end_time + max_delay_of_new_timestamp to make sure it processes all possible ticks that might have a new timestamp in the query time range.

    If integer value is specified, it is assumed to be milliseconds.

  • max_out_of_order_interval (int or datetime offset) –

    Specifies the maximum out-of-order interval for alternative timestamps. Ticks with new timestamps out of order will be sorted in ascending order.

    This is the only parameter that leads to accumulation of ticks.

    If integer value is specified, it is assumed to be milliseconds.

  • max_delay_handling (str ('complain' or 'discard' or 'use_original_timestamp' or 'use_new_timestamp')) –

    This parameter how to process ticks that are delayed even more than specified in max_delay_of_original_timestamp or max_delay_of_new_timestamp parameters.

    • complain: raise an exception

    • discard: do not add tick to the output time series

    • use_original_timestamp: assign original timestamp to the tick

    • use_new_timestamp: try to assign new timestamp to the tick. If the new timestamp of previous tick is greater than new timestamp of this tick then previous one is used. NOTE: A previously propagated timestamp could be from a heartbeat. When there is no previous propagated tick and the new timestamp falls behind the query start time, the latter is preferred, while query end time is preferred if the new timestamp exceeds it.

    This parameter is processed before any action from out_of_order_timestamp_handling parameter.

  • out_of_order_timestamp_handling (str ('complain' or 'use_previous_value' or 'use_original_timestamp')) –

    This parameter how to process ticks that are out of order even more than specified in max_out_of_order_interval parameter.

    • complain: raise an exception

    • use_previous_value: assign new timestamp from a previous propagated tick

    • use_original_timestamp: try to use original timestamp. If maximum out-of-order interval is still exceeded then exception will be thrown.

    This parameter is processed after any action from max_delay_handling parameter.

  • zero_timestamp_handling (str ('preserve_sequence'), optional) –

    This parameter specifies how to process ticks with zero alternative timestamps.

    If value is None, actions from max_delay_handling and out_of_order_timestamp_handling are performed on it.

    If value is preserve_sequence then new timestamp will be set to the maximum between query start time and the new timestamp of the previous propagated tick.

  • log_sequence_violations (bool) – If set to True, then warnings about actions from parameters out_of_order_timestamp_handling, max_delay_handling and zero_timestamp_handling are logged into the log file.

  • inplace (bool) – The flag controls whether operation should be applied inplace or not. If inplace=True, then it returns nothing. Otherwise method returns a new modified object.

Return type

Source or None

Note

In case parameters max_delay_of_original_timestamp or max_delay_of_new_timestamp are specified this method automatically uses modify_query_times() method, affecting the start or end time of the query thus possibly changing some logic of the nodes placed higher in the graph.

Also there are some limitations on using this method in the graph, e.g. this method can’t be used in the “diamond” pattern and can’t be used twice in the same graph.

Examples

Data and timestamps from the database:

>>> start = otp.dt(2022, 3, 2)
>>> end = otp.dt(2022, 3, 3)
>>> data = otp.DataSource('NYSE_TAQ', symbols='AAPL', tick_type='TRD')
>>> otp.run(data, start=start, end=end)
                     Time  PRICE  SIZE
0 2022-03-02 00:00:00.000    1.0   100
1 2022-03-02 00:00:00.001    1.1   101
2 2022-03-02 00:00:00.002    1.2   102

Adding one hour to all ticks. Parameter max_delay_of_original_timestamp must be specified in this case:

>>> data = otp.DataSource('NYSE_TAQ', symbols='AAPL', tick_type='TRD')
>>> data['ORIG_TS'] = data['TIMESTAMP']
>>> data['NEW_TS'] = data['TIMESTAMP'] + otp.Hour(1)
>>> data = data.update_timestamp('NEW_TS', max_delay_of_original_timestamp=otp.Hour(1))
>>> otp.run(data, start=start, end=end)[['Time', 'PRICE', 'SIZE', 'ORIG_TS']]
                     Time  PRICE  SIZE                  ORIG_TS
0 2022-03-02 01:00:00.000    1.0   100  2022-03-02 00:00:00.000
1 2022-03-02 01:00:00.001    1.1   101  2022-03-02 00:00:00.001
2 2022-03-02 01:00:00.002    1.2   102  2022-03-02 00:00:00.002

Subtracting one day from all ticks. Parameter max_delay_of_new_timestamp must be specified in this case.

>>> data = otp.DataSource('NYSE_TAQ', symbols='AAPL', tick_type='TRD')
>>> data['ORIG_TS'] = data['TIMESTAMP']
>>> data['NEW_TS'] = data['TIMESTAMP'] - otp.Day(1)
>>> data = data.update_timestamp('NEW_TS', max_delay_of_new_timestamp=otp.Day(1))
>>> otp.run(data, start=start - otp.Day(1), end=end)[['Time', 'PRICE', 'SIZE', 'ORIG_TS']]
                     Time  PRICE  SIZE                 ORIG_TS
0 2022-03-01 00:00:00.000    1.0   100 2022-03-02 00:00:00.000
1 2022-03-01 00:00:00.001    1.1   101 2022-03-02 00:00:00.001
2 2022-03-01 00:00:00.002    1.2   102 2022-03-02 00:00:00.002

Parameter max_delay_handling can be used to specify how to handle ticks exceeding the maximum:

>>> data = otp.DataSource('NYSE_TAQ', symbols='AAPL', tick_type='TRD')
>>> data['ORIG_TS'] = data['TIMESTAMP']
>>> data['NEW_TS'] = data.apply(
...     lambda row: row['TIMESTAMP'] + otp.Hour(24)
...     if row['PRICE'] == 1.1
...     else row['TIMESTAMP'] + otp.Hour(1)
... )
>>> data = data.update_timestamp('NEW_TS',
...                              max_delay_of_original_timestamp=otp.Hour(1),
...                              max_delay_handling='discard')
>>> otp.run(data, start=start, end=end)[['Time', 'PRICE', 'SIZE', 'ORIG_TS']]
                     Time  PRICE  SIZE                  ORIG_TS
0 2022-03-02 01:00:00.000    1.0   100  2022-03-02 00:00:00.000
1 2022-03-02 01:00:00.002    1.2   102  2022-03-02 00:00:00.002

Parameter max_out_of_order_interval can be used in case new timestamp are out of order:

>>> data = otp.DataSource('NYSE_TAQ', symbols='AAPL', tick_type='TRD')
>>> data['ORIG_TS'] = data['TIMESTAMP']
>>> data = data.agg({'COUNT': otp.agg.count()}, running=True, all_fields=True)
>>> data['NEW_TS'] = data['TIMESTAMP'] - otp.Minute(data['COUNT'])
>>> data = data.update_timestamp('NEW_TS',
...                              max_delay_of_new_timestamp=otp.Hour(10),
...                              max_out_of_order_interval=otp.Minute(100))
>>> otp.run(data, start=start - otp.Hour(2), end=end)[['Time', 'PRICE', 'SIZE', 'ORIG_TS', 'COUNT']]
                     Time  PRICE  SIZE                 ORIG_TS  COUNT
0 2022-03-01 23:57:00.002    1.2   102 2022-03-02 00:00:00.002      3
1 2022-03-01 23:58:00.001    1.1   101 2022-03-02 00:00:00.001      2
2 2022-03-01 23:59:00.000    1.0   100 2022-03-02 00:00:00.000      1

See also

UPDATE_TIMESTAMP OneTick event processor