Query start / end flow#
Query interval#
onetick.py
runs a query for a specified query interval. The query interval can be set implicitly or explicitly.
The query interval [start, end)
is specified using the start
and end
parameters (or the date
parameter).
The are two ways of defining start and end for a query: on a source or for the whole query (e.g.,
in otp.run
). Both methods can be combined for some cases.
Query interval on query execution#
Query interval can be set when the query is executed:
# trades are retrieved for the interval [2022/3/1, 2022/3/2) specified when the query is executed on line 2
trades = otp.DataSource(db='NYSE_TAQ', tick_type='TRD', symbols='AAPL')
trades(start=otp.dt(2022, 3, 1), end=otp.dt(2022, 3, 2))
The query interval specified when executing the query applies to every source that does not specify its own interval:
# trades are retrieved for the interval [2022/3/1, 2022/3/2) specified when the query is executed
trades = otp.DataSource(db='NYSE_TAQ', tick_type='TRD', symbols='AAPL')
# quotes are retrieved for the interval [2022/3/1, 2022/3/2) specified when the query is executed
quotes = otp.DataSource(db='NYSE_TAQ', tick_type='QTE', symbols='AAPL')
res = otp.join([trades, quotes])
res(start=otp.dt(2022, 3, 1), end=otp.dt(2022, 3, 2))
Query interval on a source#
Query interval can be specified when a source is defined:
trades = otp.DataSource(db='NYSE_TAQ', tick_type='TRD', symbols='AAPL', start=otp.dt(2022, 3, 1), end=otp.dt(2022, 3, 2))
trades()
Every source can specify its own interval and different sources can have different intervals. For example, below we specify the intervals to compute the volume on March 1 in one source and the volume on March 2 in another source. We then merge the two sources and the user does not need to worry about setting the interval for the resulting query.
>>> day1_trades = otp.DataSource(db='NYSE_TAQ', symbol='AAPL', tick_type='TRD', start=otp.dt(2023, 3, 1), end=otp.dt(2023, 3, 2))
>>> day1_volume = day1_trades.agg({'VOLUME': otp.agg.sum(day1_trades['SIZE'])}, bucket_time='start')
>>> day1_volume() #volume on March 1
Time VOLUME
0 2023-03-01 62351689
>>> day2_trades = otp.DataSource(db='NYSE_TAQ', symbol='AAPL', tick_type='TRD', start=otp.dt(2023, 3, 2), end=otp.dt(2023, 3, 3))
>>> day2_volume = day2_trades.agg({'VOLUME': otp.agg.sum(day2_trades['SIZE'])}, bucket_time='start')
>>> day2_volume() #volume on March 2
Time VOLUME
0 2023-03-02 60242644
>>> res = day1_volume + day2_volume # merge ticks
>>> otp.run(res)
Time VOLUME
0 2023-03-01 62351689
1 2023-03-01 60242644
The interval can also be specified using the date
parameter of the otp.DataSource
, that sets
the start and end parameters to 00:00:00
and next day’s 00:00:00
respectively.
Default query interval#
onetick.py
uses the default values onetick.py.config.default_start_time
and onetick.py.config.default_end_time
for the start
and end
parameters when they are not set. The default values are useful in here.
The otp.dt
class#
The start
and end
parameters take the standard datetime.datetime values as well as
otp.dt
values. The otp.dt
class is introduced to support
nanoseconds and DST as the standard python datetime.datetime
class does not support them.
otp.dt
could be used in any onetick.py
api call that allows date or time as an input:
>>> data = otp.Ticks(X=[1, 2, 3])
>>> data['TIME_VALUE'] = otp.dt(2022, 1, 1, nanosecond=456)
>>> otp.run(data)
Time X TIME_VALUE
0 2003-12-01 00:00:00.000 1 2022-01-01 00:00:00.000000456
1 2003-12-01 00:00:00.001 2 2022-01-01 00:00:00.000000456
2 2003-12-01 00:00:00.002 3 2022-01-01 00:00:00.000000456
Timezone#
The timezone can be specified in the otp.run
using the timezone
parameter. If it is not set then default timezone is used.
It is possible to change the default timezone using the OTP_DEFAULT_TZ
environment variable or using the otp.config['tz']
config variable:
otp.config['tz'] = 'GMT'