Data Retrieval#

import onetick.py as otp
otp.config['tz'] = 'EST5EDT'

OneTick is a time series database meaning that each record has a timestamp and timestamps of consecutive records are non-decreasing. Multiple time series are stored in OneTick. An individual time series is identified by the symbol (aka ticker, financial instrument, security), tick type (i.e., the type of data such as trades or quotes), and the name of the database where the time series is stored.

# you can retrieve the list of databases available
dbs = otp.databases()
list(dbs)[:5]
['ACTIV_ALPHA', 'ACTIV_CANADA', 'ACTIV_CSE', 'ACTIV_CSE2', 'ACTIV_CX2']
# the list of dates with data for the db
dbs['ACTIV_CANADA'].dates()[-5:]
[datetime.date(2023, 4, 15),
 datetime.date(2023, 4, 16),
 datetime.date(2023, 4, 17),
 datetime.date(2023, 4, 18),
 datetime.date(2023, 4, 19)]
# or just the last day with data
dbs['ACTIV_CANADA'].last_date
datetime.date(2023, 4, 19)
# and the list of tick types
dbs['ACTIV_CANADA'].tick_types()
['DAY', 'QTE', 'STAT', 'TRD']

We can now retrieve symbols traded in a given time range. (In many financial markets, there are properties that remain constant throughout the trading day. Examples include the name of a financial instrument and the set of instruments traded).

symbols = otp.Symbols('ACTIV_CANADA')
otp.run(symbols, start=otp.dt(2023,4,11), end=otp.dt(2023,4,12))
Time SYMBOL_NAME
0 2023-04-11 A.H
1 2023-04-11 AAA.P
2 2023-04-11 AAAJ.P
3 2023-04-11 AAB
4 2023-04-11 AAC.P
... ... ...
5737 2023-04-11 ZYZ.SB.A
5738 2023-04-11 ZYZ.UN.U
5739 2023-04-11 ZZE.H
5740 2023-04-11 ZZZ
5741 2023-04-11 ZZZD

5742 rows × 2 columns

We used the otp.run method above, which executed a query that retrieved the list of symbols.

The start and end timestamps were specified with onetick-py’s datetime class otp.dt.

Now that we have database names, tick types, and symbols, we are ready to query a time series.

q = otp.DataSource('NYSE_TAQ', tick_type='TRD')
otp.config['tz'] = 'EST5EDT' # set the timezone
otp.run(q, start=otp.dt(2023,3,29,9,30), end=otp.dt(2023,3,29,10), symbols=['SPY'])
Time EXCHANGE COND STOP_STOCK SOURCE TRF TTE TICKER PRICE DELETED_TIME TICK_STATUS SIZE CORR SEQ_NUM TRADE_ID PARTICIPANT_TIME TRF_TIME OMDSEQ
0 2023-03-29 09:30:00.000877568 P T N C 0 SPY 399.9200 1969-12-31 19:00:00 0 400 0 82274 52983525290315 2023-03-29 09:30:00.000854016 1969-12-31 19:00:00.000000000 0
1 2023-03-29 09:30:00.001151232 T T N C 0 SPY 399.9200 1969-12-31 19:00:00 0 1000 0 82280 62879132897848 2023-03-29 09:30:00.000801064 1969-12-31 19:00:00.000000000 0
2 2023-03-29 09:30:00.001154304 T T N C 0 SPY 399.9200 1969-12-31 19:00:00 0 1000 0 82281 62879132897849 2023-03-29 09:30:00.000801064 1969-12-31 19:00:00.000000000 1
3 2023-03-29 09:30:00.001921280 T T N C 0 SPY 399.9300 1969-12-31 19:00:00 0 657 0 82282 62879132897891 2023-03-29 09:30:00.001574348 1969-12-31 19:00:00.000000000 2
4 2023-03-29 09:30:00.010831360 Z F N C 1 SPY 399.9250 1969-12-31 19:00:00 0 100 0 82401 52983525039159 2023-03-29 09:30:00.010560000 1969-12-31 19:00:00.000000000 0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
73249 2023-03-29 09:59:59.690966784 D I N C T 0 SPY 399.8582 1969-12-31 19:00:00 0 3 0 301129 71675710421041 2023-03-29 09:59:59.690309443 2023-03-29 09:59:59.690612916 0
73250 2023-03-29 09:59:59.697699840 D I N C N 0 SPY 399.8424 1969-12-31 19:00:00 0 5 0 301131 79371887987030 2023-03-29 09:59:59.696245391 2023-03-29 09:59:59.697677652 0
73251 2023-03-29 09:59:59.707425024 D I N C T 0 SPY 399.8600 1969-12-31 19:00:00 0 1 0 301132 71675710421059 2023-03-29 09:59:59.633000000 2023-03-29 09:59:59.707071825 0
73252 2023-03-29 09:59:59.928770304 D I N C T 0 SPY 399.8600 1969-12-31 19:00:00 0 1 0 301136 71675710423605 2023-03-29 09:59:59.897000000 2023-03-29 09:59:59.928417034 0
73253 2023-03-29 09:59:59.949504768 D I N C T 0 SPY 399.8500 1969-12-31 19:00:00 0 35 0 301138 71675710423626 2023-03-29 09:59:59.948898628 2023-03-29 09:59:59.949147444 0

73254 rows × 18 columns

Note that we specified the start and end of the time series to retrieve the corresponding interval.

Let’s just keep the columns we’re interested in to make it more digestible.

q = otp.DataSource('NYSE_TAQ', tick_type='TRD')
q = q[['PRICE','SIZE','COND','EXCHANGE']]
otp.run(q, start=otp.dt(2023,3,29,9,30), end=otp.dt(2023,3,29,10), symbols=['SPY'])
Time PRICE SIZE COND EXCHANGE
0 2023-03-29 09:30:00.000877568 399.9200 400 T P
1 2023-03-29 09:30:00.001151232 399.9200 1000 T T
2 2023-03-29 09:30:00.001154304 399.9200 1000 T T
3 2023-03-29 09:30:00.001921280 399.9300 657 T T
4 2023-03-29 09:30:00.010831360 399.9250 100 F Z
... ... ... ... ... ...
73249 2023-03-29 09:59:59.690966784 399.8582 3 I D
73250 2023-03-29 09:59:59.697699840 399.8424 5 I D
73251 2023-03-29 09:59:59.707425024 399.8600 1 I D
73252 2023-03-29 09:59:59.928770304 399.8600 1 I D
73253 2023-03-29 09:59:59.949504768 399.8500 35 I D

73254 rows × 5 columns

We can retrieve multiple time series.

q = otp.DataSource('NYSE_TAQ', tick_type='TRD')
q = q[['PRICE','SIZE','COND','EXCHANGE']]
mult = otp.run(q, start=otp.dt(2023,3,29,9,30), end=otp.dt(2023,3,29,16), symbols=['SPY', 'QQQ'])
mult
{'SPY':                                 Time    PRICE  SIZE  COND EXCHANGE
 0      2023-03-29 09:30:00.000877568  399.920   400    T         P
 1      2023-03-29 09:30:00.001151232  399.920  1000    T         T
 2      2023-03-29 09:30:00.001154304  399.920  1000    T         T
 3      2023-03-29 09:30:00.001921280  399.930   657    T         T
 4      2023-03-29 09:30:00.010831360  399.925   100   F          Z
 ...                              ...      ...   ...   ...      ...
 537609 2023-03-29 15:59:59.994555136  401.350   643   F          P
 537610 2023-03-29 15:59:59.995045376  401.350   900   F          P
 537611 2023-03-29 15:59:59.997313024  401.340   100              Z
 537612 2023-03-29 15:59:59.997354752  401.340   498              N
 537613 2023-03-29 15:59:59.997406208  401.340   200              T
 
 [537614 rows x 5 columns],
 'QQQ':                                 Time    PRICE  SIZE  COND EXCHANGE
 0      2023-03-29 09:30:00.004574216  310.880   484  @F          Z
 1      2023-03-29 09:30:00.041854238  310.880   100  @           P
 2      2023-03-29 09:30:00.041854358  310.880   100  @  Q        P
 3      2023-03-29 09:30:00.079583114  310.900    50  @  I        J
 4      2023-03-29 09:30:00.079611884  310.900    50  @  I        Y
 ...                              ...      ...   ...   ...      ...
 409494 2023-03-29 15:59:59.974708442  312.810    38  @  I        Q
 409495 2023-03-29 15:59:59.988868197  312.810    32  @F I        X
 409496 2023-03-29 15:59:59.988872215  312.805   200  @F          Q
 409497 2023-03-29 15:59:59.988874072  312.810    49  @F I        Q
 409498 2023-03-29 15:59:59.989201372  312.810   600  @F          K
 
 [409499 rows x 5 columns]}

Each time series is returned as the value of a dict keyed by the corresponding symbol.

mult['QQQ']
Time PRICE SIZE COND EXCHANGE
0 2023-03-29 09:30:00.004574216 310.880 484 @F Z
1 2023-03-29 09:30:00.041854238 310.880 100 @ P
2 2023-03-29 09:30:00.041854358 310.880 100 @ Q P
3 2023-03-29 09:30:00.079583114 310.900 50 @ I J
4 2023-03-29 09:30:00.079611884 310.900 50 @ I Y
... ... ... ... ... ...
409494 2023-03-29 15:59:59.974708442 312.810 38 @ I Q
409495 2023-03-29 15:59:59.988868197 312.810 32 @F I X
409496 2023-03-29 15:59:59.988872215 312.805 200 @F Q
409497 2023-03-29 15:59:59.988874072 312.810 49 @F I Q
409498 2023-03-29 15:59:59.989201372 312.810 600 @F K

409499 rows × 5 columns

We can also retrieve all of the symbols from the database or all of the symbols matching a pattern.

q = otp.DataSource('NYSE_TAQ', tick_type='TRD')
q = q[['PRICE','SIZE','COND','EXCHANGE']]
mult = otp.run(q, start=otp.dt(2023,3,29,9,30), end=otp.dt(2023,3,29,16), symbols=otp.Symbols('NYSE_TAQ', pattern='AAA%'))
mult
{'AAA':                            Time    PRICE  SIZE  COND EXCHANGE
 0 2023-03-29 10:13:42.861570816  24.5699     1     I        D
 1 2023-03-29 11:58:56.726636288  24.5100    39     I        V
 2 2023-03-29 12:30:32.157703680  24.5350    48     I        V,
 'AAAU':                               Time   PRICE   SIZE  COND EXCHANGE
 0    2023-03-29 09:30:00.019952128  19.480   8892   O X        Z
 1    2023-03-29 09:30:00.236440064  19.475      4     I        Z
 2    2023-03-29 09:30:00.287164928  19.475      1     I        D
 3    2023-03-29 09:30:00.331036416  19.480      1     I        D
 4    2023-03-29 09:30:01.001156864  19.480      1     I        D
 ...                            ...     ...    ...   ...      ...
 1370 2023-03-29 15:59:57.926450432  19.470   7600   F          Y
 1371 2023-03-29 15:59:57.926453504  19.470   6200   F          Y
 1372 2023-03-29 15:59:57.926459648  19.470  10400   F          Y
 1373 2023-03-29 15:59:57.927043072  19.470  27709   F          N
 1374 2023-03-29 15:59:57.929156864  19.470    908   F          A
 
 [1375 rows x 5 columns]}

You may choose to merge all of the time series by time.

q = otp.DataSource('NYSE_TAQ', tick_type='TRD')
q = q[['PRICE','SIZE','COND','EXCHANGE']]
q = otp.merge([q], symbols=['SPY', 'QQQ'], identify_input_ts=True)
single = otp.run(q, start=otp.dt(2023,3,29,9,30), end=otp.dt(2023,3,29,16))
single
Time PRICE SIZE COND EXCHANGE SYMBOL_NAME TICK_TYPE
0 2023-03-29 09:30:00.000877568 399.92 400 T P SPY TRD
1 2023-03-29 09:30:00.001151232 399.92 1000 T T SPY TRD
2 2023-03-29 09:30:00.001154304 399.92 1000 T T SPY TRD
3 2023-03-29 09:30:00.001921280 399.93 657 T T SPY TRD
4 2023-03-29 09:30:00.004574216 310.88 484 @F Z QQQ TRD
... ... ... ... ... ... ... ...
947108 2023-03-29 15:59:59.994555136 401.35 643 F P SPY TRD
947109 2023-03-29 15:59:59.995045376 401.35 900 F P SPY TRD
947110 2023-03-29 15:59:59.997313024 401.34 100 Z SPY TRD
947111 2023-03-29 15:59:59.997354752 401.34 498 N SPY TRD
947112 2023-03-29 15:59:59.997406208 401.34 200 T SPY TRD

947113 rows × 7 columns