Aggregating#

Let’s start with an unaggregated time series.

import onetick.py as otp

s = otp.dt(2024, 2, 1, 9, 30)
e = otp.dt(2024, 2, 1, 9, 30, 1)

q = otp.DataSource('US_COMP_SAMPLE', tick_type='TRD')
q = q[['PRICE', 'SIZE', 'COND', 'EXCHANGE']]
otp.run(q, start=s, end=e, symbols=['AAPL'])
Time PRICE SIZE COND EXCHANGE
0 2024-02-01 09:30:00.000961260 184.010 302 @FT P
1 2024-02-01 09:30:00.000961491 184.000 100 @FT P
2 2024-02-01 09:30:00.000961701 184.000 1 @FTI P
3 2024-02-01 09:30:00.000973163 184.000 1 @FTI P
4 2024-02-01 09:30:00.000973355 184.000 5 @FTI P
... ... ... ... ... ...
574 2024-02-01 09:30:00.987184691 183.900 9 @F I K
575 2024-02-01 09:30:00.990378350 183.920 1 @ I D
576 2024-02-01 09:30:00.991941892 183.935 1 @ I D
577 2024-02-01 09:30:00.993785116 183.905 300 @ D
578 2024-02-01 09:30:00.996512511 183.934 5 @ I D

579 rows × 5 columns

Let’s make a note of the total number of trades.

Method agg can be used to aggregate data.

We can aggregate over the entire queried interval by default:

q = otp.DataSource('US_COMP_SAMPLE', tick_type='TRD')
q = q[['PRICE', 'SIZE', 'COND', 'EXCHANGE']]
q = q.agg({
    'volume': otp.agg.sum('SIZE'),
    'vwap': otp.agg.vwap('PRICE', 'SIZE'),
    'count': otp.agg.count(),
})
otp.run(q, start=s, end=e, symbols=['AAPL'])
Time volume vwap count
0 2024-02-01 09:30:01 1349013 183.901435 579

Or over fixed buckets (aka bars or windows), for example 100 milliseconds buckets:

q = otp.DataSource('US_COMP_SAMPLE', tick_type='TRD')
q = q[['PRICE', 'SIZE', 'COND', 'EXCHANGE']]
q = q.agg({
    'volume': otp.agg.sum('SIZE'),
    'vwap': otp.agg.vwap('PRICE', 'SIZE')
}, bucket_interval=.1)
otp.run(q, start=s, end=e, symbols=['AAPL'])
Time volume vwap
0 2024-02-01 09:30:00.100 11967 183.968874
1 2024-02-01 09:30:00.200 3182 183.898847
2 2024-02-01 09:30:00.300 2178 183.903108
3 2024-02-01 09:30:00.400 1849 183.919567
4 2024-02-01 09:30:00.500 2595 183.890730
5 2024-02-01 09:30:00.600 1306428 183.900126
6 2024-02-01 09:30:00.700 14454 183.946954
7 2024-02-01 09:30:00.800 5586 183.942992
8 2024-02-01 09:30:00.900 253 183.925455
9 2024-02-01 09:30:01.000 521 183.910768

Or over a sliding window:

q = otp.DataSource('US_COMP_SAMPLE', tick_type='TRD')
q = q[['PRICE', 'SIZE', 'COND', 'EXCHANGE']]
q = q.agg({
    'volume': otp.agg.sum('SIZE'),
    'vwap': otp.agg.vwap('PRICE', 'SIZE')
}, bucket_interval=.1, running=True)
otp.run(q, start=s, end=e, symbols=['AAPL'])
Time volume vwap
0 2024-02-01 09:30:00.000961260 302 184.010000
1 2024-02-01 09:30:00.000961491 402 184.007512
2 2024-02-01 09:30:00.000961701 403 184.007494
3 2024-02-01 09:30:00.000973163 404 184.007475
4 2024-02-01 09:30:00.000973355 409 184.007384
... ... ... ...
1127 2024-02-01 09:30:00.994112863 520 183.910846
1128 2024-02-01 09:30:00.995240159 518 183.910695
1129 2024-02-01 09:30:00.996512511 523 183.910918
1130 2024-02-01 09:30:00.996556050 522 183.910843
1131 2024-02-01 09:30:00.997939525 521 183.910768

1132 rows × 3 columns

Note that the number of output ticks is more than the number of trades. This is due to the output tick being created not only when each input tick enters the window but also when it drops out.

We can display all fields of the incoming tick along with the current values of the sliding window metrics.

q = otp.DataSource('US_COMP_SAMPLE', tick_type='TRD')
q = q[['PRICE', 'SIZE', 'COND', 'EXCHANGE']]
q = q.agg({
    'volume': otp.agg.sum('SIZE'),
    'vwap': otp.agg.vwap('PRICE', 'SIZE')
}, bucket_interval=.1, running=True, all_fields=True)
otp.run(q, start=s, end=e, symbols=['AAPL'])
Time PRICE SIZE COND EXCHANGE volume vwap
0 2024-02-01 09:30:00.000961260 184.010 302 @FT P 302 184.010000
1 2024-02-01 09:30:00.000961491 184.000 100 @FT P 402 184.007512
2 2024-02-01 09:30:00.000961701 184.000 1 @FTI P 403 184.007494
3 2024-02-01 09:30:00.000973163 184.000 1 @FTI P 404 184.007475
4 2024-02-01 09:30:00.000973355 184.000 5 @FTI P 409 184.007384
... ... ... ... ... ... ... ...
574 2024-02-01 09:30:00.987184691 183.900 9 @F I K 249 183.921265
575 2024-02-01 09:30:00.990378350 183.920 1 @ I D 226 183.919447
576 2024-02-01 09:30:00.991941892 183.935 1 @ I D 221 183.918959
577 2024-02-01 09:30:00.993785116 183.905 300 @ D 521 183.910921
578 2024-02-01 09:30:00.996512511 183.934 5 @ I D 523 183.910918

579 rows × 7 columns

In this case, we are back to the same number of ticks as the number trades as an output tick is only created on arrival of an input tick.

All of the aggregation operations support grouping.

q = otp.DataSource('US_COMP_SAMPLE', tick_type='TRD')
q = q[['PRICE', 'SIZE', 'COND', 'EXCHANGE']]
q = q.agg({
    'volume': otp.agg.sum('SIZE'),
    'vwap': otp.agg.vwap('PRICE', 'SIZE')
}, group_by=['EXCHANGE'])
otp.run(q, start=s, end=e, symbols=['AAPL'])
Time EXCHANGE volume vwap
0 2024-02-01 09:30:01 B 184 183.957554
1 2024-02-01 09:30:01 D 3291 183.943396
2 2024-02-01 09:30:01 H 15 183.914667
3 2024-02-01 09:30:01 J 129 183.945426
4 2024-02-01 09:30:01 K 2746 183.925703
5 2024-02-01 09:30:01 N 709 183.942271
6 2024-02-01 09:30:01 P 5681 183.938351
7 2024-02-01 09:30:01 Q 1330182 183.900941
8 2024-02-01 09:30:01 U 433 183.943580
9 2024-02-01 09:30:01 V 1669 183.922624
10 2024-02-01 09:30:01 X 93 183.937527
11 2024-02-01 09:30:01 Y 1036 183.935656
12 2024-02-01 09:30:01 Z 2845 183.938042

Note that in non-running mode OneTick unconditionally divides the whole time interval into specified number of buckets. It means that you will always get this specified number of ticks in the result, even if you have less ticks in the input data. For example, aggregating this empty data will result in 10 ticks nonetheless:

t = otp.Empty()
t = t.agg({'COUNT': otp.agg.count()}, bucket_interval=0.1)
otp.run(t, start=s, end=e)
Time COUNT
0 2024-02-01 09:30:00.100 0
1 2024-02-01 09:30:00.200 0
2 2024-02-01 09:30:00.300 0
3 2024-02-01 09:30:00.400 0
4 2024-02-01 09:30:00.500 0
5 2024-02-01 09:30:00.600 0
6 2024-02-01 09:30:00.700 0
7 2024-02-01 09:30:00.800 0
8 2024-02-01 09:30:00.900 0
9 2024-02-01 09:30:01.000 0

A list of all aggregations appears here. It can also be retrieved with dir(otp.agg).

Aggregation Use Cases#

Creating Bars

Golden Cross strategy