Aggregating#

Let’s start with an unaggregated time series.

import onetick.py as otp

s = otp.dt(2024, 2, 1, 9, 30)
e = otp.dt(2024, 2, 1, 9, 30, 1)

q = otp.DataSource('US_COMP_SAMPLE', tick_type='TRD')
q = q[['PRICE', 'SIZE', 'COND', 'EXCHANGE']]
otp.run(q, start=s, end=e, symbols=['AAPL'])

	Time	PRICE	SIZE	COND	EXCHANGE
0	2024-02-01 09:30:00.000961260	184.010	302	@FT	P
1	2024-02-01 09:30:00.000961491	184.000	100	@FT	P
2	2024-02-01 09:30:00.000961701	184.000	1	@FTI	P
3	2024-02-01 09:30:00.000973163	184.000	1	@FTI	P
4	2024-02-01 09:30:00.000973355	184.000	5	@FTI	P
...	...	...	...	...	...
574	2024-02-01 09:30:00.987184691	183.900	9	@F I	K
575	2024-02-01 09:30:00.990378350	183.920	1	@ I	D
576	2024-02-01 09:30:00.991941892	183.935	1	@ I	D
577	2024-02-01 09:30:00.993785116	183.905	300	@	D
578	2024-02-01 09:30:00.996512511	183.934	5	@ I	D

579 rows × 5 columns

Let’s make a note of the total number of trades.

Method agg can be used to aggregate data.

We can aggregate over the entire queried interval by default:

q = otp.DataSource('US_COMP_SAMPLE', tick_type='TRD')
q = q[['PRICE', 'SIZE', 'COND', 'EXCHANGE']]
q = q.agg({
    'volume': otp.agg.sum('SIZE'),
    'vwap': otp.agg.vwap('PRICE', 'SIZE'),
    'count': otp.agg.count(),
})
otp.run(q, start=s, end=e, symbols=['AAPL'])

	Time	volume	vwap	count
0	2024-02-01 09:30:01	1349013	183.901435	579

Or over fixed buckets (aka bars or windows), for example 100 milliseconds buckets:

q = otp.DataSource('US_COMP_SAMPLE', tick_type='TRD')
q = q[['PRICE', 'SIZE', 'COND', 'EXCHANGE']]
q = q.agg({
    'volume': otp.agg.sum('SIZE'),
    'vwap': otp.agg.vwap('PRICE', 'SIZE')
}, bucket_interval=.1)
otp.run(q, start=s, end=e, symbols=['AAPL'])

	Time	volume	vwap
0	2024-02-01 09:30:00.100	11967	183.968874
1	2024-02-01 09:30:00.200	3182	183.898847
2	2024-02-01 09:30:00.300	2178	183.903108
3	2024-02-01 09:30:00.400	1849	183.919567
4	2024-02-01 09:30:00.500	2595	183.890730
5	2024-02-01 09:30:00.600	1306428	183.900126
6	2024-02-01 09:30:00.700	14454	183.946954
7	2024-02-01 09:30:00.800	5586	183.942992
8	2024-02-01 09:30:00.900	253	183.925455
9	2024-02-01 09:30:01.000	521	183.910768

Or over a sliding window:

q = otp.DataSource('US_COMP_SAMPLE', tick_type='TRD')
q = q[['PRICE', 'SIZE', 'COND', 'EXCHANGE']]
q = q.agg({
    'volume': otp.agg.sum('SIZE'),
    'vwap': otp.agg.vwap('PRICE', 'SIZE')
}, bucket_interval=.1, running=True)
otp.run(q, start=s, end=e, symbols=['AAPL'])

	Time	volume	vwap
0	2024-02-01 09:30:00.000961260	302	184.010000
1	2024-02-01 09:30:00.000961491	402	184.007512
2	2024-02-01 09:30:00.000961701	403	184.007494
3	2024-02-01 09:30:00.000973163	404	184.007475
4	2024-02-01 09:30:00.000973355	409	184.007384
...	...	...	...
1127	2024-02-01 09:30:00.994112863	520	183.910846
1128	2024-02-01 09:30:00.995240159	518	183.910695
1129	2024-02-01 09:30:00.996512511	523	183.910918
1130	2024-02-01 09:30:00.996556050	522	183.910843
1131	2024-02-01 09:30:00.997939525	521	183.910768

1132 rows × 3 columns

Note that the number of output ticks is more than the number of trades. This is due to the output tick being created not only when each input tick enters the window but also when it drops out.

We can display all fields of the incoming tick along with the current values of the sliding window metrics.

q = otp.DataSource('US_COMP_SAMPLE', tick_type='TRD')
q = q[['PRICE', 'SIZE', 'COND', 'EXCHANGE']]
q = q.agg({
    'volume': otp.agg.sum('SIZE'),
    'vwap': otp.agg.vwap('PRICE', 'SIZE')
}, bucket_interval=.1, running=True, all_fields=True)
otp.run(q, start=s, end=e, symbols=['AAPL'])

	Time	PRICE	SIZE	COND	EXCHANGE	volume	vwap
0	2024-02-01 09:30:00.000961260	184.010	302	@FT	P	302	184.010000
1	2024-02-01 09:30:00.000961491	184.000	100	@FT	P	402	184.007512
2	2024-02-01 09:30:00.000961701	184.000	1	@FTI	P	403	184.007494
3	2024-02-01 09:30:00.000973163	184.000	1	@FTI	P	404	184.007475
4	2024-02-01 09:30:00.000973355	184.000	5	@FTI	P	409	184.007384
...	...	...	...	...	...	...	...
574	2024-02-01 09:30:00.987184691	183.900	9	@F I	K	249	183.921265
575	2024-02-01 09:30:00.990378350	183.920	1	@ I	D	226	183.919447
576	2024-02-01 09:30:00.991941892	183.935	1	@ I	D	221	183.918959
577	2024-02-01 09:30:00.993785116	183.905	300	@	D	521	183.910921
578	2024-02-01 09:30:00.996512511	183.934	5	@ I	D	523	183.910918

579 rows × 7 columns

In this case, we are back to the same number of ticks as the number trades as an output tick is only created on arrival of an input tick.

All of the aggregation operations support grouping.

q = otp.DataSource('US_COMP_SAMPLE', tick_type='TRD')
q = q[['PRICE', 'SIZE', 'COND', 'EXCHANGE']]
q = q.agg({
    'volume': otp.agg.sum('SIZE'),
    'vwap': otp.agg.vwap('PRICE', 'SIZE')
}, group_by=['EXCHANGE'])
otp.run(q, start=s, end=e, symbols=['AAPL'])

	Time	EXCHANGE	volume	vwap
0	2024-02-01 09:30:01	B	184	183.957554
1	2024-02-01 09:30:01	D	3291	183.943396
2	2024-02-01 09:30:01	H	15	183.914667
3	2024-02-01 09:30:01	J	129	183.945426
4	2024-02-01 09:30:01	K	2746	183.925703
5	2024-02-01 09:30:01	N	709	183.942271
6	2024-02-01 09:30:01	P	5681	183.938351
7	2024-02-01 09:30:01	Q	1330182	183.900941
8	2024-02-01 09:30:01	U	433	183.943580
9	2024-02-01 09:30:01	V	1669	183.922624
10	2024-02-01 09:30:01	X	93	183.937527
11	2024-02-01 09:30:01	Y	1036	183.935656
12	2024-02-01 09:30:01	Z	2845	183.938042

Note that in non-running mode OneTick unconditionally divides the whole time interval into specified number of buckets. It means that you will always get this specified number of ticks in the result, even if you have less ticks in the input data. For example, aggregating this empty data will result in 10 ticks nonetheless:

t = otp.Empty()
t = t.agg({'COUNT': otp.agg.count()}, bucket_interval=0.1)
otp.run(t, start=s, end=e)

	Time	COUNT
0	2024-02-01 09:30:00.100	0
1	2024-02-01 09:30:00.200	0
2	2024-02-01 09:30:00.300	0
3	2024-02-01 09:30:00.400	0
4	2024-02-01 09:30:00.500	0
5	2024-02-01 09:30:00.600	0
6	2024-02-01 09:30:00.700	0
7	2024-02-01 09:30:00.800	0
8	2024-02-01 09:30:00.900	0
9	2024-02-01 09:30:01.000	0

A list of all aggregations appears here. It can also be retrieved with dir(otp.agg).

Aggregation Use Cases#

Creating Bars

Golden Cross strategy