Order Book Analytics#
onetick-py
offers functions for analyzing tick-by-tick order book. There are three representations of an order book. We’ll show top 3 levels only for ease of exposition.
A book can be displayed with a tick per level per side. We refer to a level in the book as a ‘price level’ or ‘prl’.
import onetick.py as otp
snapshot_time=otp.dt(2022, 3, 2, 10)
prl = otp.ObSnapshot(db='CME', tick_type='PRL_FULL', max_levels=3)
otp.run(prl, symbols='NQ\H22', start=snapshot_time, end=snapshot_time)
Time | PRICE | UPDATE_TIME | SIZE | LEVEL | BUY_SELL_FLAG | |
---|---|---|---|---|---|---|
0 | 2022-03-02 10:00:00 | 14092.00 | 2022-03-02 09:59:59.928574271 | 1 | 1 | 1 |
1 | 2022-03-02 10:00:00 | 14092.25 | 2022-03-02 09:59:59.901649655 | 3 | 2 | 1 |
2 | 2022-03-02 10:00:00 | 14092.50 | 2022-03-02 09:59:59.873574571 | 2 | 3 | 1 |
3 | 2022-03-02 10:00:00 | 14090.75 | 2022-03-02 09:59:59.923928381 | 3 | 1 | 0 |
4 | 2022-03-02 10:00:00 | 14090.50 | 2022-03-02 09:59:59.923521093 | 3 | 2 | 0 |
5 | 2022-03-02 10:00:00 | 14090.25 | 2022-03-02 09:59:59.924031067 | 4 | 3 | 0 |
Alternatively, a book can show a tick per level with both ask and bid price/size info.
snapshot_time=otp.dt(2022, 3, 2, 10)
prl = otp.ObSnapshotWide(db='CME', tick_type='PRL_FULL', max_levels=3)
otp.run(prl, symbols='NQ\H22', start=snapshot_time, end=snapshot_time)
Time | BID_PRICE | BID_UPDATE_TIME | BID_SIZE | ASK_PRICE | ASK_UPDATE_TIME | ASK_SIZE | LEVEL | |
---|---|---|---|---|---|---|---|---|
0 | 2022-03-02 10:00:00 | 14090.75 | 2022-03-02 09:59:59.923928381 | 3 | 14092.00 | 2022-03-02 09:59:59.928574271 | 1 | 1 |
1 | 2022-03-02 10:00:00 | 14090.50 | 2022-03-02 09:59:59.923521093 | 3 | 14092.25 | 2022-03-02 09:59:59.901649655 | 3 | 2 |
2 | 2022-03-02 10:00:00 | 14090.25 | 2022-03-02 09:59:59.924031067 | 4 | 14092.50 | 2022-03-02 09:59:59.873574571 | 2 | 3 |
Finally, all levels can be displayed in one tick.
snapshot_time=otp.dt(2022, 3, 2, 10)
prl = otp.ObSnapshotFlat(db='CME', tick_type='PRL_FULL', max_levels=3)
print(otp.run(prl, symbols='NQ\H22', start=snapshot_time, end=snapshot_time))
Time BID_PRICE1 BID_UPDATE_TIME1 BID_SIZE1 ASK_PRICE1 ASK_UPDATE_TIME1 ASK_SIZE1 BID_PRICE2 BID_UPDATE_TIME2 BID_SIZE2 ASK_PRICE2 ASK_UPDATE_TIME2 ASK_SIZE2 BID_PRICE3 BID_UPDATE_TIME3 BID_SIZE3 ASK_PRICE3 ASK_UPDATE_TIME3 ASK_SIZE3
0 2022-03-02 10:00:00 14090.75 2022-03-02 09:59:59.923928381 3 14092.0 2022-03-02 09:59:59.928574271 1 14090.5 2022-03-02 09:59:59.923521093 3 14092.25 2022-03-02 09:59:59.901649655 3 14090.25 2022-03-02 09:59:59.924031067 4 14092.5 2022-03-02 09:59:59.873574571 2
We can output the book (in any of the three representation) on every change to price/size at any of the levels.
prl = otp.ObSnapshotFlat(db='CME', tick_type='PRL_FULL', max_levels=3, running=True)
prl = prl.drop(r".+TIME\d")
print(otp.run(prl, symbols='NQ\H22', start=otp.dt(2022, 3, 2, 10), end=otp.dt(2022, 3, 2, 10)+otp.Milli(100)))
Time BID_PRICE1 BID_SIZE1 ASK_PRICE1 ASK_SIZE1 BID_PRICE2 BID_SIZE2 ASK_PRICE2 ASK_SIZE2 BID_PRICE3 BID_SIZE3 ASK_PRICE3 ASK_SIZE3
0 2022-03-02 10:00:00.000000000 14090.75 3 14092.00 1 14090.50 3 14092.25 3 14090.25 4 14092.50 2
1 2022-03-02 10:00:00.005759409 14091.00 1 14092.00 1 14090.75 3 14092.25 3 14090.50 3 14092.50 2
2 2022-03-02 10:00:00.005759583 14091.00 1 14092.25 3 14090.75 3 14092.50 2 14090.50 3 14092.75 2
3 2022-03-02 10:00:00.005837891 14091.00 1 14092.25 2 14090.75 3 14092.50 2 14090.50 3 14092.75 2
4 2022-03-02 10:00:00.006117137 14091.25 1 14092.25 2 14091.00 1 14092.50 2 14090.75 3 14092.75 2
5 2022-03-02 10:00:00.006155241 14091.25 1 14092.25 2 14091.00 1 14092.50 3 14090.75 3 14092.75 2
6 2022-03-02 10:00:00.006287769 14091.25 1 14092.25 1 14091.00 1 14092.50 3 14090.75 3 14092.75 2
7 2022-03-02 10:00:00.006485627 14091.00 1 14092.25 1 14090.75 3 14092.50 3 14090.50 3 14092.75 2
8 2022-03-02 10:00:00.006542011 14091.00 1 14092.25 1 14090.75 3 14092.50 2 14090.50 3 14092.75 2
9 2022-03-02 10:00:00.006551707 14091.00 1 14092.25 1 14090.75 3 14092.50 1 14090.50 3 14092.75 2
10 2022-03-02 10:00:00.006736777 14091.00 1 14092.25 2 14090.75 3 14092.50 1 14090.50 3 14092.75 2
11 2022-03-02 10:00:00.006802977 14091.00 1 14092.25 3 14090.75 3 14092.50 1 14090.50 3 14092.75 2
12 2022-03-02 10:00:00.006955737 14091.00 1 14092.25 3 14090.75 3 14092.50 2 14090.50 3 14092.75 2
13 2022-03-02 10:00:00.015703265 14091.00 1 14092.25 2 14090.75 3 14092.50 2 14090.50 3 14092.75 2
The ObSnapshot
method doesn’t require specifying max_levels
. The entire book is returned when the parameter is not specified.
snapshot_time=otp.dt(2022, 3, 2, 10)
prl = otp.ObSnapshot(db='CME', tick_type='PRL_FULL')
otp.run(prl, symbols='NQ\H22', start=snapshot_time, end=snapshot_time)
Time | PRICE | UPDATE_TIME | SIZE | LEVEL | BUY_SELL_FLAG | |
---|---|---|---|---|---|---|
0 | 2022-03-02 10:00:00 | 14092.00 | 2022-03-02 09:59:59.928574271 | 1 | 1 | 1 |
1 | 2022-03-02 10:00:00 | 14092.25 | 2022-03-02 09:59:59.901649655 | 3 | 2 | 1 |
2 | 2022-03-02 10:00:00 | 14092.50 | 2022-03-02 09:59:59.873574571 | 2 | 3 | 1 |
3 | 2022-03-02 10:00:00 | 14092.75 | 2022-03-02 09:59:59.699031867 | 2 | 4 | 1 |
4 | 2022-03-02 10:00:00 | 14093.00 | 2022-03-02 09:59:59.901890847 | 4 | 5 | 1 |
... | ... | ... | ... | ... | ... | ... |
1567 | 2022-03-02 10:00:00 | 6490.00 | 2022-03-01 22:59:59.999000000 | 3 | 831 | 0 |
1568 | 2022-03-02 10:00:00 | 1586.00 | 2022-03-01 22:59:59.999000000 | 1 | 832 | 0 |
1569 | 2022-03-02 10:00:00 | 786.50 | 2022-03-01 22:59:59.999000000 | 1 | 833 | 0 |
1570 | 2022-03-02 10:00:00 | 200.00 | 2022-03-01 22:59:59.999000000 | 1 | 834 | 0 |
1571 | 2022-03-02 10:00:00 | 1.00 | 2022-03-01 22:59:59.999000000 | 1 | 835 | 0 |
1572 rows × 6 columns
Book Imbalance#
Let’s find the time weighted book imbalance. The imbalance at a given time is defined as the sum of the bid sizes at the top x levels minus the sum of the ask sizes at the top x levels divided by the sum of these two terms: the values close to 1 mean the book is much heavier on the bid side, close to -1 – on the ask side, equal to zero means the sizes are the same.
x = 3
prl = otp.ObSnapshotWide(db='CME', tick_type='PRL_FULL', max_levels=x, running=True)
prls_df = otp.run(prl, symbols='NQ\H22', start=otp.dt(2022, 3, 2, 10), end=otp.dt(2022, 3, 2, 10)+otp.Milli(100))
print(prls_df.head(7))
prl = prl.agg({'ask_vol': otp.agg.sum('ASK_SIZE'), 'bid_vol': otp.agg.sum('BID_SIZE')}, bucket_units='ticks', bucket_interval=x)
prl['imb'] = (prl['bid_vol'] - prl['ask_vol']) / (prl['bid_vol'] + prl['ask_vol'])
prls_df = otp.run(prl, symbols='NQ\H22', start=otp.dt(2022, 3, 2, 10), end=otp.dt(2022, 3, 2, 10)+otp.Milli(100))
print(prls_df.head())
imb_stats = prl.agg({
'tw_imb': otp.agg.tw_average('imb'),
'mean': otp.agg.average('imb'),
'stdev': otp.agg.stddev('imb'),
})
print(otp.run(imb_stats, symbols='NQ\H22', start=otp.dt(2022, 3, 2, 10), end=otp.dt(2022, 3, 2, 10)+otp.Milli(100)))
Time BID_PRICE BID_UPDATE_TIME BID_SIZE ASK_PRICE ASK_UPDATE_TIME ASK_SIZE LEVEL
0 2022-03-02 10:00:00.000000000 14090.75 2022-03-02 09:59:59.923928381 3 14092.00 2022-03-02 09:59:59.928574271 1 1
1 2022-03-02 10:00:00.000000000 14090.50 2022-03-02 09:59:59.923521093 3 14092.25 2022-03-02 09:59:59.901649655 3 2
2 2022-03-02 10:00:00.000000000 14090.25 2022-03-02 09:59:59.924031067 4 14092.50 2022-03-02 09:59:59.873574571 2 3
3 2022-03-02 10:00:00.005759409 14091.00 2022-03-02 10:00:00.005759409 1 14092.00 2022-03-02 09:59:59.928574271 1 1
4 2022-03-02 10:00:00.005759409 14090.75 2022-03-02 09:59:59.923928381 3 14092.25 2022-03-02 09:59:59.901649655 3 2
5 2022-03-02 10:00:00.005759409 14090.50 2022-03-02 09:59:59.923521093 3 14092.50 2022-03-02 09:59:59.873574571 2 3
6 2022-03-02 10:00:00.005759583 14091.00 2022-03-02 10:00:00.005759409 1 14092.25 2022-03-02 09:59:59.901649655 3 1
Time ask_vol bid_vol imb
0 2022-03-02 10:00:00.000000000 6 10 0.250000
1 2022-03-02 10:00:00.005759409 6 7 0.076923
2 2022-03-02 10:00:00.005759583 7 7 0.000000
3 2022-03-02 10:00:00.005837891 6 7 0.076923
4 2022-03-02 10:00:00.006117137 6 5 -0.090909
Time tw_imb mean stdev
0 2022-03-02 10:00:00.100 0.079814 0.063728 0.12232
Book sweep#
There are two version of book sweep: by price and by quantity. Book sweep by price, take a price as an input and returns the total quatity available at that price or better. Book sweep by quantity, takes a quantity as an input and returns the VWAP if the quantity were executed immediately.
def side_to_direction(side):
return 1 if side == 'ASK' else -1
def sweep_by_price(side, price):
prl = otp.ObSnapshot(db='CME', tick_type='PRL_FULL', side=side)
direction = side_to_direction(side)
prl, _ = prl[direction * prl['PRICE'] <= direction * price]
prl = prl.agg({'total_qty': otp.agg.sum('SIZE')})
return otp.run(prl, symbols='NQ\H22', start=otp.dt(2022, 3, 2, 10), end=otp.dt(2022, 3, 2, 10))
print(sweep_by_price('BID', 14075))
print(sweep_by_price('ASK', 14077))
Time total_qty
0 2022-03-02 10:00:00 169
Time total_qty
0 2022-03-02 10:00:00 0
def sweep_by_qty(side, qty):
prl = otp.ObSnapshot(db='CME', tick_type='PRL_FULL', side=side)
prl = prl.agg({'total_qty': otp.agg.sum('SIZE')}, running=True, all_fields=True)
direction = side_to_direction(side)
prl, _ = prl[prl['total_qty'] - prl['SIZE'] < qty]
# update the SIZE in the last tick only so that total_qty is exactly qty
prl['SIZE'] = prl.apply(lambda tick: prl['SIZE'] - (prl['total_qty'] - qty) if prl['total_qty'] > qty else prl['SIZE'])
prl = prl.agg({'VWAP': otp.agg.vwap('PRICE', 'SIZE')})
return otp.run(prl, symbols='NQ\H22', start=otp.dt(2022, 3, 2, 10), end=otp.dt(2022, 3, 2, 10))
print(sweep_by_qty('BID', 10))
print(sweep_by_qty('ASK', 10))
Time VWAP
0 2022-03-02 10:00:00 14090.475
Time VWAP
0 2022-03-02 10:00:00 14092.525
Market By Order#
Order Book data may be annotated with ‘key’ fields lets us break down the book by each value of the ‘key’ fields. For example, a book could by keyed by market participant ID, allowing us to see the book with the orders of a given market participant only. Some exchanges provide ‘market-by-order’ data where the book is keyed by order id. Set show_full_detail
to True
to see the book broken down to the most granular level. The example below is a market-by-order book.
prl = otp.ObSnapshot('CME', tick_type='PRL_FULL', side='BID', show_full_detail=True)
prl = prl.first(5)
print(otp.run(prl, symbols='NQ\H22', start=otp.dt(2022, 3, 2, 10), end=otp.dt(2022, 3, 2, 10)))
Time ORDER_ID BUY_SELL_FLAG ORDER_TYPE PRICE SIZE TIME_PRIORITY RECORD_TYPE DELETED_TIME TICK_STATUS OMDSEQ LEVEL UPDATE_TIME
0 2022-03-02 10:00:00 6830848559575 0 L 14090.75 1 43115683040 R 1970-01-01 0 12 1 2022-03-02 09:59:59.923928381
1 2022-03-02 10:00:00 6830848559573 0 L 14090.75 1 43115683038 R 1970-01-01 0 9 1 2022-03-02 09:59:59.923735565
2 2022-03-02 10:00:00 6830848559571 0 L 14090.75 1 43115683036 R 1970-01-01 0 6 1 2022-03-02 09:59:59.923710433
3 2022-03-02 10:00:00 6830848559570 0 L 14090.50 1 43115683035 R 1970-01-01 0 0 2 2022-03-02 09:59:59.923521093
4 2022-03-02 10:00:00 6830848548970 0 L 14090.50 1 43115668616 R 1970-01-01 0 0 2 2022-03-02 09:59:17.797610227
Market-by-order data can be used to analyze/validate the priority mechanism used by the exchange.``
prl = otp.ObSnapshot('CME', tick_type='PRL_FULL', side='BID', show_full_detail=True)
"""
ORDER_TYPE:
L = Limit order
I = Implied order
Implied liquidity doesn’t have priority as it's always last to execute at any price level.
It also doesn’t have an order ID, so the IDs that we see in the db are synthetic
(consisting of 1 or 2 for the 1st/2nd implied level, and E/F for the buy/sell side respectively).
In order to rank the orders within a given price point by priority, we need to sort first by ORDER_TYPE (“L” comes before “I”),
then by TIME_PRIORITY (lowest value comes first).
"""
prl = prl.sort(['LEVEL','ORDER_TYPE', 'TIME_PRIORITY'],ascending=[True,False, True])
orders = otp.run(prl, symbols='NQ\H22', start=otp.dt(2022, 3, 2, 10), end=otp.dt(2022, 3, 2, 10))
orders = orders[['ORDER_ID', 'PRICE', 'LEVEL', 'TIME_PRIORITY','SIZE', 'BUY_SELL_FLAG', 'ORDER_TYPE']]
orders.head()
ORDER_ID | PRICE | LEVEL | TIME_PRIORITY | SIZE | BUY_SELL_FLAG | ORDER_TYPE | |
---|---|---|---|---|---|---|---|
0 | 6830848559571 | 14090.75 | 1 | 43115683036 | 1 | 0 | L |
1 | 6830848559573 | 14090.75 | 1 | 43115683038 | 1 | 0 | L |
2 | 6830848559575 | 14090.75 | 1 | 43115683040 | 1 | 0 | L |
3 | 6830848545469 | 14090.50 | 2 | 43115663812 | 1 | 0 | L |
4 | 6830848548970 | 14090.50 | 2 | 43115668616 | 1 | 0 | L |