Order Book Analytics#

import onetick.py as otp

onetick-py offers functions for analyzing tick-by-tick order book. There are three representations of an order book. We’ll show top 3 levels only for ease of exposition.

A book can be displayed with a tick per level per side. We refer to a level in the book as a ‘price level’ or ‘prl’.

snapshot_time=otp.dt(2022, 3, 2, 10)
prl = otp.ObSnapshot(db='CME', tick_type='PRL_FULL', max_levels=3) 
otp.run(prl, symbols='NQ\H22', start=snapshot_time, end=snapshot_time)
Time PRICE UPDATE_TIME SIZE LEVEL BUY_SELL_FLAG
0 2022-03-02 10:00:00 14092.00 2022-03-02 09:59:59.928574271 1 1 1
1 2022-03-02 10:00:00 14092.25 2022-03-02 09:59:59.901649655 3 2 1
2 2022-03-02 10:00:00 14092.50 2022-03-02 09:59:59.873574571 2 3 1
3 2022-03-02 10:00:00 14090.75 2022-03-02 09:59:59.923928381 3 1 0
4 2022-03-02 10:00:00 14090.50 2022-03-02 09:59:59.923521093 3 2 0
5 2022-03-02 10:00:00 14090.25 2022-03-02 09:59:59.924031067 4 3 0

Alternatively, a book can show a tick per level with both ask and bid price/size info.

snapshot_time=otp.dt(2022, 3, 2, 10)
prl = otp.ObSnapshotWide(db='CME', tick_type='PRL_FULL', max_levels=3)   
otp.run(prl, symbols='NQ\H22', start=snapshot_time, end=snapshot_time)
Time BID_PRICE BID_UPDATE_TIME BID_SIZE ASK_PRICE ASK_UPDATE_TIME ASK_SIZE LEVEL
0 2022-03-02 10:00:00 14090.75 2022-03-02 09:59:59.923928381 3 14092.00 2022-03-02 09:59:59.928574271 1 1
1 2022-03-02 10:00:00 14090.50 2022-03-02 09:59:59.923521093 3 14092.25 2022-03-02 09:59:59.901649655 3 2
2 2022-03-02 10:00:00 14090.25 2022-03-02 09:59:59.924031067 4 14092.50 2022-03-02 09:59:59.873574571 2 3

Finally, all levels can be displayed in one tick.

snapshot_time=otp.dt(2022, 3, 2, 10)
prl = otp.ObSnapshotFlat(db='CME', tick_type='PRL_FULL', max_levels=3) 
print(otp.run(prl, symbols='NQ\H22', start=snapshot_time, end=snapshot_time))
                 Time  BID_PRICE1              BID_UPDATE_TIME1  BID_SIZE1  ASK_PRICE1              ASK_UPDATE_TIME1  ASK_SIZE1  BID_PRICE2              BID_UPDATE_TIME2  BID_SIZE2  ASK_PRICE2              ASK_UPDATE_TIME2  ASK_SIZE2  BID_PRICE3              BID_UPDATE_TIME3  BID_SIZE3  ASK_PRICE3              ASK_UPDATE_TIME3  ASK_SIZE3
0 2022-03-02 10:00:00    14090.75 2022-03-02 09:59:59.923928381          3     14092.0 2022-03-02 09:59:59.928574271          1     14090.5 2022-03-02 09:59:59.923521093          3    14092.25 2022-03-02 09:59:59.901649655          3    14090.25 2022-03-02 09:59:59.924031067          4     14092.5 2022-03-02 09:59:59.873574571          2

We can output the book (in any of the three representation) on every change to price/size at any of the levels.

prl = otp.ObSnapshotFlat(db='CME', tick_type='PRL_FULL', max_levels=3, running=True)
prl = prl.drop(r".+TIME\d")
print(otp.run(prl, symbols='NQ\H22', start=otp.dt(2022, 3, 2, 10),  end=otp.dt(2022, 3, 2, 10)+otp.Milli(100)))
                            Time  BID_PRICE1  BID_SIZE1  ASK_PRICE1  ASK_SIZE1  BID_PRICE2  BID_SIZE2  ASK_PRICE2  ASK_SIZE2  BID_PRICE3  BID_SIZE3  ASK_PRICE3  ASK_SIZE3
0  2022-03-02 10:00:00.000000000    14090.75          3    14092.00          1    14090.50          3    14092.25          3    14090.25          4    14092.50          2
1  2022-03-02 10:00:00.005759409    14091.00          1    14092.00          1    14090.75          3    14092.25          3    14090.50          3    14092.50          2
2  2022-03-02 10:00:00.005759583    14091.00          1    14092.25          3    14090.75          3    14092.50          2    14090.50          3    14092.75          2
3  2022-03-02 10:00:00.005837891    14091.00          1    14092.25          2    14090.75          3    14092.50          2    14090.50          3    14092.75          2
4  2022-03-02 10:00:00.006117137    14091.25          1    14092.25          2    14091.00          1    14092.50          2    14090.75          3    14092.75          2
5  2022-03-02 10:00:00.006155241    14091.25          1    14092.25          2    14091.00          1    14092.50          3    14090.75          3    14092.75          2
6  2022-03-02 10:00:00.006287769    14091.25          1    14092.25          1    14091.00          1    14092.50          3    14090.75          3    14092.75          2
7  2022-03-02 10:00:00.006485627    14091.00          1    14092.25          1    14090.75          3    14092.50          3    14090.50          3    14092.75          2
8  2022-03-02 10:00:00.006542011    14091.00          1    14092.25          1    14090.75          3    14092.50          2    14090.50          3    14092.75          2
9  2022-03-02 10:00:00.006551707    14091.00          1    14092.25          1    14090.75          3    14092.50          1    14090.50          3    14092.75          2
10 2022-03-02 10:00:00.006736777    14091.00          1    14092.25          2    14090.75          3    14092.50          1    14090.50          3    14092.75          2
11 2022-03-02 10:00:00.006802977    14091.00          1    14092.25          3    14090.75          3    14092.50          1    14090.50          3    14092.75          2
12 2022-03-02 10:00:00.006955737    14091.00          1    14092.25          3    14090.75          3    14092.50          2    14090.50          3    14092.75          2
13 2022-03-02 10:00:00.015703265    14091.00          1    14092.25          2    14090.75          3    14092.50          2    14090.50          3    14092.75          2

The ObSnapshot method doesn’t require specifying max_levels. The entire book is returned when the parameter is not specified.

snapshot_time=otp.dt(2022, 3, 2, 10)
prl = otp.ObSnapshot(db='CME', tick_type='PRL_FULL') 
otp.run(prl, symbols='NQ\H22', start=snapshot_time, end=snapshot_time)
Time PRICE UPDATE_TIME SIZE LEVEL BUY_SELL_FLAG
0 2022-03-02 10:00:00 14092.00 2022-03-02 09:59:59.928574271 1 1 1
1 2022-03-02 10:00:00 14092.25 2022-03-02 09:59:59.901649655 3 2 1
2 2022-03-02 10:00:00 14092.50 2022-03-02 09:59:59.873574571 2 3 1
3 2022-03-02 10:00:00 14092.75 2022-03-02 09:59:59.699031867 2 4 1
4 2022-03-02 10:00:00 14093.00 2022-03-02 09:59:59.901890847 4 5 1
... ... ... ... ... ... ...
1567 2022-03-02 10:00:00 6490.00 2022-03-01 22:59:59.999000000 3 831 0
1568 2022-03-02 10:00:00 1586.00 2022-03-01 22:59:59.999000000 1 832 0
1569 2022-03-02 10:00:00 786.50 2022-03-01 22:59:59.999000000 1 833 0
1570 2022-03-02 10:00:00 200.00 2022-03-01 22:59:59.999000000 1 834 0
1571 2022-03-02 10:00:00 1.00 2022-03-01 22:59:59.999000000 1 835 0

1572 rows × 6 columns

Book Imbalance#

Let’s find the time weighted book imbalance. The imbalance at a given time is defined as the sum of the bid sizes at the top x levels minus the sum of the ask sizes at the top x levels divided by the sum of these two terms: the values close to 1 mean the book is much heavier on the bid side, close to -1 – on the ask side, equal to zero means the sizes are the same.

x = 3

prl = otp.ObSnapshotWide(db='CME', tick_type='PRL_FULL', max_levels=x, running=True)
prls_df = otp.run(prl, symbols='NQ\H22', start=otp.dt(2022, 3, 2, 10), end=otp.dt(2022, 3, 2, 10)+otp.Milli(100))
print(prls_df.head(7))

prl = prl.agg({'ask_vol': otp.agg.sum('ASK_SIZE'), 'bid_vol': otp.agg.sum('BID_SIZE')}, bucket_units='ticks', bucket_interval=x)
prl['imb'] = (prl['bid_vol'] - prl['ask_vol']) / (prl['bid_vol'] + prl['ask_vol'])
prls_df = otp.run(prl, symbols='NQ\H22', start=otp.dt(2022, 3, 2, 10), end=otp.dt(2022, 3, 2, 10)+otp.Milli(100))
print(prls_df.head())

imb_stats = prl.agg({
    'tw_imb': otp.agg.tw_average('imb'),
    'mean':   otp.agg.average('imb'),
    'stdev':  otp.agg.stddev('imb'),
})
print(otp.run(imb_stats, symbols='NQ\H22', start=otp.dt(2022, 3, 2, 10), end=otp.dt(2022, 3, 2, 10)+otp.Milli(100)))
                           Time  BID_PRICE               BID_UPDATE_TIME  BID_SIZE  ASK_PRICE               ASK_UPDATE_TIME  ASK_SIZE  LEVEL
0 2022-03-02 10:00:00.000000000   14090.75 2022-03-02 09:59:59.923928381         3   14092.00 2022-03-02 09:59:59.928574271         1      1
1 2022-03-02 10:00:00.000000000   14090.50 2022-03-02 09:59:59.923521093         3   14092.25 2022-03-02 09:59:59.901649655         3      2
2 2022-03-02 10:00:00.000000000   14090.25 2022-03-02 09:59:59.924031067         4   14092.50 2022-03-02 09:59:59.873574571         2      3
3 2022-03-02 10:00:00.005759409   14091.00 2022-03-02 10:00:00.005759409         1   14092.00 2022-03-02 09:59:59.928574271         1      1
4 2022-03-02 10:00:00.005759409   14090.75 2022-03-02 09:59:59.923928381         3   14092.25 2022-03-02 09:59:59.901649655         3      2
5 2022-03-02 10:00:00.005759409   14090.50 2022-03-02 09:59:59.923521093         3   14092.50 2022-03-02 09:59:59.873574571         2      3
6 2022-03-02 10:00:00.005759583   14091.00 2022-03-02 10:00:00.005759409         1   14092.25 2022-03-02 09:59:59.901649655         3      1
                           Time  ask_vol  bid_vol       imb
0 2022-03-02 10:00:00.000000000        6       10  0.250000
1 2022-03-02 10:00:00.005759409        6        7  0.076923
2 2022-03-02 10:00:00.005759583        7        7  0.000000
3 2022-03-02 10:00:00.005837891        6        7  0.076923
4 2022-03-02 10:00:00.006117137        6        5 -0.090909
                     Time    tw_imb      mean    stdev
0 2022-03-02 10:00:00.100  0.079814  0.063728  0.12232

Book sweep#

There are two version of book sweep: by price and by quantity. Book sweep by price, take a price as an input and returns the total quatity available at that price or better. Book sweep by quantity, takes a quantity as an input and returns the VWAP if the quantity were executed immediately.

def side_to_direction(side):
    return 1 if side == 'ASK' else -1

def sweep_by_price(side, price):
    prl = otp.ObSnapshot(db='CME', tick_type='PRL_FULL', side=side)
    direction = side_to_direction(side)
    prl, _ = prl[direction * prl['PRICE'] <= direction * price]
    prl = prl.agg({'total_qty': otp.agg.sum('SIZE')})
    return otp.run(prl, symbols='NQ\H22', start=otp.dt(2022, 3, 2, 10), end=otp.dt(2022, 3, 2, 10))

print(sweep_by_price('BID', 14075))
print(sweep_by_price('ASK', 14077))
                 Time  total_qty
0 2022-03-02 10:00:00        169
                 Time  total_qty
0 2022-03-02 10:00:00          0
def sweep_by_qty(side, qty):
    prl = otp.ObSnapshot(db='CME', tick_type='PRL_FULL', side=side)
    prl = prl.agg({'total_qty': otp.agg.sum('SIZE')}, running=True, all_fields=True)
    direction = side_to_direction(side)
    prl, _ = prl[prl['total_qty'] - prl['SIZE'] < qty]
    # update the SIZE in the last tick only so that total_qty is exactly qty
    prl['SIZE'] = prl.apply(lambda tick: prl['SIZE'] - (prl['total_qty'] - qty) if prl['total_qty'] > qty else prl['SIZE'])
    prl = prl.agg({'VWAP': otp.agg.vwap('PRICE', 'SIZE')})
    return otp.run(prl, symbols='NQ\H22', start=otp.dt(2022, 3, 2, 10), end=otp.dt(2022, 3, 2, 10))
print(sweep_by_qty('BID', 10))
print(sweep_by_qty('ASK', 10))
                 Time       VWAP
0 2022-03-02 10:00:00  14090.475
                 Time       VWAP
0 2022-03-02 10:00:00  14092.525

Market By Order#

Order Book data may be annotated with ‘key’ fields lets you break down the book by each value of the ‘key’ fields. For example, a book could by keyed by market participant ID, allowing you to see the book with the orders of a given market participant only. Some exchanges provide ‘market-by-order’ data where the book is keyed by order id. Set show_full_detail to True to see the book broken down to the most granular level. The example below is a market-by-order book.

prl = otp.ObSnapshot('CME', tick_type='PRL_FULL', side='BID', show_full_detail=True)
prl = prl.first(5)
print(otp.run(prl, symbols='NQ\H22', start=otp.dt(2022, 3, 2, 10), end=otp.dt(2022, 3, 2, 10)))
                 Time       ORDER_ID  BUY_SELL_FLAG ORDER_TYPE     PRICE  SIZE  TIME_PRIORITY RECORD_TYPE DELETED_TIME  TICK_STATUS  OMDSEQ  LEVEL                   UPDATE_TIME
0 2022-03-02 10:00:00  6830848559575              0          L  14090.75     1    43115683040           R   1970-01-01            0      12      1 2022-03-02 09:59:59.923928381
1 2022-03-02 10:00:00  6830848559573              0          L  14090.75     1    43115683038           R   1970-01-01            0       9      1 2022-03-02 09:59:59.923735565
2 2022-03-02 10:00:00  6830848559571              0          L  14090.75     1    43115683036           R   1970-01-01            0       6      1 2022-03-02 09:59:59.923710433
3 2022-03-02 10:00:00  6830848559570              0          L  14090.50     1    43115683035           R   1970-01-01            0       0      2 2022-03-02 09:59:59.923521093
4 2022-03-02 10:00:00  6830848548970              0          L  14090.50     1    43115668616           R   1970-01-01            0       0      2 2022-03-02 09:59:17.797610227

Market-by-order data can be used to analyze/validate the priority mechanism used by the exchange.``

prl = otp.ObSnapshot('CME', tick_type='PRL_FULL', side='BID', show_full_detail=True)

"""
ORDER_TYPE:
L = Limit order
I = Implied order

Implied liquidity doesn’t have priority as it's always last to execute at any price level. 
It also doesn’t have an order ID, so the IDs that you see in the db are synthetic 
(consisting of 1 or 2 for the 1st/2nd implied level, and E/F for the buy/sell side respectively).

In order to rank the orders within a given price point by priority, you need to sort first by ORDER_TYPE (“L” comes before “I”),
then by TIME_PRIORITY (lowest value comes first).
"""
prl = prl.sort(['LEVEL','ORDER_TYPE', 'TIME_PRIORITY'],ascending=[True,False, True])
orders = otp.run(prl, symbols='NQ\H22', start=otp.dt(2022, 3, 2, 10), end=otp.dt(2022, 3, 2, 10))
orders = orders[['ORDER_ID', 'PRICE', 'LEVEL', 'TIME_PRIORITY','SIZE', 'BUY_SELL_FLAG', 'ORDER_TYPE']]
orders.head()
ORDER_ID PRICE LEVEL TIME_PRIORITY SIZE BUY_SELL_FLAG ORDER_TYPE
0 6830848559571 14090.75 1 43115683036 1 0 L
1 6830848559573 14090.75 1 43115683038 1 0 L
2 6830848559575 14090.75 1 43115683040 1 0 L
3 6830848545469 14090.50 2 43115663812 1 0 L
4 6830848548970 14090.50 2 43115668616 1 0 L