{ "cells": [ { "cell_type": "markdown", "id": "4a33fb14", "metadata": {}, "source": [ "# Time-Based Joins" ] }, { "cell_type": "code", "execution_count": 1, "id": "f41be6b8-0bcd-4afa-8412-1f533f3c5196", "metadata": {}, "outputs": [], "source": [ "import onetick.py as otp" ] }, { "cell_type": "markdown", "id": "5eaac7ad-b507-48bf-aff2-7ff5d812e147", "metadata": {}, "source": [ "Time series analysis often involves looking up information from other time series for relevant time ranges. OneTick has built-in functions for this." ] }, { "cell_type": "markdown", "id": "6f00e839-f381-49b2-9249-c1f36b36c12c", "metadata": {}, "source": [ "# `join_by_time`: Enhancing a time series with information from other time series at the time of each tick" ] }, { "cell_type": "markdown", "id": "810b313d-8033-46ee-aac3-15c61faf6635", "metadata": {}, "source": [ "Below we'll enhance the trades with the prevailing quote (i.e., best bid and ask) at the time of each trade. We'll first look just at trades, then just at quotes, and finally we'll join the two." ] }, { "cell_type": "markdown", "id": "9bed10d4-dcd8-414d-b618-d600807f39af", "metadata": {}, "source": [ "Let's examine the trades first." ] }, { "cell_type": "code", "execution_count": 2, "id": "7872865c-768c-4a7d-b30e-6887d0e7b1cf", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
TimePRICESIZE
02023-03-29 10:00:00.005852928399.8500100
12023-03-29 10:00:00.005852928399.8500100
22023-03-29 10:00:00.005856000399.8500100
32023-03-29 10:00:00.006580736399.85001
42023-03-29 10:00:00.012014592399.85001
............
21362023-03-29 10:00:59.285156096399.8800100
21372023-03-29 10:00:59.687216896399.89001
21382023-03-29 10:00:59.911493888399.8899100
21392023-03-29 10:00:59.974551040399.8701100
21402023-03-29 10:00:59.998262528399.89001
\n", "

2141 rows × 3 columns

\n", "
" ], "text/plain": [ " Time PRICE SIZE\n", "0 2023-03-29 10:00:00.005852928 399.8500 100\n", "1 2023-03-29 10:00:00.005852928 399.8500 100\n", "2 2023-03-29 10:00:00.005856000 399.8500 100\n", "3 2023-03-29 10:00:00.006580736 399.8500 1\n", "4 2023-03-29 10:00:00.012014592 399.8500 1\n", "... ... ... ...\n", "2136 2023-03-29 10:00:59.285156096 399.8800 100\n", "2137 2023-03-29 10:00:59.687216896 399.8900 1\n", "2138 2023-03-29 10:00:59.911493888 399.8899 100\n", "2139 2023-03-29 10:00:59.974551040 399.8701 100\n", "2140 2023-03-29 10:00:59.998262528 399.8900 1\n", "\n", "[2141 rows x 3 columns]" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "otp.config['tz'] = 'EST5EDT'\n", "trd = otp.DataSource('NYSE_TAQ', tick_type='TRD')\n", "trd = trd[['PRICE','SIZE']]\n", "otp.run(trd, start=otp.dt(2023,3,29,10), end=otp.dt(2023,3,29,10,1), symbols=['SPY'])" ] }, { "cell_type": "markdown", "id": "fd6139cf-af7b-4bc9-bcfe-08ef9271edd0", "metadata": {}, "source": [ "Now let's take a look at the quotes (or rather the 'national best bid/offer')." ] }, { "cell_type": "code", "execution_count": 3, "id": "ce9c9f4e-419b-4ebb-aba7-a40f8639ac77", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
TimeBID_PRICEASK_PRICE
02023-03-29 10:00:00.005847040399.85399.86
12023-03-29 10:00:00.005907200399.85399.86
22023-03-29 10:00:00.005957376399.85399.86
32023-03-29 10:00:00.005992448399.85399.86
42023-03-29 10:00:00.006007296399.85399.86
............
264092023-03-29 10:00:59.998479872399.88399.89
264102023-03-29 10:00:59.998615296399.88399.89
264112023-03-29 10:00:59.998635264399.88399.89
264122023-03-29 10:00:59.999427072399.88399.89
264132023-03-29 10:00:59.999838208399.88399.89
\n", "

26414 rows × 3 columns

\n", "
" ], "text/plain": [ " Time BID_PRICE ASK_PRICE\n", "0 2023-03-29 10:00:00.005847040 399.85 399.86\n", "1 2023-03-29 10:00:00.005907200 399.85 399.86\n", "2 2023-03-29 10:00:00.005957376 399.85 399.86\n", "3 2023-03-29 10:00:00.005992448 399.85 399.86\n", "4 2023-03-29 10:00:00.006007296 399.85 399.86\n", "... ... ... ...\n", "26409 2023-03-29 10:00:59.998479872 399.88 399.89\n", "26410 2023-03-29 10:00:59.998615296 399.88 399.89\n", "26411 2023-03-29 10:00:59.998635264 399.88 399.89\n", "26412 2023-03-29 10:00:59.999427072 399.88 399.89\n", "26413 2023-03-29 10:00:59.999838208 399.88 399.89\n", "\n", "[26414 rows x 3 columns]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "qte = otp.DataSource('TAQ_NBBO', tick_type='NBBO')\n", "qte = qte[['BID_PRICE','ASK_PRICE']]\n", "otp.run(qte, start=otp.dt(2023,3,29,10), end=otp.dt(2023,3,29,10,1), symbols=['SPY'])" ] }, { "cell_type": "markdown", "id": "bb18486d-ccb2-4add-b235-6803077465a0", "metadata": {}, "source": [ "We \"enhance\" the trades with the information from the quotes." ] }, { "cell_type": "code", "execution_count": 4, "id": "5ca34e70-019c-45e0-923a-61765ed71e7a", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
TimePRICESIZEBID_PRICEASK_PRICE
02023-03-29 10:00:00.005852928399.8500100399.85399.86
12023-03-29 10:00:00.005852928399.8500100399.85399.86
22023-03-29 10:00:00.005856000399.8500100399.85399.86
32023-03-29 10:00:00.006580736399.85001399.84399.85
42023-03-29 10:00:00.012014592399.85001399.85399.87
..................
21362023-03-29 10:00:59.285156096399.8800100399.88399.89
21372023-03-29 10:00:59.687216896399.89001399.87399.89
21382023-03-29 10:00:59.911493888399.8899100399.87399.89
21392023-03-29 10:00:59.974551040399.8701100399.87399.89
21402023-03-29 10:00:59.998262528399.89001399.88399.89
\n", "

2141 rows × 5 columns

\n", "
" ], "text/plain": [ " Time PRICE SIZE BID_PRICE ASK_PRICE\n", "0 2023-03-29 10:00:00.005852928 399.8500 100 399.85 399.86\n", "1 2023-03-29 10:00:00.005852928 399.8500 100 399.85 399.86\n", "2 2023-03-29 10:00:00.005856000 399.8500 100 399.85 399.86\n", "3 2023-03-29 10:00:00.006580736 399.8500 1 399.84 399.85\n", "4 2023-03-29 10:00:00.012014592 399.8500 1 399.85 399.87\n", "... ... ... ... ... ...\n", "2136 2023-03-29 10:00:59.285156096 399.8800 100 399.88 399.89\n", "2137 2023-03-29 10:00:59.687216896 399.8900 1 399.87 399.89\n", "2138 2023-03-29 10:00:59.911493888 399.8899 100 399.87 399.89\n", "2139 2023-03-29 10:00:59.974551040 399.8701 100 399.87 399.89\n", "2140 2023-03-29 10:00:59.998262528 399.8900 1 399.88 399.89\n", "\n", "[2141 rows x 5 columns]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "enh_trd = otp.join_by_time([trd, qte])\n", "otp.run(enh_trd, start=otp.dt(2023,3,29,10), end=otp.dt(2023,3,29,10,1), symbols=['SPY'])" ] }, { "cell_type": "markdown", "id": "6478876f-e63d-4d04-81e5-5c314b9eaf07", "metadata": {}, "source": [ "In other words, each trade is joined with the quote that was active at the time the trade took place. You can examine the quote time to make sure it's before the trade time." ] }, { "cell_type": "code", "execution_count": 5, "id": "3fbcedd7-9dba-4fa5-8dd8-f8f01c0226c0", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
TimePRICESIZEBID_PRICEASK_PRICEquote_time
02023-03-29 10:00:00.005852928399.8500100399.85399.862023-03-29 10:00:00.005847040
12023-03-29 10:00:00.005852928399.8500100399.85399.862023-03-29 10:00:00.005847040
22023-03-29 10:00:00.005856000399.8500100399.85399.862023-03-29 10:00:00.005847040
32023-03-29 10:00:00.006580736399.85001399.84399.852023-03-29 10:00:00.006548736
42023-03-29 10:00:00.012014592399.85001399.85399.872023-03-29 10:00:00.011275008
.....................
21362023-03-29 10:00:59.285156096399.8800100399.88399.892023-03-29 10:00:59.285148672
21372023-03-29 10:00:59.687216896399.89001399.87399.892023-03-29 10:00:59.686418176
21382023-03-29 10:00:59.911493888399.8899100399.87399.892023-03-29 10:00:59.879587072
21392023-03-29 10:00:59.974551040399.8701100399.87399.892023-03-29 10:00:59.933540352
21402023-03-29 10:00:59.998262528399.89001399.88399.892023-03-29 10:00:59.998239232
\n", "

2141 rows × 6 columns

\n", "
" ], "text/plain": [ " Time PRICE SIZE BID_PRICE ASK_PRICE quote_time\n", "0 2023-03-29 10:00:00.005852928 399.8500 100 399.85 399.86 2023-03-29 10:00:00.005847040\n", "1 2023-03-29 10:00:00.005852928 399.8500 100 399.85 399.86 2023-03-29 10:00:00.005847040\n", "2 2023-03-29 10:00:00.005856000 399.8500 100 399.85 399.86 2023-03-29 10:00:00.005847040\n", "3 2023-03-29 10:00:00.006580736 399.8500 1 399.84 399.85 2023-03-29 10:00:00.006548736\n", "4 2023-03-29 10:00:00.012014592 399.8500 1 399.85 399.87 2023-03-29 10:00:00.011275008\n", "... ... ... ... ... ... ...\n", "2136 2023-03-29 10:00:59.285156096 399.8800 100 399.88 399.89 2023-03-29 10:00:59.285148672\n", "2137 2023-03-29 10:00:59.687216896 399.8900 1 399.87 399.89 2023-03-29 10:00:59.686418176\n", "2138 2023-03-29 10:00:59.911493888 399.8899 100 399.87 399.89 2023-03-29 10:00:59.879587072\n", "2139 2023-03-29 10:00:59.974551040 399.8701 100 399.87 399.89 2023-03-29 10:00:59.933540352\n", "2140 2023-03-29 10:00:59.998262528 399.8900 1 399.88 399.89 2023-03-29 10:00:59.998239232\n", "\n", "[2141 rows x 6 columns]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "qte['quote_time'] = qte['Time']\n", "enh_trd = otp.join_by_time([trd, qte])\n", "otp.run(enh_trd, start=otp.dt(2023,3,29,10), end=otp.dt(2023,3,29,10,1), symbols=['SPY'])" ] }, { "cell_type": "markdown", "id": "f1c192bc-3273-44fe-a1e2-9a34dbbdd349", "metadata": {}, "source": [ "Let's try the same query for a less liquid symbol." ] }, { "cell_type": "code", "execution_count": 6, "id": "1843eb21-3ed7-4aab-8b08-75b1bf9fd986", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
TimePRICESIZEBID_PRICEASK_PRICEquote_time
02023-03-29 10:00:00.73495398440.95007NaNNaN1969-12-31 19:00:00.000000000
12023-03-29 10:00:01.49224857640.950010040.9240.952023-03-29 10:00:01.158007296
22023-03-29 10:00:03.34810393640.9500840.9240.952023-03-29 10:00:03.347166976
32023-03-29 10:00:03.34812185640.950039140.9240.952023-03-29 10:00:03.348119296
42023-03-29 10:00:03.34816998440.950010040.9240.952023-03-29 10:00:03.348169472
.....................
1772023-03-29 10:00:53.95733939241.00006140.9841.002023-03-29 10:00:53.957334528
1782023-03-29 10:00:55.37712256041.0100540.9841.012023-03-29 10:00:54.791796736
1792023-03-29 10:00:55.61297356840.9916140.9841.012023-03-29 10:00:54.791796736
1802023-03-29 10:00:57.38135065641.00001040.9941.032023-03-29 10:00:57.381330688
1812023-03-29 10:00:57.38249318441.010010041.0041.032023-03-29 10:00:57.381917184
\n", "

182 rows × 6 columns

\n", "
" ], "text/plain": [ " Time PRICE SIZE BID_PRICE ASK_PRICE quote_time\n", "0 2023-03-29 10:00:00.734953984 40.9500 7 NaN NaN 1969-12-31 19:00:00.000000000\n", "1 2023-03-29 10:00:01.492248576 40.9500 100 40.92 40.95 2023-03-29 10:00:01.158007296\n", "2 2023-03-29 10:00:03.348103936 40.9500 8 40.92 40.95 2023-03-29 10:00:03.347166976\n", "3 2023-03-29 10:00:03.348121856 40.9500 391 40.92 40.95 2023-03-29 10:00:03.348119296\n", "4 2023-03-29 10:00:03.348169984 40.9500 100 40.92 40.95 2023-03-29 10:00:03.348169472\n", ".. ... ... ... ... ... ...\n", "177 2023-03-29 10:00:53.957339392 41.0000 61 40.98 41.00 2023-03-29 10:00:53.957334528\n", "178 2023-03-29 10:00:55.377122560 41.0100 5 40.98 41.01 2023-03-29 10:00:54.791796736\n", "179 2023-03-29 10:00:55.612973568 40.9916 1 40.98 41.01 2023-03-29 10:00:54.791796736\n", "180 2023-03-29 10:00:57.381350656 41.0000 10 40.99 41.03 2023-03-29 10:00:57.381330688\n", "181 2023-03-29 10:00:57.382493184 41.0100 100 41.00 41.03 2023-03-29 10:00:57.381917184\n", "\n", "[182 rows x 6 columns]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "otp.run(enh_trd, start=otp.dt(2023,3,29,10), end=otp.dt(2023,3,29,10,1), symbols=['AA'])" ] }, { "cell_type": "markdown", "id": "1158bea9-f28f-4fa1-b655-a0224e76fece", "metadata": {}, "source": [ "We didn't find a matching quote for the first trade. This happens when the first event following the query start time is a trade rather than a quote. We can fix this by instructing OneTick to look for a quote back in time from the start time. A complete code snipped is below." ] }, { "cell_type": "code", "execution_count": 7, "id": "056d8e72-19ec-4fe8-97f4-3132652949d3", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
TimePRICESIZEBID_PRICEASK_PRICEquote_time
02023-03-29 10:00:00.73495398440.9500740.9240.952023-03-29 10:00:00.000000000
12023-03-29 10:00:01.49224857640.950010040.9240.952023-03-29 10:00:01.158007296
22023-03-29 10:00:03.34810393640.9500840.9240.952023-03-29 10:00:03.347166976
32023-03-29 10:00:03.34812185640.950039140.9240.952023-03-29 10:00:03.348119296
42023-03-29 10:00:03.34816998440.950010040.9240.952023-03-29 10:00:03.348169472
.....................
1772023-03-29 10:00:53.95733939241.00006140.9841.002023-03-29 10:00:53.957334528
1782023-03-29 10:00:55.37712256041.0100540.9841.012023-03-29 10:00:54.791796736
1792023-03-29 10:00:55.61297356840.9916140.9841.012023-03-29 10:00:54.791796736
1802023-03-29 10:00:57.38135065641.00001040.9941.032023-03-29 10:00:57.381330688
1812023-03-29 10:00:57.38249318441.010010041.0041.032023-03-29 10:00:57.381917184
\n", "

182 rows × 6 columns

\n", "
" ], "text/plain": [ " Time PRICE SIZE BID_PRICE ASK_PRICE quote_time\n", "0 2023-03-29 10:00:00.734953984 40.9500 7 40.92 40.95 2023-03-29 10:00:00.000000000\n", "1 2023-03-29 10:00:01.492248576 40.9500 100 40.92 40.95 2023-03-29 10:00:01.158007296\n", "2 2023-03-29 10:00:03.348103936 40.9500 8 40.92 40.95 2023-03-29 10:00:03.347166976\n", "3 2023-03-29 10:00:03.348121856 40.9500 391 40.92 40.95 2023-03-29 10:00:03.348119296\n", "4 2023-03-29 10:00:03.348169984 40.9500 100 40.92 40.95 2023-03-29 10:00:03.348169472\n", ".. ... ... ... ... ... ...\n", "177 2023-03-29 10:00:53.957339392 41.0000 61 40.98 41.00 2023-03-29 10:00:53.957334528\n", "178 2023-03-29 10:00:55.377122560 41.0100 5 40.98 41.01 2023-03-29 10:00:54.791796736\n", "179 2023-03-29 10:00:55.612973568 40.9916 1 40.98 41.01 2023-03-29 10:00:54.791796736\n", "180 2023-03-29 10:00:57.381350656 41.0000 10 40.99 41.03 2023-03-29 10:00:57.381330688\n", "181 2023-03-29 10:00:57.382493184 41.0100 100 41.00 41.03 2023-03-29 10:00:57.381917184\n", "\n", "[182 rows x 6 columns]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "trd = otp.DataSource('NYSE_TAQ', tick_type='TRD')\n", "trd = trd[['PRICE','SIZE']]\n", "\n", "qte = otp.DataSource('TAQ_NBBO', tick_type='NBBO', back_to_first_tick=60)\n", "qte = qte[['BID_PRICE','ASK_PRICE']]\n", "qte['quote_time'] = qte['Time']\n", "\n", "enh_trd = otp.join_by_time([trd, qte])\n", "otp.run(enh_trd, start=otp.dt(2023,3,29,10), end=otp.dt(2023,3,29,10,1), symbols=['AA'])" ] }, { "cell_type": "markdown", "id": "66f31c5b-05b3-4d7a-a208-33453fd99039", "metadata": {}, "source": [ "# Use case: Prevaling quote at the time of a trade\n", "The code snippet above implements this use case." ] }, { "cell_type": "markdown", "id": "5e93226d-78dd-413d-ab56-5fbab57a604c", "metadata": {}, "source": [ "# Use case: Computing Markouts\n", "A common TCA and quant research use case is to see what happens to the quote at certain intervals (aka markouts) before/after the trade. We can compute markouts efficiently by shifting the time series for each markout and then doing `join_by_time`." ] }, { "cell_type": "code", "execution_count": null, "id": "5d154585-d546-4879-9337-913b04abf8b6", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
TimePRICESIZEASK_PRICE_-1BID_PRICE_-1quote_time_-1ASK_PRICE_1BID_PRICE_1quote_time_1ASK_PRICE_5BID_PRICE_5quote_time_5
02023-03-08 10:00:00.014762069181.4250300181.47181.412023-03-08 09:59:59.002564435181.49181.482023-03-08 10:00:01.000000000181.15181.042023-03-08 10:00:05.012557243
12023-03-08 10:00:00.014764168181.440012181.47181.412023-03-08 09:59:59.002564435181.49181.482023-03-08 10:00:01.000000000181.15181.042023-03-08 10:00:05.012557243
22023-03-08 10:00:00.014767570181.4400200181.47181.412023-03-08 09:59:59.002564435181.49181.482023-03-08 10:00:01.000000000181.15181.042023-03-08 10:00:05.012557243
32023-03-08 10:00:00.014799259181.45002181.47181.412023-03-08 09:59:59.002564435181.49181.482023-03-08 10:00:01.000000000181.15181.042023-03-08 10:00:05.012557243
42023-03-08 10:00:00.015374686181.4400300181.47181.412023-03-08 09:59:59.002564435181.49181.482023-03-08 10:00:01.000000000181.15181.042023-03-08 10:00:05.012557243
.......................................
7052023-03-09 10:00:00.956041264183.5899500183.46183.432023-03-09 09:59:59.911620614183.59183.542023-03-09 10:00:01.945769144183.65183.572023-03-09 10:00:05.954673686
7062023-03-09 10:00:00.967881686183.580025183.46183.432023-03-09 09:59:59.963372673183.59183.542023-03-09 10:00:01.963256371183.66183.572023-03-09 10:00:05.967652794
7072023-03-09 10:00:00.978963708183.4718136183.46183.422023-03-09 09:59:59.971349853183.59183.542023-03-09 10:00:01.963256371183.66183.572023-03-09 10:00:05.967652794
7082023-03-09 10:00:00.983358396183.550020183.46183.412023-03-09 09:59:59.983288366183.59183.542023-03-09 10:00:01.963256371183.66183.572023-03-09 10:00:05.967652794
7092023-03-09 10:00:00.990945922183.5650200183.44183.412023-03-09 09:59:59.988781014183.59183.542023-03-09 10:00:01.963256371183.66183.572023-03-09 10:00:05.967652794
\n", "

710 rows × 12 columns

\n", "
" ], "text/plain": [ " Time PRICE SIZE ASK_PRICE_-1 BID_PRICE_-1 quote_time_-1 ASK_PRICE_1 BID_PRICE_1 quote_time_1 ASK_PRICE_5 BID_PRICE_5 quote_time_5\n", "0 2023-03-08 10:00:00.014762069 181.4250 300 181.47 181.41 2023-03-08 09:59:59.002564435 181.49 181.48 2023-03-08 10:00:01.000000000 181.15 181.04 2023-03-08 10:00:05.012557243\n", "1 2023-03-08 10:00:00.014764168 181.4400 12 181.47 181.41 2023-03-08 09:59:59.002564435 181.49 181.48 2023-03-08 10:00:01.000000000 181.15 181.04 2023-03-08 10:00:05.012557243\n", "2 2023-03-08 10:00:00.014767570 181.4400 200 181.47 181.41 2023-03-08 09:59:59.002564435 181.49 181.48 2023-03-08 10:00:01.000000000 181.15 181.04 2023-03-08 10:00:05.012557243\n", "3 2023-03-08 10:00:00.014799259 181.4500 2 181.47 181.41 2023-03-08 09:59:59.002564435 181.49 181.48 2023-03-08 10:00:01.000000000 181.15 181.04 2023-03-08 10:00:05.012557243\n", "4 2023-03-08 10:00:00.015374686 181.4400 300 181.47 181.41 2023-03-08 09:59:59.002564435 181.49 181.48 2023-03-08 10:00:01.000000000 181.15 181.04 2023-03-08 10:00:05.012557243\n", ".. ... ... ... ... ... ... ... ... ... ... ... ...\n", "705 2023-03-09 10:00:00.956041264 183.5899 500 183.46 183.43 2023-03-09 09:59:59.911620614 183.59 183.54 2023-03-09 10:00:01.945769144 183.65 183.57 2023-03-09 10:00:05.954673686\n", "706 2023-03-09 10:00:00.967881686 183.5800 25 183.46 183.43 2023-03-09 09:59:59.963372673 183.59 183.54 2023-03-09 10:00:01.963256371 183.66 183.57 2023-03-09 10:00:05.967652794\n", "707 2023-03-09 10:00:00.978963708 183.4718 136 183.46 183.42 2023-03-09 09:59:59.971349853 183.59 183.54 2023-03-09 10:00:01.963256371 183.66 183.57 2023-03-09 10:00:05.967652794\n", "708 2023-03-09 10:00:00.983358396 183.5500 20 183.46 183.41 2023-03-09 09:59:59.983288366 183.59 183.54 2023-03-09 10:00:01.963256371 183.66 183.57 2023-03-09 10:00:05.967652794\n", "709 2023-03-09 10:00:00.990945922 183.5650 200 183.44 183.41 2023-03-09 09:59:59.988781014 183.59 183.54 2023-03-09 10:00:01.963256371 183.66 183.57 2023-03-09 10:00:05.967652794\n", "\n", "[710 rows x 12 columns]" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import onetick.query as otq\n", "\n", "s = otp.dt(2023, 3, 8, 10)\n", "e = otp.dt(2023, 3, 9, 10, 0, 1)\n", "markouts = [-1, 1, 5] \n", "\n", "trd = otp.DataSource('NYSE_TAQ', tick_type='TRD')\n", "trd = trd[['PRICE', 'SIZE']]\n", "\n", "qte_by_markout = []\n", "for m in markouts:\n", " qte = otp.DataSource('TAQ_NBBO', tick_type='NBBO', back_to_first_tick=86400)\n", " qte = qte[['ASK_PRICE', 'BID_PRICE']]\n", " qte = qte.rename({'ASK_PRICE': f'ASK_PRICE_{m}', \n", " 'BID_PRICE': f'BID_PRICE_{m}'})\n", " qte[f'quote_time_{m}'] = qte['Time']\n", " \n", " # shift the data by m seconds\n", " # change the timestamp so it fits into original time range (required by OneTick)\n", " qte.sink(otq.ModifyQueryTimes(start_time=f'_START_TIME + {m * 1000}',\n", " output_timestamp=f'TIMESTAMP - {m * 1000}',\n", " end_time=f'_END_TIME + {m * 1000}'))\n", " qte_by_markout.append(qte)\n", "\n", "trd = otp.join_by_time([trd] + qte_by_markout)\n", "otp.run(trd, start=s, end=e, symbols='TSLA', apply_times_daily=True)" ] }, { "cell_type": "markdown", "id": "c7f14ed4-8248-430b-b13a-74a4fe694223", "metadata": {}, "source": [ "# `join_with_query`: Executing a query on each tick" ] }, { "cell_type": "markdown", "id": "d12786f9-dbcd-4726-b667-6b47d71aff64", "metadata": {}, "source": [ "Sometimes we want to do a look up based on the information provided in a tick. For example, we may have a series of order ticks each containing an order arrival and exit times and we may want to find the market vwap during the interval. Let's take this one step at a time." ] }, { "cell_type": "markdown", "id": "7db241f8-69e6-48dc-91a9-ebe496fb2a55", "metadata": {}, "source": [ "Let's find the market vwap for a given time range (i.e., for a given `start` and `end`)." ] }, { "cell_type": "code", "execution_count": 8, "id": "d3994b99-f9f7-4925-8352-ab3094c3d5ba", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Timemarket_vwap
02023-04-10 10:27:00160.477922
\n", "
" ], "text/plain": [ " Time market_vwap\n", "0 2023-04-10 10:27:00 160.477922" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "q = otp.DataSource('NYSE_TAQ', tick_type='TRD')\n", "q = q.agg({'market_vwap': otp.agg.vwap('PRICE','SIZE')})\n", "otp.run(q, start=otp.dt(2023,4,10,10,25,59), end=otp.dt(2023,4,10,10,27), symbols='AAPL')" ] }, { "cell_type": "markdown", "id": "279ab9ef-844c-4a30-a526-52e09c1b65e2", "metadata": {}, "source": [ "Next let's create a couple of orders each with its own values for `start`/`end` specified in the `arrival`/`exit` columns." ] }, { "cell_type": "code", "execution_count": 42, "id": "2572553c-b769-43f7-b312-b77d2508675a", "metadata": {}, "outputs": [], "source": [ "# set some defaults\n", "from datetime import datetime\n", "otp.config['default_start_time'] = datetime(2003, 12, 1, 0, 0, 0)\n", "otp.config['default_end_time'] = datetime(2003, 12, 4, 0, 0, 0)\n", "otp.config['default_db'] = 'DEMO_L1'\n", "otp.config['default_symbol'] = 'AAPL'" ] }, { "cell_type": "code", "execution_count": 43, "id": "d38d98d1-e151-4f11-bc39-bf477aa692df", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Timearrivalexitsym
02003-12-01 00:00:00.0002023-04-10 10:25:592023-04-10 10:26:07AAPL
12003-12-01 00:00:00.0012023-04-10 10:26:092023-04-10 10:27:48MSFT
\n", "
" ], "text/plain": [ " Time arrival exit sym\n", "0 2003-12-01 00:00:00.000 2023-04-10 10:25:59 2023-04-10 10:26:07 AAPL\n", "1 2003-12-01 00:00:00.001 2023-04-10 10:26:09 2023-04-10 10:27:48 MSFT" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "orders = otp.Ticks(arrival=[otp.dt(2023,4,10,10,25,59), otp.dt(2023,4,10,10,26,9)], \n", " exit=[otp.dt(2023,4,10,10,26,7), otp.dt(2023,4,10,10,27,48)],\n", " sym=['AAPL', 'MSFT'])\n", "otp.run(orders)" ] }, { "cell_type": "markdown", "id": "844a4826-7771-41c7-8f5d-2a6ed65a83c2", "metadata": {}, "source": [ "We can wrap the code that finds vwap into a function and call it for each order while passing the relevant parameters for `start`, `end`, and `symbol`." ] }, { "cell_type": "code", "execution_count": 30, "id": "fb8b8e54-ac60-49d8-9927-9bddd873055b", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Timemarket_vwaparrivalexitsym
02022-04-01 00:00:00.000160.5488902023-04-10 10:25:592023-04-10 10:26:07AAPL
12022-04-01 00:00:00.001286.1367792023-04-10 10:26:092023-04-10 10:27:48MSFT
\n", "
" ], "text/plain": [ " Time market_vwap arrival exit sym\n", "0 2022-04-01 00:00:00.000 160.548890 2023-04-10 10:25:59 2023-04-10 10:26:07 AAPL\n", "1 2022-04-01 00:00:00.001 286.136779 2023-04-10 10:26:09 2023-04-10 10:27:48 MSFT" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def vwap(symbol):\n", " q = otp.DataSource('NYSE_TAQ', tick_type='TRD')\n", " q = q.agg({'market_vwap': otp.agg.vwap('PRICE','SIZE')})\n", " return q\n", "\n", "orders = otp.Ticks(arrival=[otp.dt(2023,4,10,10,25,59), otp.dt(2023,4,10,10,26,9)], \n", " exit=[otp.dt(2023,4,10,10,26,7), otp.dt(2023,4,10,10,27,48)],\n", " sym=['AAPL', 'MSFT'])\n", "orders = orders.join_with_query(vwap, start=orders['arrival'], end=orders['exit'], symbol=orders['sym'])\n", "otp.run(orders)" ] }, { "cell_type": "markdown", "id": "7624b26c-4e9d-4b74-8fc4-80bfa769162f", "metadata": {}, "source": [ "# Use case: Interval VWAP\n", "The code above provides an implementation for this use case. However, a more efficient implementation may be useful when the number of orders is large. We provide a more efficient implementation below." ] }, { "cell_type": "code", "execution_count": 36, "id": "5e5d69e8-571d-47be-8ce9-f07824c92d2b", "metadata": {}, "outputs": [], "source": [ "orders['_PARAM_START_TIME_NANOS'] = orders['arrival'] \n", "orders['_PARAM_END_TIME_NANOS'] = orders['exit'] \n", "orders['SYMBOL_NAME'] = orders['sym'] " ] }, { "cell_type": "code", "execution_count": 37, "id": "618781ac-c3ff-4530-9a2b-1e5ad7ec5e97", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'AAPL': Time market_vwap\n", " 0 2023-04-10 10:26:07 160.54889,\n", " 'MSFT': Time market_vwap\n", " 0 2023-04-10 10:27:48 286.136779}" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "otp.run(vwap, symbols=orders, start=otp.dt(2022,12,1), end=otp.dt(2022,12,2))" ] }, { "cell_type": "code", "execution_count": null, "id": "b6226756-6dfa-4177-a2a2-c0851f0f766b", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.8" } }, "nbformat": 4, "nbformat_minor": 5 }