otp.Source.join_with_query#
- Source.join_with_query(query, how='outer', symbol=None, params=None, start=None, end=None, timezone=None, prefix=None, caching=None, keep_time=None, where=None, default_fields_for_outer_join=None, symbol_time=None, concurrency=None, process_query_async=True, **kwargs)#
For each tick executes
query
.- Parameters
query (callable, Source) –
Callable
query
should returnSource
. This object will be evaluated by OneTick (not python) for every tick. Note python code will be executed only once, so all python’s conditional expressions will be evaluated only once too. Callable should havesymbol
parameter and the parameters with names fromparams
if they are specified in this method.If
query
is aSource
object then it will be propagated as a query to OneTick.how ('inner', 'outer') – Type of join. If inner, then each tick is propagated only if its
query
execution has a non-empty result.params (dict) – Mapping of the parameters’ names and their values for the
query
.Columns
can be used as a value.symbol (str, Operation, dict, Source, or Tuple[Union[str, Operation], Union[dict, Source]]) –
Symbol name to use in
query
. In addition, symbol params can be passed along with symbol name.Symbol name can be passed as a string or as an
Operation
.Symbol parameters can be passed as a dictionary. Also, the main
Source
object, or the object containing a symbol parameter list, can be used as a list of symbol parameter. Special symbol parameters (_PARAM_START_TIME_NANOS and _PARAM_END_TIME_NANOS) will be ignored and will not be propagated toquery
.symbol
will be interpreted as a symbol name or as symbol parameters, depending on its type. You can pass both as a tuple.If symbol name is not passed, then symbol name from the main source is used.
start (datetime, Operation) – Start time of
query
. By default, start time of the main source is used.end (datetime, Operation) – End time of
query
. By default, end time of the main source is used.start_time –
Deprecated since version 1.48.4: The same as
start
.end_time –
Deprecated since version 1.48.4: The same as
end
.timezone (Optional, str) – Timezone of
query
. By default, timezone of the main source is used.prefix (str) – Prefix for the names of joined tick fields.
caching (str) –
If None caching is disabled (default). You can specify caching by using values:
’cross_symbol’: cache is the same for all symbols
’per_symbol’: cache is different for each symbol.
Note
When parameter
process_query_async
is set toTrue
(default), caching may work unexpectedly, because ticks will be accumulated in batches andquery
will be processed in different threads.keep_time (str) – Name for the joined timestamp column. None means no timestamp column will be joined.
where (Operation) – Condition to filter ticks for which the result of the
query
will be joined.default_fields_for_outer_join (dict) – When you use outer join, all output ticks will have fields from the schema of the joined source. If nothing was joined to a particular output tick, these fields will have default values for their type. This parameter allows to override the values that would be added to ticks for which nothing was joined. Dictionary keys should be field names, and dictionary values should be constants or
Operation
expressionssymbol_time (datetime, Operation) – Time that will be used by Onetick to map the symbol with which
query
is executed to the reference data. This parameter is only necessary if the query is expected to perform symbology conversions.concurrency (int) – Specifies number of threads for asynchronous processing of
query
per unbound symbol list. By default, the number of threads is 1.process_query_async (bool) –
Switches between synchronous and asynchronous execution of queries.
While asynchronous execution is generally much more effective, in certain cases synchronous execution may still be preferred (e.g., when there are a few input ticks, each initiating a memory-consuming query).
In asynchronous mode typically while parallel thread is processing the query, EP accumulates some input ticks.
self (Source) –
- Returns
Source with joined ticks from
query
- Return type
Examples
>>> d = otp.Ticks(Y=[-1]) >>> d = d.update(dict(Y=1), where=(d.Symbol.name == "a")) >>> data = otp.Ticks(X=[1, 2], ... S=["a", "b"]) >>> res = data.join_with_query(d, how='inner', symbol=data['S']) >>> otp.run(res)[["X", "Y", "S"]] X Y S 0 1 1 a 1 2 -1 b
>>> d = otp.Ticks(ADDED=[-1]) >>> d = d.update(dict(ADDED=1), where=(d.Symbol.name == "3")) # symbol name is always string >>> data = otp.Ticks(A=[1, 2], B=[2, 4]) >>> res = data.join_with_query(d, how='inner', symbol=(data['A'] + data['B'])) >>> df = otp.run(res) >>> df[["A", "B", "ADDED"]] A B ADDED 0 1 2 1 1 2 4 -1
Constants as symbols are also supported
>>> d = otp.Ticks(ADDED=[d.Symbol.name]) >>> data = otp.Ticks(A=[1, 2], B=[2, 4]) >>> res = data.join_with_query(d, how='inner', symbol=1) >>> df = otp.run(res) >>> df[["A", "B", "ADDED"]] A B ADDED 0 1 2 1 1 2 4 1
Function object as query is also supported (Note it will be executed only once in python’s code)
>>> def func(symbol): ... d = otp.Ticks(TYPE=["six"]) ... d = d.update(dict(TYPE="three"), where=(symbol.name == "3")) # symbol is always converted to string ... d["TYPE"] = symbol['PREF'] + d["TYPE"] + symbol['POST'] ... return d >>> data = otp.Ticks(A=[1, 2], B=[2, 4]) >>> res = data.join_with_query(func, how='inner', symbol=(data['A'] + data['B'], dict(PREF="_", POST="$"))) >>> df = otp.run(res) >>> df[["A", "B", "TYPE"]] A B TYPE 0 1 2 _three$ 1 2 4 _six$
It’s possible to pass the source itself as a list of symbol parameters, which will make all of its fields accessible through the “symbol” object
>>> def func(symbol): ... d = otp.Ticks(TYPE=["six"]) ... d["TYPE"] = symbol['PREF'] + d["TYPE"] + symbol['POST'] ... return d >>> data = otp.Ticks(A=[1, 2], B=[2, 4], PREF=["_", "$"], POST=["$", "_"]) >>> res = data.join_with_query(func, how='inner', symbol=data) >>> df = otp.run(res) >>> df[["A", "B", "TYPE"]] A B TYPE 0 1 2 _six$ 1 2 4 $six_
The examples above can be rewritten by using onetick query parameters instead of symbol parameters. OTQ parameters are global for query, while symbol parameters can be redefined by bound symbols.
>>> def func(symbol, pref, post): ... d = otp.Ticks(TYPE=["six"]) ... d = d.update(dict(TYPE="three"), where=(symbol.name == "3")) # symbol is always converted to string ... d["TYPE"] = pref + d["TYPE"] + post ... return d >>> data = otp.Ticks(A=[1, 2], B=[2, 4]) >>> res = data.join_with_query(func, how='inner', symbol=(data['A'] + data['B']), ... params=dict(pref="_", post="$")) >>> df = otp.run(res) >>> df[["A", "B", "TYPE"]] A B TYPE 0 1 2 _three$ 1 2 4 _six$
Some or all onetick query parameters can be column or expression also
>>> def func(symbol, pref, post): ... d = otp.Ticks(TYPE=["six"]) ... d = d.update(dict(TYPE="three"), where=(symbol.name == "3")) # symbol is always converted to string ... d["TYPE"] = pref + d["TYPE"] + post ... return d >>> data = otp.Ticks(A=[1, 2], B=[2, 4], PREF=["^", "_"], POST=["!", "$"]) >>> res = data.join_with_query(func, how='inner', symbol=(data['A'] + data['B']), ... params=dict(pref=data["PREF"] + ".", post=data["POST"])) >>> df = otp.run(res) >>> df[["A", "B", "TYPE"]] A B TYPE 0 1 2 ^.three! 1 2 4 _.six$
You can specify start and end time of the query to select specific ticks from db
>>> d = otp.Ticks(Y=[1, 2]) >>> data = otp.Ticks(X=[1, 2]) >>> start = datetime(2003, 12, 1, 0, 0, 0, 1000, tzinfo=pytz.timezone("EST5EDT")) >>> end = datetime(2003, 12, 1, 0, 0, 0, 3000, tzinfo=pytz.timezone("EST5EDT")) >>> res = data.join_with_query(d, how='inner', start=start, end=end) >>> otp.run(res) Time Y X 0 2003-12-01 00:00:00.000 1 1 1 2003-12-01 00:00:00.000 2 1 2 2003-12-01 00:00:00.001 1 2 3 2003-12-01 00:00:00.001 2 2
Use keep_time param to keep or rename original timestamp column
>>> d = otp.Ticks(Y=[1, 2]) >>> data = otp.Ticks(X=[1, 2]) >>> res = data.join_with_query(d, how='inner', keep_time="ORIG_TIME") >>> otp.run(res) Time Y ORIG_TIME X 0 2003-12-01 00:00:00.000 1 2003-12-01 00:00:00.000 1 1 2003-12-01 00:00:00.000 2 2003-12-01 00:00:00.001 1 2 2003-12-01 00:00:00.001 1 2003-12-01 00:00:00.000 2 3 2003-12-01 00:00:00.001 2 2003-12-01 00:00:00.001 2
See also
JOIN_WITH_QUERY OneTick event processor