otp.run#
- run(query, *, symbols=None, start=utils.adaptive, end=utils.adaptive, date=None, start_time_expression=None, end_time_expression=None, timezone=utils.default, context=utils.default, username=None, alternative_username=None, password=None, batch_size=utils.default, running=False, query_properties=None, concurrency=utils.default, apply_times_daily=None, symbol_date=None, query_params=None, time_as_nsec=True, treat_byte_arrays_as_strings=True, output_matrix_per_field=False, output_structure=None, return_utc_times=None, connection=None, callback=None, svg_path=None, use_connection_pool=False, node_name=None, require_dict=False, max_expected_ticks_per_symbol=None, log_symbol=utils.default, encoding=None, manual_dataframe_callback=False)#
- Executes a query and returns its result. - Parameters
- query ( - onetick.py.Source, otq.Ep, otq.Graph, otq.GraphQuery, otq.ChainQuery, str, otq.Chainlet, Callable) –- Query to execute can be source, path of the query on a disk or onetick.query graph or event processor. For running OTQ files, it represents the path (including filename) to the OTQ file to run a single query within the file. If more than one query is present, then the query to be run must be specified (that is, - 'path_to_file/otq_file.otq::query_to_run').- querycan also be a function that has a symbol object as the first parameter. This object can be used to get symbol name and symbol parameters. Function must return a- Source.
- symbols (str, list of str, list of otq.Symbol, - onetick.py.Source, pd.DataFrame, optional) – Symbol(s) to run the query for passed as a string, a list of strings, a pd.DataFrame with the- SYMBOL_NAMEcolumn, or as a “symbols” query which results include the- SYMBOL_NAMEcolumn. The start/end times for the symbols query will taken from the params below. See symbols for more details.
- start (datetime.datetime, - onetick.py.datetime,- pyomd.timeval_t, optional) – The start time of the query. If datetime.datetime was passed then timezone of object is ignored by Onetick, therefore we suggest using only- otp.datetimeobjects as an argument. onetick.py uses- default_start_timeas default value, if you don’t want to specify start time, e.g. to use saved time of the query, then you should specify None value. See also- timezoneargument.
- end (datetime.datetime, - onetick.py.datetime,- pyomd.timeval_t, optional) – The end time of the query. If datetime.datetime was passed then timezone of object is ignored by Onetick, therefore we suggest using only- otp.datetimeobjects as an argument. See also- timezoneargument. onetick.py uses- default_end_timeas default value, if you don’t want to specify end time, e.g. to use saved time of the query, then you should specify None value.
- date (datetime.date, - onetick.py.date, optional) – The date to run the query for. Can be set instead of- startand- endparameters. If set then the interval to run the query will be from 0:00 to 24:00 of the specified date.
- start_time_expression (str, optional) – Start time onetick expression of the query. If specified, it will take precedence over - start. Supported only if query is Source, Graph or Event Processor.
- end_time_expression (str, optional) – End time onetick expression of the query. If specified, it will take precedence over - end. Supported only if query is Source, Graph or Event Processor.
- timezone (str, optional) – The timezone of start and end times, as well as of the output timestamps. It has higher priority then timezone of start and end parameters. If parameter is omitted timestamps of ticks will be formatted with the default - tz.
- context (str, optional) – Allows specification of different instances of OneTick tick_servers to connect to. If not set then default - contextis used.
- username (Optional[str]) – The username to make the connection. By default the user which executed the process is used. 
- alternative_username (str) – The username used for authentication. Needs to be set only when the tick server is configured to use password-based authentication. By default, - default_auth_usernameis used.
- password (str, optional) – The password used for authentication. Needs to be set only when the tick server is configured to use password-based authentication. Note: not supported and ignored on older OneTick versions. By default, - default_passwordis used.
- batch_size (int) – number of symbols to run in one batch. By default, the value from - default_batch_sizeis used.
- running (bool, optional) – Indicates whether a query is CEP or not. Default is False. 
- query_properties ( - pyomd.QueryPropertiesor dict, optional) – Query properties, such as ONE_TO_MANY_POLICY, ALLOW_GRAPH_REUSE, etc
- concurrency (int, optional) – The maximum number of CPU cores to use to process the query. By default, the value from - default_concurrencyis used.
- apply_times_daily (bool) – - Runs the query for every day in the - start-- endtime range, using the time components of- startand- enddatetimes.- Note that those daily intervals are executed separately, so you don’t have access to the data from previous or next days (see example in the next section). 
- symbol_date (Optional[Union[datetime.datetime, int]]) – The symbol date used to look up symbology mapping information in the reference database, expressed as datetime object or integer of YYYYMMDD format 
- query_params (dict) – Parameters of the query. 
- time_as_nsec (bool) – Outputs timestamps up to nanoseconds granularity (defaults to False: by default we output timestamps in microseconds granularity) 
- treat_byte_arrays_as_strings (bool) – Outputs byte arrays as strings (defaults to True) 
- output_matrix_per_field (bool) – Changes output format to list of matrices per field. 
- output_structure (otp.Source.OutputStructure, optional) – - Structure (type) of the result. Supported values are:
- df (default) - the result is returned as pandas.DataFrame or dict[symbol: pandas.Dataframe] in case of using multiple symbols or first stage query. 
- map - the result is returned as SymbolNumpyResultMap. 
- list - the result is returned as list. 
 
 
- return_utc_times (bool) – If True Return times in UTC timezone and in local timezone otherwise 
- connection ( - pyomd.Connection) – The connection to be used for discovering nested .otq files
- callback ( - onetick.py.CallbackBase) – Class with callback methods. If set, the output of the query should be controlled with callbacks and this function returns nothing.
- svg_path – 
- use_connection_pool (bool) – 
- node_name (str, List[str], optional) – Name of the output node to select result from. If query graph has several output nodes, you can specify the name of the node to choose result from. If node_name was specified, query should be presented by path on the disk and output_structure should be df 
- require_dict (bool) – If set to True, result will be forced to be a dictionary even if it’s returned for a single symbol 
- max_expected_ticks_per_symbol (int) – Expected maximum number of ticks per symbol (used for performance optimizations). By default, - max_expected_ticks_per_symbolis used.
- log_symbol (bool) – Log currently executed symbol. Note that this only works with unbound symbols. Also in this case - otp.runis executed in- callbackmode and no value is returned from the function, so it should be used only for debugging purposes. This logging will not work if some other value specified in parameter- callback. By default,- otp.config.log_symbolis used.
- encoding (str, optional) – The encoding of string fields. 
- manual_dataframe_callback (bool) – Create dataframe manually with - callbackmode. Only works if- output_structure='df'is specified and parameter- callbackis not. May improve performance in some cases.
 
- Returns
- result of the query 
- Return type
- result, list, dict, pandas.DataFrame, None 
 - Examples - Running - onetick.py.Sourceand setting start and end times:- >>> data = otp.Tick(A=1) >>> otp.run(data, start=otp.dt(2003, 12, 2), end=otp.dt(2003, 12, 4)) Time A 0 2003-12-02 1 - Setting query interval with - dateparameter:- >>> data = otp.Tick(A=1) >>> data['START'] = data['_START_TIME'] >>> data['END'] = data['_END_TIME'] >>> otp.run(data, date=otp.dt(2003, 12, 1)) Time A START END 0 2003-12-01 1 2003-12-01 2003-12-02 - Running otq.Ep and passing query parameters: - >>> ep = otq.TickGenerator(bucket_interval=0, fields='long A = $X').tick_type('TT') >>> otp.run(ep, symbols='LOCAL::', query_params={'X': 1}) Time A 0 2003-12-04 1 - Running in callback mode: - >>> class Callback(otp.CallbackBase): ... def __init__(self): ... self.result = None ... def process_tick(self, tick, time): ... self.result = tick >>> data = otp.Tick(A=1) >>> callback = Callback() >>> otp.run(data, callback=callback) >>> callback.result {'A': 1} - Running with - apply_times_daily. Note that daily intervals are processed separately so, for example, we can’t access column COUNT from previous day.- >>> trd = otp.DataSource('NYSE_TAQ', symbols='AAPL', tick_type='TRD') >>> trd = trd.agg({'COUNT': otp.agg.count()}, ... bucket_interval=12 * 3600, bucket_time='start') >>> trd['PREV_COUNT'] = trd['COUNT'][-1] >>> otp.run(trd, apply_times_daily=True, ... start=otp.dt(2023, 4, 3), end=otp.dt(2023, 4, 5), timezone='EST5EDT') Time COUNT PREV_COUNT 0 2023-04-03 00:00:00 328447 0 1 2023-04-03 12:00:00 240244 328447 2 2023-04-04 00:00:00 263293 0 3 2023-04-04 12:00:00 193018 263293 - Using a function as a - query, accessing symbol name and parameters:- >>> def query(symbol): ... t = otp.Tick(X='x') ... t['SYMBOL_NAME'] = symbol.name ... t['SYMBOL_PARAM'] = symbol.PARAM ... return t >>> symbols = otp.Ticks({'SYMBOL_NAME': ['A', 'B'], 'PARAM': [1, 2]}) >>> result = otp.run(query, symbols=symbols) >>> result['A'] Time X SYMBOL_NAME SYMBOL_PARAM 0 2003-12-01 x A 1 >>> result['B'] Time X SYMBOL_NAME SYMBOL_PARAM 0 2003-12-01 x B 2 - Debugging unbound symbols with - log_symbolparameter:- >>> data = otp.Tick(X=1) >>> symbols = otp.Ticks({'SYMBOL_NAME': ['A', 'B'], 'PARAM': [1, 2]}) >>> otp.run(query, symbols=symbols, log_symbol=True) Running query <onetick.py.sources.Tick object at ...> Processing symbol A Processing symbol B - By default, some non-standard characters in data strings could be processed incorrectly: - >>> data = ['AA測試AA'] >>> source = otp.Ticks({'A': data}) >>> otp.run(source) Time A 0 2003-12-01 AA測試AA - To fix this you can pass encoding parameter to otp.run: - data = ['AA測試AA'] source = otp.Ticks({'A': data}) df = otp.run(source, encoding="utf-8") print(df) - Time A 0 2003-12-01 AA測試AA