# otp.run

### run(query, \*, symbols=None, start=utils.adaptive, end=utils.adaptive, date=None, start_time_expression=None, end_time_expression=None, timezone=utils.default, context=utils.default, username=None, alternative_username=None, password=None, batch_size=utils.default, running=False, query_properties=None, concurrency=utils.default, apply_times_daily=None, symbol_date=None, query_params=None, time_as_nsec=True, treat_byte_arrays_as_strings=True, output_matrix_per_field=False, output_structure=None, return_utc_times=None, connection=None, callback=None, svg_path=None, use_connection_pool=False, node_name=None, require_dict=False, max_expected_ticks_per_symbol=None, log_symbol=utils.default, encoding=None, manual_dataframe_callback=False, print_symbol_errors=utils.default)

Executes a query and returns its result.

* **Parameters:**
  * **query** ([`onetick.py.Source`](source/root.md#onetick.py.Source), otq.Ep, otq.GraphQuery, otq.ChainQuery, str, otq.Chainlet,            Callable, otq.SqlQuery, [`onetick.py.SqlQuery`](misc/sql.md#onetick.py.SqlQuery)) -- 

    Query to execute can be source, path of the query on a disk or onetick.query graph or event processor.
    For running OTQ files, it represents the path (including filename) to the OTQ file to run a single query within
    the file. If more than one query is present, then the query to be run must be specified
    (that is, `'path_to_file/otq_file.otq::query_to_run'`).

    `query` can also be a function that has a symbol object as the first parameter.
    This object can be used to get symbol name and symbol parameters.
    Function must return a [`Source`](source/root.md#onetick.py.Source).
  * **symbols** (str, list of str, list of otq.Symbol, [`onetick.py.Source`](source/root.md#onetick.py.Source), [pandas.DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), optional) -- Symbol(s) to run the query for passed as a string, a list of strings,
    a [pandas.DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) with the `SYMBOL_NAME` column,
    or as a "symbols" query which results include the `SYMBOL_NAME` column.
    The start/end times for the symbols query will taken from the params below.
    See [symbols](../static/concepts/symbols.md#symbols-bound-and-unbound) for more details.
  * **start** ([`datetime.datetime`](https://docs.python.org/3/library/datetime.html#datetime.datetime), [`otp.datetime`](datetime/dt.md#onetick.py.datetime),            `pyomd.timeval_t`, optional) -- The start time of the query. Can be timezone-naive or timezone-aware. See also `timezone` argument.
    onetick.py uses [`otp.config.default_start_time`](config.md#onetick.py.configuration.Config.default_start_time)
    as default value, if you don't want to specify start time, e.g. to use saved time of the query,
    then you should specify None value.
  * **end** ([`datetime.datetime`](https://docs.python.org/3/library/datetime.html#datetime.datetime), [`otp.datetime`](datetime/dt.md#onetick.py.datetime),          `pyomd.timeval_t`, optional) -- The end time of the query (note that it's non-inclusive).
    Can be timezone-naive or timezone-aware. See also `timezone` argument.
    onetick.py uses [`otp.config.default_end_time`](config.md#onetick.py.configuration.Config.default_end_time)
    as default value, if you don't want to specify end time, e.g. to use saved time of the query,
    then you should specify None value.
  * **date** ([`datetime.date`](https://docs.python.org/3/library/datetime.html#datetime.date), [`otp.date`](datetime/date.md#onetick.py.date), optional) -- The date to run the query for. Can be set instead of `start` and `end` parameters.
    If set then the interval to run the query will be from 0:00 to 24:00 of the specified date.
  * **start_time_expression** (str, [`Operation`](operation/root.md#onetick.py.Operation), optional) -- Start time onetick expression of the query. If specified, it will take precedence over `start`.
    Supported only if query is Source, Graph or Event Processor.
  * **end_time_expression** (str, [`Operation`](operation/root.md#onetick.py.Operation), optional) -- End time onetick expression of the query. If specified, it will take precedence over `end`.
    Supported only if query is Source, Graph or Event Processor.
  * **timezone** ([*str*](https://docs.python.org/3/library/stdtypes.html#str) *,* *optional*) -- The timezone of output timestamps.
    Also, when start and/or end arguments are timezone-naive, it will define their timezone.
    If parameter is omitted timestamps of ticks will be formatted
    with the default [`otp.config.tz`](config.md#onetick.py.configuration.Config.tz).
  * **context** ([*str*](https://docs.python.org/3/library/stdtypes.html#str) *,* *optional*) -- Allows specification of different contexts from OneTick configuration to connect to.
    If not set then default [`otp.config.context`](config.md#onetick.py.configuration.Config.context) is used.
    See [guide about switching contexts](../static/getting_started/session.md#switching-contexts) for examples.
  * **username** ([*str*](https://docs.python.org/3/library/stdtypes.html#str) *|* *None*) -- The username to make the connection.
    By default the user which executed the process is used or the value specified in
    [`otp.config.default_username`](config.md#onetick.py.configuration.Config.default_username).
  * **alternative_username** ([*str*](https://docs.python.org/3/library/stdtypes.html#str)) -- The username used for authentication.
    Needs to be set only when the tick server is configured to use password-based authentication.
    By default,
    [`otp.config.default_auth_username`](config.md#onetick.py.configuration.Config.default_auth_username) is used.
    Not supported for WebAPI mode.
  * **password** ([*str*](https://docs.python.org/3/library/stdtypes.html#str) *,* *optional*) -- The password used for authentication.
    Needs to be set only when the tick server is configured to use password-based authentication.
    By default, [`otp.config.default_password`](config.md#onetick.py.configuration.Config.default_password) is used.
    Note: not supported and ignored on older OneTick versions.
    Not supported for WebAPI mode.
  * **batch_size** ([*int*](https://docs.python.org/3/library/functions.html#int)) -- Number of symbols to process in one batch. Larger batch sizes reduce overhead
    but use more memory.
    By default, the value from
    [`otp.config.default_batch_size`](config.md#onetick.py.configuration.Config.default_batch_size) is used.
    Not supported for WebAPI mode.
  * **running** ([*bool*](https://docs.python.org/3/library/functions.html#bool) *,* *optional*) -- Set to True for CEP (Complex Event Processing) real-time streaming queries.
    Default is False.
  * **query_properties** (`pyomd.QueryProperties` or dict, optional) -- Query properties, such as ONE_TO_MANY_POLICY, ALLOW_GRAPH_REUSE, etc
  * **concurrency** ([*int*](https://docs.python.org/3/library/functions.html#int) *,* *optional*) -- The maximum number of CPU cores to use to process the query.
    By default, the value from
    [`otp.config.default_concurrency`](config.md#onetick.py.configuration.Config.default_concurrency) is used.
  * **apply_times_daily** ([*bool*](https://docs.python.org/3/library/functions.html#bool)) -- 

    Runs the query for every day in the `start`-`end` time range,
    using the time components of `start` and `end` datetimes.

    Note that those daily intervals are executed separately, so you don't have access
    to the data from previous or next days (see example in the next section).
  * **symbol_date** ([`datetime.datetime`](https://docs.python.org/3/library/datetime.html#datetime.datetime), int, str, optional) -- Date used for resolving symbols in date-dependent symbologies, where the same
    identifier can map to different instruments on different dates.
    Accepts a datetime object or integer in `YYYYMMDD` format (e.g., `20220301`).
  * **query_params** ([*dict*](https://docs.python.org/3/library/stdtypes.html#dict)) -- Parameters of the query.
  * **time_as_nsec** ([*bool*](https://docs.python.org/3/library/functions.html#bool)) -- If True, output timestamps have nanosecond granularity.
    If False, timestamps are truncated to microsecond granularity.
    Default is True.
  * **treat_byte_arrays_as_strings** ([*bool*](https://docs.python.org/3/library/functions.html#bool)) -- Outputs byte arrays as strings (defaults to True)
  * **output_matrix_per_field** ([*bool*](https://docs.python.org/3/library/functions.html#bool)) -- Changes output format to list of matrices per field.
    Not supported for WebAPI mode.
  * **output_structure** (*otp.Source.OutputStructure* *,* *optional*) -- 

    Structure (type) of the result. Supported values are:
    : - df (default) - the result is returned as [pandas.DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) object
        or dictionary of symbol names and [pandas.DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) objects
        in case of using multiple symbols or first stage query.
      - map - the result is returned as SymbolNumpyResultMap.
      - list - the result is returned as list.
      - polars - the result is returned as
        [polars.DataFrame](https://docs.pola.rs/api/python/stable/reference/dataframe/index.html) object
        or dictionary of symbol names and dataframe objects
        (**Only supported in WebAPI mode**).
      - pandas - the result is returned as [pandas.DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html)
        (**Only supported in WebAPI mode**).

    df output structure is default for both standard and WebAPI modes.
    In this mode onetick-py converts numpy data structure returned from onetick.query to pandas.Dataframe.

    pandas output structure (available only in WebAPI mode), instead, returns pandas.Dataframe
    directly from OneTick.
  * **return_utc_times** ([*bool*](https://docs.python.org/3/library/functions.html#bool)) -- If True, return timestamps in UTC timezone. If False, return in local timezone.
    Not supported for WebAPI mode.
  * **connection** (`pyomd.Connection`) -- The connection to be used for discovering nested .otq files
    Not supported for WebAPI mode.
  * **callback** ([`onetick.py.CallbackBase`](misc/callback.md#onetick.py.CallbackBase)) -- Class with callback methods.
    If set, the output of the query should be controlled with callbacks
    and this function returns nothing.
  * **svg_path** ([*str*](https://docs.python.org/3/library/stdtypes.html#str) *,* *optional*) -- Path to render graph in case `otq.API_CONFIG['RENDER_GRAPH_ON_ERROR']` is set.
  * **use_connection_pool** ([*bool*](https://docs.python.org/3/library/functions.html#bool)) -- Default is False. If set to True, the connection pool is used.
    Not supported for WebAPI mode.
  * **node_name** ([*str*](https://docs.python.org/3/library/stdtypes.html#str) *,* *List* *[*[*str*](https://docs.python.org/3/library/stdtypes.html#str) *]* *,* *optional*) -- Name of the output node to select result from. If query graph has several output nodes, you can specify the name
    of the node to choose result from. If node_name was specified, query should be presented by path on the disk
    and output_structure should be df
  * **require_dict** ([*bool*](https://docs.python.org/3/library/functions.html#bool)) -- If True, the result is always returned as a dictionary keyed by symbol name,
    even when only a single symbol is queried. Default is False.
  * **max_expected_ticks_per_symbol** ([*int*](https://docs.python.org/3/library/functions.html#int)) -- Expected maximum number of ticks per symbol (used for performance optimizations).
    By default, [`otp.config.max_expected_ticks_per_symbol`](config.md#onetick.py.configuration.Config.max_expected_ticks_per_symbol) is used.
    Not supported for WebAPI mode.
  * **log_symbol** ([*bool*](https://docs.python.org/3/library/functions.html#bool)) -- Log currently executed symbol.
    Note that this only works with unbound symbols.
    Also in this case [`otp.run`](#onetick.py.run) is executed in `callback` mode
    and no value is returned from the function, so it should be used only for debugging purposes.
    This logging will not work if some other value specified in parameter `callback`.
    By default, [`otp.config.log_symbol`](config.md#onetick.py.configuration.Config.log_symbol) is used.
  * **encoding** ([*str*](https://docs.python.org/3/library/stdtypes.html#str) *,* *optional*) -- The encoding of string fields.
  * **manual_dataframe_callback** ([*bool*](https://docs.python.org/3/library/functions.html#bool)) -- Create dataframe manually with `callback` mode.
    Only works if `output_structure='df'` is specified and parameter `callback` is not.
    May improve performance in some cases.
  * **print_symbol_errors** ([*bool*](https://docs.python.org/3/library/functions.html#bool)) -- If True (default), symbol-level errors from OneTick are printed as Python warnings.
    Applicable only when `output_structure` is `'df'`.
    By default, [`otp.config.print_symbol_errors`](config.md#onetick.py.configuration.Config.print_symbol_errors)
    is used, which is True by default.
* **Returns:**
  result of the query
* **Return type:**
  result, list, dict, [pandas.DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), None

### Examples

Running [`onetick.py.Source`](source/root.md#onetick.py.Source) and setting start and end times:

```pycon
>>> data = otp.Tick(A=1)
>>> otp.run(data, start=otp.dt(2003, 12, 2), end=otp.dt(2003, 12, 4))
        Time  A
0 2003-12-02  1
```

Setting query interval with `date` parameter:

```pycon
>>> data = otp.Tick(A=1)
>>> data['START'] = data['_START_TIME']
>>> data['END'] = data['_END_TIME']
>>> otp.run(data, date=otp.dt(2003, 12, 1))
        Time  A      START        END
0 2003-12-01  1 2003-12-01 2003-12-02
```

Running otq.Ep and passing query parameters:

```pycon
>>> ep = otq.TickGenerator(bucket_interval=0, fields='long A = $X').tick_type('TT')
>>> otp.run(ep, symbols='LOCAL::', query_params={'X': 1})
        Time  A
0 2003-12-04  1
```

Running in callback mode:

```pycon
>>> class Callback(otp.CallbackBase):
...     def __init__(self):
...         self.result = None
...     def process_tick(self, tick, time):
...         self.result = tick
>>> data = otp.Tick(A=1)
>>> callback = Callback()
>>> otp.run(data, callback=callback)
>>> callback.result
{'A': 1}
```

Running with `apply_times_daily`.
Note that daily intervals are processed separately so, for example,
we can't access column **COUNT** from previous day.

```pycon
>>> trd = otp.DataSource('US_COMP', symbols='AAPL', tick_type='TRD')  
>>> trd = trd.agg({'COUNT': otp.agg.count()},
...               bucket_interval=12 * 3600, bucket_time='start')  
>>> trd['PREV_COUNT'] = trd['COUNT'][-1]  
>>> otp.run(trd, apply_times_daily=True,
...         start=otp.dt(2023, 4, 3), end=otp.dt(2023, 4, 5), timezone='EST5EDT')  
                 Time   COUNT  PREV_COUNT
0 2023-04-03 00:00:00  328447           0
1 2023-04-03 12:00:00  240244      328447
2 2023-04-04 00:00:00  263293           0
3 2023-04-04 12:00:00  193018      263293
```

Using a function as a `query`, accessing symbol name and parameters:

```pycon
>>> def query(symbol):
...     t = otp.Tick(X='x')
...     t['SYMBOL_NAME'] = symbol.name
...     t['SYMBOL_PARAM'] = symbol.PARAM
...     return t
>>> symbols = otp.Ticks({'SYMBOL_NAME': ['A', 'B'], 'PARAM': [1, 2]})
>>> result = otp.run(query, symbols=symbols)
>>> result['A']
        Time  X SYMBOL_NAME  SYMBOL_PARAM
0 2003-12-01  x           A             1
>>> result['B']
        Time  X SYMBOL_NAME  SYMBOL_PARAM
0 2003-12-01  x           B             2
```

Debugging unbound symbols with `log_symbol` parameter:

```pycon
>>> data = otp.Tick(X=1)
>>> symbols = otp.Ticks({'SYMBOL_NAME': ['A', 'B'], 'PARAM': [1, 2]})
>>> otp.run(query, symbols=symbols, log_symbol=True)  
Running query <onetick.py.sources.ticks.Tick object at ...>
Processing symbol A
Processing symbol B
```

By default, some non-standard characters in data strings could be processed incorrectly:

```pycon
>>> data = ['AA測試AA']
>>> source = otp.Ticks({'A': data})
>>> otp.run(source)
        Time           A
0 2003-12-01  AAæ¸¬è©¦AA
```

To fix this you can pass encoding parameter to otp.run:

```python
data = ['AA測試AA']
source = otp.Ticks({'A': data})
df = otp.run(source, encoding="utf-8")
print(df)
```

```none
        Time        A
0 2003-12-01  AA測試AA
```

Note that query `start` time is inclusive, but query `end` time is not,
meaning that ticks with timestamps equal to the query end time will not be included:

```pycon
>>> data = otp.Tick(A=1, bucket_interval=24*60*60)
>>> data['A'] = data['TIMESTAMP'].dt.day_of_month()
>>> otp.run(data, start=otp.dt(2003, 12, 1), end=otp.dt(2003, 12, 4))
        Time  A
0 2003-12-01  1
1 2003-12-02  2
2 2003-12-03  3
>>> otp.run(data, start=otp.dt(2003, 12, 1), end=otp.dt(2003, 12, 2))
        Time  A
0 2003-12-01  1
```

If you want to include such ticks, you can add one nanosecond to the query end time:

```pycon
>>> otp.run(data, start=otp.dt(2003, 12, 1), end=otp.dt(2003, 12, 2) + otp.Nano(1))
        Time  A
0 2003-12-01  1
1 2003-12-02  2
```

Using pandas.DataFrame as a symbol list:

```pycon
>>> symbols_df = pd.DataFrame({'SYMBOL_NAME': ['AAPL', 'MSFT'], 'SYMBOL_PARAM': ['a', 'b']})
>>> data = otp.Tick(A=1)
>>> data['SYMBOL_NAME'] = data.Symbol.name
>>> data['SYMBOL_PARAM'] = data.Symbol.get('SYMBOL_PARAM', otp.string[64])
>>> result = otp.run(data, symbols=symbols_df)
>>> result['AAPL']
        Time  A SYMBOL_NAME SYMBOL_PARAM
0 2003-12-01  1        AAPL            a
>>> result['MSFT']
        Time  A SYMBOL_NAME SYMBOL_PARAM
0 2003-12-01  1        MSFT            b
```

Setting `timezone` controls the output timestamp timezone.
When `start`/`end` are timezone-naive, it also defines their timezone:

```pycon
>>> data = otp.Tick(A=1)
>>> otp.run(data, start=otp.dt(2003, 12, 1), end=otp.dt(2003, 12, 2), timezone='EST5EDT')
        Time  A
0 2003-12-01  1
```

Use `require_dict=True` to always get a dictionary result,
even when running a single symbol:

```pycon
>>> data = otp.Tick(A=1)
>>> result = otp.run(data, require_dict=True)
>>> type(result)
<class 'dict'>
```

Running for multiple symbols returns a dictionary keyed by symbol name:

```pycon
>>> data = otp.DataSource(db='SOME_DB', tick_type='TT')
>>> result = otp.run(data, symbols=['S1', 'S2'])
>>> result['S1']
                     Time  X
0 2003-12-01 00:00:00.000  1
1 2003-12-01 00:00:00.001  2
2 2003-12-01 00:00:00.002  3
>>> result['S2']
                     Time  X
0 2003-12-01 00:00:00.000 -3
1 2003-12-01 00:00:00.001 -2
2 2003-12-01 00:00:00.002 -1
```

Using a [`Source`](source/root.md#onetick.py.Source) as `symbols` creates a first-stage query
that dynamically generates the symbol list. The source must produce a `SYMBOL_NAME` column:

```python
# First-stage query: get symbols from a reference database
symbol_src = otp.DataSource('REF_DB', tick_type='SYMBOLS')
symbol_src = symbol_src[['SYMBOL_NAME']]

data = otp.DataSource('US_COMP', tick_type='TRD')
result = otp.run(data, symbols=symbol_src, date=otp.dt(2022, 3, 1))
# result is a dict keyed by symbol names from symbol_src
```

`output_structure` controls the format of the return value.
Use `'list'` to get raw results as a list of tuples:

```python
data = otp.DataSource('US_COMP', tick_type='TRD')
result = otp.run(data, symbols='AAPL', output_structure='list')
# result is [(symbol, ticks_data, error_data, node_name), ...]
```

Use `output_structure='map'` for a `SymbolNumpyResultMap` object:

```python
result = otp.run(data, symbols='AAPL', output_structure='map')
```

`running=True` marks the query as a CEP (Complex Event Processing) query
for real-time streaming:

```python
# CEP query for real-time data
data = otp.DataSource('US_COMP', tick_type='TRD')
result = otp.run(data, symbols='AAPL', running=True,
                 start=otp.dt(2023, 1, 1), end=otp.dt(2099, 1, 1))
```

`batch_size` and `concurrency` tune performance for multi-symbol queries:

```python
data = otp.DataSource('US_COMP', tick_type='TRD')
result = otp.run(data, symbols=large_symbol_list,
                 batch_size=50,    # process 50 symbols per batch
                 concurrency=4)    # use 4 CPU cores
```

`symbol_date` specifies the date for resolving symbols
in date-dependent symbologies:

```python
data = otp.DataSource('US_COMP', tick_type='TRD')
result = otp.run(data, symbols=['AAPL', 'MSFT'],
                 symbol_date=otp.dt(2022, 3, 1),
                 date=otp.dt(2022, 3, 1))

# Also accepts integer YYYYMMDD format
result = otp.run(data, symbols=['AAPL'], symbol_date=20220301,
                 date=otp.dt(2022, 3, 1))
```

`start_time_expression` and `end_time_expression` allow dynamic time boundaries
using OneTick expressions. They take precedence over `start`/`end`:

```python
data = otp.DataSource('US_COMP', tick_type='TRD')
result = otp.run(data, symbols='AAPL',
                 start_time_expression='20220301093000',
                 end_time_expression='20220301160000')
```

`query_properties` passes OneTick query properties as a dict:

```python
data = otp.DataSource('US_COMP', tick_type='TRD')
result = otp.run(data, symbols='AAPL',
                 query_properties={'ALLOW_GRAPH_REUSE': 'true'},
                 date=otp.dt(2022, 3, 1))
```

`node_name` selects the output from a specific node when running
an OTQ file with multiple output nodes:

```python
result = otp.run('path/to/multi_output.otq',
                 symbols='AAPL', node_name='OUTPUT_1',
                 date=otp.dt(2022, 3, 1))
```
