# otp.perf

## Public API functions and classes

### measure_perf(src_or_otq, summary_file=None, context=adaptive)

Run **measure_perf.exe** tool on some .otq file or [`onetick.py.Source`](source/root.md#onetick.py.Source).
Result is saved in file `summary_file`.
If it is not set, then temporary [`onetick.py.utils.temp.TmpFile`](misc/tmp_file.md#onetick.py.utils.temp.TmpFile) is generated and returned.

* **Parameters:**
  * **src_or_otq** ([`Source`](source/root.md#onetick.py.Source) or str) -- [`Source`](source/root.md#onetick.py.Source) object or path to already existing .otq file.
  * **summary_file** ([*str*](https://docs.python.org/3/library/stdtypes.html#str)) -- path to the resulting summary file.
    By default some temporary file name will be used.
  * **context** ([*str*](https://docs.python.org/3/library/stdtypes.html#str)) -- context that will be used to run the query.
* **Return type:**
  Returns tuple with the path to the generated query and path to the summary file.

### Examples

```pycon
>>> t = otp.Tick(A=1)
>>> otq_file, summary_file = otp.perf.measure_perf(t)
>>> with open(summary_file) as f:  
...    print(f.read())
Running result of ...
...
index,EP_name,tag,...
...
```

### *class* PerformanceSummaryFile(summary_file)

Bases: [`object`](https://docs.python.org/3/library/functions.html#object)

Class to read and parse `summary_file` that was generated by OneTick's measure_perf.exe

Parsed result is accessible via public properties of the class.

* **Parameters:**
  **summary_file** ([*str*](https://docs.python.org/3/library/stdtypes.html#str) *|* [*PathLike*](https://docs.python.org/3/library/os.html#os.PathLike)) -- path to the summary file.

### Examples

```pycon
>>> t = otp.Tick(A=1)
>>> otq_file, summary_file = otp.perf.measure_perf(t)
>>> result = otp.perf.PerformanceSummaryFile(summary_file)
>>> print(result.ordinary_summary.dataframe)  
index      EP_name  tag ...
    0  PASSTHROUGH    0 ...
...
```

#### summary_file

path to the summary file

#### summary_text

the text of the summary file

#### ordinary_summary

[`Ordinary summary`](#onetick.py.perf.OrdinarySummary)

#### presort_summary

[`Presort summary`](#onetick.py.perf.PresortSummary)

#### cep_summary

[`CEP summary`](#onetick.py.perf.CEPSummary)

### *class* MeasurePerformance(src_or_otq, summary_file=None, context=adaptive)

Bases: [`PerformanceSummaryFile`](#onetick.py.perf.PerformanceSummaryFile)

Class to run OneTick's measure_perf.exe on the specified query and parse the result.

Additionally some debug information about the python location of event processor objects
may be added to the result if
`stack_info`
configuration parameter is set.

Parsed result is accessible via public properties of the class.

* **Parameters:**
  * **src_or_otq** ([`Source`](source/root.md#onetick.py.Source) or str) -- [`Source`](source/root.md#onetick.py.Source) object or path to already existing .otq file.
  * **summary_file** ([*str*](https://docs.python.org/3/library/stdtypes.html#str)) -- path to the resulting summary file.
    By default some temporary file name will be used.
  * **context** ([*str*](https://docs.python.org/3/library/stdtypes.html#str)) -- context that will be used to run the query.

### Examples

```pycon
>>> t = otp.Tick(A=1)
>>> result = otp.perf.MeasurePerformance(t)
>>> print(result.ordinary_summary.dataframe)  
index      EP_name  tag ...
    0  PASSTHROUGH    0 ...
...
```

#### summary_file

path to the summary file

#### summary_text

the text of the summary file

#### ordinary_summary

[`Ordinary summary`](#onetick.py.perf.OrdinarySummary)

#### presort_summary

[`Presort summary`](#onetick.py.perf.PresortSummary)

#### cep_summary

[`CEP summary`](#onetick.py.perf.CEPSummary)

## Ordinary summary objects

### *class* OrdinarySummary

Bases: `PerformanceSummary`

This is the first section in the summary file containing the largest portion of the summary for graph nodes.

* **Parameters:**
  **text** ([*str*](https://docs.python.org/3/library/stdtypes.html#str) *|* *None*)

#### text

text of the summary (csv format)

#### dataframe

pandas.DataFrame from the data of the summary

#### entries

list of corresponding entries objects

#### entries_dict

mapping of EP tags to corresponding entry objects

### *class* OrdinarySummaryEntry

Data class for each line of ordinary performance summary.

* **Parameters:**
  * **index** ([*int*](https://docs.python.org/3/library/functions.html#int))
  * **EP_name** ([*str*](https://docs.python.org/3/library/stdtypes.html#str))
  * **tag** ([*int*](https://docs.python.org/3/library/functions.html#int))
  * **running_time_with_children** ([*int*](https://docs.python.org/3/library/functions.html#int))
  * **running_time** ([*int*](https://docs.python.org/3/library/functions.html#int))
  * **processed_tick_events** ([*int*](https://docs.python.org/3/library/functions.html#int))
  * **processed_schema_events** ([*int*](https://docs.python.org/3/library/functions.html#int))
  * **processed_timer_events** ([*int*](https://docs.python.org/3/library/functions.html#int))
  * **max_accumulated_ticks_count** ([*int*](https://docs.python.org/3/library/functions.html#int))
  * **max_introduced_latency** ([*int*](https://docs.python.org/3/library/functions.html#int))
  * **ep_introduces_delay_flag** ([*int*](https://docs.python.org/3/library/functions.html#int))
  * **allocated_memory_with_children** ([*int*](https://docs.python.org/3/library/functions.html#int))
  * **allocated_memory** ([*int*](https://docs.python.org/3/library/functions.html#int))
  * **unreleased_memory_with_children** ([*int*](https://docs.python.org/3/library/functions.html#int))
  * **unreleased_memory** ([*int*](https://docs.python.org/3/library/functions.html#int))
  * **peak_allocated_memory** ([*int*](https://docs.python.org/3/library/functions.html#int))
  * **stack_info** ([*str*](https://docs.python.org/3/library/stdtypes.html#str) *|* *None*)
  * **traceback** ([*str*](https://docs.python.org/3/library/stdtypes.html#str) *|* *None*)

#### asdict()

Return entry as a dictionary of field names and their values.

* **Return type:**
  [dict](https://docs.python.org/3/library/stdtypes.html#dict)

#### *classmethod* field_names()

Get list of entries field names.

#### *classmethod* fields()

Get list of entries field objects.

#### stack_info *: [str](https://docs.python.org/3/library/stdtypes.html#str) | [None](https://docs.python.org/3/library/constants.html#None)* *= None*

internal stack info number to identify debug information

#### traceback *: [str](https://docs.python.org/3/library/stdtypes.html#str) | [None](https://docs.python.org/3/library/constants.html#None)* *= None*

python traceback string to identify location of the python code that created OneTick's EP

#### index *: [int](https://docs.python.org/3/library/functions.html#int)*

Sequential number of the EP in the report

#### EP_name *: [str](https://docs.python.org/3/library/stdtypes.html#str)*

Name of the EP

#### tag *: [int](https://docs.python.org/3/library/functions.html#int)*

EP full tag (scope will be added to the tag if there is any)

#### running_time_with_children *: [int](https://docs.python.org/3/library/functions.html#int)*

Time elapsed for EP execution with its child nodes in microseconds

#### running_time *: [int](https://docs.python.org/3/library/functions.html#int)*

Individual time elapsed for EP execution in microseconds

#### processed_tick_events *: [int](https://docs.python.org/3/library/functions.html#int)*

Number of ticks processed by the EP

#### processed_schema_events *: [int](https://docs.python.org/3/library/functions.html#int)*

Number of tick descriptors processed by the EP

#### processed_timer_events *: [int](https://docs.python.org/3/library/functions.html#int)*

#### max_accumulated_ticks_count *: [int](https://docs.python.org/3/library/functions.html#int)*

Maximal number of ticks accumulated by the EP during query execution
This field is calculated only for aggregations (for example, EPs with a sliding window or GROUP_BY).
For all other EPs, it has the value of 0.

#### max_introduced_latency *: [int](https://docs.python.org/3/library/functions.html#int)*

For continuous queries, each EP measures the latency in microseconds for all the ticks it has propagated.
The latency of a tick is considered to be the difference between
tick propagation host time and the timestamp of the tick.
The maximum value of this latency (calculated by the EP)
is reported by measure_perf.exe in the summary of that EP.

The latency is calculated neither for aggregations with BUCKET_TIME=BUCKET_START
(as ticks are propagated by overwritten timestamps that are equal to the bucket start) nor for their child EPs.
For such cases, the following max_introduced_latency
special values indicate the reason why the maximum introduced latency was not calculated:

* -3 indicates that the EP is the culprit for latency calculation interruption
* -2 indicates that the latency calculation for the EP is turned off because
  its source EP's max_introduced_latency is -3
* -1 indicates that the query is non-continuous

#### ep_introduces_delay_flag *: [int](https://docs.python.org/3/library/functions.html#int)*

There are EPs (like PRESORT, Aggregations, and others)
that are allowed to propagate received ticks with some delay.
This flag indicates if the EP introduces delay.

#### allocated_memory_with_children *: [int](https://docs.python.org/3/library/functions.html#int)*

The amount of memory allocated by EP and its child nodes.

#### allocated_memory *: [int](https://docs.python.org/3/library/functions.html#int)*

The amount of memory allocated by EP.

#### unreleased_memory_with_children *: [int](https://docs.python.org/3/library/functions.html#int)*

The amount of memory unreleased by EP. The usual cause of non-zero unreleased memory is EP's cached data.

#### unreleased_memory *: [int](https://docs.python.org/3/library/functions.html#int)*

The amount of memory unreleased by EP and its child nodes.
The usual cause of non-zero unreleased memory is EP's and its child nodes' cached data.

#### peak_allocated_memory *: [int](https://docs.python.org/3/library/functions.html#int)*

Peak memory utilization introduced by EP and its child nodes.

## Presort summary objects

### *class* PresortSummary

Bases: `PerformanceSummary`

In PRESORT EPs summary section **measure_perf.exe** provides per PRESORT source branch report
containing max accumulated ticks count by PRESORT for each of these branches.
Namely, it shows how many ticks were accumulated by PRESORT for each of these source branches.

Please note that there are some PRESORT EP types, like SYNCHRONIZE_TIME EP,
that do not support performance measurement, yet.

Each line of this section contains six fields
representing the location of the branch for which the report is printed
and a field that contains the maximum number of ticks accumulated by PRESORT for this branch.

The location of a branch is determined by the source and sink EP names and tags.

* **Parameters:**
  **text** ([*str*](https://docs.python.org/3/library/stdtypes.html#str) *|* *None*)

#### text

text of the summary (csv format)

#### dataframe

pandas.DataFrame from the data of the summary

#### entries

list of corresponding entries objects

#### entries_dict

mapping of EP tags to corresponding entry objects

### *class* PresortSummaryEntry

Data class for each line of PRESORT performance summary.

* **Parameters:**
  * **index** ([*int*](https://docs.python.org/3/library/functions.html#int))
  * **presort_source_ep_name** ([*str*](https://docs.python.org/3/library/stdtypes.html#str))
  * **presort_sink_ep_name** ([*str*](https://docs.python.org/3/library/stdtypes.html#str))
  * **presort_source_ep_tag** ([*int*](https://docs.python.org/3/library/functions.html#int))
  * **presort_sink_ep_tag** ([*int*](https://docs.python.org/3/library/functions.html#int))
  * **max_accumulated_ticks_count** ([*int*](https://docs.python.org/3/library/functions.html#int))
  * **stack_info** ([*str*](https://docs.python.org/3/library/stdtypes.html#str) *|* *None*)
  * **traceback** ([*str*](https://docs.python.org/3/library/stdtypes.html#str) *|* *None*)

#### asdict()

Return entry as a dictionary of field names and their values.

* **Return type:**
  [dict](https://docs.python.org/3/library/stdtypes.html#dict)

#### *classmethod* field_names()

Get list of entries field names.

#### *classmethod* fields()

Get list of entries field objects.

#### stack_info *: [str](https://docs.python.org/3/library/stdtypes.html#str) | [None](https://docs.python.org/3/library/constants.html#None)* *= None*

internal stack info number to identify debug information

#### traceback *: [str](https://docs.python.org/3/library/stdtypes.html#str) | [None](https://docs.python.org/3/library/constants.html#None)* *= None*

python traceback string to identify location of the python code that created OneTick's EP

#### index *: [int](https://docs.python.org/3/library/functions.html#int)*

Sequential number of the branch in PRESORT EPs summary section

#### presort_source_ep_name *: [str](https://docs.python.org/3/library/stdtypes.html#str)*

Source EP name of combined PRESORT EP source branch for which the summary was reported

#### presort_sink_ep_name *: [str](https://docs.python.org/3/library/stdtypes.html#str)*

Combined PRESORT EP name

#### presort_source_ep_tag *: [int](https://docs.python.org/3/library/functions.html#int)*

Source EP tag of combined PRESORT EP source branch for which the summary was reported

#### presort_sink_ep_tag *: [int](https://docs.python.org/3/library/functions.html#int)*

Combined PRESORT EP tag

#### max_accumulated_ticks_count *: [int](https://docs.python.org/3/library/functions.html#int)*

Maximum accumulated ticks count by PRESORT for the located branch.

## CEP summary objects

### *class* CEPSummary

Bases: `PerformanceSummary`

The last summary type produced by **measure_perf.exe**
is the latency summary for root EPs of the executed top-level query in CEP mode.

Each root EP in CEP mode measures tick arrival latency before processing and propagating it to the sinks,
down by the graph.

Note that for non-CEP mode this summary is not printed at all.

The summary provided in this section tries to shed some light
and estimate the relationship between the following two variables:

* dependent variable - tick latency
* independent variable - tick arrival time into the root node.

The summary printed in this section tries to describe this relationship using some statistical analysis metrics.

Please note that these values are calculated across all ticks in all symbols processed by the query.

Calculated stats for ROOT EPs are printed once the query is finished and there are no more ticks left to arrive.

This summary contains the mean of latencies, standard deviation,
average slope of linear regression function (calculated by the least squares method),
and average variance from the regression function computed based on latency numbers of ticks
that are passed through each root EP of a top-level query.

For each root node, one line is printed with the fields containing values for each of the above-mentioned metrics.
This summary should be enough to determine slow consumer queries and try to debug and optimize those.

* **Parameters:**
  **text** ([*str*](https://docs.python.org/3/library/stdtypes.html#str) *|* *None*)

#### text

text of the summary (csv format)

#### dataframe

pandas.DataFrame from the data of the summary

#### entries

list of corresponding entries objects

#### entries_dict

mapping of EP tags to corresponding entry objects

### *class* CEPSummaryEntry

Data class for each line of CEP performance summary.

* **Parameters:**
  * **index** ([*int*](https://docs.python.org/3/library/functions.html#int))
  * **sink_ep_name** ([*str*](https://docs.python.org/3/library/stdtypes.html#str))
  * **sink_ep_tag** ([*int*](https://docs.python.org/3/library/functions.html#int))
  * **latencies_mean** ([*float*](https://docs.python.org/3/library/functions.html#float))
  * **latencies_standard_deviation** ([*float*](https://docs.python.org/3/library/functions.html#float))
  * **latencies_average_slope** ([*float*](https://docs.python.org/3/library/functions.html#float))
  * **latencies_variance_from_regression_line** ([*float*](https://docs.python.org/3/library/functions.html#float))
  * **stack_info** ([*str*](https://docs.python.org/3/library/stdtypes.html#str) *|* *None*)
  * **traceback** ([*str*](https://docs.python.org/3/library/stdtypes.html#str) *|* *None*)

#### asdict()

Return entry as a dictionary of field names and their values.

* **Return type:**
  [dict](https://docs.python.org/3/library/stdtypes.html#dict)

#### *classmethod* field_names()

Get list of entries field names.

#### *classmethod* fields()

Get list of entries field objects.

#### stack_info *: [str](https://docs.python.org/3/library/stdtypes.html#str) | [None](https://docs.python.org/3/library/constants.html#None)* *= None*

internal stack info number to identify debug information

#### traceback *: [str](https://docs.python.org/3/library/stdtypes.html#str) | [None](https://docs.python.org/3/library/constants.html#None)* *= None*

python traceback string to identify location of the python code that created OneTick's EP

#### index *: [int](https://docs.python.org/3/library/functions.html#int)*

Sequential number of the root EP in the root EPs summary section

#### sink_ep_name *: [str](https://docs.python.org/3/library/stdtypes.html#str)*

Root EP name for which summary is provided

#### sink_ep_tag *: [int](https://docs.python.org/3/library/functions.html#int)*

Root EP tag for which summary is provided

#### latencies_mean *: [float](https://docs.python.org/3/library/functions.html#float)*

Mean of the latencies of all ticks passed through the node

#### latencies_standard_deviation *: [float](https://docs.python.org/3/library/functions.html#float)*

Standard deviation of the latencies of all ticks passed through the node

#### latencies_average_slope *: [float](https://docs.python.org/3/library/functions.html#float)*

Average slope of the linear regression function found by least squares method calculated for all latencies
of all ticks passed through the root node.
As mentioned earlier, the regression function can be considered as a function describing some relationship
between two variables: tick latency and tick arrival timestamp.

#### latencies_variance_from_regression_line *: [float](https://docs.python.org/3/library/functions.html#float)*

This is the average variance of ticks latencies from the computed linear regression function.
