otp.Source.write_parquet#

Source.write_parquet(output_path, compression_type='snappy', num_tick_per_row_group=1000, partitioning_keys='', propagate_input_ticks=False, inplace=False)#

Writes the input tick series to parquet data file.

Input must not have field ‘time’ as that field will also be added by the EP in the resulting file(s)

Parameters
  • output_path (str) – Path for saving ticks to Parquet file. Partitioned: Path to the root directory of the parquet files. Non-partitioned: Path to the parquet file.

  • compression_type (str) – Compression type for parquet files. Should be one of these: gzip, lz4, none, snappy (default), zstd.

  • num_tick_per_row_group (int) – Number of rows per row group.

  • partitioning_keys (list, str) –

    List of fields (list or comma-separated string) to be used as keys for partitioning.

    Setting this parameter will switch this EP to partitioned mode.

    In non-partitioned mode, if the path points to a file that already exists, it will be overridden. When partitioning is active:

    • The target directory must be empty

    • Key fields and their string values will be automatically URL-encoded to avoid conflicts with filesystem naming rules.

    Pseudo-fields ‘_SYMBOL_NAME’ and ‘_TICK_TYPE’ may be used as partitioning_keys and will be added to the schema automatically.

  • propagate_input_ticks (bool) – Switches propagation of the ticks. If set to True, ticks will be propagated.

  • inplace (bool) – A flag controls whether operation should be applied inplace. If inplace=True, then it returns nothing. Otherwise method returns a new modified object.

Examples

Simple usage

>>> data = otp.Ticks(A=[1, 2, 3])
>>> data = data.write_parquet("/path/to/parquet/file")  
>>> df = otp.run(data)  

See also

WRITE_TO_PARQUET OneTick event processor