otp.inspection.DB#

class DB(name, context=utils.default)#

Bases: object

An object of available databases that the otp.databases() function returns. It helps to make initial analysis on the database level: available tick types, dates with data, symbols, tick schema, etc.

property access_info: dict#

Get access info for this database and current user.

All dates are returned in GMT timezone.

This function result is cached, cache will be cleared after each otp.databases call.

Examples

>>> some_db = otp.databases()['SOME_DB']
>>> some_db.access_info
{'DB_NAME': 'SOME_DB',
 'READ_ACCESS': 1,
 'WRITE_ACCESS': 1,
 'MIN_AGE_SET': 0,
 'MIN_AGE_MSEC': 0,
 'MAX_AGE_SET': 0,
 'MAX_AGE_MSEC': 0,
 'MIN_START_DATE_SET': 0,
 'MIN_START_DATE_MSEC': Timestamp('1970-01-01 00:00:00'),
 'MAX_END_DATE_SET': 0,
 'MAX_END_DATE_MSEC': Timestamp('1970-01-01 00:00:00'),
 'MIN_AGE_DB_DAYS': 0,
 'MIN_AGE_DB_DAYS_SET': 0,
 'MAX_AGE_DB_DAYS': 0,
 'MAX_AGE_DB_DAYS_SET': 0,
 'CEP_ACCESS': 0,
 'DESTROY_ACCESS': 0}

See also

ACCESS_INFO OneTick event processor

show_config(config_type='locator_entry')#

Shows the specified configuration for a database.

Parameters

config_type (str) –

If ‘locator_entry’ is specified, a string representing db’s locator entry along with VDB_FLAG (this flag equals 1 when the database is virtual and 0 otherwise) will be returned.

If ‘db_time_intervals’ is specified, then time intervals configured in the locator file will be propagated including additional information, such as LOCATION, ARCHIVE_DURATION, DAY_BOUNDARY_TZ, DAY_BOUNDARY_OFFSET, ALTERNATIVE_LOCATIONS, etc.

Return type

dict

Examples

>>> some_db = otp.databases()['SOME_DB']
>>> print(some_db.show_config()['LOCATOR_STRING'])  
<DB ARCHIVE_COMPRESSION_TYPE="NATIVE_PLUS_GZIP" ID="SOME_DB" SYMBOLOGY="BZX" TICK_TIMESTAMP_TYPE="NANOS" >
<LOCATIONS >
    <LOCATION ACCESS_METHOD="file" DAY_BOUNDARY_TZ="EST5EDT"
              END_TIME="21000101000000" LOCATION="..." START_TIME="20021230000000" />
</LOCATIONS>
<RAW_DATA />
</DB>
>>> some_db.show_config(config_type='db_time_intervals')  
{'START_DATE': 1041206400000, 'END_DATE': 4102444800000,
 'GROWABLE_ARCHIVE_FLAG': 0, 'ARCHIVE_DURATION': 0,
 'LOCATION': '...', 'DAY_BOUNDARY_TZ': 'EST5EDT', 'DAY_BOUNDARY_OFFSET': 0, 'ALTERNATIVE_LOCATIONS': ''}

See also

DB/SHOW_CONFIG OneTick event processor

property min_acl_start_date: Optional[datetime.date]#

Minimum start date set in ACL for current user. Returns None if not set.

property max_acl_end_date: Optional[datetime.date]#

Maximum end date set in ACL for current user. Returns None if not set.

dates(respect_acl=False, check_index_file=None)#

Returns list of dates in GMT timezone for which data is available.

Parameters
  • respect_acl (bool) – If True then only the dates that current user has access to will be returned

  • check_index_file (bool) – If True (default), then file index will be searched for to determine if a database is loaded for a date. This check may be expensive, in terms of time it takes, when the file resides on NFS or on object storage, such as S3. If this parameter is set to False, then only the database directory for a date will be searched. This will increase performance, but may also return the days that are configured but where there is actually no data.

Returns

Returns None when there is no data in the database

Return type

datetime.date or None

Examples

>>> some_db = otp.databases()['SOME_DB']
>>> some_db.dates()
[datetime.date(2003, 12, 1)]
property last_date#

The latest date on which db has data and the current user has access to.

Returns

Returns None when there is no data in the database

Return type

datetime.date or None

Examples

>>> some_db = otp.databases()['SOME_DB']
>>> some_db.last_date
datetime.date(2003, 12, 1)
tick_types(date=None, timezone=None)#

Returns list of tick types for the date.

Parameters
  • date (otp.dt, datetime.datetime, optional) – Date for the tick types look up. None means the last_date

  • timezone (str, optional) – Timezone for the look up. None means the default timezone.

Returns

List with string values of available tick types.

Return type

list

Examples

>>> nyse_taq_db = otp.databases()['NYSE_TAQ']
>>> nyse_taq_db.tick_types(date=otp.dt(2022, 3, 1))
['QTE', 'TRD']
schema(date=None, tick_type=None, timezone=None, check_index_file=None)#

Gets the schema of the database.

Parameters
  • date (otp.dt, datetime.datetime, optional) – Date for the schema. None means the last_date

  • tick_type (str, optional) – Specifies a tick type for schema. None means use the one available tick type, if there are multiple tick types then it raises the Exception. It uses the tick_types() method.

  • timezone (str, optional) – Allows to specify a timezone for searching tick types.

Returns

Dict where keys are field names and values are onetick.py types. It’s compatible with the onetick.py.Source.schema methods.

Return type

dict

Examples

>>> nyse_taq_db = otp.databases()['NYSE_TAQ']
>>> nyse_taq_db.schema(tick_type='TRD', date=otp.dt(2022, 3, 1))
{'PRICE': <class 'float'>, 'SIZE': <class 'int'>}
symbols(date=None, timezone=None, tick_type=None, pattern='.*')#

Finds a list of available symbols in the database

Parameters
  • date (otp.dt, datetime.datetime, optional) – Date for the symbols look up. None means the last_date

  • tick_type (str, optional) – Tick type for symbols. None means union across all tick types.

  • timezone (str, optional) – Timezone for the lookup. None means the default timezone.

  • pattern (str) – Regular expression to select symbols.

Return type

List[str]

Examples

>>> nyse_taq_db = otp.databases()['NYSE_TAQ']
>>> nyse_taq_db.symbols(date=otp.dt(2022, 3, 1), tick_type='TRD', pattern='^AAP.*')
['AAP', 'AAPL']
show_archive_stats(start=utils.adaptive, end=utils.adaptive, date=None, timezone='GMT')#

This method shows various stats about the queried symbol, as well as an archive as a whole for each day within the queried interval.

Accelerator databases are not supported. Memory databases will be ignored even within their life hours.

Archive stats returned:

  • COMPRESSION_TYPE - archive compression type. In older archives native compression flag is not stored, so for example for gzip compression this field may say “GZIP or NATIVE_PLUS_GZIP”. The meta_data_upgrader.exe tool can be used to determine and inject that information in such cases in order to get a more precise result in this field.

  • TIME_RANGE_VALIDITY - whether lowest and highest loaded timestamps (see below) are known. Like native compression flag, this information is missing in older archives and can be added using meta_data_upgrader.exe tool.

  • LOWEST_LOADED_DATETIME - the lowest loaded timestamp for the queried interval (across all symbols)

  • HIGHEST_LOADED_DATETIME - the highest loaded timestamp for the queried interval (across all symbols)

  • TOTAL_TICKS - the number of ticks for the queried interval (across all symbols). Also missing in older archives and can be added using meta_data_upgrader.exe. If not available, -1 will be returned.

  • SYMBOL_DATA_SIZE - the size of the symbol in archive in bytes. This information is also missing in older archives, however the other options, it cannot later be added. In such cases -1 will be returned.

  • TOTAL_SYMBOLS - the number of symbols for the queried interval

  • TOTAL_SIZE - archive size in bytes for the queried interval (including the garbage potentially accumulated during appends).

Note

Fields LOWEST_LOADED_DATETIME and HIGHEST_LOADED_DATETIME are returned in GMT timezone, so the default value of parameter timezone is GMT too.

Examples

Show stats for a particular date for a database SOME_DB:

>>> db = otp.databases()['SOME_DB']
>>> db.show_archive_stats(date=otp.dt(2003, 12, 1))  
                 Time  COMPRESSION_TYPE TIME_RANGE_VALIDITY LOWEST_LOADED_DATETIME HIGHEST_LOADED_DATETIME...
0 2003-12-01 05:00:00  NATIVE_PLUS_GZIP               VALID    2003-12-01 05:00:00 2003-12-01 05:00:00.002...

See also

SHOW_ARCHIVE_STATS OneTick event processor

Return type

pandas.core.frame.DataFrame

ref_data(ref_data_type, symbol_date=None, start=utils.adaptive, end=utils.adaptive, date=None, timezone='GMT')#

Shows reference data for the specified security and reference data type.

It can be used to view corporation actions, symbol name changes, primary exchange info and symbology mapping for a securities, as well as the list of symbologies, names of custom adjustment types for corporate actions present in a reference database as well as names of continuous contracts in database symbology.

Parameters
  • ref_data_type (str) –

    Type of reference data to be queried. Possible values are:

    • corp_actions

    • symbol_name_history

    • primary_exchange

    • symbol_calendar

    • symbol_currency

    • symbology_mapping

    • symbology_list

    • custom_adjustment_type_list

    • all_calendars

    • all_continuous_contract_names

  • symbol_date – This parameter must be specified for some reference data types to be queried.

Return type

pandas.core.frame.DataFrame

Examples

Show calendars for a database TRAIN_A_PRL_TRD in the given range:

>>> db = otp.databases()['TRAIN_A_PRL_TRD']  
>>> db.ref_data('all_calendars',  
...             start=otp.dt(2018, 2, 1),
...             end=otp.dt(2018, 2, 9),
...             symbol_date=otp.dt(2018, 2, 1))
                 Time        END_DATETIME CALENDAR_NAME SESSION_NAME SESSION_FLAGS DAY_PATTERN  START_HHMMSS          END_HHMMSS TIMEZONE  PRIORITY DESCRIPTION
0 2018-02-01 00:00:00 2018-02-06 23:59:59          FRED      Regular             R   0.0.12345         93000              160000  EST5EDT         0
1 2018-02-06 23:59:59 2018-02-07 23:59:59          FRED      Holiday             H   0.0.12345         93000              160000  EST5EDT         1
2 2018-02-07 23:59:59 2050-12-31 23:59:59          FRED      Regular             F   0.0.12345         93000              160000  EST5EDT         0

See also

REF_DATA OneTick event processor