otp.inspection.DB#

class DB(name, context=utils.default)#

Bases: object

An object of available databases that the otp.databases() function returns. It helps to make initial analysis on the database level: available tick types, dates with data, symbols, tick schema, etc.

property access_info: dict#

Get access info for this database and current user.

All dates are returned in GMT timezone.

This function result is cached, cache will be cleared after each otp.databases call.

Examples

>>> some_db = otp.databases()['SOME_DB']
>>> some_db.access_info
{'DB_NAME': 'SOME_DB',
 'READ_ACCESS': 1,
 'WRITE_ACCESS': 1,
 'MIN_AGE_SET': 0,
 'MIN_AGE_MSEC': 0,
 'MAX_AGE_SET': 0,
 'MAX_AGE_MSEC': 0,
 'MIN_START_DATE_SET': 0,
 'MIN_START_DATE_MSEC': Timestamp('1970-01-01 00:00:00'),
 'MAX_END_DATE_SET': 0,
 'MAX_END_DATE_MSEC': Timestamp('1970-01-01 00:00:00'),
 'MIN_AGE_DB_DAYS': 0,
 'MIN_AGE_DB_DAYS_SET': 0,
 'MAX_AGE_DB_DAYS': 0,
 'MAX_AGE_DB_DAYS_SET': 0,
 'CEP_ACCESS': 0,
 'DESTROY_ACCESS': 0}

See also

ACCESS_INFO OneTick event processor

show_config(config_type='locator_entry')#

Shows the specified configuration for a database.

Parameters

config_type (str) –

If ‘locator_entry’ is specified, a string representing db’s locator entry along with VDB_FLAG (this flag equals 1 when the database is virtual and 0 otherwise) will be returned.

If ‘db_time_intervals’ is specified, then time intervals configured in the locator file will be propagated including additional information, such as LOCATION, ARCHIVE_DURATION, DAY_BOUNDARY_TZ, DAY_BOUNDARY_OFFSET, ALTERNATIVE_LOCATIONS, etc.

Return type

dict

Examples

>>> some_db = otp.databases()['SOME_DB']
>>> print(some_db.show_config()['LOCATOR_STRING'])  
<DB ARCHIVE_COMPRESSION_TYPE="NATIVE_PLUS_GZIP" ID="SOME_DB" SYMBOLOGY="BZX" TICK_TIMESTAMP_TYPE="NANOS" >
<LOCATIONS >
    <LOCATION ACCESS_METHOD="file" DAY_BOUNDARY_TZ="EST5EDT"
              END_TIME="21000101000000" LOCATION="..." START_TIME="20021230000000" />
</LOCATIONS>
<RAW_DATA />
</DB>
>>> some_db.show_config(config_type='db_time_intervals')  
{'START_DATE': 1041206400000, 'END_DATE': 4102444800000,
 'GROWABLE_ARCHIVE_FLAG': 0, 'ARCHIVE_DURATION': 0,
 'LOCATION': '...', 'DAY_BOUNDARY_TZ': 'EST5EDT', 'DAY_BOUNDARY_OFFSET': 0, 'ALTERNATIVE_LOCATIONS': ''}

See also

DB/SHOW_CONFIG OneTick event processor

property min_acl_start_date: Optional[datetime.date]#

Minimum start date set in ACL for current user. Returns None if not set.

property max_acl_end_date: Optional[datetime.date]#

Maximum end date set in ACL for current user. Returns None if not set.

dates(respect_acl=False)#

Returns list of dates in GMT timezone for which data is available.

Returns

Returns None when there is no data in the database

Return type

datetime.date or None

Examples

>>> some_db = otp.databases()['SOME_DB']
>>> some_db.dates()
[datetime.date(2003, 12, 1)]
property last_date#

The latest date on which db has data and the current user has access to.

Returns

Returns None when there is no data in the database

Return type

datetime.date or None

Examples

>>> some_db = otp.databases()['SOME_DB']
>>> some_db.last_date
datetime.date(2003, 12, 1)
tick_types(date=None, timezone=None)#

Returns list of tick types for the date.

Parameters
  • date (otp.dt, datetime.datetime, optional) – Date for the tick types look up. None means the last_date

  • timezone (str, optional) – Timezone for the look up. None means the default timezone.

Returns

List with string values of available tick types.

Return type

list

Examples

>>> nyse_taq_db = otp.databases()['NYSE_TAQ']
>>> nyse_taq_db.tick_types(date=otp.dt(2022, 3, 1))
['QTE', 'TRD']
schema(date=None, tick_type=None, timezone=None)#

Gets the schema of the database.

Parameters
  • date (otp.dt, datetime.datetime, optional) – Date for the schema. None means the last_date

  • tick_type (str, optional) – Specifies a tick type for schema. None means use the one available tick type, if there are multiple tick types then it raises the Exception. It uses the tick_types() method.

  • timezone (str, optional) – Allows to specify a timezone for searching tick types.

Returns

Dict where keys are field names and values are onetick.py types. It’s compatible with the onetick.py.Source.schema methods.

Return type

dict

Examples

>>> nyse_taq_db = otp.databases()['NYSE_TAQ']
>>> nyse_taq_db.schema(tick_type='TRD', date=otp.dt(2022, 3, 1))
{'PRICE': <class 'float'>, 'SIZE': <class 'int'>}
symbols(date=None, timezone=None, tick_type=None, pattern='.*')#

Finds a list of available symbols in the database

Parameters
  • date (otp.dt, datetime.datetime, optional) – Date for the symbols look up. None means the last_date

  • tick_type (str, optional) – Tick type for symbols. None means union across all tick types.

  • timezone (str, optional) – Timezone for the lookup. None means the default timezone.

  • pattern (str) – Regular expression to select symbols.

Return type

List[str]

Examples

>>> nyse_taq_db = otp.databases()['NYSE_TAQ']
>>> nyse_taq_db.symbols(date=otp.dt(2022, 3, 1), tick_type='TRD', pattern='^AAP.*')
['AAP', 'AAPL']
show_archive_stats(start=utils.adaptive, end=utils.adaptive, date=None, timezone='GMT')#

This method shows various stats about the queried symbol, as well as an archive as a whole for each day within the queried interval.

Accelerator databases are not supported. Memory databases will be ignored even within their life hours.

Archive stats returned:

  • COMPRESSION_TYPE - archive compression type. In older archives native compression flag is not stored, so for example for gzip compression this field may say “GZIP or NATIVE_PLUS_GZIP”. The meta_data_upgrader.exe tool can be used to determine and inject that information in such cases in order to get a more precise result in this field.

  • TIME_RANGE_VALIDITY - whether lowest and highest loaded timestamps (see below) are known. Like native compression flag, this information is missing in older archives and can be added using meta_data_upgrader.exe tool.

  • LOWEST_LOADED_DATETIME - the lowest loaded timestamp for the queried interval (across all symbols)

  • HIGHEST_LOADED_DATETIME - the highest loaded timestamp for the queried interval (across all symbols)

  • TOTAL_TICKS - the number of ticks for the queried interval (across all symbols). Also missing in older archives and can be added using meta_data_upgrader.exe. If not available, -1 will be returned.

  • SYMBOL_DATA_SIZE - the size of the symbol in archive in bytes. This information is also missing in older archives, however the other options, it cannot later be added. In such cases -1 will be returned.

  • TOTAL_SYMBOLS - the number of symbols for the queried interval

  • TOTAL_SIZE - archive size in bytes for the queried interval (including the garbage potentially accumulated during appends).

Note

Fields LOWEST_LOADED_DATETIME and HIGHEST_LOADED_DATETIME are returned in GMT timezone, so the default value of parameter timezone is GMT too.

Examples

Show stats for a particular date for a database SOME_DB:

>>> db = otp.databases()['SOME_DB']
>>> db.show_archive_stats(date=otp.dt(2003, 12, 1))  
                 Time  COMPRESSION_TYPE TIME_RANGE_VALIDITY LOWEST_LOADED_DATETIME HIGHEST_LOADED_DATETIME...
0 2003-12-01 05:00:00  NATIVE_PLUS_GZIP               VALID    2003-12-01 05:00:00 2003-12-01 05:00:00.002...

See also

SHOW_ARCHIVE_STATS OneTick event processor

Return type

pandas.core.frame.DataFrame