# otp.inspection.DB

### *class* DB(name, description='', context=utils.default)

Bases: [`object`](https://docs.python.org/3/library/functions.html#object)

An object of available databases that the [`otp.databases()`](databases.md#onetick.py.databases) function returns.
It helps to make initial analysis on the database level: available tick types,
dates with data, symbols, tick schema, etc.

#### access_info(deep_scan=False, username=None)

Get access info for this database and `username`.

All dates are returned in GMT timezone.

* **Parameters:**
  * **deep_scan** -- If False (default) then the access fields are returned from the configuration of the database
    (basically the same fields as specified in the locator) and the dictionary is returned.
    If True then access fields are returned for each available remote host and time interval
    and the [pandas.DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) object is returned.
  * **username** -- Can be used to specify the user for which the query will be executed.
    By default the query is executed for the current user.
* **Return type:**
  *DataFrame* | [dict](https://docs.python.org/3/library/stdtypes.html#dict)

### Examples

By default access fields from the basic configuration of the database are returned:

```pycon
>>> some_db = otp.databases()['SOME_DB']
>>> some_db.access_info()  
{'DB_NAME': 'SOME_DB',
 'READ_ACCESS': 1,
 'WRITE_ACCESS': 1,
 'MIN_AGE_SET': 0,
 'MIN_AGE_MSEC': 0,
 'MAX_AGE_SET': 0,
 'MAX_AGE_MSEC': 0,
 'MIN_START_DATE_SET': 0,
 'MIN_START_DATE_MSEC': Timestamp('1970-01-01 00:00:00'),
 'MAX_END_DATE_SET': 0,
 'MAX_END_DATE_MSEC': Timestamp('1970-01-01 00:00:00'),
 'MIN_AGE_DB_DAYS': 0,
 'MIN_AGE_DB_DAYS_SET': 0,
 'MAX_AGE_DB_DAYS': 0,
 'MAX_AGE_DB_DAYS_SET': 0,
 'CEP_ACCESS': 1,
 'DESTROY_ACCESS': 0,
 'MEMDB_ACCESS': 1}
```

Set parameter `deep_scan` to True to return access fields from each available host and time interval:

```pycon
>>> some_db.access_info(deep_scan=True)  
   DB_NAME  READ_ACCESS  WRITE_ACCESS  MIN_AGE_SET  MIN_AGE_MSEC  MAX_AGE_SET  MAX_AGE_MSEC              MIN_START_DATE_SET MIN_START_DATE_MSEC  MAX_END_DATE_SET MAX_END_DATE_MSEC  MIN_AGE_DB_DAYS              MIN_AGE_DB_DAYS_SET  MAX_AGE_DB_DAYS  MAX_AGE_DB_DAYS_SET  CEP_ACCESS  DESTROY_ACCESS  MEMDB_ACCESS                SERVER_ADDRESS INTERVAL_START INTERVAL_END
0  SOME_DB            1             1            0             0            0             0                               0          1970-01-01                 0        1970-01-01                0                                0                0                    0           1               0             1                           ...     2002-12-30   2100-01-01
```

#### SEE ALSO
**ACCESS_INFO** OneTick event processor

#### show_config(config_type='locator_entry')

Shows the specified configuration for a database.

* **Parameters:**
  **config_type** ([*str*](https://docs.python.org/3/library/stdtypes.html#str)) -- 

  If **'locator_entry'** is specified, a string representing db's locator entry along with VDB_FLAG
  (this flag equals 1 when the database is virtual and 0 otherwise) will be returned.

  If **'db_time_intervals'** is specified,
  then time intervals configured in the locator file will be propagated
  including additional information, such as
  LOCATION, ARCHIVE_DURATION, DAY_BOUNDARY_TZ, DAY_BOUNDARY_OFFSET, ALTERNATIVE_LOCATIONS, etc.
* **Return type:**
  [dict](https://docs.python.org/3/library/stdtypes.html#dict)

### Examples

```python
some_db = otp.databases()['SOME_DB']
print(some_db.show_config()['LOCATOR_STRING'])
```

```none
<DB ARCHIVE_COMPRESSION_TYPE="NATIVE_PLUS_ZSTD" ID="SOME_DB" SYMBOLOGY="BZX" TICK_TIMESTAMP_TYPE="NANOS" >
<LOCATIONS >
    <LOCATION ACCESS_METHOD="file" DAY_BOUNDARY_TZ="EST5EDT"
              END_TIME="21000101000000" LOCATION="..." START_TIME="20021230000000" />
</LOCATIONS>
<RAW_DATA />
</DB>
```

```pycon
>>> some_db = otp.databases()['SOME_DB']
>>> some_db.show_config(config_type='db_time_intervals')  
{'START_DATE': 1041206400000, 'END_DATE': 4102444800000,
 'GROWABLE_ARCHIVE_FLAG': 0, 'ARCHIVE_DURATION': 0,
 'LOCATION': '...', 'DAY_BOUNDARY_TZ': 'EST5EDT', 'DAY_BOUNDARY_OFFSET': 0, 'ALTERNATIVE_LOCATIONS': ''}
```

#### SEE ALSO
**DB/SHOW_CONFIG** OneTick event processor

#### *property* min_acl_start_date *: [date](https://docs.python.org/3/library/datetime.html#datetime.date) | [None](https://docs.python.org/3/library/constants.html#None)*

Minimum start date set in ACL for current user.
Returns None if not set.

#### *property* max_acl_end_date *: [date](https://docs.python.org/3/library/datetime.html#datetime.date) | [None](https://docs.python.org/3/library/constants.html#None)*

Maximum end date set in ACL for current user.
Returns None if not set.

#### dates(respect_acl=False, check_index_file=utils.adaptive)

Returns list of dates in GMT timezone for which data is available.

* **Parameters:**
  * **respect_acl** ([*bool*](https://docs.python.org/3/library/functions.html#bool)) -- If True then only the dates that current user has access to will be returned
  * **check_index_file** ([*bool*](https://docs.python.org/3/library/functions.html#bool)) -- If True, then file *index* will be searched for to determine if a database is loaded for a date.
    This check may be expensive, in terms of time it takes,
    when the file resides on NFS or on object storage, such as S3.
    If this parameter is set to False, then only the database directory for a date will be searched.
    This will increase performance, but may also return the days that are configured
    but where there is actually no data.
    By default this option is set to False if it is supported by API and the server,
    otherwise it is set to True.
* **Returns:**
  Returns `None` when there is no data in the database
* **Return type:**
  `datetime.date` or `None`

### Examples

```pycon
>>> some_db = otp.databases()['SOME_DB']
>>> some_db.dates()
[datetime.date(2003, 12, 1)]
```

#### *property* last_date

The latest date on which db has data and the current user has access to.

* **Returns:**
  Returns `None` when there is no data in the database
* **Return type:**
  `datetime.date` or `None`

### Examples

```pycon
>>> some_db = otp.databases()['SOME_DB']
>>> some_db.last_date
datetime.date(2003, 12, 1)
```

#### tick_types(date=None, timezone=None)

Returns list of tick types for the `date`.

* **Parameters:**
  * **date** ([`otp.dt`](../datetime/dt.md#onetick.py.datetime), [`datetime.datetime`](https://docs.python.org/3/library/datetime.html#datetime.datetime), optional) -- Date for the tick types look up. `None` means the [`last_date`](#onetick.py.db._inspection.DB.last_date)
  * **timezone** ([*str*](https://docs.python.org/3/library/stdtypes.html#str) *,* *optional*) -- Timezone for the look up. `None` means the default timezone.
* **Returns:**
  List with string values of available tick types.
* **Return type:**
  [list](https://docs.python.org/3/library/stdtypes.html#list)

### Examples

```pycon
>>> us_comp_db = otp.databases()['US_COMP']
>>> us_comp_db.tick_types(date=otp.dt(2022, 3, 1))
['QTE', 'TRD']
```

#### schema(date=None, tick_type=None, timezone=None, check_index_file=utils.adaptive)

Gets the schema of the database.

* **Parameters:**
  * **date** ([`otp.dt`](../datetime/dt.md#onetick.py.datetime), [`datetime.datetime`](https://docs.python.org/3/library/datetime.html#datetime.datetime), optional) -- Date for the schema. `None` means the [`last_date`](#onetick.py.db._inspection.DB.last_date)
  * **tick_type** ([*str*](https://docs.python.org/3/library/stdtypes.html#str) *,* *optional*) -- Specifies a tick type for schema. `None` means use the one available
    tick type, if there are multiple tick types then it raises the `Exception`.
    It uses the [`tick_types()`](#onetick.py.db._inspection.DB.tick_types) method.
  * **timezone** ([*str*](https://docs.python.org/3/library/stdtypes.html#str) *,* *optional*) -- Allows to specify a timezone for searching tick types.
  * **check_index_file** ([*bool*](https://docs.python.org/3/library/functions.html#bool)) -- If True, then file *index* will be searched for to determine if a database is loaded for a date.
    This check may be expensive, in terms of time it takes,
    when the file resides on NFS or on object storage, such as S3.
    If this parameter is set to False, then only the database directory for a date will be searched.
    This will increase performance, but may also return the days that are configured
    but where there is actually no data.
    By default this option is set to False if it is supported by API and the server,
    otherwise it is set to True.
* **Returns:**
  Dict where keys are field names and values are `onetick.py` [types](../../static/concepts/schema.md#schema-concept).
  It's compatible with the [`onetick.py.Source.schema`](../source/schema.md#onetick.py.Source.schema) methods.
* **Return type:**
  [dict](https://docs.python.org/3/library/stdtypes.html#dict)

### Examples

```pycon
>>> us_comp_db = otp.databases()['US_COMP']
>>> us_comp_db.schema(tick_type='TRD', date=otp.dt(2022, 3, 1))
{'PRICE': <class 'float'>, 'SIZE': <class 'int'>}
```

#### symbols(date=None, timezone=None, tick_type=None, pattern='.\*')

Finds a list of available symbols in the database

* **Parameters:**
  * **date** ([`otp.dt`](../datetime/dt.md#onetick.py.datetime), [`datetime.datetime`](https://docs.python.org/3/library/datetime.html#datetime.datetime), optional) -- Date for the symbols look up. `None` means the [`last_date`](#onetick.py.db._inspection.DB.last_date)
  * **tick_type** ([*str*](https://docs.python.org/3/library/stdtypes.html#str) *,* *optional*) -- Tick type for symbols. `None` means union across all tick types.
  * **timezone** ([*str*](https://docs.python.org/3/library/stdtypes.html#str) *,* *optional*) -- Timezone for the lookup. `None` means the default timezone.
  * **pattern** ([*str*](https://docs.python.org/3/library/stdtypes.html#str)) -- Regular expression to select symbols.
* **Return type:**
  [list](https://docs.python.org/3/library/stdtypes.html#list)[[str](https://docs.python.org/3/library/stdtypes.html#str)]

### Examples

```pycon
>>> us_comp_db = otp.databases()['US_COMP']
>>> us_comp_db.symbols(date=otp.dt(2022, 3, 1), tick_type='TRD', pattern='^AAP.*')
['AAP', 'AAPL']
```

#### show_archive_stats(start=utils.adaptive, end=utils.adaptive, date=None, timezone='GMT')

This method shows various stats about the queried symbol,
as well as an archive as a whole for each day within the queried interval.

Accelerator databases are not supported.
Memory databases will be ignored even within their life hours.

Archive stats returned:

> * COMPRESSION_TYPE - archive compression type.
>   In older archives native compression flag is not stored,
>   so for example for gzip compression this field may say "GZIP or NATIVE_PLUS_GZIP".
>   The meta_data_upgrader.exe tool can be used to determine and inject that information in such cases
>   in order to get a more precise result in this field.
> * TIME_RANGE_VALIDITY - whether lowest and highest loaded timestamps (see below) are known.
>   Like native compression flag, this information is missing in older archives
>   and can be added using meta_data_upgrader.exe tool.
> * LOWEST_LOADED_DATETIME - the lowest loaded timestamp for the queried interval (across all symbols)
> * HIGHEST_LOADED_DATETIME - the highest loaded timestamp for the queried interval (across all symbols)
> * TOTAL_TICKS - the number of ticks for the queried interval (across all symbols).
>   Also missing in older archives and can be added using meta_data_upgrader.exe.
>   If not available, -1 will be returned.
> * SYMBOL_DATA_SIZE - the size of the symbol in archive in bytes.
>   This information is also missing in older archives, however the other options, it cannot later be added.
>   In such cases -1 will be returned.
> * TOTAL_SYMBOLS - the number of symbols for the queried interval
> * TOTAL_SIZE - archive size in bytes for the queried interval
>   (including the garbage potentially accumulated during appends).

#### NOTE
Fields **LOWEST_LOADED_DATETIME** and **HIGHEST_LOADED_DATETIME** are returned in GMT timezone,
so the default value of parameter `timezone` is GMT too.

### Examples

Show stats for a particular date for a database SOME_DB:

```python
db = otp.databases()['SOME_DB']
stats = db.show_archive_stats(date=otp.dt(2003, 12, 1))
print(stats)
```

```none
                 Time  COMPRESSION_TYPE TIME_RANGE_VALIDITY LOWEST_LOADED_DATETIME HIGHEST_LOADED_DATETIME...
0 2003-12-01 05:00:00  NATIVE_PLUS_ZSTD               VALID    2003-12-01 05:00:00 2003-12-01 05:00:00.002...
```

#### SEE ALSO
**SHOW_ARCHIVE_STATS** OneTick event processor

* **Return type:**
  *DataFrame*

#### ref_data(ref_data_type, symbol_date=None, start=utils.adaptive, end=utils.adaptive, date=None, timezone='GMT', symbol='')

Shows reference data for the specified security and reference data type.

It can be used to view corporation actions,
symbol name changes,
primary exchange info and symbology mapping for a securities,
as well as the list of symbologies,
names of custom adjustment types for corporate actions present in a reference database
as well as names of continuous contracts in database symbology.

* **Parameters:**
  * **ref_data_type** ([*str*](https://docs.python.org/3/library/stdtypes.html#str)) -- 

    Type of reference data to be queried. Possible values are:
    > * corp_actions
    > * symbol_name_history
    > * primary_exchange
    > * symbol_calendar
    > * symbol_currency
    > * symbology_mapping
    > * symbology_list
    > * custom_adjustment_type_list
    > * all_calendars
    > * all_continuous_contract_names
  * **symbol_date** -- This parameter must be specified for some reference data types to be queried.
  * **symbol** ([*str*](https://docs.python.org/3/library/stdtypes.html#str)) -- Symbol name for the query (may be useful for some `ref_data_type`).
* **Return type:**
  *DataFrame*

### Examples

Show calendars for a database TRAIN_A_PRL_TRD in the given range:

```pycon
>>> db = otp.databases()['TRAIN_A_PRL_TRD']  
>>> db.ref_data('all_calendars',  
...             start=otp.dt(2018, 2, 1),
...             end=otp.dt(2018, 2, 9),
...             symbol_date=otp.dt(2018, 2, 1))
                 Time        END_DATETIME CALENDAR_NAME SESSION_NAME SESSION_FLAGS DAY_PATTERN  START_HHMMSS          END_HHMMSS TIMEZONE  PRIORITY DESCRIPTION
0 2018-02-01 00:00:00 2018-02-06 23:59:59          FRED      Regular             R   0.0.12345         93000              160000  EST5EDT         0
1 2018-02-06 23:59:59 2018-02-07 23:59:59          FRED      Holiday             H   0.0.12345         93000              160000  EST5EDT         1
2 2018-02-07 23:59:59 2050-12-31 23:59:59          FRED      Regular             F   0.0.12345         93000              160000  EST5EDT         0
```

Set symbol name with `symbol` parameter:

```pycon
>>> db = otp.databases()['US_COMP_SAMPLE']  
>>> db.ref_data(ref_data_type='corp_actions',  
...             start=otp.dt(2025, 1, 2),
...             end=otp.dt(2025, 7, 2),
...             symbol_date=otp.dt(2025, 7, 1),
...             symbol='WMT',
...             timezone='America/New_York')
        Time  MULTIPLICATIVE_ADJUSTMENT  ADDITIVE_ADJUSTMENT ADJUSTMENT_TYPE
0 2025-03-21                   1.000000                0.235   CASH_DIVIDEND
1 2025-03-21                   0.997261                0.000  MULTI_ADJ_CASH
2 2025-05-09                   1.000000                0.235   CASH_DIVIDEND
3 2025-05-09                   0.997588                0.000  MULTI_ADJ_CASH
```

#### SEE ALSO
**REF_DATA** OneTick event processor
