pybroker.data module

Contains DataSources used to fetch external data.

class Alpaca(api_key: str, api_secret: str)[source]

Bases: DataSource

Retrieves stock data from Alpaca.

query(symbols: str | Iterable[str], start_date: str | datetime, end_date: str | datetime, timeframe: str | None = '1d', adjust: str | None = None) DataFrame[source]

Queries data. Cached data is returned if caching is enabled by calling pybroker.cache.enable_data_source_cache().

Parameters:
  • symbols – Symbols of the data to query.

  • start_date – Start date of the data to query (inclusive).

  • end_date – End date of the data to query (inclusive).

  • timeframe

    Formatted string that specifies the timeframe resolution to query. The timeframe string supports the following units:

    • "s"/"sec": seconds

    • "m"/"min": minutes

    • "h"/"hour": hours

    • "d"/"day": days

    • "w"/"week": weeks

    An example timeframe string is 1h 30m.

  • adjust – The type of adjustment to make.

Returns:

pandas.DataFrame containing the queried data.

class AlpacaCrypto(api_key: str, api_secret: str)[source]

Bases: DataSource

Retrieves crypto data from Alpaca.

Parameters:
  • api_key – Alpaca API key.

  • api_secret – Alpaca API secret.

COLUMNS: Final = ('symbol', 'date', 'open', 'high', 'low', 'close', 'volume', 'vwap', 'trade_count')
TRADE_COUNT: Final = 'trade_count'
query(symbols: str | Iterable[str], start_date: str | datetime, end_date: str | datetime, timeframe: str | None = '1d', _: str | None = None) DataFrame[source]

Queries data. Cached data is returned if caching is enabled by calling pybroker.cache.enable_data_source_cache().

Parameters:
  • symbols – Symbols of the data to query.

  • start_date – Start date of the data to query (inclusive).

  • end_date – End date of the data to query (inclusive).

  • timeframe

    Formatted string that specifies the timeframe resolution to query. The timeframe string supports the following units:

    • "s"/"sec": seconds

    • "m"/"min": minutes

    • "h"/"hour": hours

    • "d"/"day": days

    • "w"/"week": weeks

    An example timeframe string is 1h 30m.

  • adjust – The type of adjustment to make.

Returns:

pandas.DataFrame containing the queried data.

class DataSource[source]

Bases: ABC, DataSourceCacheMixin

Base class for querying data from an external source. Extend this class and override _fetch_data() to implement a custom DataSource that can be used with pybroker.strategy.Strategy.

abstract _fetch_data(symbols: frozenset[str], start_date: datetime, end_date: datetime, timeframe: str | None, adjust: str | None) DataFrame[source]

Override this method to return data from a custom source. The returned pandas.DataFrame must contain the following columns: symbol, date, open, high, low, and close.

Parameters:
  • symbols – Ticker symbols of the data to query.

  • start_date – Start date of the data to query (inclusive).

  • end_date – End date of the data to query (inclusive).

  • timeframe

    Formatted string that specifies the timeframe resolution to query. The timeframe string supports the following units:

    • "s"/"sec": seconds

    • "m"/"min": minutes

    • "h"/"hour": hours

    • "d"/"day": days

    • "w"/"week": weeks

    An example timeframe string is 1h 30m.

  • adjust – The type of adjustment to make.

Returns:

pandas.DataFrame containing the queried data.

query(symbols: str | Iterable[str], start_date: str | datetime, end_date: str | datetime, timeframe: str | None = '', adjust: str | None = None) DataFrame[source]

Queries data. Cached data is returned if caching is enabled by calling pybroker.cache.enable_data_source_cache().

Parameters:
  • symbols – Symbols of the data to query.

  • start_date – Start date of the data to query (inclusive).

  • end_date – End date of the data to query (inclusive).

  • timeframe

    Formatted string that specifies the timeframe resolution to query. The timeframe string supports the following units:

    • "s"/"sec": seconds

    • "m"/"min": minutes

    • "h"/"hour": hours

    • "d"/"day": days

    • "w"/"week": weeks

    An example timeframe string is 1h 30m.

  • adjust – The type of adjustment to make.

Returns:

pandas.DataFrame containing the queried data.

class DataSourceCacheMixin[source]

Bases: object

Mixin that implements fetching and storing cached DataSource data.

get_cached(symbols: Iterable[str], timeframe: str, start_date: str | datetime | Timestamp | datetime64, end_date: str | datetime | Timestamp | datetime64, adjust: str | None) tuple[DataFrame, Iterable[str]][source]

Retrieves cached data from disk when caching is enabled with pybroker.cache.enable_data_source_cache().

Parameters:
  • symbolsIterable of symbols for fetching cached data.

  • timeframe

    Formatted string that specifies the timeframe resolution of the cached data. The timeframe string supports the following units:

    • "s"/"sec": seconds

    • "m"/"min": minutes

    • "h"/"hour": hours

    • "d"/"day": days

    • "w"/"week": weeks

    An example timeframe string is 1h 30m.

  • start_date – Starting date of the cached data (inclusive).

  • end_date – Ending date of the cached data (inclusive).

  • adjust – The type of adjustment to make.

Returns:

tuple[pandas.DataFrame, Iterable[str]] containing a pandas.DataFrame with the cached data, and an Iterable[str] of symbols for which no cached data was found.

set_cached(timeframe: str, start_date: str | datetime | Timestamp | datetime64, end_date: str | datetime | Timestamp | datetime64, adjust: str | None, data: DataFrame)[source]

Stores data to disk cache when caching is enabled with pybroker.cache.enable_data_source_cache().

Parameters:
  • timeframe

    Formatted string that specifies the timeframe resolution of the data to cache. The timeframe string supports the following units:

    • "s"/"sec": seconds

    • "m"/"min": minutes

    • "h"/"hour": hours

    • "d"/"day": days

    • "w"/"week": weeks

    An example timeframe string would be 1h 30m.

  • start_date – Starting date of the data to cache (inclusive).

  • end_date – Ending date of the data to cache (inclusive).

  • adjust – The type of adjustment to make.

  • datapandas.DataFrame containing the data to cache.

class YFinance[source]

Bases: DataSource

Retrieves data from Yahoo Finance.

ADJ_CLOSE

Column name of adjusted close prices.

Type:

Final

ADJ_CLOSE: Final = 'adj_close'
query(symbols: str | Iterable[str], start_date: str | datetime, end_date: str | datetime, _timeframe: str | None = '', _adjust: str | None = None) DataFrame[source]

Queries data from Yahoo Finance. The timeframe of the data is limited to per day only.

Parameters:
  • symbols – Ticker symbols of the data to query.

  • start_date – Start date of the data to query (inclusive).

  • end_date – End date of the data to query (inclusive).

Returns:

pandas.DataFrame containing the queried data.