Creating a Custom Data Source

PyBroker comes with pre-built DataSources for Yahoo Finance, Alpaca, and AKShare, which you can use right away without any additional setup. But if you have a specific need or want to use a different data source, PyBroker also allows you to create your own DataSource class.

Extending DataSource

In the example code provided below, a new DataSource called CSVDataSource is implemented, which loads data from a CSV file. The CSVDataSource reads a file named prices.csv into a Pandas DataFrame, and then returns the data from this DataFrame based on the input parameters provided:

[1]:
import pandas as pd
import pybroker
from pybroker.data import DataSource

class CSVDataSource(DataSource):

    def __init__(self):
        super().__init__()
        # Register custom columns in the CSV.
        pybroker.register_columns('rsi')

    def _fetch_data(self, symbols, start_date, end_date, _timeframe, _adjust):
        df = pd.read_csv('data/prices.csv')
        df = df[df['symbol'].isin(symbols)]
        df['date'] = pd.to_datetime(df['date'])
        return df[(df['date'] >= start_date) & (df['date'] <= end_date)]

To make the custom 'rsi' column from the CSV file available to PyBroker, we register it using pybroker.register_columns. This allows PyBroker to use this custom column when it processes the data.

It’s important to note that when returning the data from your custom DataSource, it must include the following columns: symbol, date, open, high, low, and close, as these columns are expected by PyBroker.

Now we can query the CSV data from an instance of CSVDataSource:

[2]:
csv_data_source = CSVDataSource()
df = csv_data_source.query(['MCD', 'NKE', 'DIS'], '6/1/2021', '12/1/2021')
df
Loading bar data...
Loaded bar data: 0:00:00

[2]:
date symbol open high low close rsi
0 2021-06-01 DIS 180.179993 181.009995 178.740005 178.839996 46.321532
1 2021-06-01 MCD 235.979996 235.990005 232.740005 233.240005 46.522926
2 2021-06-01 NKE 137.850006 138.050003 134.210007 134.509995 53.308085
3 2021-06-02 DIS 179.039993 179.100006 176.929993 177.000000 42.635256
4 2021-06-02 MCD 233.970001 234.330002 232.809998 233.779999 48.051484
... ... ... ... ... ... ... ...
382 2021-11-30 MCD 247.380005 247.899994 243.949997 244.600006 40.461178
383 2021-11-30 NKE 168.789993 171.550003 167.529999 169.240005 51.505558
384 2021-12-01 DIS 146.699997 148.369995 142.039993 142.149994 16.677555
385 2021-12-01 MCD 245.759995 250.899994 244.110001 244.179993 39.853689
386 2021-12-01 NKE 170.889999 173.369995 166.679993 166.699997 46.704527

387 rows × 7 columns

To use CSVDataSource in a backtest, we create a new Strategy object and pass the custom DataSource:

[3]:
from pybroker import Strategy

def buy_low_sell_high_rsi(ctx):
    pos = ctx.long_pos()
    if not pos and ctx.rsi[-1] < 30:
        ctx.buy_shares = 100
    elif pos and ctx.rsi[-1] > 70:
        ctx.sell_shares = pos.shares

strategy = Strategy(csv_data_source, '6/1/2021', '12/1/2021')
strategy.add_execution(buy_low_sell_high_rsi, ['MCD', 'NKE', 'DIS'])
result = strategy.backtest()
result.orders
Backtesting: 2021-06-01 00:00:00 to 2021-12-01 00:00:00

Loading bar data...
Loaded bar data: 0:00:00

Test split: 2021-06-01 00:00:00 to 2021-12-01 00:00:00
100% (129 of 129) |######################| Elapsed Time: 0:00:00 Time:  0:00:00

Finished backtest: 0:00:02
[3]:
type symbol date shares limit_price fill_price fees
id
1 buy NKE 2021-09-21 100 NaN 154.86 0.0
2 sell NKE 2021-11-04 100 NaN 173.82 0.0
3 buy DIS 2021-11-16 100 NaN 159.40 0.0

Note that because we registered the custom rsi column with PyBroker, it can be accessed in the ExecContext using ctx.rsi.

Using a Pandas DataFrame

If you do not need the flexibility of implementing your own DataSource, then you can pass a Pandas DataFrame to a Strategy instead.

To demonstrate, the earlier example can be re-implemented as follows:

[4]:
df = pd.read_csv('data/prices.csv')
df['date'] = pd.to_datetime(df['date'])

pybroker.register_columns('rsi')

strategy = Strategy(df, '6/1/2021', '12/1/2021')
strategy.add_execution(buy_low_sell_high_rsi, ['MCD', 'NKE', 'DIS'])
result = strategy.backtest()
result.orders
Backtesting: 2021-06-01 00:00:00 to 2021-12-01 00:00:00

Test split: 2021-06-01 00:00:00 to 2021-12-01 00:00:00
100% (129 of 129) |######################| Elapsed Time: 0:00:00 Time:  0:00:00

Finished backtest: 0:00:00
[4]:
type symbol date shares limit_price fill_price fees
id
1 buy NKE 2021-09-21 100 NaN 154.86 0.0
2 sell NKE 2021-11-04 100 NaN 173.82 0.0
3 buy DIS 2021-11-16 100 NaN 159.40 0.0