Writing Indicators
This notebook explains how to create and integrate custom stock indicators in PyBroker. Indicators in PyBroker are written using NumPy, a powerful library for numerical computing. To optimize performance, we’ll also be utilizing Numba, a JIT compiler that translates Python code into efficient machine code. Numba is especially helpful for accelerating code that involves loops and NumPy arrays. Here’s how we import these libraries:
[1]:
import numpy as np
from numba import njit
The following code shows an indicator function that calculates close prices minus a moving average (CMMA), which can be used for a mean reversion strategy:
[2]:
def cmma(bar_data, lookback):
@njit # Enable Numba JIT.
def vec_cmma(values):
# Initialize the result array.
n = len(values)
out = np.array([np.nan for _ in range(n)])
# For all bars starting at lookback:
for i in range(lookback, n):
# Calculate the moving average for the lookback.
ma = 0
for j in range(i - lookback, i):
ma += values[j]
ma /= lookback
# Subtract the moving average from value.
out[i] = values[i] - ma
return out
# Calculate with close prices.
return vec_cmma(bar_data.close)
The cmma
function takes two arguments: bar_data
, which is an instance of the BarData class that holds OHLCV data and custom fields, and lookback
, which is a user-defined argument for the lookback of the moving average.
The vec_cmma
function is JIT-compiled by Numba and nested inside cmma
. This is necessary since a Numba compiled function supports a NumPy array as an argument but not an instance of a Python class like BarData
. Note the computation of the indicator values is vectorized by Numba, meaning that it’s performed on all of the historical data at once. This approach significantly speeds up the backtesting process.
The next step is to register the indicator function with PyBroker using the following code:
[3]:
import pybroker
cmma_20 = pybroker.indicator('cmma_20', cmma, lookback=20)
Here, we are giving the name cmma_20
to the indicator function and specifying the lookback
parameter as 20
bars. Any arguments in the indicator function that come after bar_data
will be passed as user-defined arguments to pybroker.indicator. Once the indicator function is registered with PyBroker, it will return a new
Indicator instance that references the indicator function we defined.
The following is an example of how to use the registered Indicator
in PyBroker with some data downloaded from Yahoo Finance:
[4]:
from pybroker import YFinance
pybroker.enable_data_source_cache('yfinance')
yfinance = YFinance()
df = yfinance.query('PG', '4/1/2020', '4/1/2022')
Loading bar data...
[*********************100%***********************] 1 of 1 completed
Loaded bar data: 0:00:01
[5]:
cmma_20(df)
[5]:
2020-04-01 NaN
2020-04-02 NaN
2020-04-03 NaN
2020-04-06 NaN
2020-04-07 NaN
...
2022-03-25 1.967502
2022-03-28 3.288005
2022-03-29 4.968507
2022-03-30 3.790999
2022-03-31 2.171002
Length: 505, dtype: float64
As you can see, the Indicator
instance is a Callable
. Once called, the resulting computed indicator values are returned as a Pandas Series.
The Indicator
class also provides functions for measuring its information content. For example, you can compute the interquartile range (IQR):
[6]:
cmma_20.iqr(df)
[6]:
4.655495452880842
Or compute the relative entropy:
[7]:
cmma_20.relative_entropy(df)
[7]:
0.7495800114455111
Using the Indicator in a Strategy
After implementing our indicator, the next step is to integrate it into a trading strategy. The following example shows a simple strategy that goes long when the 20-day CMMA is less than 0 — i.e. when the last close price drops below the 20-day moving average:
[8]:
def buy_cmma_cross(ctx):
if ctx.long_pos():
return
# Place a buy order if the most recent value of the 20 day CMMA is < 0:
if ctx.indicator('cmma_20')[-1] < 0:
ctx.buy_shares = ctx.calc_target_shares(1)
ctx.hold_bars = 3
The indicator values are retrieved by calling ctx.indicator on the ExecContext and passing in the registered name of the cmma_20
indicator.
(Note, you can also retrieve indicator data for another symbol by passing the symbol to ExecContext#indicator())
[9]:
from pybroker import Strategy
strategy = Strategy(yfinance, '4/1/2020', '4/1/2022')
strategy.add_execution(buy_cmma_cross, 'PG', indicators=cmma_20)
Here, the buy_cmma_cross
function is added to the Strategy along with the cmma_20
indicator. We can enable caching of the computed indicator values to disk with the following:
[10]:
pybroker.enable_indicator_cache('my_indicators')
[10]:
<diskcache.core.Cache at 0x7f45b0a73bb0>
Finally, we can run the backtest with the following code. The warmup
argument specifies that 20 bars need to pass before running the backtest execution:
[11]:
result = strategy.backtest(warmup=20)
result.metrics_df.round(4)
Backtesting: 2020-04-01 00:00:00 to 2022-04-01 00:00:00
Loaded cached bar data.
Computing indicators...
100% (1 of 1) |##########################| Elapsed Time: 0:00:00 Time: 0:00:00
Test split: 2020-04-01 00:00:00 to 2022-03-31 00:00:00
100% (505 of 505) |######################| Elapsed Time: 0:00:00 Time: 0:00:00
Finished backtest: 0:00:01
[11]:
name | value | |
---|---|---|
0 | trade_count | 60.0000 |
1 | initial_market_value | 100000.0000 |
2 | end_market_value | 100759.3600 |
3 | total_pnl | 759.3600 |
4 | unrealized_pnl | 0.0000 |
5 | total_return_pct | 0.7594 |
6 | total_profit | 41596.7500 |
7 | total_loss | -40837.3900 |
8 | total_fees | 0.0000 |
9 | max_drawdown | -13446.9300 |
10 | max_drawdown_pct | -11.9774 |
11 | win_rate | 53.3333 |
12 | loss_rate | 46.6667 |
13 | winning_trades | 32.0000 |
14 | losing_trades | 28.0000 |
15 | avg_pnl | 12.6560 |
16 | avg_return_pct | 0.0293 |
17 | avg_trade_bars | 3.0000 |
18 | avg_profit | 1299.8984 |
19 | avg_profit_pct | 1.2609 |
20 | avg_winning_trade_bars | 3.0000 |
21 | avg_loss | -1458.4782 |
22 | avg_loss_pct | -1.3782 |
23 | avg_losing_trade_bars | 3.0000 |
24 | largest_win | 4263.4500 |
25 | largest_win_pct | 4.1000 |
26 | largest_win_bars | 3.0000 |
27 | largest_loss | -4675.6700 |
28 | largest_loss_pct | -4.1700 |
29 | largest_loss_bars | 3.0000 |
30 | max_wins | 7.0000 |
31 | max_losses | 4.0000 |
32 | sharpe | 0.0023 |
33 | profit_factor | 1.0092 |
34 | ulcer_index | 1.8823 |
35 | upi | 0.0019 |
36 | equity_r2 | 0.0015 |
37 | std_error | 3385.1968 |
When the backtest runs, PyBroker computes the indicator values. If there are multiple indicators added to the Strategy
, then PyBroker will compute them in parallel across multiple CPU cores.
Vectorized Helpers
The PyBroker library provides vectorized helper functions to make the process of computing indicators easier. One of these helper functions is highv, which calculates the highest value for every period of n bars.
In the example code, an indicator function called hhv
is defined that uses highv
to calculate the highest high price for every period of 5 bars:
[12]:
from pybroker import highv
def hhv(bar_data, period):
return highv(bar_data.high, period)
hhv_5 = pybroker.indicator('hhv_5', hhv, period=5)
hhv_5(df)
[12]:
2020-04-01 NaN
2020-04-02 NaN
2020-04-03 NaN
2020-04-06 NaN
2020-04-07 120.059998
...
2022-03-25 153.919998
2022-03-28 153.919998
2022-03-29 156.470001
2022-03-30 156.470001
2022-03-31 156.470001
Length: 505, dtype: float64
The pybroker.vect module also includes other vectorized helpers such as lowv, sumv, returnv, and cross, the last of which is used to compute crossovers.
Additionally, PyBroker includes convenient wrappers for highest and lowest indicators. Our hhv
indicator can be rewritten as:
[13]:
from pybroker import highest
hhv_5 = highest('hhv_5', 'high', period=5)
hhv_5(df)
[13]:
2020-04-01 NaN
2020-04-02 NaN
2020-04-03 NaN
2020-04-06 NaN
2020-04-07 120.059998
...
2022-03-25 153.919998
2022-03-28 153.919998
2022-03-29 156.470001
2022-03-30 156.470001
2022-03-31 156.470001
Length: 505, dtype: float64
Computing Multiple Indicators
An IndicatorSet can be used to calculate multiple indicators. The cmma_20
and hhv_5
indicators can be computed together by adding them to the IndicatorSet
. The resulting output will be a Pandas DataFrame containing both:
[14]:
from pybroker import IndicatorSet
indicator_set = IndicatorSet()
indicator_set.add(cmma_20, hhv_5)
indicator_set(df)
Computing indicators...
100% (2 of 2) |##########################| Elapsed Time: 0:00:01 Time: 0:00:01
[14]:
symbol | date | cmma_20 | hhv_5 | |
---|---|---|---|---|
0 | PG | 2020-04-01 | NaN | NaN |
1 | PG | 2020-04-02 | NaN | NaN |
2 | PG | 2020-04-03 | NaN | NaN |
3 | PG | 2020-04-06 | NaN | NaN |
4 | PG | 2020-04-07 | NaN | 120.059998 |
... | ... | ... | ... | ... |
500 | PG | 2022-03-25 | 1.967502 | 153.919998 |
501 | PG | 2022-03-28 | 3.288005 | 153.919998 |
502 | PG | 2022-03-29 | 4.968507 | 156.470001 |
503 | PG | 2022-03-30 | 3.790999 | 156.470001 |
504 | PG | 2022-03-31 | 2.171002 | 156.470001 |
505 rows × 4 columns
Using TA-Lib
TA-Lib is a widely used technical analysis library that implements many financial indicators. Integrating TA-Lib with PyBroker is straightforward. Here is an example:
[15]:
import talib
rsi_20 = pybroker.indicator('rsi_20', lambda data: talib.RSI(data.close, timeperiod=20))
rsi_20(df)
[15]:
2020-04-01 NaN
2020-04-02 NaN
2020-04-03 NaN
2020-04-06 NaN
2020-04-07 NaN
...
2022-03-25 49.373093
2022-03-28 51.014810
2022-03-29 53.407971
2022-03-30 51.610544
2022-03-31 49.029540
Length: 505, dtype: float64
Built-In Indicators
PyBroker also includes built-in indicators that are available in the indicator module.
In the next tutorial, you will learn how to train a model using custom indicators in PyBroker.