Archive
Calendar Strategy: Fed Days
UPDATE: I was pointed out a problem with original post due to look ahead bias introduced by prices > SMA(prices,100) statement. In the calendar strategy logic I did not use a usual lag of one day because important days are known before hand. However, the prices > SMA(prices,100) statement should be lagged by one day. I updated plots and source code.
Today, I want to follow up with the Calendar Strategy: Option Expiry post. Let’s examine the importance of the FED meeting days as presented in the Fed Days And Intermediate-Term Highs post.
Let’s dive in and examine historical perfromance of SPY during FED meeting days:
############################################################################### # Load Systematic Investor Toolbox (SIT) # https://systematicinvestor.wordpress.com/systematic-investor-toolbox/ ############################################################################### setInternet2(TRUE) con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb')) source(con) close(con) #***************************************************************** # Load historical data #****************************************************************** load.packages('quantmod') tickers = spl('SPY') data <- new.env() getSymbols.extra(tickers, src = 'yahoo', from = '1980-01-01', env = data, set.symbolnames = T, auto.assign = T) for(i in data$symbolnames) data[[i]] = adjustOHLC(data[[i]], use.Adjusted=T) bt.prep(data, align='keep.all', fill.gaps = T) #***************************************************************** # Setup #***************************************************************** prices = data$prices n = ncol(prices) dates = data$dates models = list() universe = prices > 0 # 100 day SMA filter universe = universe & prices > SMA(prices,100) # Find Fed Days info = get.FOMC.dates(F) key.date.index = na.omit(match(info$day, dates)) key.date = NA * prices key.date[key.date.index,] = T #***************************************************************** # Strategy #***************************************************************** signals = list(T0=0) for(i in 1:15) signals[[paste0('N',i)]] = 0:i signals = calendar.signal(key.date, signals) models = calendar.strategy(data, signals, universe = universe) strategy.performance.snapshoot(models, T, sort.performance=F)
Please note 100 day moving average filter above. If we take it out, the performance deteriorates significantly.
# custom stats out = sapply(models, function(x) list( CAGR = 100*compute.cagr(x$equity), MD = 100*compute.max.drawdown(x$equity), Win = x$trade.summary$stats['win.prob', 'All'], Profit = x$trade.summary$stats['profitfactor', 'All'] )) performance.barchart.helper(out, sort.performance = F) strategy.performance.snapshoot(models$N15, control=list(main=T)) last.trades(models$N15) trades = models$N15$trade.summary$trades trades = make.xts(parse.number(trades[,'return']), as.Date(trades[,'entry.date'])) layout(1:2) par(mar = c(4,3,3,1), cex = 0.8) barplot(trades, main='N15 Trades', las=1) plot(cumprod(1+trades/100), type='b', main='N15 Trades', las=1)
N15 Strategy:
With this post I wanted to show how easily we can study calendar strategy performance using the Systematic Investor Toolbox.
Next, I will look at the importance of the Dividend days.
To view the complete source code for this example, please have a look at the bt.calendar.strategy.fed.days.test() function in bt.test.r at github.
Calendar Strategy: Option Expiry
Today, I want to follow up with the Calendar Strategy: Month End post. Let’s examine the perfromance Option Expiry days as presented in the The Mooost Wonderful Tiiiiiiime of the Yearrrrrrrrr! post.
First, I created two convenience functions for creating a calendar signal and back-testing calendar strategy: calendar.signal and calendar.strategy functions are in the strategy.r at github
Now, let’s dive in and examine historical perfromance of SPY during Option Expiry period in December:
############################################################################### # Load Systematic Investor Toolbox (SIT) # https://systematicinvestor.wordpress.com/systematic-investor-toolbox/ ############################################################################### setInternet2(TRUE) con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb')) source(con) close(con) #***************************************************************** # Load historical data #****************************************************************** load.packages('quantmod') tickers = spl('SPY') data <- new.env() getSymbols.extra(tickers, src = 'yahoo', from = '1980-01-01', env = data, set.symbolnames = T, auto.assign = T) for(i in data$symbolnames) data[[i]] = adjustOHLC(data[[i]], use.Adjusted=T) bt.prep(data, align='keep.all', fill.gaps = T) #***************************************************************** # Setup #***************************************************************** prices = data$prices n = ncol(prices) dates = data$dates models = list() universe = prices > 0 # Find Friday before options expiration week in December years = date.year(range(dates)) second.friday = third.friday.month(years[1]:years[2], 12) - 7 key.date.index = na.omit(match(second.friday, dates)) key.date = NA * prices key.date[key.date.index,] = T #***************************************************************** # Strategy: Op-ex week in December most bullish week of the year for the SPX # Buy: December Friday prior to op-ex. # Sell X days later: 100K/trade 1984-present # http://quantifiableedges.blogspot.com/2011/12/mooost-wonderful-tiiiiiiime-of.html #***************************************************************** signals = list(T0=0) for(i in 1:15) signals[[paste0('N',i)]] = 0:i signals = calendar.signal(key.date, signals) models = calendar.strategy(data, signals, universe = universe) strategy.performance.snapshoot(models, T, sort.performance=F)
Strategies vary in perfromance, next let’s examine a bit more details
# custom stats out = sapply(models, function(x) list( CAGR = 100*compute.cagr(x$equity), MD = 100*compute.max.drawdown(x$equity), Win = x$trade.summary$stats['win.prob', 'All'], Profit = x$trade.summary$stats['profitfactor', 'All'] )) performance.barchart.helper(out, sort.performance = F) # Plot 15 day strategy strategy.performance.snapshoot(models$N15, control=list(main=T)) # Plot trades for 15 day strategy last.trades(models$N15) # Make a summary plot of trades for 15 day strategy trades = models$N15$trade.summary$trades trades = make.xts(parse.number(trades[,'return']), as.Date(trades[,'entry.date'])) layout(1:2) par(mar = c(4,3,3,1), cex = 0.8) barplot(trades, main='Trades', las=1) plot(cumprod(1+trades/100), type='b', main='Trades', las=1)
Details for the 15 day strategy:
With this post I wanted to show how easily we can study calendar strategy performance using the Systematic Investor Toolbox.
Next, I will look at the importance of the FED meeting days.
To view the complete source code for this example, please have a look at the
bt.calendar.strategy.option.expiry.test() function in bt.test.r at github.
Calendar Strategy: Month End
Calendar Strategy is a very simple strategy that buys an sells at the predetermined days, known in advance. Today I want to show how we can easily investigate performance at and around Month End days.
First let’s load historical prices for SPY from Yahoo Fiance and compute SPY perfromance at the month-ends. I.e. strategy will open long position at the close on the 30th and sell position at the close on the 31st.
############################################################################### # Load Systematic Investor Toolbox (SIT) # https://systematicinvestor.wordpress.com/systematic-investor-toolbox/ ############################################################################### setInternet2(TRUE) con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb')) source(con) close(con) #***************************************************************** # Load historical data #****************************************************************** load.packages('quantmod') tickers = spl('SPY') data <- new.env() getSymbols.extra(tickers, src = 'yahoo', from = '1980-01-01', env = data, set.symbolnames = T, auto.assign = T) for(i in data$symbolnames) data[[i]] = adjustOHLC(data[[i]], use.Adjusted=T) bt.prep(data, align='keep.all', fill.gaps = T) #***************************************************************** # Setup #***************************************************************** prices = data$prices n = ncol(prices) models = list() period.ends = date.month.ends(data$dates, F) #***************************************************************** # Strategy #***************************************************************** key.date = NA * prices key.date[period.ends] = T universe = prices > 0 signal = key.date data$weight[] = NA data$weight[] = ifna(universe & key.date, F) models$T0 = bt.run.share(data, do.lag = 0, trade.summary=T, clean.signal=T)
Please note that above, in the bt.run.share call, I set do.lag parameter equal to zero (the default value for the do.lag parameter is one). The reason for default setting equal to one is due to signal (decision to trade) is derived using all information available today, so the position can only be implement next day. I.e.
portfolio.returns = lag(signal, do.lag) * returns = lag(signal, 1) * returns
However, in case of the calendar strategy there is no need to lag signal because the trade day is known in advance. I.e.
portfolio.returns = lag(signal, do.lag) * returns = signal * returns
Next, I created two functions to help with signal creation and strategy testing:
calendar.strategy <- function(data, signal, universe = data$prices > 0) { data$weight[] = NA data$weight[] = ifna(universe & signal, F) bt.run.share(data, do.lag = 0, trade.summary=T, clean.signal=T) } calendar.signal <- function(key.date, offsets = 0) { signal = mlag(key.date, offsets[1]) for(i in offsets) signal = signal | mlag(key.date, i) signal } # Trade on key.date models$T0 = calendar.strategy(data, key.date) # Trade next day after key.date models$N1 = calendar.strategy(data, mlag(key.date,1)) # Trade two days next(after) key.date models$N2 = calendar.strategy(data, mlag(key.date,2)) # Trade a day prior to key.date models$P1 = calendar.strategy(data, mlag(key.date,-1)) # Trade two days prior to key.date models$P2 = calendar.strategy(data, mlag(key.date,-2)) # Trade: open 2 days before the key.date and close 2 days after the key.date signal = key.date | mlag(key.date,-1) | mlag(key.date,-2) | mlag(key.date,1) | mlag(key.date,2) models$P2N2 = calendar.strategy(data, signal) # same, but using helper function above models$P2N2 = calendar.strategy(data, calendar.signal(key.date, -2:2)) strategy.performance.snapshoot(models, T) strategy.performance.snapshoot(models, control=list(comparison=T), sort.performance=F)
Above, T0 is a calendar strategy that buys on 30th and sells on 31st. I.e. position is only held on a month end day. P1 and P2 are two strategies that buy a day prior and two days prior correspondingly. N1 and N2 are two strategies that buy a day after and two days after correspondingly.
The N1 strategy, buy on 31st and sell on the 1st next month seems to be working best for SPY.
Finally, let’s look at the actual trades:
last.trades <- function(model, n=20, make.plot=T, return.table=F) { ntrades = min(n, nrow(model$trade.summary$trades)) trades = last(model$trade.summary$trades, ntrades) if(make.plot) { layout(1) plot.table(trades) } if(return.table) trades } last.trades(models$P2)
The P2 strategy enters position at the close 3 days before the month end and exits positions at the close 2 days before the month end. I.e. the performance is due to returns only 2 days before the month end.
With this post I wanted to show how easily we can study calendar strategy performance using the Systematic Investor Toolbox.
Next, I will demonstrate calendar strategy applications to variety of important dates.
To view the complete source code for this example, please have a look at the bt.calendar.strategy.month.end.test() function in bt.test.r at github.
Probabilistic Momentum with Intraday data
I want to follow up the Intraday data post with testing the Probabilistic Momentum strategy on Intraday data. I will use Intraday data for SPY and GLD from the Bonnot Gang to test the strategy.
############################################################################## # Load Systematic Investor Toolbox (SIT) # https://systematicinvestor.wordpress.com/systematic-investor-toolbox/ ############################################################################### setInternet2(TRUE) con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb')) source(con) close(con) #***************************************************************** # Load historical data #****************************************************************** load.packages('quantmod') # data from http://thebonnotgang.com/tbg/historical-data/ # please save SPY and GLD 1 min data at the given path spath = 'c:/Desktop/' data = bt.load.thebonnotgang.data('SPY,GLD', spath) data1 <- new.env() data1$FI = data$GLD data1$EQ = data$SPY data = data1 bt.prep(data, align='keep.all', fill.gaps = T) lookback.len = 120 confidence.level = 60/100 prices = data$prices ret = prices / mlag(prices) - 1 models = list() #***************************************************************** # Simple Momentum #****************************************************************** momentum = prices / mlag(prices, lookback.len) data$weight[] = NA data$weight$EQ[] = momentum$EQ > momentum$FI data$weight$FI[] = momentum$EQ <= momentum$FI models$Simple = bt.run.share(data, clean.signal=T) #***************************************************************** # Probabilistic Momentum + Confidence Level # http://cssanalytics.wordpress.com/2014/01/28/are-simple-momentum-strategies-too-dumb-introducing-probabilistic-momentum/ # http://cssanalytics.wordpress.com/2014/02/12/probabilistic-momentum-spreadsheet/ #****************************************************************** ir = sqrt(lookback.len) * runMean(ret$EQ - ret$FI, lookback.len) / runSD(ret$EQ - ret$FI, lookback.len) momentum.p = pt(ir, lookback.len - 1) data$weight[] = NA data$weight$EQ[] = iif(cross.up(momentum.p, confidence.level), 1, iif(cross.dn(momentum.p, (1 - confidence.level)), 0,NA)) data$weight$FI[] = iif(cross.dn(momentum.p, (1 - confidence.level)), 1, iif(cross.up(momentum.p, confidence.level), 0,NA)) models$Probabilistic = bt.run.share(data, clean.signal=T) data$weight[] = NA data$weight$EQ[] = iif(cross.up(momentum.p, confidence.level), 1, iif(cross.up(momentum.p, (1 - confidence.level)), 0,NA)) data$weight$FI[] = iif(cross.dn(momentum.p, (1 - confidence.level)), 1, iif(cross.up(momentum.p, confidence.level), 0,NA)) models$Probabilistic.Leverage = bt.run.share(data, clean.signal=T) #***************************************************************** # Create Report #****************************************************************** strategy.performance.snapshoot(models, T)
Next, let’s examine the hourly perfromance of the strategy.
#***************************************************************** # Hourly Performance #****************************************************************** strategy.name = 'Probabilistic.Leverage' ret = models[[strategy.name]]$ret ret.number = 100*as.double(ret) dates = index(ret) factor = format(dates, '%H') layout(1:2) par(mar=c(4,4,1,1)) boxplot(tapply(ret.number, factor, function(x) x),outline=T, main=paste(strategy.name, 'Distribution of Returns'), las=1) barplot(tapply(ret.number, factor, function(x) sum(x)), main=paste(strategy.name, 'P&L by Hour'), las=1)
There are lots of abnormal returns in the 9:30-10:00am box due to big overnight returns. I.e. a return from today’s open to prior’s day close. If we exclude this observation every day, the distribution each hour is more consistent.
#***************************************************************** # Hourly Performance: Remove first return of the day (i.e. overnight) #****************************************************************** day.stat = bt.intraday.day(dates) ret.number[day.stat$day.start] = 0 layout(1:2) par(mar=c(4,4,1,1)) boxplot(tapply(ret.number, factor, function(x) x),outline=T, main=paste(strategy.name, 'Distribution of Returns'), las=1) barplot(tapply(ret.number, factor, function(x) sum(x)), main=paste(strategy.name, 'P&L by Hour'), las=1)
The strategy performs best in the morning and dwindles down in the afternoon and overnight.
These hourly seasonality plots are just a different way to analyze performance of the strategy based on Intraday data.
To view the complete source code for this example, please have a look at the bt.strategy.intraday.thebonnotgang.test() function in bt.test.r at github.
Intraday data
In the Intraday Backtest post I showed an example of loading and working with Forex Intraday data from the FXHISTORICALDATA.COM. Recently, I came across another interesting source of Intraday data at the Bonnot Gang site. Please note that you will have to register to get access to the Intraday data; the registration is free.
Today, I want examine quality of the Intraday data from the Bonnot Gang and show how it can be integrated into Backtest using the Systematic Investor Toolbox. For the example below, please first download and save 1 minute Intraday historical data for SPX and GLD. Next let’s load and plot time series for SPX.
############################################################################### # Load Systematic Investor Toolbox (SIT) # https://systematicinvestor.wordpress.com/systematic-investor-toolbox/ ############################################################################### con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb')) source(con) close(con) #***************************************************************** # Load historical data #****************************************************************** load.packages('quantmod') # data from http://thebonnotgang.com/tbg/historical-data/ spath = 'c:/Desktop/' # http://stackoverflow.com/questions/14440661/dec-argument-in-data-tablefread Sys.localeconv()["decimal_point"] Sys.setlocale("LC_NUMERIC", "French_France.1252") data <- new.env() data$SPY = read.xts(paste0(spath,'SPY_1m.csv'), sep = ';', date.column = 3, format='%Y-%m-%d %H:%M:%S', index.class = c("POSIXlt", "POSIXt")) data$GLD = read.xts(paste0(spath,'GLD_1m.csv'), sep = ';', date.column = 3, format='%Y-%m-%d %H:%M:%S', index.class = c("POSIXlt", "POSIXt")) #***************************************************************** # Create plot for Nov 1, 2012 and 2013 #****************************************************************** layout(c(1,1,2)) plota(data$SPY['2012:11:01'], type='candle', main='SPY on Nov 1st, 2012', plotX = F) plota(plota.scale.volume(data$SPY['2012:11:01']), type = 'volume') layout(c(1,1,2)) plota(data$SPY['2013:11:01'], type='candle', main='SPY on Nov 1st, 2013', plotX = F) plota(plota.scale.volume(data$SPY['2013:11:01']), type = 'volume')
It jumps right away that the data provider had changed the time scale, in 2012 data was recorded from 9:30 to 16:00 and in 2013 data was recorded from 13:30 to 20:00.
Next, let’s check if there are any big gaps in the series Intraday.
#***************************************************************** # Data check for Gaps in the series Intraday #****************************************************************** i = 'GLD' dates = index(data[[i]]) factor = format(dates, '%Y%m%d') gap = tapply(dates, factor, function(x) max(diff(x))) gap[names(gap[gap > 4*60])] data[[i]]['2013:02:19'] i = 'SPY' dates = index(data[[i]]) factor = format(dates, '%Y%m%d') gap = tapply(dates, factor, function(x) max(diff(x))) gap[names(gap[gap > 4*60])] data[[i]]['2013:02:19']
Please see below the dates for GLD with gaps over 4 minutes
20120801 12 20121226 22 20130219 48 20130417 6 20130531 6 20130705 8 20131105 4 20131112 4 20140124 14 20140210 22 20140303 6
A detailed look at the Feb 19th, 2013 shows a 48 minute gap between 14:54 and 15:42
> data[[i]]['2013:02:19 14:50::2013:02:19 15:45'] open high low close volume 2013-02-19 14:50:54 155.3110 155.315 155.3001 155.315 8900 2013-02-19 14:51:56 155.3100 155.310 155.3100 155.310 119900 2013-02-19 14:52:52 155.3100 155.330 155.3000 155.305 354600 2013-02-19 14:53:55 155.2990 155.300 155.2800 155.280 0 2013-02-19 14:54:54 155.2900 155.290 155.2659 155.279 10500 2013-02-19 15:42:57 155.3400 155.360 155.3400 155.350 587900 2013-02-19 15:43:57 155.3501 155.355 155.3300 155.332 8300 2013-02-19 15:44:59 155.3395 155.340 155.3200 155.340 10700 2013-02-19 15:45:55 155.3300 155.340 155.3300 155.340 5100
So there is definitely something going on with data acquisition at that time.
Next, let’s compare Intrada data with daily data:
#***************************************************************** # Data check : compare with daily #****************************************************************** data.daily <- new.env() quantmod::getSymbols(spl('SPY,GLD'), src = 'yahoo', from = '1970-01-01', env = data.daily, auto.assign = T) layout(1) plota(data$GLD, type='l', col='blue', main='GLD') plota.lines(data.daily$GLD, type='l', col='red') plota.legend('Intraday,Daily', 'blue,red') plota(data$SPY, type='l', col='blue', main='SPY') plota.lines(data.daily$SPY, type='l', col='red') plota.legend('Intraday,Daily', 'blue,red')
The Intraday data matches Daily data very well.
Please note that the raw Intraday data comes with seconds time stamp, for back-testing purposes we will also want to round date time to the nearest minute, so that we can merge the Intraday data series without introducing multiple entries for the same minute. For example:
#***************************************************************** # Round to the next minute #****************************************************************** GLD.sample = data$GLD['2012:07:10::2012:07:10 09:35'] SPY.sample= data$SPY['2012:07:10::2012:07:10 09:35'] merge( Cl(GLD.sample), Cl(SPY.sample) ) # round to the next minute index(GLD.sample) = as.POSIXct(format(index(GLD.sample) + 60, '%Y-%m-%d %H:%M'), format = '%Y-%m-%d %H:%M') index(SPY.sample) = as.POSIXct(format(index(SPY.sample) + 60, '%Y-%m-%d %H:%M'), format = '%Y-%m-%d %H:%M') merge( Cl(GLD.sample), Cl(SPY.sample) )
> merge( Cl(GLD.sample), Cl(SPY.sample) ) close close.1 2012-07-10 09:30:59 155.0900 136.030 2012-07-10 09:31:59 155.1200 136.139 2012-07-10 09:32:58 155.1100 NA 2012-07-10 09:32:59 NA 136.180 2012-07-10 09:33:56 155.1400 NA 2012-07-10 09:33:59 NA 136.100 2012-07-10 09:34:59 155.0999 136.110 2012-07-10 09:35:59 155.0200 136.180 > merge( Cl(GLD.sample), Cl(SPY.sample) ) close close.1 2012-07-10 09:31:00 155.0900 136.030 2012-07-10 09:32:00 155.1200 136.139 2012-07-10 09:33:00 155.1100 136.180 2012-07-10 09:34:00 155.1400 136.100 2012-07-10 09:35:00 155.0999 136.110 2012-07-10 09:36:00 155.0200 136.180
I got an impression that these Intraday data is not really authentic, but was collected by running Intraday snap shoots of the quotes and later on processed to create one minute bars. But I might be wrong.
Next, let’s clean the Intraday data, by removing any day with time gaps over 4 minutes and let’s round all times to the nearest minute:
#***************************************************************** # Clean data #****************************************************************** # remove dates with gaps over 4 min for(i in ls(data)) { dates = index(data[[i]]) factor = format(dates, '%Y%m%d') gap = tapply(dates, factor, function(x) max(diff(x))) data[[i]] = data[[i]][ is.na(match(factor, names(gap[gap > 4*60]))) ] } common = unique(format(index(data[[ls(data)[1]]]), '%Y%m%d')) for(i in ls(data)) { dates = index(data[[i]]) factor = format(dates, '%Y%m%d') common = intersect(common, unique(factor)) } # remove days that are not present in both time series for(i in ls(data)) { dates = index(data[[i]]) factor = format(dates, '%Y%m%d') data[[i]] = data[[i]][!is.na(match(factor, common)),] } #***************************************************************** # Round to the next minute #****************************************************************** for(i in ls(data)) index(data[[i]]) = as.POSIXct(format(index(data[[i]]) + 60, '%Y-%m-%d %H:%M'), tz = Sys.getenv('TZ'), format = '%Y-%m-%d %H:%M')
Once Intraday data is ready, we can test a simple equal weight strategy:
#***************************************************************** # Load historical data #****************************************************************** bt.prep(data, align='keep.all', fill.gaps = T) prices = data$prices dates = data$dates nperiods = nrow(prices) models = list() #***************************************************************** # Benchmarks #****************************************************************** data$weight[] = NA data$weight$SPY = 1 models$SPY = bt.run.share(data, clean.signal=F) data$weight[] = NA data$weight$GLD = 1 models$GLD = bt.run.share(data, clean.signal=F) data$weight[] = NA data$weight$SPY = 0.5 data$weight$GLD = 0.5 models$EW = bt.run.share(data, clean.signal=F) #***************************************************************** # Create Report #****************************************************************** strategy.performance.snapshoot(models, T)
In this post, I tried to outline the basic steps you need to take if you are planning to work with a new data source. Next, I plan to follow with more examples of testing Intraday strategies.
To view the complete source code for this example, please have a look at the bt.intraday.thebonnotgang.test() function in bt.test.r at github.