Archive

Archive for January, 2012

Multiple Factor Model – Fundamental Data

January 29, 2012 7 comments

The Multiple Factor Model can be used to decompose returns and calculate risk. Following are some examples of the Multiple Factor Models:

The factors in the model are usually created using pricing, fundamental, analyst estimates, and proprietary data. I will only show examples of factors using pricing and fundamental data because these infromation is readily available from Yahoo Fiance and ADVFN.

This is the first post in the series about Multiple Factor Models. In this post I will show how to get company’s Fundamental Data into R, create a simple factor, and run correlation analysis. In the next posts, I will show how to:

  • Build Factors and compute quantiles spreads
  • Backtest Multiple Factor Model
  • Calculate Risk using Multiple Factor Model

I created a fund.data() function in fundamental.data.r at github to download company’s historical Fundamental data from ADVFN. Following code loads historical quarterly fundamental data for Wal-Mart Stores and computes rolling annual Earnings per share (EPS) using the Systematic Investor Toolbox:

###############################################################################
# Load Systematic Investor Toolbox (SIT)
# http://systematicinvestor.wordpress.com/systematic-investor-toolbox/
###############################################################################
con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb'))
    source(con)
close(con)


###############################################################################
# determine date when fundamental data is available
# use 'date preliminary data loaded' when available
# otherwise lag 'quarter end date' 2 months for Q1/2/3 and 3 months for Q4
###############################################################################		
date.fund.data <- function(data)
{
	# construct date
	quarter.end.date = as.Date(paste(data['quarter end date',], '/1', sep=''), '%Y/%m/%d')	
	quarterly.indicator = data['quarterly indicator',]
	date.preliminary.data.loaded = as.Date(data['date preliminary data loaded',], '%Y-%m-%d') + 1
	
	months = seq(quarter.end.date[1], tail(quarter.end.date,1)+365, by='1 month') 
	index = match(quarter.end.date, months)
	quarter.end.date = months[ iif(quarterly.indicator == '4', index+3, index+2) + 1 ] - 1
		
	fund.date = date.preliminary.data.loaded
		fund.date[is.na(fund.date)] = quarter.end.date[is.na(fund.date)] 

	return(fund.date)
}

	#*****************************************************************
	# Load historical fundamental data
	# http://advfn.com/p.php?pid=financials&symbol=NYSE:WMT&mode=quarterly_reports
	#****************************************************************** 
	Symbol = 'NYSE:WMT'	
	fund = fund.data(Symbol, 80)
	
	# construct date
	fund.date = date.fund.data(fund)	
	
	#*****************************************************************
	# Create and Plot Earnings per share
	#****************************************************************** 
	EPS.Q = as.double(fund['Diluted EPS from Total Operations',])
		EPS.Q = as.xts(EPS.Q, fund.date)	
	EPS = runSum(EPS.Q, 4)

	# Plot
	layout(1:2)
	par(mar=c(2,2,2,1))
	x = barplot(EPS.Q, main='Wal-Mart Quarterly Earnings per share', border=NA)
	text(x, EPS.Q, fund['quarterly indicator',], adj=c(0.5,-0.3), cex=0.8, xpd = TRUE)

	barplot(EPS, main='Wal-Mart Rolling Annual Earnings per share', border=NA)

You can see a pronounced seasonality in the Quarterly EPS for Wal-Mart, the Q4 always strong and stands out. The common way to deal with seasonality in the income statement is to use rolling annual sum, i.e. sum last 4 quarters.

Next let’s align Wal-Mart prices and EPS and plot them on the same graph.

	#*****************************************************************
	# Load historical data
	#****************************************************************** 
	load.packages('quantmod')
	tickers = 'WMT'
		
	data <- new.env()
	getSymbols(tickers, src = 'yahoo', from = '1980-01-01', env = data, auto.assign = T)
		for(i in ls(data)) data[[i]] = adjustOHLC(data[[i]], use.Adjusted=T)
		
	data$WMT = merge(data$WMT, EPS)
		# back fill EPS
		data$WMT$EPS = ifna.prev(coredata(data$WMT$EPS))	
	
	# Plot
	y = data$WMT['1990::']
	plota(Cl(y), type = 'l', LeftMargin=3)
			
	plota2Y(y$EPS, type='l', las=1, col='red', col.axis = 'red')
								
	plota.legend('WMT(rhs),WMT.EPS(lhs)', 'blue,red', list(Cl(y),y$EPS))

Next let’s repeat the above steps for all companies in the Dow Jones index.

	#*****************************************************************
	# Load historical data
	#****************************************************************** 
	load.packages('quantmod')		
	tickers = dow.jones.components()
	
	# get fundamental data
	data.fund <- new.env()
		temp = paste(iif( nchar(tickers) <= 3, 'NYSE:', 'NASDAQ:'), tickers, sep='')
		for(i in 1:len(tickers)) data.fund[[tickers[i]]] = fund.data(temp[i], 80)
	save(data.fund, file='data.fund.Rdata')
	
		
	# get pricing data
	data <- new.env()
	getSymbols(tickers, src = 'yahoo', from = '1970-01-01', env = data, auto.assign = T)
		for(i in ls(data)) data[[i]] = adjustOHLC(data[[i]], use.Adjusted=T)	
	save(data, file='data.Rdata')
	

	#load(file='data.fund.Rdata')
	#load(file='data.Rdata')
	
			
	# combine fundamental and pricing data
	for(i in tickers) {
		fund = data.fund[[i]]
		fund.date = date.fund.data(fund)
		
		EPS.Q = as.double(fund['Diluted EPS from Total Operations',])
			EPS.Q = as.xts(EPS.Q, fund.date)	
		EPS = runSum(EPS.Q, 4)
				
		data[[i]] = merge(data[[i]], EPS)
	}

	bt.prep(data, align='keep.all', dates='1995::2011')

It takes a while to download historical fundamental data for all companies in the Dow Jones index, so I recommend saving your results with save(data.fund, file=’data.fund.Rdata’) command. Later on if you want to run code one more time, just load(file=’data.fund.Rdata’) instead of downloading all data again.

Next let’s create monthly factors. EP factor = (Earnings per share) / Price. VOMO factor = Volume x Momentum.

	#*****************************************************************
	# Compute monthly factors
	#****************************************************************** 
	prices = data$prices
		prices = bt.apply.matrix(prices, function(x) ifna.prev(x))
	
	# create factors
	factors = list()

	# E/P
	EPS = bt.apply(data, function(x) ifna.prev(x[, 'EPS']))
	factors$EP = EPS / prices
			
	# VOMO - Volume x Momentum
	volume = bt.apply(data, function(x) ifna.prev(Vo(x)))
	factors$VOMO = (prices / mlag(prices,10) - 1) * bt.apply.matrix(volume, runMean, 22) / bt.apply.matrix(volume, runMean, 66)
		
	
	# find month ends
	month.ends = endpoints(prices, 'months')
	
	prices = prices[month.ends,]
	n = ncol(prices)
	nperiods = nrow(prices)
	
	ret = prices / mlag(prices) - 1
	next.month.ret = mlag(ret, -1)
	
	factors$EP = factors$EP[month.ends,]	
	factors$VOMO = factors$VOMO[month.ends,]			

Next let’s run correlation analysis for EP factor. You can do correlation analysis for VOMO factor as a homework.

	#*****************************************************************
	# Correlation Analysis
	#****************************************************************** 
	x = as.vector(factors$EP)
 	y = as.vector(next.month.ret)
 	
 	cor.test(x, y, use = 'complete.obs', method = 'pearson')			

 	# Plot
	par(mar=c(4,4,2,1)) 	 	 	
 	plot(x, y, pch=20, main='Correlation Analysis for EP factor', xlab='EP', ylab='Next Month Return')
 		abline(lm(y ~ x), col='blue', lwd=2)

>  cor.test(x, y, use = 'complete.obs', method = 'pearson')
        Pearson's product-moment correlation
data:  x and y
t = 3.6931, df = 5867, p-value = 0.0002235
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.02260247 0.07365350
sample estimates:
       cor
0.04815943

The correlation between EP and Next Month Returns is small, but significantly different from zero. The small correlation is not a surprise and is usual for this type of analysis. In the next posts, I will show that even this weak dependence can be profitable.

To view the complete source code for this example, please have a look at the fm.fund.data.test() function in factor.model.test.r at github.

Categories: Factor Model, Factors, R, Strategy

Time Series Matching with Dynamic Time Warping

January 20, 2012 5 comments

THIS IS NOT INVESTMENT ADVICE. The information is provided for informational purposes only.

In the Time Series Matching post, I used one to one mapping to the compute distance between the query(current pattern) and reference(historical time series). Following chart visualizes this concept. The distance is the sum of vertical lines.

An alternative way to map one time series to another is Dynamic Time Warping(DTW). DTW algorithm looks for minimum distance mapping between query and reference. Following chart visualizes one to many mapping possible with DTW.

To check if there a difference between simple one to one mapping and DTW, I will search for time series matches that are similar to the most recent 90 days of SPY in the last 10 years of history. Following code loads historical prices from Yahoo Fiance, setups the problem and computes Euclidean distance for the historical rolling window using the Systematic Investor Toolbox:

###############################################################################
# Load Systematic Investor Toolbox (SIT)
# http://systematicinvestor.wordpress.com/systematic-investor-toolbox/
###############################################################################
con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb'))
    source(con)
close(con)

	#*****************************************************************
	# Load historical data
	#****************************************************************** 
	load.packages('quantmod')	
	tickers = 'SPY'

	data = getSymbols(tickers, src = 'yahoo', from = '1950-01-01', auto.assign = F)

	#*****************************************************************
	# Euclidean distance, one to one mapping
	#****************************************************************** 
	obj = bt.matching.find(Cl(data), normalize.fn = normalize.mean, dist.fn = 'dist.euclidean', plot=T)

	matches = bt.matching.overlay(obj, plot.index=1:90, plot=T)

	layout(1:2)
	matches = bt.matching.overlay(obj, plot=T, layout=T)
	bt.matching.overlay.table(obj, matches, plot=T, layout=T)

Next, let’ examine the top 10 matches using Dynamic Time Warping distance. I will use the Dynamic Time Warping implementation from dtw package.

	#*****************************************************************
	# Dynamic time warping distance	
	#****************************************************************** 
	# http://en.wikipedia.org/wiki/Dynamic_time_warping
	# http://dtw.r-forge.r-project.org/
	#****************************************************************** 
	load.packages('dtw')

	obj = bt.matching.find(Cl(data), normalize.fn = normalize.mean, dist.fn = 'dist.DTW', plot=T)

	matches = bt.matching.overlay(obj, plot.index=1:90, plot=T)

	layout(1:2)
	matches = bt.matching.overlay(obj, plot=T, layout=T)
	bt.matching.overlay.table(obj, matches, plot=T, layout=T)

Both algorithms produced very similar matches and very similar predictions. I would use these predictions as an educated guess to market action going forward. So far, it looks like the market will not be going up in full throttle in the next 22 days.

To view the complete source code for this example, please have a look at the bt.matching.dtw.test() function in bt.test.r at github.

Categories: R, Strategy, Trading Strategies

Time Series Matching strategy backtest

January 17, 2012 4 comments

This is a quick post to address comments raised in the Time Series Matching post. I will show a very simple example of backtesting a Time Series Matching strategy using a distance weighted prediction. I have to warn you, the strategy’s performance is worse then the Buy and Hold.

I used the code from Time Series Matching post and re-arranged it into 3 functions:
bt.matching.find – finds historical matches similar to the given query (pattern).
bt.matching.overlay – creates matrix that overlays all matches one on top of each other.
bt.matching.overlay.table – creates and plots matches summary table.

I will use historical prices for ^GSPC to extend SPY time series. I will create a monthly backtest, that trades at the end of the month, staring January 1994. Each month, I will look back for the best historical matches similar to the last 90 days in the last 10 years of history.

I will compute a distance weighted average prediction for the next month and will go long if prediction is positive, otherwise stay in cash. This is a very simple backtest and the strategy’s performance is worse then the Buy and Hold.

Following code loads historical prices from Yahoo Fiance and setups Time Series Matching strategy backtest using the Systematic Investor Toolbox:

###############################################################################
# Load Systematic Investor Toolbox (SIT)
###############################################################################
con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb'))
    source(con)
close(con)

	#*****************************************************************
	# Load historical data
	#****************************************************************** 
	load.packages('quantmod')	
	tickers = spl('SPY,^GSPC')

	data <- new.env()
	getSymbols(tickers, src = 'yahoo', from = '1950-01-01', env = data, auto.assign = T)
	bt.prep(data, align='keep.all')

	# combine SPY and ^GSPC
	scale = as.double( data$prices$SPY['1993:01:29'] / data$prices$GSPC['1993:01:29'] )
	hist = c(scale * data$prices$GSPC['::1993:01:28'], data$prices$SPY['1993:01:29::'])

	#*****************************************************************
	# Backtest setup:
	# Starting January 1994, each month search for the 10 best matches 
	# similar to the last 90 days in the last 10 years of history data
	#
	# Invest next month if distance weighted prediction is positive
	# otherwise stay in cash
	#****************************************************************** 
	month.ends = endpoints(hist, 'months')
		month.ends = month.ends[month.ends > 0]		
	
	start.index = which(date.year(index(hist[month.ends])) == 1994)[1]
	weight = hist * NA
	
	for( i in start.index : len(month.ends) ) {
		obj = bt.matching.find(hist[1:month.ends[i],], normalize.fn = normalize.first)
		matches = bt.matching.overlay(obj)
		
		# compute prediction for next month
		n.match = len(obj$min.index)
		n.query = len(obj$query)				
		month.ahead.forecast = matches[,(2*n.query+22)]/ matches[,2*n.query] - 1
		
		# Distance weighted average
		temp = round(100*(obj$dist / obj$dist[1] - 1))		
			n.weight = max(temp) + 1
			weights = (n.weight - temp) / ( n.weight * (n.weight+1) / 2)
		weights = weights / sum(weights)
			# barplot(weights)
		avg.direction = weighted.mean(month.ahead.forecast[1:n.match], w=weights)
		
		# Logic
		weight[month.ends[i]] = 0
		if( avg.direction > 0 ) weight[month.ends[i]] = 1
	}

Next, let’s compare the Time Series Matching strategy to Buy & Hold:

	#*****************************************************************
	# Code Strategies
	#****************************************************************** 
	tickers = 'SPY'

	data <- new.env()
	getSymbols(tickers, src = 'yahoo', from = '1950-01-01', env = data, auto.assign = T)
	bt.prep(data, align='keep.all')
	
	prices = data$prices  
	
	# Buy & Hold	
	data$weight[] = 1
	buy.hold = bt.run(data)	

	# Strategy
	data$weight[] = NA
		data$weight[] = weight['1993:01:29::']
		capital = 100000
		data$weight[] = (capital / prices) * bt.exrem(data$weight)
	test = bt.run(data, type='share', capital=capital, trade.summary=T)
		
	#*****************************************************************
	# Create Report
	#****************************************************************** 
	plotbt.custom.report.part1(test, buy.hold, trade.summary=T)

How would you change the strategy or backtest to make it profitable? Please share your ideas. I looking forward to exploring them.

To view the complete source code for this example, please have a look at the bt.matching.backtest.test() function in bt.test.r at github.

Time Series Matching

January 13, 2012 9 comments

THIS IS NOT INVESTMENT ADVICE. The information is provided for informational purposes only.

If it looks like a duck, swims like a duck, and quacks like a duck, then it probably is a duck.

Do you want to know what S&P 500 will do in the next week, month, quarter? One way to make an educated guess is to find historical periods similar to the current market environment, and examine what happened. I will call this process time series matching, but you could find a similar techniques referred as technical patterns and fractals. To get some flavor about fractals, following are two articles I read recently about fractals:

I recommend reading following article about the time series matching to understand different approaches:

I will use a simple method outlined in the How to Accelerate Model Deployment using Rook by Jean-Robert Avettand-Fenoel article to find time series matches that are similar to the most recent 90 days of SPY.

Following code loads historical prices from Yahoo Fiance, setups the problem and computes Euclidean distance for the historical rolling window using the Systematic Investor Toolbox:

###############################################################################
# Load Systematic Investor Toolbox (SIT)
###############################################################################
con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb'))
    source(con)
close(con)

	#*****************************************************************
	# Load historical data
	#****************************************************************** 
	load.packages('quantmod')	
	tickers = 'SPY'

	data = getSymbols(tickers, src = 'yahoo', from = '1950-01-01', auto.assign = F)

	#*****************************************************************
	# Setup search
	#****************************************************************** 
	data = last(data, 252*10)
	reference = coredata(Cl(data))
		n = len(reference)
		query = reference[(n-90+1):n]	
		reference = reference[1:(n-90)]
		
		n.query = len(query)
		n.reference = len(reference)

	#*****************************************************************
	# Compute Distances
	#****************************************************************** 		
	dist = rep(NA, n.reference)
	query.normalized = (query - mean(query)) / sd(query)
	
	for( i in n.query : n.reference ) {
		window = reference[ (i - n.query + 1) : i]
		window.normalized = (window - mean(window)) / sd(window)
		dist[i] = stats:::dist(rbind(query.normalized, window.normalized))
	}

Next, let’s select the best 10 matches to the ‘query’ pattern in the SPY history:

	#*****************************************************************
	# Find Matches
	#****************************************************************** 			
	min.index = c()
	n.match = 10
	
	# only look at the minimums 
	temp = dist
		temp[ temp > mean(dist, na.rm=T) ] = NA
		
	# remove n.query, points to the left/right of the minimums
	for(i in 1:n.match) {
		if(any(!is.na(temp))) {
			index = which.min(temp)
			min.index[i] = index
			temp[max(0,index - 2*n.query) : min(n.reference,(index + n.query))] = NA
		}
	}
	n.match = len(min.index)
		
	#*****************************************************************
	# Plot Matches
	#****************************************************************** 		
	dates = index(data)[1:len(dist)]
	
	par(mar=c(2, 4, 2, 2))
	plot(dates, dist, type='l',col='gray', main='Top Matches', ylab='Euclidean Distance', xlab='')
		abline(h = mean(dist, na.rm=T), col='darkgray', lwd=2)
		points(dates[min.index], dist[min.index], pch=22, col='red', bg='red')
		text(dates[min.index], dist[min.index], 1:n.match, adj=c(1,1), col='black',xpd=TRUE)
		

	plota(data, type='l', col='gray', main=tickers)
		plota.lines(last(data,90), col='blue')
		for(i in 1:n.match) {
		plota.lines(data[(min.index[i]-n.query + 1):min.index[i]], col='red')
		}
		text(index(data)[min.index - n.query/2], reference[min.index - n.query/2], 1:n.match, 
			adj=c(1,-1), col='black',xpd=TRUE)
		plota.legend('Pattern,Match #','blue,red')

Next, let’s overlay all matches with the ‘query’ pattern and examine their historical performance after the match took place:

	#*****************************************************************
	# Overlay all Matches
	#****************************************************************** 		
	matches = matrix(NA, nr=(n.match+1), nc=3*n.query)
	temp = c(rep(NA, n.query), reference, query)
	for(i in 1:n.match) {
		matches[i,] = temp[ (min.index[i] - n.query + 1):(min.index[i] + 2*n.query) ]	
	}

	# add the 'query' pattern
	matches[(n.match+1),] = temp[ (len(temp) - 2*n.query + 1):(len(temp) + n.query) ]		
	
	# normalize
	for(i in 1:(n.match+1)) {
		matches[i,] = matches[i,] / matches[i,n.query]
	}
		

	#*****************************************************************
	# Plot all Matches
	#****************************************************************** 				
	temp = 100 * ( t(matches[,-c(1:n.query)]) - 1)
	
	par(mar=c(2, 4, 2, 2))
	matplot(temp, type='l',col='gray',lwd=2, lty='dotted', xlim=c(1,2.5*n.query),
		main = paste('Pattern Prediction with', n.match, 'neighbours'),ylab='Normalized', xlab='')
		lines(temp[,(n.match+1)], col='black',lwd=4)
	
	points(rep(2*n.query,n.match), temp[2*n.query,1:n.match], pch=21, lwd=2, col='gray', bg='gray')
				
	bt.plot.dot.label <- function(x, data, xfun, col='red') {
		for(j in 1:len(xfun)) {
			y = match.fun(xfun[[j]])(data)
			points(x, y, pch=21, lwd=4, col=col, bg=col)
			text(x, y, paste(names(xfun)[j], ':', round(y,1),'%'),
				adj=c(-0.1,0), cex = 0.8, col=col,xpd=TRUE)			
		}
	}
	
	bt.plot.dot.label(2*n.query, temp[2*n.query,1:n.match], 
		list(Min=min,Max=max,Median=median,'Bot 25%'=function(x) quantile(x,0.25),'Top 75%'=function(x) quantile(x,0.75)))
	bt.plot.dot.label(n.query, temp[n.query,(n.match+1)], list(Current=min))

Next, let’s summarize all matches performance in a table:

	#*****************************************************************
	# Table with predictions
	#****************************************************************** 		
	temp = matrix( double(), nr=(n.match+4), 6)
		rownames(temp) = c(1:n.match, spl('Current,Min,Average,Max'))
		colnames(temp) = spl('Start,End,Return,Week,Month,Quarter')
		
	# compute returns
	temp[1:(n.match+1),'Return'] = matches[,2*n.query]/ matches[,n.query]
	temp[1:(n.match+1),'Week'] = matches[,(2*n.query+5)]/ matches[,2*n.query]
	temp[1:(n.match+1),'Month'] = matches[,(2*n.query+20)]/ matches[,2*n.query]
	temp[1:(n.match+1),'Quarter'] = matches[,(2*n.query+60)]/ matches[,2*n.query]
			
	# compute average returns
	index = spl('Return,Week,Month,Quarter')
	temp['Min', index] = apply(temp[1:(n.match+1),index],2,min,na.rm=T)
	temp['Average', index] = apply(temp[1:(n.match+1),index],2,mean,na.rm=T)
	temp['Max', index] = apply(temp[1:(n.match+1),index],2,max,na.rm=T)
	
	# format
	temp[] = plota.format(100*(temp-1),1,'','%')
		
	# enter dates
	temp['Current', 'Start'] = format(index(last(data,90)[1]), '%d %b %Y')
	temp['Current', 'End'] = format(index(last(data,1)[1]), '%d %b %Y')
	for(i in 1:n.match) {
		temp[i, 'Start'] = format(index(data[min.index[i] - n.query + 1]), '%d %b %Y')
		temp[i, 'End'] = format(index(data[min.index[i]]), '%d %b %Y')	
	}
		
	# plot table
	plot.table(temp, smain='Match #')

The Time Series Matching analysis can be used to make an educated guess what S&P 500 will do in the next week, month, quarter. This educated guess is based on historical data and there is no guarantees that history will repeat itself.

In the next post I will examine other distance measures for Time Series Matching and I will show an example of Dynamic time warping.

To view the complete source code for this example, please have a look at the bt.matching.test() function in bt.test.r at github.

Categories: R, Strategy, Trading Strategies

Trading using Garch Volatility Forecast

January 6, 2012 7 comments

Quantum Financier wrote an interesting article Regime Switching System Using Volatility Forecast. The article presents an elegant algorithm to switch between mean-reversion and trend-following strategies based on the market volatility. Two model are examined: one using the historical volatility and another using the Garch(1,1) Volatility Forecast. The mean-reversion strategy is modeled with RSI(2): Long when RSI(2), and Short otherwise. The trend-following strategy is modeled with SMA 50/200 crossover: Long when SMA(50) > SMA(200), and Short otherwise.

I want show how to implement these ideas using the backtesting library in the Systematic Investor Toolbox.

Following code loads historical prices from Yahoo Fiance and compares performance of the Buy and Hold, Mean-Reversion, and Trend-Following strategies using the backtesting library in the Systematic Investor Toolbox:

###############################################################################
# Load Systematic Investor Toolbox (SIT)
###############################################################################
con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb'))
    source(con)
close(con)

	#*****************************************************************
	# Load historical data
	#****************************************************************** 
	load.packages('quantmod')	
	tickers = 'SPY'

	data <- new.env()
	getSymbols(tickers, src = 'yahoo', from = '1970-01-01', env = data, auto.assign = T)
		for(i in ls(data)) data[[i]] = adjustOHLC(data[[i]], use.Adjusted=T)		
	bt.prep(data, align='remove.na', dates='2000::2012')

	#*****************************************************************
	# Code Strategies
	#****************************************************************** 
	prices = data$prices  
	n = len(tickers)  
	nperiods = nrow(prices)
	
	# Buy & Hold	
	data$weight[] = 1
	buy.hold = bt.run(data)	

		
	# Mean-Reversion(MR) strategy - RSI2
	rsi2 = bt.apply.matrix(prices, RSI, 2)
	data$weight[] = NA
		data$weight[] = iif(rsi2 < 50, 1, -1)	
		capital = 100000
		data$weight[] = (capital / prices) * bt.exrem(data$weight)
	mr = bt.run(data, type='share', capital=capital, trade.summary=T)
				
		
	# Trend Following(TF) strategy - MA 50/200 crossover
	sma.short = bt.apply.matrix(prices, SMA, 50)
	sma.long = bt.apply.matrix(prices, SMA, 200)
	data$weight[] = NA
		data$weight[] = iif(sma.short > sma.long, 1, -1)
		capital = 100000
		data$weight[] = (capital / prices) * bt.exrem(data$weight)
	tf = bt.run(data, type='share', capital=capital, trade.summary=T)
		
	#*****************************************************************
	# Create Report
	#****************************************************************** 
	plotbt.custom.report.part1(mr, tf, buy.hold)

Next, let’s create a strategy that switches between mean-reversion and trend-following strategies based on historical market volatility.

	#*****************************************************************
	# Regime Switching based on historical market volatility
	# Classify current volatility by percentile using a 252 day look-back period
	# percentrank(MA(percentrank(Stdev( diff(log(close)) ,21),252),21),250)
	#****************************************************************** 
	ret.log = bt.apply.matrix(prices, ROC, type='continuous')
	hist.vol = bt.apply.matrix(ret.log, runSD, n = 21)
	vol.rank = percent.rank(SMA(percent.rank(hist.vol, 252), 21), 250)

	# Regime Switching Historical
	data$weight[] = NA
		data$weight[] = iif(vol.rank > 0.5, 
					iif(rsi2 < 50, 1, -1),
					iif(sma.short > sma.long, 1, -1)
				)
		capital = 100000
		data$weight[] = (capital / prices) * bt.exrem(data$weight)
	regime.switching = bt.run(data, type='share', capital=capital, trade.summary=T)
	
	#*****************************************************************
	# Create Report
	#****************************************************************** 
	plotbt.custom.report.part1(regime.switching, mr, tf, buy.hold)

Next, let’s create a GARCH(1,1) Volatility Forecast. I would recommend reading following articles for anyone who wants to find what GARCH is all about or to refresh their knowledge:

There are a few R packages to fit GARCH models. I will consider garch function from tseries package and garchFit function from fGarch package. The garch function from tseries package is fast but does not always find solution. The garchFit function from fGarch package is slower but does converge more consistently. To demonstrate the speed difference between garch function and garchFit function I created a simple benchmark:

	#*****************************************************************
	# Benchmarking Garch algorithms 
	#****************************************************************** 
	load.packages('tseries,fGarch,rbenchmark')	
	temp = garchSim(n=252)

	test1 <- function() {
		fit1=garch(temp, order = c(1, 1), control = garch.control(trace = F))
	}
	test2 <- function() {
		fit2=garchFit(~ garch(1,1), data = temp, include.mean=FALSE, trace=F)
	}
		 	
	benchmark(
		test1(),
		test2(),
		columns=spl('test,replications,elapsed,relative'),
		order='relative',
		replications=100
	)

The garchFit function is on average 6 times slower than garch function. So to forecast volatility I will try to use garch function whenever it can find a solution and garchFit function otherwise.

	#*****************************************************************
	# Forecast Volatility using Garch	
	# garch from tseries is fast, but does not consistently converge
	# garchFit from fGarch is slower, but converges consistently
	#****************************************************************** 
	load.packages('tseries,fGarch')	
			
	# Sigma[t]^2 = w + a* Sigma[t-1]^2 + b*r[t-1]^2
	garch.predict.one.day <- function(fit, r.last) 
	{
		h.last = tail( fitted(fit)[,1] ,1)			
		sqrt(sum( coef(fit) * c(1,  r.last^2, h.last^2) ))	
	}

	# same as predict( fit, n.ahead=1, doplot=F)[3]
	garchFit.predict.one.day <- function(fit, r.last) 
	{
		h.last = tail(sqrt(fit@h.t), 1)
		sqrt(sum( fit@fit$matcoef[,1] * c(1,  r.last^2, h.last^2) ))
	}
	
	garch.vol = NA * hist.vol
	for( i in (252+1):nperiods ) {
		temp = as.vector(ret.log[ (i-252+1):i, ])
		r.last =  tail( temp, 1 )
			
		fit = tryCatch( garch(temp, order = c(1, 1), control = garch.control(trace = F)),
	    				error=function( err ) FALSE, warning=function( warn ) FALSE )
	                    
		if( !is.logical( fit ) ) {
			if( i == 252+1 ) garch.vol[1:252] = fitted(fit)[,1]
			garch.vol[i] = garch.predict.one.day(fit, r.last)
		} else {
			fit = tryCatch( garchFit(~ garch(1,1), data = temp, include.mean=FALSE, trace=F),
	    				error=function( err ) FALSE, warning=function( warn ) FALSE )
	    				
			if( !is.logical( fit ) ) {
				if( i == 252+1 ) garch.vol[1:252] = sqrt(fit@h.t)
				garch.vol[i] = garchFit.predict.one.day(fit, r.last)
			} 
		}			
		if( i %% 100 == 0) cat(i, '\n')
	}
	garch.vol = ifna.prev(garch.vol)

Now, let’s create a strategy that switches between mean-reversion and trend-following strategies based on GARCH(1,1) volatility forecast.

	#*****************************************************************
	# Regime Switching using Garch
	#****************************************************************** 		
	vol.rank = percent.rank(SMA(percent.rank(garch.vol, 252), 21), 250)

	# Regime Switching Garch
	data$weight[] = NA
		data$weight[] = iif(vol.rank > 0.5, 
					iif(rsi2 < 50, 1, -1),
					iif(sma.short > sma.long, 1, -1)
				)
		capital = 100000
		data$weight[] = (capital / prices) * bt.exrem(data$weight)
	regime.switching.garch = bt.run(data, type='share', capital=capital, trade.summary=T)	

	#*****************************************************************
	# Create Report
	#****************************************************************** 
	plotbt.custom.report.part1(regime.switching.garch, regime.switching, buy.hold)

The switching strategy that uses GARCH(1,1) volatility forecast performed slightly better than the one that uses historical volatility.

There many different approaches you can take to incorporate forecasting into your models and trading strategies. R has a very rich set of packages to model and forecast time series. Here are some examples that I found interesting:

To view the complete source code for this example, please have a look at the bt.volatility.garch() function in bt.test.r at github.

Categories: R, Strategy, Trading Strategies
Follow

Get every new post delivered to your Inbox.

Join 207 other followers