Archive

Archive for the ‘Factor Model’ Category

Factor Attribution

I came across a very descriptive visualization of the Factor Attribution that I will replicate today. There is the Three Factor Rolling Regression Viewer at the mas financial tools web site that performs rolling window Factor Analysis of the “three-factor model” of Fama and French. The factor returns are available from the Kenneth R French: Data Library. I recommend reading the Efficient Frontier: Rolling Your Own: Three-Factor Analysis by W. Bernstein for a step by step instructions.

Let’s start by loading historical returns for the Vanguard Small Cap Value Index (VISVX) and aligning them with Fama/French Monthly Factors. Please note I wrote a helper function, get.fama.french.data(), to simplify process of loading and analyzing factor data from the Kenneth R French: Data Library.

###############################################################################
# Load Systematic Investor Toolbox (SIT)
# http://systematicinvestor.wordpress.com/systematic-investor-toolbox/
###############################################################################
setInternet2(TRUE)
con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb'))
    source(con)
close(con)

	#*****************************************************************
	# Load historical data
	#****************************************************************** 
	load.packages('quantmod')	
	tickers = 'VISVX'
	
	data <- new.env()
	getSymbols(tickers, src = 'yahoo', from = '1980-01-01', env = data, auto.assign = T)
	for(i in ls(data)) {
		temp = adjustOHLC(data[[i]], use.Adjusted=T)							
		
		period.ends = endpoints(temp, 'months')
			period.ends = period.ends[period.ends > 0]

		# reformat date to match Fama French Data
		monthly.dates = as.Date(paste(format(index(temp)[period.ends], '%Y%m'),'01',sep=''), '%Y%m%d')
		data[[i]] = make.xts(coredata(temp[period.ends,]), monthly.dates)
	}
	
	# Fama/French factors
	factors = get.fama.french.data('F-F_Research_Data_Factors', periodicity = 'months',download = T, clean = F)
	
	# add factors and align
	data$factors = factors$data / 100
	bt.prep(data, align='remove.na', dates='1994::')

Next, let’s run Factor Attribution over all available data:

	#*****************************************************************
	# Facto Loadings Regression over whole period
	#****************************************************************** 
	prices = data$prices
		nperiods = nrow(prices)
		dates = index(data$prices)
	
	# compute simple returns	
	hist.returns = ROC(prices[,tickers], type = 'discrete')
		hist.returns = hist.returns - data$factors$RF
		colnames(hist.returns) = 'fund'
	hist.returns = cbind(hist.returns, data$factors$Mkt.RF,
					data$factors$SMB, data$factors$HML)

	fit.all = summary(lm(fund~Mkt.RF+SMB+HML, data=hist.returns))
		estimate.all = c(fit.all$coefficients[,'Estimate'], fit.all$r.squared)
		std.error.all = c(fit.all$coefficients[,'Std. Error'], NA)
Coefficients:
              Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.0006828  0.0012695  -0.538    0.591
Mkt.RF       0.9973980  0.0262881  37.941   <2e-16 ***
SMB          0.5478299  0.0364984  15.010   <2e-16 ***
HML          0.6316528  0.0367979  17.165   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Multiple R-squared: 0.9276,     Adjusted R-squared: 0.9262

All factor loadings are significant.

Next, let’s run a 36 month rolling window Factor Attribution:

	#*****************************************************************
	# Facto Loadings Regression over 36 Month window
	#****************************************************************** 							
	window.len = 36
	
	colnames = spl('alpha,MKT,SMB,HML,R2')
	estimate = make.xts(matrix(NA, nr = nperiods, len(colnames)), dates)
	colnames(estimate) = colnames
	std.error = estimate
	
	# main loop
	for( i in window.len:nperiods ) {
		window.index = (i - window.len + 1) : i
		
		fit = summary(lm(fund~Mkt.RF+SMB+HML, data=hist.returns[window.index,]))
		estimate[i,] = c(fit$coefficients[,'Estimate'], fit$r.squared)
		std.error[i,] = c(fit$coefficients[,'Std. Error'], NA)
		
		if( i %% 10 == 0) cat(i, '\n')
	}

Finally, let’s re-create the timeseries and histogram charts as presented by Three Factor Rolling Regression Viewer.

	#*****************************************************************
	# Reports
	#****************************************************************** 
	layout(matrix(1:10,nc=2,byrow=T))
	
	for(i in 1:5) {	
		#-------------------------------------------------------------------------
		# Time plot
		#-------------------------------------------------------------------------
		est = estimate[,i]
		est.std.error = ifna(std.error[,i], 0)
		
		plota(est, 
			ylim = range( c(
				range(est + est.std.error, na.rm=T),
				range(est - est.std.error, na.rm=T)		
				)))
	
		polygon(c(dates,rev(dates)), 
			c(coredata(est + est.std.error), 
			rev(coredata(est - est.std.error))), 
			border=NA, col=col.add.alpha('red',50))
	
		est = estimate.all[i]
		est.std.error = std.error.all[i]
	
		polygon(c(range(dates),rev(range(dates))), 
			c(rep(est + est.std.error,2),
			rep(est - est.std.error,2)),
			border=NA, col=col.add.alpha('blue',50))
		
		abline(h=0, col='blue', lty='dashed')
			
		abline(h=est, col='blue')
	
		plota.lines(estimate[,i], type='l', col='red')
		
		#-------------------------------------------------------------------------
		# Histogram
		#-------------------------------------------------------------------------
		par(mar = c(4,3,2,1))
		hist(estimate[,i], col='red', border='gray', las=1,
			xlab='', ylab='', main=colnames(estimate)[i])
			abline(v=estimate.all[i], col='blue', lwd=2)
	}

In the next post, I plan to replicate the Style charts and provide more examples.

To view the complete source code for this example, please have a look at the three.factor.rolling.regression() function in bt.test.r at github.

Categories: Factor Model, Factors, R

Multiple Factor Model Summary

In this post I want to summarize all the material I covered in the Multiple Factor Models series. The Multiple Factor Model can be used to decompose returns and calculate risk. Following are some examples of the Multiple Factor Models:

The factors in the model are usually created using pricing, fundamental, analyst estimates, and proprietary data. I will only show examples of factors using pricing and fundamental data because these infromation is readily available from Yahoo Fiance and ADVFN.

Following is a summary of all posts that I wrote about Multiple Factor Models:

  1. Multiple Factor Model – Fundamental Data – in this post I demonstrate how to get company’s Fundamental Data into R, create a simple factor, and run correlation analysis.
  2. Multiple Factor Model – Building Fundamental Factors – in this post I demonstrate how to build Fundamental factors described in the CSFB Alpha Factor Framework and compute quantiles spreads. For details of the CSFB Alpha Factor Framework please read CSFB Quantitative Research, Alpha Factor Framework on page 11, page 49 by P. N. Patel, S. Yao, R. Carlson, A. Banerji, J. Handelman.
  3. Multiple Factor Model – Building CSFB Factors – in this post I demonstrate how to build majority of factors described in the CSFB Alpha Factor Framework, run cross sectional regression to estimate factor loading, create and test Alpha model.
  4. Multiple Factor Model – Building Risk Model – in this post I demonstrate how to build a multiple factor risk model, compute factor covariance using shrinkage estimator, forecast stocks specific variances using GARCH(1,1).
  5. Portfolio Optimization – Why do we need a Risk Model – in this post I explain why do we need a risk model and demonstrate how it is used during portfolio construction process.
  6. Multiple Factor Model – Building 130/30 Index – in this post I demonstrate how to build 130/30 Index based on the CSFB Factors and the Risk Model we created previously. The 130/30: The New Long-Only (2008) by A. Lo, P. Patel paper presents a very detailed step by step guide to building 130/30 Index using average CSFB Factors as the alpha model and MSCI Barra Multi-Factor Risk model.
  7. Multiple Factor Model – Building 130/30 Index (Update) – in this post I demonstrate how to build Market-Neutral and Minimum Variance strategies and compare their performance to the 130/30 Index.

There is an excellent discussion of portfolio construction problems and possible solutions in the The top 7 portfolio optimization problems post by Pat Burns. I want to highlight two problems that are relevant to the Multiple Factor Models.

Categories: Factor Model, Factors

Multiple Factor Model – Building 130/30 Index (Update)

This is just a quick update to my last post: Multiple Factor Model – Building 130/30 Index. I want to introduce Market-Neutral and Minimum Variance strategies and compare their performance to the 130/30 Index.

I have updated the source code of the fm.long.short.test() function in factor.model.test.r at github to include Market-Neutral and Minimum Variance strategies.

Let’s start by creating a Minimum Variance strategy. I will use the same framework as in the Multiple Factor Model – Building 130/30 Index post. At each month end, I will solve for a portfolio that has minimum variance, ignoring all alpha information.

	#*****************************************************************
	# Construct LONG ONLY minimum variance portfolio using the multiple factor risk model
	#****************************************************************** 
	weights$long.min.var.alpha = weight
				
	for(t in 36:nperiods) {	
		#--------------------------------------------------------------------------
		# Create constraints
		#--------------------------------------------------------------------------
		# set min/max wgts for individual stocks: 0 =< x <= 10/100
		constraints = new.constraints(n, lb = 0, ub = 10/100)
		
		# wgts must sum to 1 (fully invested)
		constraints = add.constraints(rep(1,n), type = '=', b = 1, constraints)
			
		#--------------------------------------------------------------------------
		# Create factor exposures constraints
		#--------------------------------------------------------------------------	
		# adjust prior constraints, add factor exposures
		constraints = add.variables(nfactors, constraints)
		
		# BX - X1 = 0
		constraints = add.constraints(rbind(ifna(factor.exposures[t,,], 0), -diag(nfactors)), rep(0, nfactors), type = '=', constraints)
	
		#--------------------------------------------------------------------------
		# Create Covariance matrix
		# [Qu  0]
		# [ 0 Qf]
		#--------------------------------------------------------------------------
		temp = diag(n)
			diag(temp) = ifna(specific.variance[t,], mean(coredata(specific.variance[t,]), na.rm=T))^2
		cov.temp = diag(n + nfactors)
			cov.temp[1:n,1:n] = temp
		cov.temp[(n+1):(n+nfactors),(n+1):(n+nfactors)] = factor.covariance[t,,]
				
		#--------------------------------------------------------------------------
		# Setup optimizations
		#--------------------------------------------------------------------------	
		# set expected return
		alpha = factors.avg$AVG[t,] / 5
		expected.return = c(ifna(coredata(alpha),0), rep(0, nfactors))
			
		# remove companies that have no beta from optimization
		index = which(is.na(beta[t,]))
		if( len(index) > 0) {
			constraints$ub[index] = 0
			constraints$lb[index] = 0
		}
	
		# find solution
		sol = solve.QP.bounds(Dmat = cov.temp, dvec = 0 * expected.return, 
					Amat = constraints$A, bvec = constraints$b, 
					meq = constraints$meq, lb = constraints$lb, ub = constraints$ub)
					
		weights$long.min.var.alpha[t,] = sol$solution[1:n]
		
		cat(t, '\n')
	}

Next, let’s construct Market-Neutral portfolio. I will restrict portfolio weights to be +/- 10% and will use portfolio construction technique that I documented in the 130/30 Portfolio Construction post.

	#*****************************************************************
	# Construct Market-Neutral portfolio 100:100 with beta=0 using the multiple factor risk model
	# based on the examples in the aa.long.short.test functions
	#****************************************************************** 
	weights$market.neutral.alpha = weight

	for(t in 36:nperiods) {	
		#--------------------------------------------------------------------------
		# Split x into x.long and x.short, x_long and x_short >= 0
		# SUM(x.long) - SUM(x.short) = 0
		#--------------------------------------------------------------------------			
		# x.long and x.short >= 0
		# x.long <= 0.1 
		# x.short <= 0.1 
		constraints = new.constraints(2*n, lb = 0, ub = c(rep(0.1,n), rep(0.1,n)))
		
		# SUM (x.long - x.short) = 0
		constraints = add.constraints(c(rep(1,n), -rep(1,n)), 0, type = '=', constraints)		
	
		# SUM (x.long + x.short) = 2
		constraints = add.constraints(c(rep(1,n), rep(1,n)), 2, type = '=', constraints)		
				
		#--------------------------------------------------------------------------
		# beta of portfolio is the weighted average of the individual asset betas		
		# http://www.duke.edu/~charvey/Classes/ba350/riskman/riskman.htm
		#--------------------------------------------------------------------------
		temp = ifna(as.vector(beta[t,]),0)
		constraints = add.constraints(c(temp, -temp), type = '=', b = 0, constraints)
			
		#--------------------------------------------------------------------------
		# Create factor exposures constraints
		#--------------------------------------------------------------------------	
		# adjust prior constraints, add factor exposures
		constraints = add.variables(nfactors, constraints)
		
		# BX - X1 = 0
		temp = ifna(factor.exposures[t,,], 0)
		constraints = add.constraints(rbind(temp, -temp, -diag(nfactors)), rep(0, nfactors), type = '=', constraints)
			
		#--------------------------------------------------------------------------
		# Add binary constraints	
		#--------------------------------------------------------------------------	
		# adjust prior constraints: add b.i
		constraints = add.variables(n, constraints)
	
		# index of binary variables b.i
		constraints$binary.index = (2*n + nfactors + 1):(3*n + nfactors)
	
		# binary variable b.i : x.long < b, x.short < (1 - b)
		# x.long < b
		constraints = add.constraints(rbind(diag(n), 0*diag(n), matrix(0,nfactors,n), -diag(n)), rep(0, n), type = '<=', constraints)

		# x.short < (1 - b)
		constraints = add.constraints(rbind(0*diag(n), diag(n), matrix(0,nfactors,n), diag(n)), rep(1, n), type = '<=', constraints)

		#--------------------------------------------------------------------------
		# set expected return
		#--------------------------------------------------------------------------	
		# set expected return
		alpha = factors.avg$AVG[t,] / 5
		temp = ifna(coredata(alpha),0)
		expected.return = c(temp, -temp, rep(0, nfactors), rep(0, n))

		#--------------------------------------------------------------------------
		# Create Covariance matrix
		# [Qu  0]
		# [ 0 Qf]
		#--------------------------------------------------------------------------
		temp = diag(n)
			diag(temp) = ifna(specific.variance[t,], mean(coredata(specific.variance[t,]), na.rm=T))^2
			
		# | cov -cov |
		# |-cov  cov |		
		temp = cbind( rbind(temp, -temp), rbind(-temp, temp) )
		
		cov.temp = 0*diag(2*n + nfactors + n)
			cov.temp[1:(2*n),1:(2*n)] = temp
		cov.temp[(2*n+1):(2*n+nfactors),(2*n+1):(2*n+nfactors)] = factor.covariance[t,,]
	
		#--------------------------------------------------------------------------
		# Adjust Covariance matrix
		#--------------------------------------------------------------------------
		if(!is.positive.definite(cov.temp)) {
			cov.temp <- make.positive.definite(cov.temp, 0.000000001)
		}	
		
		#--------------------------------------------------------------------------
		# page 9, Risk: We use the Barra default setting, risk aversion value of 0.0075, and
		# AS-CF risk aversion ratio of 1.
		#
		# The Effects of Risk Aversion on Optimization (2010) by S. Liu, R. Xu
		# page 4/5
		#--------------------------------------------------------------------------
		risk.aversion = 0.0075
			
		# remove companies that have no beta from optimization
		index = which(is.na(beta[t,]))
		if( len(index) > 0) {
			constraints$ub[index] = 0
			constraints$lb[index] = 0
			constraints$ub[2*index] = 0
			constraints$lb[2*index] = 0			
		}
	
		# find solution
		sol = solve.QP.bounds(Dmat = 2* risk.aversion * cov.temp, dvec = expected.return, 
					Amat = constraints$A, bvec = constraints$b, 
					meq = constraints$meq, lb = constraints$lb, ub = constraints$ub,
					binary.vec = constraints$binary.index)					
					
		x = sol$solution[1:n] - sol$solution[(n+1):(2*n)]
		weights$market.neutral.alpha[t,] = x
		
		cat(t, '\n')
	}

Next, let’s re-create all summary charts from the Multiple Factor Model – Building 130/30 Index post.

Please note that we are not comparing apples to apples here. For example, Market-Neutral strategy has beta set equal to zero while 130/30 Index has beta set equal to one. But at the same time there are a few interesting observations:

  • Minimum Variance strategy has the smallest drawdown among all long strategies, but its performance is lacking
  • Market-Neutral strategy does surprisingly well in this environment and has the best Sharpe ratio
  • Market-Neutral strategy also has the highest portfolio turnover; hence, high trading costs will make this strategy less attractive

To view the complete source code for this example, please have a look at the fm.long.short.test() function in factor.model.test.r at github.

Multiple Factor Model – Building 130/30 Index

Nico brought to my attention the 130/30: The New Long-Only (2008) by A. Lo, P. Patel paper in his comment to the Multiple Factor Model – Building CSFB Factors post. This paper presents a very detailed step by step guide to building 130/30 Index using average CSFB Factors as the alpha model and MSCI Barra Multi-Factor Risk model. Today, I want to adapt this methodology and to show how to build 130/30 Index based on the CSFB Factors we created in the Multiple Factor Model – Building CSFB Factors post and the Risk Model we created in the Multiple Factor Model – Building Risk Model post.

Let’s start by loading the CSFB factors that we saved at the end of the Multiple Factor Model – Building CSFB Factors post. [If you are missing data.factors.Rdata file, please execute fm.all.factor.test() function first to create and save CSFB factors.] Next, let’s load the multiple factor risk model we saved at the end of the Multiple Factor Model – Building Risk Model post. [If you are missing risk.model.Rdata file, please execute fm.risk.model.test() function first to create and save multiple factor risk model.] Next, I will compute betas over a two-year rolling window.

###############################################################################
# Load Systematic Investor Toolbox (SIT)
# http://systematicinvestor.wordpress.com/systematic-investor-toolbox/
###############################################################################
con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb'))
    source(con)
close(con)

	#*****************************************************************
	# Load data
	#****************************************************************** 
	load.packages('quantmod')	

	# Load CSFB factor data that we saved at the end of the fm.all.factor.test function
	load(file='data.factors.Rdata')
		nperiods = nrow(next.month.ret)
		tickers = colnames(next.month.ret)
		n = len(tickers)
		
	# Load multiple factor risk model data that we saved at the end of the fm.risk.model.test function	
	load(file='risk.model.Rdata')
		factor.exposures = all.data[,,-1]	
		factor.names = dimnames(factor.exposures)[[3]]
		nfactors = len(factor.names)
			
	#*****************************************************************
	# Compute Betas: b = cov(r,m) / var(m)
	# The betas are measured on a two-year rolling window
	# http://en.wikipedia.org/wiki/Beta_(finance)
	#****************************************************************** 
	ret = mlag(next.month.ret)
	beta = ret * NA

	# 1/n benchmark portfolio
	benchmark = ntop(ret, n)	
	benchmark.ret = rowSums(benchmark * ret, na.rm=T)
	
	# estimate betas
	for(t in 24:nperiods) {
		t.index = (t-23):t
		benchmark.var = var( benchmark.ret[t.index], na.rm=T )
		
		t.count = count(ret[t.index, ])
		t.cov = cov( ifna(ret[t.index,], 0), benchmark.ret[t.index], use='complete.obs' )
		
		# require at least 20 months of history
		beta[t,] = iif(t.count > 20, t.cov/benchmark.var, NA)
	}

To construct efficient portfolio each month, I will solve mean-variance portfolio optimization problem of the following form:

Max weight * return - risk.aversion * weight * Covariance * weight 
Sum weight = 1
Sum weight * beta = 1
0 <= weight <= 0.1

Please read The Effects of Risk Aversion on Optimization (2010) by S. Liu, R. Xu paper for the detailed discussion of this optimization problem.

Please note that I will use risk.aversion = 0.0075 and portfolio beta = 1 as discussed on the page 22 of the 130/30: The New Long-Only (2008) by A. Lo, P. Patel paper. To model portfolio beta constraint, I will use the fact that portfolio beta is equal to the weighted average of the individual asset betas.

	#*****************************************************************
	# Construct LONG ONLY portfolio using the multiple factor risk model
	#****************************************************************** 
	load.packages('quadprog,corpcor,kernlab')
	
	weight = NA * next.month.ret
	weights = list()
		weights$benchmark = ntop(beta, n)		
		weights$long.alpha = weight
	
	for(t in 36:nperiods) {	
		#--------------------------------------------------------------------------
		# Create constraints
		#--------------------------------------------------------------------------
		# set min/max wgts for individual stocks: 0 =< x <= 10/100
		constraints = new.constraints(n, lb = 0, ub = 10/100)
		
		# wgts must sum to 1 (fully invested)
		constraints = add.constraints(rep(1,n), type = '=', b = 1, constraints)
			
		#--------------------------------------------------------------------------
		# beta of portfolio is the weighted average of the individual asset betas		
		# http://www.duke.edu/~charvey/Classes/ba350/riskman/riskman.htm
		#--------------------------------------------------------------------------
		constraints = add.constraints(ifna(as.vector(beta[t,]),0), type = '=', b = 1, constraints)
			
		#--------------------------------------------------------------------------
		# Create factor exposures constraints
		#--------------------------------------------------------------------------	
		# adjust prior constraints, add factor exposures
		constraints = add.variables(nfactors, constraints)
		
		# BX - X1 = 0
		constraints = add.constraints(rbind(ifna(factor.exposures[t,,], 0), -diag(nfactors)), rep(0, nfactors), type = '=', constraints)
	
		#--------------------------------------------------------------------------
		# Create Covariance matrix
		# [Qu  0]
		# [ 0 Qf]
		#--------------------------------------------------------------------------
		temp = diag(n)
			diag(temp) = ifna(specific.variance[t,], mean(coredata(specific.variance[t,]), na.rm=T))^2
		cov.temp = diag(n + nfactors)
			cov.temp[1:n,1:n] = temp
		cov.temp[(n+1):(n+nfactors),(n+1):(n+nfactors)] = factor.covariance[t,,]
		
		#--------------------------------------------------------------------------
		# create input assumptions
		#--------------------------------------------------------------------------
		ia = list()	
		ia$n = nrow(cov.temp)
		ia$annual.factor = 12
		
		ia$symbols = c(tickers, factor.names)
		
		ia$cov = cov.temp	
		
		#--------------------------------------------------------------------------
		# page 9, Risk: We use the Barra default setting, risk aversion value of 0.0075, and
		# AS-CF risk aversion ratio of 1.
		#
		# The Effects of Risk Aversion on Optimization (2010) by S. Liu, R. Xu
		# page 4/5
		#--------------------------------------------------------------------------
		risk.aversion = 0.0075
		ia$cov.temp = ia$cov	
	
		# set expected return
		alpha = factors.avg$AVG[t,] / 5
		ia$expected.return = c(ifna(coredata(alpha),0), rep(0, nfactors))
			
		# remove companies that have no beta from optimization
		index = which(is.na(beta[t,]))
		if( len(index) > 0) {
			constraints$ub[index] = 0
			constraints$lb[index] = 0
		}
	
		# find solution
		sol = solve.QP.bounds(Dmat = 2* risk.aversion * ia$cov.temp, dvec = ia$expected.return, 
					Amat = constraints$A, bvec = constraints$b, 
					meq = constraints$meq, lb = constraints$lb, ub = constraints$ub)
					
		weights$long.alpha[t,] = sol$solution[1:n]
	}

Next, let’s construct 130/30 portfolio. I will restrict portfolio weights to be +/- 10% and will use portfolio construction technique that I documented in the 130/30 Portfolio Construction post.

	#*****************************************************************
	# Construct Long/Short 130:30 portfolio using the multiple factor risk model
	# based on the examples in the aa.long.short.test functions
	#****************************************************************** 
	weights$long.short.alpha = weight
				
	for(t in 36:nperiods) {	
		#--------------------------------------------------------------------------
		# Create constraints
		#--------------------------------------------------------------------------
		# set min/max wgts for individual stocks: -10/100 =< x <= 10/100
		constraints = new.constraints(n, lb = -10/100, ub = 10/100)
		
		# wgts must sum to 1 (fully invested)
		constraints = add.constraints(rep(1,n), type = '=', b = 1, constraints)
			
		#--------------------------------------------------------------------------
		# beta of portfolio is the weighted average of the individual asset betas		
		# http://www.duke.edu/~charvey/Classes/ba350/riskman/riskman.htm
		#--------------------------------------------------------------------------
		constraints = add.constraints(ifna(as.vector(beta[t,]),0), type = '=', b = 1, constraints)
			
		#--------------------------------------------------------------------------
		# Create factor exposures constraints
		#--------------------------------------------------------------------------	
		# adjust prior constraints, add factor exposures
		constraints = add.variables(nfactors, constraints)
		
		# BX - X1 = 0
		constraints = add.constraints(rbind(ifna(factor.exposures[t,,], 0), -diag(nfactors)), rep(0, nfactors), type = '=', constraints)
			
		#--------------------------------------------------------------------------
		# Create 130:30
		# -v.i <= x.i <= v.i, v.i>0, SUM(v.i) = 1.6
		#--------------------------------------------------------------------------		
		# adjust prior constraints, add v.i
		constraints = add.variables(n, constraints)
	
		# -v.i <= x.i <= v.i
		#   x.i + v.i >= 0
		constraints = add.constraints(rbind(diag(n), matrix(0,nfactors,n)  ,diag(n)), rep(0, n), type = '>=', constraints)
		#   x.i - v.i <= 0
		constraints = add.constraints(rbind(diag(n), matrix(0,nfactors,n), -diag(n)), rep(0, n), type = '<=', constraints)
		
		# SUM(v.i) = 1.6
		constraints = add.constraints(c(rep(0, n), rep(0, nfactors), rep(1, n)), 1.6, type = '=', constraints)
		
		#--------------------------------------------------------------------------
		# Create Covariance matrix
		# [Qu  0]
		# [ 0 Qf]
		#--------------------------------------------------------------------------
		temp = diag(n)
			diag(temp) = ifna(specific.variance[t,], mean(coredata(specific.variance[t,]), na.rm=T))^2
		cov.temp = 0*diag(n + nfactors + n)
			cov.temp[1:n,1:n] = temp
		cov.temp[(n+1):(n+nfactors),(n+1):(n+nfactors)] = factor.covariance[t,,]
		
		#--------------------------------------------------------------------------
		# create input assumptions
		#--------------------------------------------------------------------------
		ia = list()	
		ia$n = nrow(cov.temp)
		ia$annual.factor = 12
		
		ia$symbols = c(tickers, factor.names, tickers)
		
		ia$cov = cov.temp	
	
		#--------------------------------------------------------------------------
		# page 9, Risk: We use the Barra default setting, risk aversion value of 0.0075, and
		# AS-CF risk aversion ratio of 1.
		#
		# The Effects of Risk Aversion on Optimization (2010) by S. Liu, R. Xu
		# page 4/5
		#--------------------------------------------------------------------------
		risk.aversion = 0.0075
		ia$cov.temp = ia$cov
	
		# set expected return
		alpha = factors.avg$AVG[t,] / 5
		ia$expected.return = c(ifna(coredata(alpha),0), rep(0, nfactors), rep(0, n))
			
		# remove companies that have no beta from optimization
		index = which(is.na(beta[t,]))
		if( len(index) > 0) {
			constraints$ub[index] = 0
			constraints$lb[index] = 0
		}
	
		# find solution
		sol = solve.QP.bounds(Dmat = 2* risk.aversion * ia$cov.temp, dvec = ia$expected.return, 
					Amat = constraints$A, bvec = constraints$b, 
					meq = constraints$meq, lb = constraints$lb, ub = constraints$ub)
			
		weights$long.short.alpha[t,] = sol$solution[1:n]
	}

At this point, we created monthly long-only and 130:30 portfolios. Let’s examine their transition maps.

	#*****************************************************************
	# Plot Transition Maps
	#****************************************************************** 	
	layout(1:3)
	for(i in names(weights)) plotbt.transition.map(weights[[i]], i)

The portfolio weights for the long-only portfolio (long.alpha) sum up to 100% and for the 130:30 portfolio (long.short.alpha) is 130% long and 30% short as expected.

Next let’s create trading strategies based on these portfolios and check their performance.

	#*****************************************************************
	# Create strategies
	#****************************************************************** 	
	prices = data$prices
		prices = bt.apply.matrix(prices, function(x) ifna.prev(x))
		
	# find month ends
	month.ends = endpoints(prices, 'months')

	# create strategies that invest in each qutile
	models = list()
	
	for(i in names(weights)) {
		data$weight[] = NA
			data$weight[month.ends,] = weights[[i]]
			capital = 100000
			data$weight[] = (capital / prices) * (data$weight)	
		models[[i]] = bt.run(data, type='share', capital=capital)
	}
			
	#*****************************************************************
	# Create Report
	#****************************************************************** 	
	models = rev(models)
	
	plotbt.custom.report.part1(models, dates='1998::')
	plotbt.custom.report.part2(models)

The 130:30 portfolio works best, producing high returns with smaller drawdowns than long-only and benchmark (1/N) portfolios.

The note of caution: the above results are based on $0 commission rate (i.e. trading is free). Also, I’m using the current Dow Jones index components through out the whole history; hence, introducing survivorship bias (i.e. Dow Jones index changed its composition a few times in the last 20 years).

To see how much of the problem is my assumption of $0 commission rate, let’s have a look at the annual portfolio turnovers.

	# Plot Portfolio Turnover for each strategy
	layout(1)
	barplot.with.labels(sapply(models, compute.turnover, data), 'Average Annual Portfolio Turnover')

The 130:30 portfolio has a pretty high portfolio turnover; therefore, it will not perform as well in the real life as in our backtest.

To view the complete source code for this example, please have a look at the fm.long.short.test() function in factor.model.test.r at github.

Portfolio Optimization – Why do we need a Risk Model

February 26, 2012 1 comment

In the last post, Multiple Factor Model – Building Risk Model, I have shown how to build a multiple factor risk model. In this post I want to explain why do we need a risk model and how it is used during portfolio construction process.

The covariance matrix is used during the mean-variance portfolio optimization to estimate portfolio risk. The sample covariance matrix can estimated using historical asset’s returns, but the use of the sample covariance matrix becomes problematic when the number of assets (N) is of the same order of magnitude or larger than the number of observations (T). In typical applications, there can be over a thousand stocks to choose from, but rarely more than ten years of monthly data (i.e. N = 1,000 and T = 120). I recommend following references for a detailed discussion:

One possible solution to estimating a covariance matrix is to model risk by a low-dimensional multiple factor model. Another advantage of using a multiple factor model to describe covariances between assets is that portfolio optimization algorithm works faster because the covariance matrix is quite sparse and well structured. For a detailed discussion please read: Portfolio Optimization with Factors, Scenarios, and Realistic Short Positions by B. Jacobs, K. Levy, and H. Markowitz (2005) [In problems with a large number of securities, computation time may differ by orders of magnitude between using a dense covariance matrix(sample covariance matrix) and using the diagonal or nearly diagonal covariance matrices permitted by a multiple factor model of covariance.]

The outline of this post:

  • Construct minimum variance portfolio using the sample covariance matrix
  • Construct minimum variance portfolio using the multiple factor risk model we created in the prior post

Let’s start by loading the multiple factor risk model we saved at the end of the prior post. [If you are missing risk.model.Rdata file, please execute fm.risk.model.test() function first to create and save multiple factor risk model.] Next, I will construct minimum variance portfolio using the historical covariance matrix.

###############################################################################
# Load Systematic Investor Toolbox (SIT)
# http://systematicinvestor.wordpress.com/systematic-investor-toolbox/
###############################################################################
con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb'))
    source(con)
close(con)
	#*****************************************************************
	# Load data
	#****************************************************************** 
	load.packages('quantmod')	

	# Load factor data that we saved at the end of the fm.all.factor.test function
	load(file='data.factors.Rdata')
		nperiods = nrow(next.month.ret)
		n = ncol(next.month.ret)
		tickers = colnames(next.month.ret)
		
	# Load multiple factor risk model data that we saved at the end of the fm.risk.model.test function	
	load(file='risk.model.Rdata')
		
		
	#*****************************************************************
	# Construct minimum variance portfolio using the sample covariance matrix
	#****************************************************************** 
	load.packages('quadprog,corpcor')
	
	#--------------------------------------------------------------------------
	# Create Covariance matrix based on the last 24 months of returns
	#--------------------------------------------------------------------------
	temp = last(mlag(next.month.ret),24)
	cov.temp = cov(temp, use='complete.obs', method='pearson')	
	hist.cov = cov.temp
	
	#--------------------------------------------------------------------------
	# Adjust Covariance matrix
	#--------------------------------------------------------------------------
	if(!is.positive.definite(cov.temp)) {
		cov.temp <- make.positive.definite(cov.temp, 0.000000001)
	}	
		
	#--------------------------------------------------------------------------
	# Create constraints
	#--------------------------------------------------------------------------
	# set min/max wgts for individual stocks: 0 =< x <= 1
	constraints = new.constraints(n, lb = 0, ub = 1)
	
	# wgts must sum to 1 (fully invested)
	constraints = add.constraints(rep(1,n), 1, type = '=', constraints)
			
	#--------------------------------------------------------------------------
	# Solve QP problem
	#--------------------------------------------------------------------------		
	sol = solve.QP.bounds(Dmat = cov.temp, dvec = rep(0, nrow(cov.temp)) , 
		Amat=constraints$A, bvec=constraints$b, constraints$meq,
		lb = constraints$lb, ub = constraints$ub)

	#--------------------------------------------------------------------------
	# Plot Portfolio weights
	#--------------------------------------------------------------------------		
	x = round(sol$solution,4)
		x = iif(x < 0, 0, x)
		names(x) = colnames(next.month.ret)
	hist.min.var.portfolio = x

	# plot
	barplot(100*x,las = 2, 
		main = 'Minimum variance portfolio weights (sample covariance matrix)')		

To minimize portfolio risk computed under risk model framework we have to combine specific risk and factor covariance matrix into one covariance matrix. This process is described on pages 4-5 of the Portfolio Optimization with Factors, Scenarios, and Realistic Short Positions by B. Jacobs, K. Levy, and H. Markowitz (2005).

Let K be the number of factors in the risk model. We introduce K additional variables that represent portfolio’s factor exposure to each factor. Let’s add K constraints:

X1...Xn * factor.exposures = Xn+1...Xn+k

The new covariance matrix:

[specific.risk  0                 ]
[0              factor.covariance ]

Next, I will construct minimum variance portfolio using the the multiple factor risk model we created in the prior post.

	#*****************************************************************
	# Construct minimum variance portfolio using the multiple factor risk model
	#****************************************************************** 
	t = nperiods	
	factor.exposures = all.data[t,,-1]	
		nfactors = ncol(factor.exposures)
		
	#--------------------------------------------------------------------------
	# Create constraints
	#--------------------------------------------------------------------------
	# set min/max wgts for individual stocks: 0 =< x <= 1
	constraints = new.constraints(n, lb = 0, ub = 1)
	
	# wgts must sum to 1 (fully invested)
	constraints = add.constraints(rep(1,n), 1, type = '=', constraints)
		
	# adjust prior constraints, add v.i
	constraints = add.variables(nfactors, constraints)
	
	# BX - Xnew = 0
	constraints = add.constraints(rbind(factor.exposures, -diag(nfactors)), rep(0, nfactors), type = '=', constraints)

	#--------------------------------------------------------------------------
	# Create Covariance matrix
	# [Qu  0]
	# [ 0 Qf]
	#--------------------------------------------------------------------------
	temp = diag(n)
		diag(temp) = specific.variance[t,]^2
	cov.temp = diag(n + nfactors)
		cov.temp[1:n,1:n] = temp
	cov.temp[(n+1):(n+nfactors),(n+1):(n+nfactors)] = factor.covariance[t,,]
	
	#--------------------------------------------------------------------------
	# Solve QP problem
	#--------------------------------------------------------------------------
	load.packages('quadprog')
	
	sol = solve.QP.bounds(Dmat = cov.temp, dvec = rep(0, nrow(cov.temp)) , 
		Amat=constraints$A, bvec=constraints$b, constraints$meq,
		lb = constraints$lb, ub = constraints$ub)
	
	#--------------------------------------------------------------------------
	# Plot Portfolio weights
	#--------------------------------------------------------------------------		
	x = round(sol$solution,4)[1:n]
		x = iif(x < 0, 0, x)
		names(x) = colnames(next.month.ret)
	risk.model.min.var.portfolio = x

	# plot
	barplot(100*x,las = 2, 
		main = 'Minimum variance portfolio weights (multiple factor risk model)')

The minimum variance portfolio computed under the risk model is more diversified. Also for a larger stock universe (i.e. 1000-2000 stocks) solving quadratic problem will take less time using a factor risk model that results in a sparse covariance matrix.

To view the complete source code for this example, please have a look at the fm.risk.model.test() function in factor.model.test.r at github.

Multiple Factor Model – Building Risk Model

February 21, 2012 5 comments

This is the fourth post in the series about Multiple Factor Models. I will build on the code presented in the prior post, Multiple Factor Model – Building CSFB Factors, and I will show how to build a multiple factor risk model. For an example of the multiple factor risk models, please read following references:

The outline of this post:

  • Run cross sectional regression to estimate factor returns
  • Compute factor covariance using shrinkage estimator
  • Forecast stocks specific variances using GARCH(1,1)
  • Compute portfolio risk using multiple factor model and compare it to the historical standard deviation of portfolio returns.

Let’s start by loading the CSFB factors that we saved at the end of the prior post. [If you are missing data.factors.Rdata file, please execute fm.all.factor.test() function first to create and save CSFB factors.] Next, I will run cross sectional regression to estimate factor returns.

###############################################################################
# Load Systematic Investor Toolbox (SIT)
# http://systematicinvestor.wordpress.com/systematic-investor-toolbox/
###############################################################################
con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb'))
    source(con)
close(con)
	#*****************************************************************
	# Load factor data that we saved at the end of the fm.all.factor.test functions
	#****************************************************************** 
	load.packages('quantmod,abind')	
		
	load(file='data.factors.Rdata')
		# remove Composite Average factor
		factors.avg = factors.avg[which(names(factors.avg) != 'AVG')]	
		
	#*****************************************************************
	# Run cross sectional regression to estimate factor returns
	#****************************************************************** 
	nperiods = nrow(next.month.ret)
	n = ncol(next.month.ret)
		
	# create sector dummy variables: binary 0/1 values for each sector
	nsectors = len(levels(sectors))	
	sectors.matrix = array(double(), c(nperiods, n, nsectors))
		dimnames(sectors.matrix)[[3]] = levels(sectors)		
	for(j in levels(sectors)) {
		sectors.matrix[,,j] = matrix(sectors == j,  nr=nperiods, nc=n, byrow=T)
	}
	
	# create matrix for each factor
	factors.matrix = abind(factors.avg, along = 3)		
	
	# combine sector dummies and all factors
	all.data = abind(sectors.matrix, factors.matrix)		
	
	# create betas and specific.return
	beta = all.data[,1,] * NA
	specific.return = next.month.ret * NA
		nfactors = ncol(beta)
		
	# append next.month.ret to all.data			
	all.data = abind(next.month.ret, all.data, along = 3)
		dimnames(all.data)[[3]][1] = 'Ret'
			
	# estimate betas (factor returns)
	for(t in 12:(nperiods-1)) {		
		temp = all.data[t:t,,]
		x = temp[,-c(1:2)]
		y = temp[,1]
		b = lm(y~x)$coefficients
		
		b[2:nsectors] = b[1] + b[2:nsectors]
		beta[(t+1),] = b		
		
		specific.return[(t+1),] = y - rowSums(temp[,-1] * matrix(b, n, nfactors, byrow=T), na.rm=T)	
	}

To estimate factor returns (betas), we solve for coefficients of the following multiple factor model:

Ret = b1 * F1 + b2 * F2 + ... + bn * Fn + e
where 
b1...bn are estimated factor returns
F1...Fn are factor exposures. I.e. sector dummies and CSFB factor exposures
e is stock specific return, not captured by factors F1...Fn

Note that we cannot include the first sector dummy variable in the regression, otherwise we will get a linearly dependent relationship of the first sector dummy variable with all other sector dummy variables. The sector effect of the first sector dummy variable is absorbed into the intercept in the regression.

There are a few alternative ways of estimating this regression. For example, the robust linear model can be estimated with following code:

	load.packages('MASS')
	temp = rlm(y~x)$coefficients

The quantile regression can can be estimated with following code:

	load.packages('quantreg')
	temp = rq(y ~ x, tau = 0.5)$coefficients

Next let’s look at the cumulative factor returns.

	#*****************************************************************
	# helper function
	#****************************************************************** 	
	fm.hist.plot <- function(temp, smain=NULL) {			
		ntemp = ncol(temp)		
		cols = plota.colors(ntemp)	
		plota(temp, ylim = range(temp), log='y', main=smain)
		for(i in 1:ntemp) plota.lines(temp[,i], col=cols[i])
		plota.legend(colnames(temp), cols, as.list(temp))
	}

	#*****************************************************************
	# Examine historical cumulative factor returns
	#****************************************************************** 	
	temp = make.xts(beta, index(next.month.ret))
		temp = temp['2000::',]
		temp[] = apply(coredata(temp), 2, function(x) cumprod(1 + ifna(x,0)))
	
	fm.hist.plot(temp[,-c(1:nsectors)], 'Factor Returns')

The Price Reversals(PR) and Small Size(SS) factors have done well.

Next let’s estimate the factor covariance matrix over the rolling 24 month window.

	load.packages('BurStFin')	
	factor.covariance = array(double(), c(nperiods, nfactors, nfactors))
		dimnames(factor.covariance)[[2]] = colnames(beta)
		dimnames(factor.covariance)[[3]] = colnames(beta)

	# estimate factor covariance
	for(t in 36:nperiods) {
		factor.covariance[t,,] = var.shrink.eqcor(beta[(t-23):t,])
	}

Next let’s forecast stocks specific variances using GARCH(1,1). I will use the GARCH estimation routine described in the Trading using Garch Volatility Forecast post.

	#*****************************************************************
	# Compute stocks specific variance foreasts using GARCH(1,1)
	#****************************************************************** 	
	load.packages('tseries,fGarch')	

	specific.variance = next.month.ret * NA

	for(i in 1:n) {
		specific.variance[,i] = bt.forecast.garch.volatility(specific.return[,i], 24) 
	}


	# compute historical specific.variance
	hist.specific.variance = next.month.ret * NA
	for(i in 1:n) hist.specific.variance[,i] = runSD(specific.return[,i], 24)	
	
	# if specific.variance is missing use historical specific.variance
	specific.variance[] = ifna(coredata(specific.variance), coredata(hist.specific.variance))	

	#*****************************************************************
	# Save multiple factor risk model to be used later during portfolio construction
	#****************************************************************** 
	save(all.data, factor.covariance, specific.variance, file='risk.model.Rdata')

Now we have all the ingredients to compute a portfolio risk:

Portfolio Risk = (common factor variance + specific variance)^0.5
	common factor variance = (portfolio factor exposure) * factor covariance matrix * (portfolio factor exposure)'
	specific variance = (specific.variance)^2 * (portfolio weights)^2
	#*****************************************************************
	# Compute portfolio risk
	#****************************************************************** 
	portfolio = rep(1/n, n)
		portfolio = matrix(portfolio, n, nfactors)
	
	portfolio.risk = next.month.ret[,1] * NA
	for(t in 36:(nperiods-1)) {	
		portfolio.exposure = colSums(portfolio * all.data[t,,-1], na.rm=T)
		
		portfolio.risk[t] = sqrt(
			portfolio.exposure %*% factor.covariance[t,,] %*% (portfolio.exposure) + 
			sum(specific.variance[t,]^2 * portfolio[,1]^2, na.rm=T)
			)
	}

Next let’s compare portfolio risk estimated using multiple factor risk model with portfolio historical risk.

	#*****************************************************************
	# Compute historical portfolio risk
	#****************************************************************** 
	portfolio = rep(1/n, n)
		portfolio = t(matrix(portfolio, n, nperiods))
	
	portfolio.returns = next.month.ret[,1] * NA
		portfolio.returns[] = rowSums(mlag(next.month.ret) * portfolio, na.rm=T)
	
	hist.portfolio.risk = runSD(portfolio.returns, 24)
	
	#*****************************************************************
	# Plot risks
	#****************************************************************** 			
	plota(portfolio.risk['2000::',], type='l')
		plota.lines(hist.portfolio.risk, col='blue')
		plota.legend('Risk,Historical Risk', 'black,blue')

The multiple factor risk model does a decent job of estimating portfolio risk most of the time.

To view the complete source code for this example, please have a look at the fm.risk.model.test() function in factor.model.test.r at github.

Multiple Factor Model – Building CSFB Factors

February 13, 2012 5 comments

This is the third post in the series about Multiple Factor Models. I will build on the code presented in the prior post, Multiple Factor Model – Building Fundamental Factors, and I will show how to build majority of factors described in the CSFB Alpha Factor Framework. For details of the CSFB Alpha Factor Framework please read CSFB Quantitative Research, Alpha Factor Framework on page 11, page 49 by P. N. Patel, S. Yao, R. Carlson, A. Banerji, J. Handelman.

This post will include long sections of code to extract/format data and build factors. I created a few helper functions for data manipulations and visualizations in the factor.model.r at github. For example, consecutive.changes function counts the number of consecutive positive changes.

The outline of this post:

  • Create majority of CSFB factors
  • Create and test Composite Average factor
  • Run cross sectional regression to estimate factor loading
  • Create and test Alpha model using estimated factor loading

Let’s start by getting Fundamental data and creating factors. In the prior post, I mentioned that it takes a while to download historical fundamental data for all companies in the Dow Jones index, and I recommend saving fundamental data with save(data.fund, file=’data.fund.Rdata’) command. In the following code I will just load historical fundamental data with load(file=’data.fund.Rdata’) command instead of downloading all data again.

###############################################################################
# Load Systematic Investor Toolbox (SIT)
# http://systematicinvestor.wordpress.com/systematic-investor-toolbox/
###############################################################################
con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb'))
    source(con)
close(con)

	#*****************************************************************
	# Find Sectors for each company in DOW 30
	#****************************************************************** 
	tickers = spl('XLY,XLP,XLE,XLF,XLV,XLI,XLB,XLK,XLU')
	tickers.desc = spl('ConsumerCyclicals,ConsumerStaples,Energy,Financials,HealthCare,Industrials,Materials,Technology,Utilities')
	
	sector.map = c()
	for(i in 1:len(tickers)) {
		sector.map = rbind(sector.map, 
				cbind(sector.spdr.components(tickers[i]), tickers.desc[i])
			)
	}
	colnames(sector.map) = spl('ticker,sector')

	#*****************************************************************
	# Load historical data
	#****************************************************************** 
	load.packages('quantmod,abind')		
	tickers = dow.jones.components()
	
	sectors = factor(sector.map[ match(tickers, sector.map[,'ticker']), 'sector'])
		names(sectors) = tickers
	
	# get fundamental data
	load(file='data.fund.Rdata')
			
	# get pricing data
	load(file='data.Rdata')
	
	#*****************************************************************
	# Combine fundamental and pricing data
	#****************************************************************** 	
	for(i in tickers) {
		fund = data.fund[[i]]
		fund.date = date.fund.data(fund)
			nperiods = ncol(fund)
			
		# D - holds all data to be merged with pricing data
		D = list()
		
		#--------------------------------------------------------------
		# Data for Traditional and Relative Value	
		#--------------------------------------------------------------
				
		# Earnings per Share		
		D$EPS = get.fund.data('Diluted EPS from Total Operations', fund, fund.date, is.12m.rolling=T)
		
		# Sales, exception not available for financial service firms
		D$SALE = get.fund.data('total revenue', fund, fund.date, is.12m.rolling=T)
		
		# Common Shares Outstanding
		D$CSHO = get.fund.data('total common shares out', fund, fund.date)

		# Common Equity
		D$CEQ = get.fund.data('total equity', fund, fund.date)

		# Dividends
		D$DV.PS = get.fund.data('dividends paid per share', fund, fund.date, is.12m.rolling=T)
				
		# Cash Flow
		D$CFL = get.fund.data('net cash from operating activities', fund, fund.date, cash.flow=T, is.12m.rolling=T)
			
		#--------------------------------------------------------------
		# Data for Historical Growth	
		#--------------------------------------------------------------
				
		# Consecutive Quarters of Positive Changes in Trailing 12 Month Cash Flow
		D$CFL.CON.CHG = consecutive.changes(D$CFL)		
			# check
			#merge(D$CFL, sign(diff(D$CFL)), consecutive.changes(D$CFL), consecutive.changes(D$CFL,F))
		
		# Consecutive Quarters of Positive Change in Quarterly Earnings
		D$EPS.CON.CHG = consecutive.changes(D$EPS)
		
		# 12 Month Change in Quarterly Cash Flow
		temp = get.fund.data('net cash from operating activities', fund, fund.date, cash.flow=T)
		D$CFL.CHG = temp / mlag(temp,4)
		
		# 3 Year Average Annual Sales Growth
		D$SALE.3YR.GR = D$SALE
		if(!all(is.na(D$SALE))) D$SALE.3YR.GR = SMA(ifna(D$SALE / mlag(D$SALE,4) - 1,NA), 3*4)

		# 3 Year Average Annual Earnings Growth
		D$EPS.3YR.GR = SMA(D$EPS / mlag(D$EPS,4) - 1, 3*4)
		
		# 12 Quarter Trendline in Trailing 12 Month Earnings		
		D$EPS.TREND = D$EPS * NA
			D$EPS.TREND[12:nperiods] = sapply(12:nperiods, function(i) beta.degree(ols(cbind(1,1:12), D$EPS[(i-12+1):i])$coefficients[2]))
			
		# Slope of Trend Line Through Last 4 Quarters of Trailing 12M Cash Flows		
		D$CFL.TREND = D$CFL * NA
			D$CFL.TREND[4:nperiods] = sapply(4:nperiods, function(i) beta.degree(ols(cbind(1,1:4), D$CFL[(i-4+1):i])$coefficients[2]))

		#--------------------------------------------------------------
		# Data for Profit Trends	
		#--------------------------------------------------------------
		RECT = get.fund.data('receivables', fund, fund.date)
		INVT = get.fund.data('inventories', fund, fund.date)
		
		D$AT = get.fund.data('total assets', fund, fund.date)
		XSGA = get.fund.data('Selling, General & Administrative (SG&A) Expense', fund, fund.date, is.12m.rolling=T)
		
		# Consecutive Quarters of Declines in (Receivables+Inventories) / Sales
		D$RS.CON.CHG = consecutive.changes((RECT + INVT) / D$SALE, F)
				
		# Consecutive Qtrs of Positive Change in Trailing 12M Cash Flow / Sales
		D$CS.CON.CHG = consecutive.changes(D$CFL/D$SALE)
		
		# Overhead = sales, general and administrative costs
		# Consecutive Quarters of Declines in Trailing 12 Month Overhead / Sales
		D$OS.CON.CHG = consecutive.changes(XSGA/D$SALE, F)
				
		# (Industry Relative) Trailing 12 Month (Receivables+Inventories) / Sales
		D$RS = (RECT + INVT) / D$SALE
		
		# (Industry Relative) Trailing 12 Month Sales / Assets
		D$SA = D$SALE / D$AT
		
		# Trailing 12 Month Overhead / Sales
		D$OS = XSGA / D$SALE
		
		# Trailing 12 Month Earnings / Sales
		D$ES = D$EPS / D$SALE						
							
		#--------------------------------------------------------------
		# Merge Fundamental and Pricing data
		#--------------------------------------------------------------
		
		# merge	
		data[[i]] = merge(data[[i]], as.xts(abind(D,along=2), fund.date))						
	}
	
	bt.prep(data, align='keep.all', dates='1995::2011')

	#*****************************************************************
	# Create Factors
	#****************************************************************** 
	prices = data$prices	
		prices = bt.apply.matrix(prices, function(x) ifna.prev(x))
		
	# re-map sectors
	sectors	= sectors[colnames(prices)]	

	# create factors
	factors = list()
	factor.names = list()	

Next let’s create majority of CSFB factors.

	#*****************************************************************
	# Create Traditional Value
	#****************************************************************** 
	factors$TV = list()
	factor.names$TV = 'Traditional Value'

	# Market Value - capitalization
	CSHO =  bt.apply(data, function(x) ifna.prev(x[, 'CSHO']))
	MKVAL = prices * CSHO
	
	# Price / Earnings
	EPS = bt.apply(data, function(x) ifna.prev(x[, 'EPS']))
	factors$TV$EP = EPS / prices
	
	# Price / Trailing Sales
	SALE = bt.apply(data, function(x) ifna.prev(x[, 'SALE']))	
	factors$TV$SP = SALE / MKVAL
	
	# Price / Trailing Cash Flow
	CFL = bt.apply(data, function(x) ifna.prev(x[, 'CFL']))
	factors$TV$CFP = CFL / MKVAL
	
	# Dividend Yield
	DV.PS = bt.apply(data, function(x) ifna.prev(x[, 'DV.PS']))
	factors$TV$DY = DV.PS / prices
	
	# Price / Book Value		
	CEQ = bt.apply(data, function(x) ifna.prev(x[, 'CEQ']))
	factors$TV$BP = CEQ	/ MKVAL

	# Eliminate Price/Sales and Price/Cash Flow for financial firms
	factors$TV$SP[, sectors == 'Financials'] = NA
	factors$TV$CFP[, sectors == 'Financials'] = NA
	
	#*****************************************************************
	# Create Historical Growth
	#****************************************************************** 
	factors$HG = list()
	factor.names$HG = 'Historical Growth'

	for(i in spl('CFL.CON.CHG,EPS.CON.CHG,CFL.CHG,SALE.3YR.GR,EPS.3YR.GR,EPS.TREND,CFL.TREND')) {
		factors$HG[[i]] = bt.apply(data, function(x) ifna.prev(x[, i]))
	}

	#*****************************************************************
	# Create Profit Trends
	#****************************************************************** 
	factors$PT = list()		
	factor.names$PT = 'Profit Trends'	
	
	for(i in spl('RS.CON.CHG,CS.CON.CHG,OS.CON.CHG,RS,SA,OS,ES')) {
		factors$PT[[i]] = bt.apply(data, function(x) ifna.prev(x[, i]))
	}
	
	#*****************************************************************
	# Create Price Momentum
	#****************************************************************** 
	factors$PM = list()
	factor.names$PM = 'Price Momentum'	
	
	# find week ends
	week.ends = endpoints(prices, 'weeks')
		week.prices = prices[week.ends,]
		week.nperiods = nrow(week.prices)
	
	#Slope of 52 Week Trend Line
	factors$PM$S52W.TREND = bt.apply.matrix(week.prices, function(x) {
		c(rep(NA,51),
		sapply(52:week.nperiods, function(i) beta.degree(ols(cbind(1,1:52), x[(i - 52 + 1):i])$coefficients[2]))
		)})
	
	#4/52 Week Price Oscillator
	factors$PM$PP04.52W = bt.apply.matrix(week.prices, EMA, 4) / bt.apply.matrix(week.prices, EMA, 52)
					
	#39 Week Return
	factors$PM$R39W = week.prices / mlag(week.prices, 39)
			
	#51 Week Volume Price Trend
	# compute weekly volume
	temp = bt.apply(data, function(x) cumsum(ifna(Vo(x),0)))
	temp = temp[week.ends,]
		week.volume = temp - mlag(temp)		
	temp = (week.prices - mlag(week.prices)) * week.volume
	factors$PM$VPT51W = bt.apply.matrix(temp, runSum, 51)
			
	# Convert weekly to daily
	for(i in 1:len(factors$PM)) {
		temp = prices * NA
		temp[week.ends,] = factors$PM[[i]]
		factors$PM[[i]] = bt.apply.matrix(temp, function(x) ifna.prev(x))
	}
	
	#Percent Above 260 Day Low
	factors$PM$P260LOW = prices / bt.apply.matrix(prices, runMin, 260)
	
	# Flip sign
	for(i in names(factors$PM)) factors$PM[[i]] = -factors$PM[[i]]
	
	#*****************************************************************
	# Create Price Reversal
	#****************************************************************** 
	factors$PR = list()
	factor.names$PR = 'Price Reversal'	
		
	#5 Day Industry Relative Return
	factors$PR$r5DR = prices/mlag(prices, 5)
		factors$PR$r5DR = factors$PR$r5DR / sector.mean(factors$PR$r5DR, sectors)
	
	#5 Day Money Flow / Volume
	factors$PR$MFV = bt.apply(data, function(x) {
		MFI(cbind(ifna.prev(Hi(x)),ifna.prev(Lo(x)),ifna.prev(Cl(x))), 5) / ifna.prev(Vo(x))
	})
			
	#10 Day MACD - Signal Line
	factors$PR$MACD = bt.apply.matrix(prices, function(x) {
		temp=MACD(x, 10)
		temp[, 'macd'] - temp[, 'signal']
		})		
	
	#14 Day RSI (Relative Strength Indicator)
	factors$PR$RSI = bt.apply.matrix(prices, RSI, 14)
			
	#14 Day Stochastic
	factors$PR$STOCH = bt.apply(data, function(x) {
		stoch(cbind(ifna.prev(Hi(x)),ifna.prev(Lo(x)),ifna.prev(Cl(x))),14)[,'slowD']
	})
		
	#4 Week Industry Relative Return
	factors$PR$rR4W = week.prices / mlag(week.prices,4)
		factors$PR$rR4W = factors$PR$rR4W / sector.mean(factors$PR$rR4W, sectors)
		
		# Convert weekly to daily
		temp = prices * NA
		temp[week.ends,] = factors$PR$rR4W
		factors$PR$rR4W = bt.apply.matrix(temp, function(x) ifna.prev(x))
	
	
	# VOMO - Volume x Momentum
	volume = bt.apply(data, function(x) ifna.prev(Vo(x)))
	factors$PR$VOMO = (prices / mlag(prices,10) - 1) * bt.apply.matrix(volume, runMean, 22) / bt.apply.matrix(volume, runMean, 66)
	
	# Flip sign
	for(i in names(factors$PR)) factors$PR[[i]] = -factors$PR[[i]]
	
	#*****************************************************************
	# Create Small Size
	#****************************************************************** 		
	factors$SS = list()
	factor.names$SS = 'Small Size'		
	
	#Log of Market Capitalization
	factors$SS$MC = log(MKVAL)
	
	#Log of Market Capitalization Cubed
	factors$SS$MC3 = log(MKVAL)^3
	
	#Log of Stock Price
	factors$SS$P = log(prices)
	
	#Log of Total Assets
	factors$SS$AT = log(bt.apply(data, function(x) ifna.prev(x[, 'AT'])))
	
	#Log of Trailing-12-Month Sales
	factors$SS$SALE = log(bt.apply(data, function(x) ifna.prev(x[, 'SALE'])))

	# Flip sign
	for(i in names(factors$SS)) factors$SS[[i]] = -factors$SS[[i]]
	
	#*****************************************************************
	# Convert to monthly
	#****************************************************************** 
	# find month ends
	month.ends = endpoints(prices, 'months')
	
	prices = prices[month.ends,]
		n = ncol(prices)
		nperiods = nrow(prices)
	
	ret = prices / mlag(prices) - 1
		next.month.ret = mlag(ret, -1)
	
	MKVAL = MKVAL[month.ends,]
	
	for(j in 1:len(factors)) {	
		for(i in 1:len(factors[[j]])) {
			factors[[j]][[i]] = factors[[j]][[i]][month.ends,]	
			factors[[j]][[i]][] = ifna(factors[[j]][[i]],NA)
		}
	}

	#*****************************************************************
	# Create Relative Value
	#****************************************************************** 
	factors$RV = list()
	factor.names$RV = 'Relative Value'		
	
	# relative 
	for(i in spl('EP,SP,CFP')) {
		factors$RV[[paste('r',i,sep='')]] = factors$TV[[i]] / sector.mean(factors$TV[[i]], sectors) 		
	}
		
	# spreads, 5 Year Avg = 60 months
	for(i in spl('rEP,rSP,rCFP')) {
		factors$RV[[paste('s',i,sep='')]] = factors$RV[[i]] - 
		apply(factors$RV[[i]], 2, function(x) if(all(is.na(x))) x else SMA(x,60) )
	}
	
	#*****************************************************************
	# Profit Trends (Relative)
	#****************************************************************** 	
	# relative 
	for(i in spl('RS,SA')) {
		factors$PT[[paste('r',i,sep='')]] = factors$PT[[i]] / sector.mean(factors$PT[[i]], sectors)
	}			

Next let’s create Composite Average factor and chart its performance.

	#*****************************************************************
	# Normalize and add Average factor
	#****************************************************************** 
	for(j in names(factors)) {
		factors[[j]] = normalize.normal(factors[[j]])
		factors[[j]] = add.avg.factor(factors[[j]])
	}

	#*****************************************************************
	# Create Composite Average factor
	#****************************************************************** 	
	factors.avg = list()
	for(j in names(factors)) factors.avg[[j]] = factors[[j]]$AVG
	
	factors.avg = add.avg.factor(factors.avg)

	#*****************************************************************
	# Backtest qutiles and qutile spread
	#****************************************************************** 
	plot.quantiles(factors.avg, next.month.ret, 'Average')

	plot.bt.quantiles(factors.avg$AVG, next.month.ret, 'Composite Average', data)

Please note that I’m using the current Dow Jones index components through out the whole history. This is a problem because the Dow Jones index changed its composition a few times in the last 20 years. One of the signs of this bias is high spread for Small Size group of factors. It is not obvious that buying low priced stocks and selling high priced stocks should consistently make money; but it makes sense, if we know beforehand that that low priced companies will be part of Dow Jones index at some point.

Instead of using a simple average of all factors to rank stocks, we can run cross sectional regression to estimate factor loading, and create Alpha model using estimated factor loading. For a complete description of this process, I recommend reading Commonality In The Determinants Of Expected Stock Returns by R. Haugen, N. Baker (1996) pages 8-9.

	#*****************************************************************
	# Save CSFB factors to be used later to create multiple factor Risk Model
	#****************************************************************** 
	save(data, sectors, factors.avg, next.month.ret, file='data.factors.Rdata')
		# remove Composite Average factor
		factors.avg = factors.avg[which(names(factors.avg) != 'AVG')]

	#*****************************************************************
	# Run cross sectional regression and create Alpha model
	#****************************************************************** 
	nperiods = nrow(next.month.ret)
	n = ncol(next.month.ret)
			
	# create matrix for each factor
	factors.matrix = abind(factors.avg, along = 3)	
	all.data = factors.matrix
	
	# betas
	beta = all.data[,1,] * NA
	
	# append next.month.ret to all.data			
	all.data = abind(next.month.ret, all.data, along = 3)
		dimnames(all.data)[[3]][1] = 'Ret'
		
	# estimate betas (factor returns)
	for(t in 12:(nperiods-1)) {
		temp = all.data[t:t,,]
		x = temp[,-1]
		y = temp[,1]
		beta[(t+1),] = lm(y~x-1)$coefficients
	}
	
	
	# create Alpha return forecasts
	alpha = next.month.ret * NA

	for(t in 18:(nperiods-1)) {
		# average betas over the last 6 months
		coef = colMeans(beta[(t-5):t,],na.rm=T)
		alpha[t,] = rowSums(all.data[t,,-1] * t(repmat(coef, 1,n)), na.rm=T)	
	}
	
	#*****************************************************************
	# Backtest qutiles and qutile spread
	#****************************************************************** 
	layout(1:2)
	temp = compute.quantiles(alpha, next.month.ret, plot=T)
	plot.bt.quantiles(alpha, next.month.ret, 'Alpha', data)

The performance of the regression model lags the performance of the simple average of all factors to rank stocks. There are might be many reasons for this, but I want to show you one quick and rational way to increase performance of the regression model.

We do not restrict estimated factor loadings during regression; however, a negative coefficient for a Value factor does not make sense. I don’t want explicitly say that good Value companies should have lower ranks than bad Value companies, just because there is a negative coefficient for a Value factor. One possible solution is to only use factor loadings that are positive to construct Alpha score. Here is the modified code:

	for(t in 18:(nperiods-1)) {
		# average betas over the last 6 months
		coef = colMeans(beta[(t-5):t,],na.rm=T)

		coef = iif(coef > 0, coef, 0)

		alpha[t,] = rowSums(all.data[t,,-1] * t(repmat(coef, 1,n)), na.rm=T)	
	}

The regression model still lags the performance of the simple average of all factors to rank stocks.

In the next post I will show how to create a risk model and estimate risk.

To view the complete source code for this example, please have a look at the fm.all.factor.test() function in factor.model.test.r at github.

Follow

Get every new post delivered to your Inbox.

Join 246 other followers