## Cluster Portfolio Allocation

Today, I want to continue with clustering theme and show how the portfolio weights are determined in the Cluster Portfolio Allocation method. One example of the Cluster Portfolio Allocation method is Cluster Risk Parity (Varadi, Kapler, 2012).

The Cluster Portfolio Allocation method has 3 steps:

- Create Clusters
- Allocate funds within each Cluster
- Allocate funds across all Clusters

I will illustrate below all 3 steps using “Equal Weight” and “Risk Parity” portfolio allocation methiods. Let’s start by loading historical prices for the 10 major asset classes.

############################################################################### # Load Systematic Investor Toolbox (SIT) # https://systematicinvestor.wordpress.com/systematic-investor-toolbox/ ############################################################################### setInternet2(TRUE) con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb')) source(con) close(con) #***************************************************************** # Load historical data for ETFs #****************************************************************** load.packages('quantmod') tickers = spl('GLD,UUP,SPY,QQQ,IWM,EEM,EFA,IYR,USO,TLT') data <- new.env() getSymbols(tickers, src = 'yahoo', from = '1900-01-01', env = data, auto.assign = T) for(i in ls(data)) data[[i]] = adjustOHLC(data[[i]], use.Adjusted=T) bt.prep(data, align='remove.na') #***************************************************************** # Setup #****************************************************************** # compute returns ret = data$prices / mlag(data$prices) - 1 # setup period dates = '2012::2012' ret = ret[dates]

Next, let’s compute “Plain” portfolio allocation (i.e. no Clustering)

fn.name = 'equal.weight.portfolio' fn = match.fun(fn.name) # create input assumptions ia = create.historical.ia(ret, 252) # compute allocation without cluster, for comparison weight = fn(ia)

Next, let’s create clusters and compute portfolio allocation within each Cluster

# create clusters group = cluster.group.kmeans.90(ia) ngroups = max(group) weight0 = rep(NA, ia$n) # store returns for each cluster hist.g = NA * ia$hist.returns[,1:ngroups] # compute weights within each group for(g in 1:ngroups) { if( sum(group == g) == 1 ) { weight0[group == g] = 1 hist.g[,g] = ia$hist.returns[, group == g, drop=F] } else { # create input assumptions for the assets in this cluster ia.temp = create.historical.ia(ia$hist.returns[, group == g, drop=F], 252) # compute allocation within cluster w0 = fn(ia.temp) # set appropriate weights weight0[group == g] = w0 # compute historical returns for this cluster hist.g[,g] = ia.temp$hist.returns %*% w0 } }

Next, let’s compute portfolio allocation across all Clusters and compute final portfolio weights

# create GROUP input assumptions ia.g = create.historical.ia(hist.g, 252) # compute allocation across clusters group.weights = fn(ia.g) # mutliply out group.weights by within group weights for(g in 1:ngroups) weight0[group == g] = weight0[group == g] * group.weights[g]

Finally, let’s create reports and compare portfolio allocations

#***************************************************************** # Create Report #****************************************************************** load.packages('RColorBrewer') col = colorRampPalette(brewer.pal(9,'Set1'))(ia$n) layout(matrix(1:2,nr=2,nc=1)) par(mar = c(0,0,2,0)) index = order(group) pie(weight[index], labels = paste(colnames(ret), round(100*weight,1),'%')[index], col=col, main=fn.name) pie(weight0[index], labels = paste(colnames(ret), round(100*weight0,1),'%')[index], col=col, main=paste('Cluster',fn.name))

The difference is most striking in the “Equal Weight” portfolio allocation method. The Cluster version allocates 25% to each cluster first, and then allocates equally within each cluster. The Plain version allocates equally among all assets. The “Risk Parity” version below works in similar way, but instead of having equal weights, the focus is on the equal risk allocations. I.e. UUP gets a much bigger allocation because it is far less risky than any other asset.

Next week, I will show how to back-test Cluster Portfolio Allocation methods.

To view the complete source code for this example, please have a look at the bt.cluster.portfolio.allocation.test() function in bt.test.r at github.

## Tracking Number of Historical Clusters in DOW 30 and S&P 500

In the Tracking Number of Historical Clusters post, I looked at how 3 different methods were able to identify clusters across the 10 major asset universe. Today, I want to share the impact of clustering on the larger universe. Below I examined the historical time series of number of clusters in the DOW 30 and S&P 500 indices.

I went back to the 1970 for the companies in DOW 30 index.

I went back to the 1994 for the companies in S&P 500 index.

Takeaways: The markets are changing, and correspondingly the diversification (i.e. number of clusters) goes thought cycles as can be seen in the charts. The results will vary across different methods and must be validated by the user. For example, some readers will consider an average of 10 clusters for S&P 500 as too small, while others might think that 10 clusters as sufficient.

## An Example of Seasonality Analysis

Today, I want to demonstrate how easy it is to create a seasonality analysis study and produce a sample summary report. As an example study, I will use S&P Annual Performance After a Big January post by Avondale Asset Management.

The first step is to load historical prices and find Big Januaries.

############################################################################### # Load Systematic Investor Toolbox (SIT) # https://systematicinvestor.wordpress.com/systematic-investor-toolbox/ ############################################################################### setInternet2(TRUE) con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb')) source(con) close(con) #***************************************************************** # Load historical data #****************************************************************** load.packages('quantmod') price = getSymbols('^GSPC', src = 'yahoo', from = '1900-01-01', auto.assign = F) # convert to monthly price = Cl(to.monthly(price, indexAt='endof')) ret = price / mlag(price) - 1 #***************************************************************** # Find Januaries with return > 4% #****************************************************************** index = which( date.month(index(ret)) == 1 & ret > 4/100 ) # create summary table with return in January and return for the whole year temp = c(coredata(ret),rep(0,12)) out = cbind(ret[index], sapply(index, function(i) prod(1 + temp[i:(i+11)])-1)) colnames(out) = spl('January,Year')

All the hard work is done now, let’s create a chart and table to summarize the S&P Annual Performance After a Big January numbers.

#***************************************************************** # Create Plot #****************************************************************** col=col.add.alpha(spl('black,gray'),200) pos = barplot(100*out, border=NA, beside=T, axisnames = F, axes = FALSE, col=col, main='Annual Return When S&P500 Rises More than 4% in January') axis(1, at = colMeans(pos), labels = date.year(index(out)), las=2) axis(2, las=1) grid(NA, NULL) abline(h= 100*mean(out$Year), col='red', lwd=2) plota.legend(spl('January,Annual,Average'), c(col,'red')) # plot table plot.table(round(100*as.matrix(out),1))

That is it, we are done.

Takeaways: It is very easy to create a seasonality analysis study. Next you might want to schedule to run the study script at specific times through out the year and send you a remainder email in case the study conditions are met.

To view the complete source code for this example, please have a look at the bt.seasonality.january.test() function in bt.test.r at github.