Search Results

Keyword: ‘pattern’

Backtesting Classical Technical Patterns

In the last post, Classical Technical Patterns, I discussed the algorithm and pattern definitions presented in the Foundations of Technical Analysis by A. Lo, H. Mamaysky, J. Wang (2000) paper. Today, I want to check how different patterns performed historically using SPY.

I will follow the rolling window procedure discussed on pages 14-15 of the paper. Let’s begin by loading the historical data for the SPY and running a rolling window pattern search algorithm.

###############################################################################
# Load Systematic Investor Toolbox (SIT)
###############################################################################
con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb'))
    source(con)
close(con)

	#*****************************************************************
	# Load historical data
	#****************************************************************** 
	load.packages('quantmod')
	ticker = 'SPY'
	
	data = getSymbols(ticker, src = 'yahoo', from = '1970-01-01', auto.assign = F)
		data = adjustOHLC(data, use.Adjusted=T)

	#*****************************************************************
	# Search for all patterns over a rolling window
	#****************************************************************** 
	load.packages('sm') 
	history = as.vector(coredata(Cl(data)))
	
	window.L = 35
	window.d = 3
	window.len = window.L + window.d

	patterns = pattern.db()
	
	found.patterns = c()
	
	for(t in window.len : (len(history)-1)) {
		ret = history[(t+1)]/history[t]-1
		
		sample = history[(t - window.len + 1):t]		
		obj = find.extrema( sample )	
		
		if(len(obj$data.extrema.loc) > 0) {
			out =  find.patterns(obj, patterns = patterns, silent=F, plot=F)  
			
			if(len(out)>0) found.patterns = rbind(found.patterns,cbind(t,out,t-window.len+out, ret))
		}
		if( t %% 10 == 0) cat(t, 'out of', len(history), '\n')
	}
	colnames(found.patterns) = spl('t,start,end,tstart,tend,ret')	

There are many patterns that are found multiple times. Let’s remove the entries that refer to the same pattern and keep only the first occurrence.

	#*****************************************************************
	# Clean found patterns
	#****************************************************************** 	
	# remove patterns that finished after window.L
	found.patterns = found.patterns[found.patterns[,'end'] <= window.L,]
		
	# remove the patterns found multiple times, only keep first one
	pattern.names = unique(rownames(found.patterns))
	all.patterns = c()
	for(name in pattern.names) {
		index = which(rownames(found.patterns) == name)
		temp = NA * found.patterns[index,]
		
		i.count = 0
		i.start = 1
		while(i.start < len(index)) {
			i.count = i.count + 1
			temp[i.count,] = found.patterns[index[i.start],]
			subindex = which(found.patterns[index,'tstart'] > temp[i.count,'tend'])			
						
			if(len(subindex) > 0) {
				i.start = subindex[1]
			} else break		
		} 
		all.patterns = rbind(all.patterns, temp[1:i.count,])		
	}	

Now we can visualize the performance of each pattern using the charts from my presentation about Seasonality Analysis and Pattern Matching at the R/Finance conference.

	#*****************************************************************
	# Plot
	#****************************************************************** 	
	# Frequency for each Pattern
	frequency = tapply(rep(1,nrow(all.patterns)), rownames(all.patterns), sum)
	layout(1)
	barplot.with.labels(frequency/100, 'Frequency for each Pattern')

	
	# Summary for each Pattern
	all.patterns[,'ret'] = history[(all.patterns[,'t']+20)] / history[all.patterns[,'t']] - 1
	data_list = tapply(all.patterns[,'ret'], rownames(all.patterns), list)
	group.seasonality(data_list, '20 days after Pattern')


	# Details for BBOT pattern
	layout(1)
	name = 'BBOT'
	index = which(rownames(all.patterns) == name)	
	time.seasonality(data, all.patterns[index,'t'], 20, name)	

The Broadening bottoms (BBOT) and Rectangle tops (RTOP) worked historically well for SPY.

To view the complete source code for this example, please have a look at the bt.patterns.test() function in rfinance2012.r at github.

Advertisements
Categories: Backtesting, R

Classical Technical Patterns

In my presentation about Seasonality Analysis and Pattern Matching at the R/Finance conference, I used examples that I have previously covered in my blog:

The only subject in my presentation that I have not discussed previously was about Classical Technical Patterns. For example, the Head and Shoulders pattern, most often seen in up-trends and generally regarded as a reversal pattern. Below I implemented the algorithm and pattern definitions presented in the Foundations of Technical Analysis by A. Lo, H. Mamaysky, J. Wang (2000) paper.

To identify a price pattern I will follow steps as described in the Foundations of Technical Analysis paper:

  • First, fit a smoothing estimator, a kernel regression estimator, to approximate time series.
  • Next, determine local extrema, tops and bottoms, using fist derivative of the kernel regression estimator.
  • Define classical technical patterns in terms of tops and bottoms.
  • Search for classical technical patterns throughout the tops and bottoms of the kernel regression estimator.

Let’s begin by loading historical prices for SPY:

###############################################################################
# Load Systematic Investor Toolbox (SIT)
###############################################################################
con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb'))
    source(con)
close(con)

	#*****************************************************************
	# Load historical data
	#****************************************************************** 
	load.packages('quantmod')
	ticker = 'SPY'
	
	data = getSymbols(ticker, src = 'yahoo', from = '1970-01-01', auto.assign = F)
		data = adjustOHLC(data, use.Adjusted=T)
	
	#*****************************************************************
	# Find Classical Techical Patterns, based on
	# Pattern Matching. Based on Foundations of Technical Analysis
	# by A.W. LO, H. MAMAYSKY, J. WANG	
	#****************************************************************** 
	plot.patterns(data, 190, ticker)

The first step is to fit a smoothing estimator, a kernel regression estimator, to approximate time series. I used sm package to fit kernel regression:

	library(sm)
	y = as.vector( last( Cl(data), 190) )
	t = 1:len(y)

	# fit kernel regression with cross-validatio
	h = h.select(t, y, method = 'cv')
		temp = sm.regression(t, y, h=h, display = 'none')

	# find estimated fit
	mhat = approx(temp$eval.points, temp$estimate, t, method='linear')$y

The second step is to find local extrema, tops and bottoms, using first derivative of the kernel regression estimator. (more details in the paper on page 15):

	temp = diff(sign(diff(mhat)))
	# loc - location of extrema, loc.dir - direction of extrema
	loc = which( temp != 0 ) + 1
		loc.dir = -sign(temp[(loc - 1)])

I put the logic for the first and second step into the find.extrema() function.

The next step is to define classical technical patterns in terms of tops and bottoms. The pattern.db() function implements the 10 patterns described in the paper on page 12. For example, let’s have a look at the Head and Shoulders pattern. The Head and Shoulders pattern:

  • is defined by 5 extrema points (E1, E2, E3, E4, E5)
  • starts with a maximum (E1)
  • E1 and E5 are within 1.5 percent of their average
  • E2 and E4 are within 1.5 percent of their average

The R code below that corresponds to the Head and Shoulders pattern is a direct translation, from the pattern description in the paper on page 12, and is very readable:

	#****************************************************************** 	
	# Head-and-shoulders (HS)
	#****************************************************************** 	
	pattern = list()
	pattern$len = 5
	pattern$start = 'max'
	pattern$formula = expression({
		avg.top = (E1 + E5) / 2
		avg.bot = (E2 + E4) / 2

		# E3 > E1, E3 > E5
		E3 > E1 &
		E3 > E5 &
		
		# E1 and E5 are within 1.5 percent of their average
		abs(E1 - avg.top) < 1.5/100 * avg.top &
		abs(E5 - avg.top) < 1.5/100 * avg.top &
		
		# E2 and E4 are within 1.5 percent of their average
		abs(E2 - avg.bot) < 1.5/100 * avg.bot &
		abs(E4 - avg.bot) < 1.5/100 * avg.bot
		})
	patterns$HS = pattern		

The last step is a function that searches for all defined patterns in the kernel regression representation of original time series.

I put the logic for this step into the find.patterns() function. Below is a simplified version:

find.patterns <- function
(
	obj, 	# extrema points
	patterns = pattern.db() 
) 
{
	data = obj$data
	extrema.dir = obj$extrema.dir
	data.extrema.loc = obj$data.extrema.loc
	n.index = len(data.extrema.loc)

	# search for patterns
	for(i in 1:n.index) {
	
		for(pattern in patterns) {
		
			# check same sign
			if( pattern$start * extrema.dir[i] > 0 ) {
			
				# check that there is suffcient number of extrema to complete pattern
				if( i + pattern$len - 1 <= n.index ) {
				
					# create enviroment to check pattern: E1,E2,...,En; t1,t2,...,tn
					envir.data = c(data[data.extrema.loc][i:(i + pattern$len - 1)], 
					data.extrema.loc[i:(i + pattern$len - 1)])									
						names(envir.data) = c(paste('E', 1:pattern$len, sep=''), 
							paste('t', 1:pattern$len, sep=''))
					envir.data = as.list(envir.data)					
										
					# check if pattern was found
					if( eval(pattern$formula, envir = envir.data) ) {
						cat('Found', pattern$name, 'at', i, '\n')
					}
				}
			}		
		}	
	}
	
}

I put the logic for the entire process in to the plot.patterns() helper function. The plot.patterns() function first call find.extrema() function to determine extrema points, and next it calls find.patterns() function to find and plot patterns. Let’s find classical technical patterns in the last 150 days of SPY history:

	#*****************************************************************
	# Load historical data
	#****************************************************************** 
	load.packages('quantmod')
	ticker = 'SPY'
	
	data = getSymbols(ticker, src = 'yahoo', from = '1970-01-01', auto.assign = F)
		data = adjustOHLC(data, use.Adjusted=T)
	
	#*****************************************************************
	# Find Classical Techical Patterns, based on
	# Pattern Matching. Based on Foundations of Technical Analysis
	# by A.W. LO, H. MAMAYSKY, J. WANG	
	#****************************************************************** 
	plot.patterns(data, 150, ticker)

It is very easy to define you own custom patterns and I encourage everyone to give it a try.

To view the complete source code for this example, please have a look at the pattern.test() function in rfinance2012.r at github.

Categories: R

R/Finance 2012 presentation

I have attended for the first time the R/Finance conference this year. I must say that I’m impressed with the effort that organizers put into the conference and the breadth and the depth of the material / ideas presented.

I just want to share slides and examples that I used in my presentation about Seasonality Analysis and Pattern Matching.

In the next post, I plan to discuss in more detail the algorithm and the price pattern definitions I used to find classical technical patterns.

Categories: Uncategorized

Time Series Matching with Dynamic Time Warping

January 20, 2012 6 comments

THIS IS NOT INVESTMENT ADVICE. The information is provided for informational purposes only.

In the Time Series Matching post, I used one to one mapping to the compute distance between the query(current pattern) and reference(historical time series). Following chart visualizes this concept. The distance is the sum of vertical lines.

An alternative way to map one time series to another is Dynamic Time Warping(DTW). DTW algorithm looks for minimum distance mapping between query and reference. Following chart visualizes one to many mapping possible with DTW.

To check if there a difference between simple one to one mapping and DTW, I will search for time series matches that are similar to the most recent 90 days of SPY in the last 10 years of history. Following code loads historical prices from Yahoo Fiance, setups the problem and computes Euclidean distance for the historical rolling window using the Systematic Investor Toolbox:

###############################################################################
# Load Systematic Investor Toolbox (SIT)
# https://systematicinvestor.wordpress.com/systematic-investor-toolbox/
###############################################################################
con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb'))
    source(con)
close(con)

	#*****************************************************************
	# Load historical data
	#****************************************************************** 
	load.packages('quantmod')	
	tickers = 'SPY'

	data = getSymbols(tickers, src = 'yahoo', from = '1950-01-01', auto.assign = F)

	#*****************************************************************
	# Euclidean distance, one to one mapping
	#****************************************************************** 
	obj = bt.matching.find(Cl(data), normalize.fn = normalize.mean, dist.fn = 'dist.euclidean', plot=T)

	matches = bt.matching.overlay(obj, plot.index=1:90, plot=T)

	layout(1:2)
	matches = bt.matching.overlay(obj, plot=T, layout=T)
	bt.matching.overlay.table(obj, matches, plot=T, layout=T)

Next, let’ examine the top 10 matches using Dynamic Time Warping distance. I will use the Dynamic Time Warping implementation from dtw package.

	#*****************************************************************
	# Dynamic time warping distance	
	#****************************************************************** 
	# http://en.wikipedia.org/wiki/Dynamic_time_warping
	# http://dtw.r-forge.r-project.org/
	#****************************************************************** 
	load.packages('dtw')

	obj = bt.matching.find(Cl(data), normalize.fn = normalize.mean, dist.fn = 'dist.DTW', plot=T)

	matches = bt.matching.overlay(obj, plot.index=1:90, plot=T)

	layout(1:2)
	matches = bt.matching.overlay(obj, plot=T, layout=T)
	bt.matching.overlay.table(obj, matches, plot=T, layout=T)

Both algorithms produced very similar matches and very similar predictions. I would use these predictions as an educated guess to market action going forward. So far, it looks like the market will not be going up in full throttle in the next 22 days.

To view the complete source code for this example, please have a look at the bt.matching.dtw.test() function in bt.test.r at github.

Categories: R, Strategy, Trading Strategies

Time Series Matching strategy backtest

January 17, 2012 4 comments

This is a quick post to address comments raised in the Time Series Matching post. I will show a very simple example of backtesting a Time Series Matching strategy using a distance weighted prediction. I have to warn you, the strategy’s performance is worse then the Buy and Hold.

I used the code from Time Series Matching post and re-arranged it into 3 functions:
bt.matching.find – finds historical matches similar to the given query (pattern).
bt.matching.overlay – creates matrix that overlays all matches one on top of each other.
bt.matching.overlay.table – creates and plots matches summary table.

I will use historical prices for ^GSPC to extend SPY time series. I will create a monthly backtest, that trades at the end of the month, staring January 1994. Each month, I will look back for the best historical matches similar to the last 90 days in the last 10 years of history.

I will compute a distance weighted average prediction for the next month and will go long if prediction is positive, otherwise stay in cash. This is a very simple backtest and the strategy’s performance is worse then the Buy and Hold.

Following code loads historical prices from Yahoo Fiance and setups Time Series Matching strategy backtest using the Systematic Investor Toolbox:

###############################################################################
# Load Systematic Investor Toolbox (SIT)
###############################################################################
con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb'))
    source(con)
close(con)

	#*****************************************************************
	# Load historical data
	#****************************************************************** 
	load.packages('quantmod')	
	tickers = spl('SPY,^GSPC')

	data <- new.env()
	getSymbols(tickers, src = 'yahoo', from = '1950-01-01', env = data, auto.assign = T)
	bt.prep(data, align='keep.all')

	# combine SPY and ^GSPC
	scale = as.double( data$prices$SPY['1993:01:29'] / data$prices$GSPC['1993:01:29'] )
	hist = c(scale * data$prices$GSPC['::1993:01:28'], data$prices$SPY['1993:01:29::'])

	#*****************************************************************
	# Backtest setup:
	# Starting January 1994, each month search for the 10 best matches 
	# similar to the last 90 days in the last 10 years of history data
	#
	# Invest next month if distance weighted prediction is positive
	# otherwise stay in cash
	#****************************************************************** 
	month.ends = endpoints(hist, 'months')
		month.ends = month.ends[month.ends > 0]		
	
	start.index = which(date.year(index(hist[month.ends])) == 1994)[1]
	weight = hist * NA
	
	for( i in start.index : len(month.ends) ) {
		obj = bt.matching.find(hist[1:month.ends[i],], normalize.fn = normalize.first)
		matches = bt.matching.overlay(obj)
		
		# compute prediction for next month
		n.match = len(obj$min.index)
		n.query = len(obj$query)				
		month.ahead.forecast = matches[,(2*n.query+22)]/ matches[,2*n.query] - 1
		
		# Distance weighted average
		temp = round(100*(obj$dist / obj$dist[1] - 1))		
			n.weight = max(temp) + 1
			weights = (n.weight - temp) / ( n.weight * (n.weight+1) / 2)
		weights = weights / sum(weights)
			# barplot(weights)
		avg.direction = weighted.mean(month.ahead.forecast[1:n.match], w=weights)
		
		# Logic
		weight[month.ends[i]] = 0
		if( avg.direction > 0 ) weight[month.ends[i]] = 1
	}

Next, let’s compare the Time Series Matching strategy to Buy & Hold:

	#*****************************************************************
	# Code Strategies
	#****************************************************************** 
	tickers = 'SPY'

	data <- new.env()
	getSymbols(tickers, src = 'yahoo', from = '1950-01-01', env = data, auto.assign = T)
	bt.prep(data, align='keep.all')
	
	prices = data$prices  
	
	# Buy & Hold	
	data$weight[] = 1
	buy.hold = bt.run(data)	

	# Strategy
	data$weight[] = NA
		data$weight[] = weight['1993:01:29::']
		capital = 100000
		data$weight[] = (capital / prices) * bt.exrem(data$weight)
	test = bt.run(data, type='share', capital=capital, trade.summary=T)
		
	#*****************************************************************
	# Create Report
	#****************************************************************** 
	plotbt.custom.report.part1(test, buy.hold, trade.summary=T)

How would you change the strategy or backtest to make it profitable? Please share your ideas. I looking forward to exploring them.

To view the complete source code for this example, please have a look at the bt.matching.backtest.test() function in bt.test.r at github.