Calendar Strategy: Fed Days

UPDATE: I was pointed out a problem with original post due to look ahead bias introduced by prices > SMA(prices,100) statement. In the calendar strategy logic I did not use a usual lag of one day because important days are known before hand. However, the prices > SMA(prices,100) statement should be lagged by one day. I updated plots and source code.

Today, I want to follow up with the Calendar Strategy: Option Expiry post. Let’s examine the importance of the FED meeting days as presented in the Fed Days And Intermediate-Term Highs post.

Let’s dive in and examine historical perfromance of SPY during FED meeting days:

###############################################################################
# Load Systematic Investor Toolbox (SIT)
# http://systematicinvestor.wordpress.com/systematic-investor-toolbox/
###############################################################################
setInternet2(TRUE)
con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb'))
    source(con)
close(con)
	#*****************************************************************
	# Load historical data
	#****************************************************************** 
	load.packages('quantmod')
		
	tickers = spl('SPY')
		
	data <- new.env()
	getSymbols.extra(tickers, src = 'yahoo', from = '1980-01-01', env = data, set.symbolnames = T, auto.assign = T)
		for(i in data$symbolnames) data[[i]] = adjustOHLC(data[[i]], use.Adjusted=T)
	bt.prep(data, align='keep.all', fill.gaps = T)

	#*****************************************************************
	# Setup
	#*****************************************************************
	prices = data$prices
		n = ncol(prices)
		
	dates = data$dates	
	
	models = list()
	
	universe = prices > 0
		# 100 day SMA filter
		universe = universe & prices > SMA(prices,100)
		
	# Find Fed Days
	info = get.FOMC.dates(F)
		key.date.index = na.omit(match(info$day, dates))
	
	key.date = NA * prices
		key.date[key.date.index,] = T
		
	#*****************************************************************
	# Strategy
	#*****************************************************************
	signals = list(T0=0)
		for(i in 1:15) signals[[paste0('N',i)]] = 0:i	
	signals = calendar.signal(key.date, signals)
	models = calendar.strategy(data, signals, universe = universe)

	strategy.performance.snapshoot(models, T, sort.performance=F)

plot1

Please note 100 day moving average filter above. If we take it out, the performance deteriorates significantly.

	# custom stats	
	out = sapply(models, function(x) list(
		CAGR = 100*compute.cagr(x$equity),
		MD = 100*compute.max.drawdown(x$equity),
		Win = x$trade.summary$stats['win.prob', 'All'],
		Profit = x$trade.summary$stats['profitfactor', 'All']
		))	
	performance.barchart.helper(out, sort.performance = F)
	
	strategy.performance.snapshoot(models$N15, control=list(main=T))
	
	last.trades(models$N15)
	
	trades = models$N15$trade.summary$trades
		trades = make.xts(parse.number(trades[,'return']), as.Date(trades[,'entry.date']))
	layout(1:2)
		par(mar = c(4,3,3,1), cex = 0.8) 
	barplot(trades, main='N15 Trades', las=1)
	plot(cumprod(1+trades/100), type='b', main='N15 Trades', las=1)

N15 Strategy:

plot2

plot3

plot4

plot5

With this post I wanted to show how easily we can study calendar strategy performance using the Systematic Investor Toolbox.

Next, I will look at the importance of the Dividend days.

To view the complete source code for this example, please have a look at the bt.calendar.strategy.fed.days.test() function in bt.test.r at github.

Categories: Backtesting, R

Calendar Strategy: Option Expiry

Today, I want to follow up with the Calendar Strategy: Month End post. Let’s examine the perfromance Option Expiry days as presented in the The Mooost Wonderful Tiiiiiiime of the Yearrrrrrrrr! post.

First, I created two convenience functions for creating a calendar signal and back-testing calendar strategy: calendar.signal and calendar.strategy functions are in the strategy.r at github

Now, let’s dive in and examine historical perfromance of SPY during Option Expiry period in December:

###############################################################################
# Load Systematic Investor Toolbox (SIT)
# http://systematicinvestor.wordpress.com/systematic-investor-toolbox/
###############################################################################
setInternet2(TRUE)
con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb'))
    source(con)
close(con)

	#*****************************************************************
	# Load historical data
	#****************************************************************** 
	load.packages('quantmod')
		
	tickers = spl('SPY')
		
	data <- new.env()
	getSymbols.extra(tickers, src = 'yahoo', from = '1980-01-01', env = data, set.symbolnames = T, auto.assign = T)
		for(i in data$symbolnames) data[[i]] = adjustOHLC(data[[i]], use.Adjusted=T)
	bt.prep(data, align='keep.all', fill.gaps = T)

	#*****************************************************************
	# Setup
	#*****************************************************************
	prices = data$prices
		n = ncol(prices)
		
	dates = data$dates	
	
	models = list()
	
	universe = prices > 0
		
	# Find Friday before options expiration week in December
	years = date.year(range(dates))
	second.friday = third.friday.month(years[1]:years[2], 12) - 7
		key.date.index = na.omit(match(second.friday, dates))
				
	key.date = NA * prices
		key.date[key.date.index,] = T

	#*****************************************************************
	# Strategy: Op-ex week in December most bullish week of the year for the SPX
	#   Buy: December Friday prior to op-ex.
	#   Sell X days later: 100K/trade 1984-present
	# http://quantifiableedges.blogspot.com/2011/12/mooost-wonderful-tiiiiiiime-of.html
	#*****************************************************************
	signals = list(T0=0)
		for(i in 1:15) signals[[paste0('N',i)]] = 0:i	
	signals = calendar.signal(key.date, signals)
	models = calendar.strategy(data, signals, universe = universe)
	    
	strategy.performance.snapshoot(models, T, sort.performance=F)

plot1

Strategies vary in perfromance, next let’s examine a bit more details

	# custom stats	
	out = sapply(models, function(x) list(
		CAGR = 100*compute.cagr(x$equity),
		MD = 100*compute.max.drawdown(x$equity),
		Win = x$trade.summary$stats['win.prob', 'All'],
		Profit = x$trade.summary$stats['profitfactor', 'All']
		))	
	performance.barchart.helper(out, sort.performance = F)
	
	# Plot 15 day strategy
	strategy.performance.snapshoot(models$N15, control=list(main=T))
	
	# Plot trades for 15 day strategy
	last.trades(models$N15)
	
	# Make a summary plot of trades for 15 day strategy
	trades = models$N15$trade.summary$trades
		trades = make.xts(parse.number(trades[,'return']), as.Date(trades[,'entry.date']))
	layout(1:2)
		par(mar = c(4,3,3,1), cex = 0.8) 
	barplot(trades, main='Trades', las=1)
	plot(cumprod(1+trades/100), type='b', main='Trades', las=1)

Details for the 15 day strategy:
plot2

plot3

plot4

plot5

With this post I wanted to show how easily we can study calendar strategy performance using the Systematic Investor Toolbox.

Next, I will look at the importance of the FED meeting days.

To view the complete source code for this example, please have a look at the
bt.calendar.strategy.option.expiry.test() function in bt.test.r at github.

Categories: Backtesting, R

Calendar Strategy: Month End

April 28, 2014 1 comment

Calendar Strategy is a very simple strategy that buys an sells at the predetermined days, known in advance. Today I want to show how we can easily investigate performance at and around Month End days.

First let’s load historical prices for SPY from Yahoo Fiance and compute SPY perfromance at the month-ends. I.e. strategy will open long position at the close on the 30th and sell position at the close on the 31st.

###############################################################################
# Load Systematic Investor Toolbox (SIT)
# http://systematicinvestor.wordpress.com/systematic-investor-toolbox/
###############################################################################
setInternet2(TRUE)
con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb'))
    source(con)
close(con)
	#*****************************************************************
	# Load historical data
	#****************************************************************** 
	load.packages('quantmod')
		
	tickers = spl('SPY')
		
	data <- new.env()
	getSymbols.extra(tickers, src = 'yahoo', from = '1980-01-01', env = data, set.symbolnames = T, auto.assign = T)
		for(i in data$symbolnames) data[[i]] = adjustOHLC(data[[i]], use.Adjusted=T)
	bt.prep(data, align='keep.all', fill.gaps = T)

	#*****************************************************************
	# Setup
	#*****************************************************************
	prices = data$prices
		n = ncol(prices)
		
	models = list()
		
	period.ends = date.month.ends(data$dates, F)
		
	#*****************************************************************
	# Strategy
	#*****************************************************************
	key.date = NA * prices
		key.date[period.ends] = T
	
	universe = prices > 0
	signal = key.date

	data$weight[] = NA
		data$weight[] = ifna(universe & key.date, F)
	models$T0 = bt.run.share(data, do.lag = 0, trade.summary=T, clean.signal=T) 

Please note that above, in the bt.run.share call, I set do.lag parameter equal to zero (the default value for the do.lag parameter is one). The reason for default setting equal to one is due to signal (decision to trade) is derived using all information available today, so the position can only be implement next day. I.e.

portfolio.returns = lag(signal, do.lag) * returns = lag(signal, 1) * returns

However, in case of the calendar strategy there is no need to lag signal because the trade day is known in advance. I.e.

portfolio.returns = lag(signal, do.lag) * returns = signal * returns

Next, I created two functions to help with signal creation and strategy testing:

	calendar.strategy <- function(data, signal, universe = data$prices > 0) {
		data$weight[] = NA
			data$weight[] = ifna(universe & signal, F)
		bt.run.share(data, do.lag = 0, trade.summary=T, clean.signal=T)  	
	}
	
	calendar.signal <- function(key.date, offsets = 0) {
		signal = mlag(key.date, offsets[1])
		for(i in offsets) signal = signal | mlag(key.date, i)
		signal
	}

	# Trade on key.date
	models$T0 = calendar.strategy(data, key.date)

	# Trade next day after key.date
	models$N1 = calendar.strategy(data, mlag(key.date,1))
	# Trade two days next(after) key.date
	models$N2 = calendar.strategy(data, mlag(key.date,2))

	# Trade a day prior to key.date
	models$P1 = calendar.strategy(data, mlag(key.date,-1))
	# Trade two days prior to key.date
	models$P2 = calendar.strategy(data, mlag(key.date,-2))
	
	# Trade: open 2 days before the key.date and close 2 days after the key.date
	signal = key.date | mlag(key.date,-1) | mlag(key.date,-2) | mlag(key.date,1) | mlag(key.date,2)
	models$P2N2 = calendar.strategy(data, signal)

	# same, but using helper function above	
	models$P2N2 = calendar.strategy(data, calendar.signal(key.date, -2:2))
		
	strategy.performance.snapshoot(models, T)
	
	strategy.performance.snapshoot(models, control=list(comparison=T), sort.performance=F)

Above, T0 is a calendar strategy that buys on 30th and sells on 31st. I.e. position is only held on a month end day. P1 and P2 are two strategies that buy a day prior and two days prior correspondingly. N1 and N2 are two strategies that buy a day after and two days after correspondingly.

plot1

plot2

The N1 strategy, buy on 31st and sell on the 1st next month seems to be working best for SPY.

Finally, let’s look at the actual trades:


	last.trades <- function(model, n=20, make.plot=T, return.table=F) {
		ntrades = min(n, nrow(model$trade.summary$trades))		
		trades = last(model$trade.summary$trades, ntrades)
		if(make.plot) {
			layout(1)
			plot.table(trades)
		}	
		if(return.table) trades	
	}
	
	last.trades(models$P2)

plot3

The P2 strategy enters position at the close 3 days before the month end and exits positions at the close 2 days before the month end. I.e. the performance is due to returns only 2 days before the month end.

With this post I wanted to show how easily we can study calendar strategy performance using the Systematic Investor Toolbox.

Next, I will demonstrate calendar strategy applications to variety of important dates.

To view the complete source code for this example, please have a look at the bt.calendar.strategy.month.end.test() function in bt.test.r at github.

Quality of Historical Stock Prices from Yahoo Finance

I recently looked at the strategy that invests in the components of S&P/TSX 60 index, and discovered that there are some abnormal jumps/drops in historical data that I could not explain. To help me spot these points and remove them, I created a helper function data.clean() function in data.r at github. Following is an example of how you can use data.clean() function:

##############################################################################
# Load Systematic Investor Toolbox (SIT)
# http://systematicinvestor.wordpress.com/systematic-investor-toolbox/
###############################################################################
setInternet2(TRUE)
con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb'))
    source(con)
close(con)

	###############################################################################
	# S&P/TSX 60 Index as of Mar 31 2014
	# http://ca.spindices.com/indices/equity/sp-tsx-60-index
	###############################################################################	
	load.packages('quantmod')

	tickers = spl('AEM,AGU,ARX,BMO,BNS,ABX,BCE,BB,BBD.B,BAM.A,CCO,CM,CNR,CNQ,COS,CP,CTC.A,CCT,CVE,GIB.A,CPG,ELD,ENB,ECA,ERF,FM,FTS,WN,GIL,G,HSE,IMO,K,L,MG,MFC,MRU,NA,PWT,POT,POW,RCI.B,RY,SAP,SJR.B,SC,SLW,SNC,SLF,SU,TLM,TCK.B,T,TRI,THI,TD,TA,TRP,VRX,YRI')
		tickers = gsub('\\.', '-', tickers)
	tickers.suffix = '.TO'

	data <- new.env()
	for(ticker in tickers)
		data[[ticker]] = getSymbols(paste0(ticker, tickers.suffix), src = 'yahoo', from = '1980-01-01', auto.assign = F)

	###############################################################################
	# Plot Abnormal Series
	###############################################################################
	layout(matrix(1:4,2))	
	plota(data$ARX$Adjusted['2000'], type='p', pch='|', main='ARX Adjusted Price in 2000')	
	plota(data$COS$Adjusted['2000'], type='p', pch='|', main='COS Adjusted Price in 2000')	
	plota(data$ERF$Adjusted['2000'], type='p', pch='|', main='ERF Adjusted Price in 2000')	
	plota(data$YRI$Adjusted['1999'], type='p', pch='|', main='YRI Adjusted Price in 1999')	

	###############################################################################
	# Clean data
	###############################################################################
	data.clean(data, min.ratio = 2)	

plot1

> data.clean(data, min.ratio = 2)	
Removing BNS TRP have less than 756 observations
Abnormal price found for ARX 23-Jun-2000 Ratio : 124.7
Abnormal price found for ARX 26-Sep-2000 Inverse Ratio : 99.4
Abnormal price found for COS 23-Jun-2000 Ratio : 124.1
Abnormal price found for COS 26-Sep-2000 Inverse Ratio : 101.1
Abnormal price found for ERF 14-Jun-2000 Ratio : 7.9
Abnormal price found for YRI 18-Feb-1998 Ratio : 2.1
Abnormal price found for YRI 25-May-1999 Ratio : 3

It is surprising that Bank of Nova Scotia (BNS.TO) has only one year worth of historical data. I also did not find an explanations for jumps in the ARX, COS, ERF during 2000.

Next, I did same analysis for the stocks in the S&P 100 index:

	###############################################################################
	# S&P 100 as of Mar 31 2014
	# http://ca.spindices.com/indices/equity/sp-100
	###############################################################################	
	tickers = spl('MMM,ABT,ABBV,ACN,ALL,MO,AMZN,AXP,AIG,AMGN,APC,APA,AAPL,T,BAC,BAX,BRK.B,BIIB,BA,BMY,COF,CAT,CVX,CSCO,C,KO,CL,CMCSA,COP,COST,CVS,DVN,DOW,DD,EBAY,EMC,EMR,EXC,XOM,FB,FDX,F,FCX,GD,GE,GM,GILD,GS,GOOG,HAL,HPQ,HD,HON,INTC,IBM,JNJ,JPM,LLY,LMT,LOW,MA,MCD,MDT,MRK,MET,MSFT,MDLZ,MON,MS,NOV,NKE,NSC,OXY,ORCL,PEP,PFE,PM,PG,QCOM,RTN,SLB,SPG,SO,SBUX,TGT,TXN,BK,TWX,FOXA,UNP,UPS,UTX,UNH,USB,VZ,V,WMT,WAG,DIS,WFC')
	tickers.suffix = ''

	data <- new.env()
	for(ticker in tickers)
		data[[ticker]] = getSymbols(paste0(ticker, tickers.suffix), src = 'yahoo', from = '1980-01-01', auto.assign = F)

	###############################################################################
	# Plot Abnormal Series
	###############################################################################    
	layout(matrix(1:4,2))	
	plota(data$AAPL$Adjusted['2000'], type='p', pch='|', main='AAPL Adjusted Price in 2000')	
	plota(data$AIG$Adjusted['2008'], type='p', pch='|', main='AIG Adjusted Price in 2008')	
	plota(data$FDX$Adjusted['1982'], type='p', pch='|', main='1982 Adjusted Price in 1982')	

	###############################################################################
	# Clean data
	###############################################################################
	data.clean(data, min.ratio = 2)	

plot2

> data.clean(data, min.ratio = 2)	
Removing ABBV FB have less than 756 observations
Abnormal price found for AAPL 29-Sep-2000 Inverse Ratio : 2.1
Abnormal price found for AIG 15-Sep-2008 Inverse Ratio : 2.6
Abnormal price found for FDX 13-May-1982 Ratio : 8
Abnormal price found for FDX 06-Aug-1982 Ratio : 7.8
Abnormal price found for FDX 14-May-1982 Inverse Ratio : 8
Abnormal price found for FDX 09-Aug-1982 Inverse Ratio : 8

I first thought that September 29th, 2000 drop in AAPL was an data error; however, I found following news item: Apple bruises tech sector, September 29, 2000: 4:33 p.m. ET Computer maker’s warning weighs on hardware, chip stocks; Nasdaq tumbles.

So working with data requires a bit of data manipulation and a bit of detective works. Please, always have a look at the data before running any back-tests or making any conclusions.

Categories: R

Capturing Intraday data, Backup plan

In the Capturing Intraday data post, I outlined steps to setup your own process to capture Intraday data. But what do you do if you missed some data points due for example internet being down or due to power outage your server was re-started. To fill up the gaps in the Intraday data, you could get up to 10 day historical Intraday data from Google finance.

I created a wrapper function for Google finance, getSymbol.intraday.google() function in data.r at github, to download historical Intrday quotes. For example,

##############################################################################
# Load Systematic Investor Toolbox (SIT)
# http://systematicinvestor.wordpress.com/systematic-investor-toolbox/
###############################################################################
setInternet2(TRUE)
con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb'))
    source(con)
close(con)

    #*****************************************************************
    # Load Intraday data
    #****************************************************************** 
	data = getSymbol.intraday.google('GOOG', 'NASDAQ', 60, '5d')
	last(data, 10)
	plota(data, type='candle', main='Google Intraday prices')
> last(data, 10)
                       Open    High     Low    Close Volume
2014-03-28 15:52:00 1119.10 1119.61 1119.10 1119.610   4431
2014-03-28 15:53:00 1119.30 1119.30 1118.75 1118.805   3954
2014-03-28 15:54:00 1119.31 1119.45 1119.18 1119.340   5702
2014-03-28 15:55:00 1119.17 1119.40 1119.00 1119.340   8907
2014-03-28 15:56:00 1119.19 1119.35 1119.18 1119.190  11882
2014-03-28 15:57:00 1119.30 1119.30 1119.02 1119.270   6298
2014-03-28 15:58:00 1119.25 1119.35 1119.15 1119.265  10542
2014-03-28 15:59:00 1119.38 1119.49 1118.82 1119.250  29496
2014-03-28 16:00:00 1120.15 1120.15 1119.15 1119.380  71518
2014-03-28 16:01:00 1120.15 1120.15 1120.15 1120.150      0

plot1

So if your Intraday capture process failed, you can rely on Google fiance data to fill up the gaps.

Categories: R

Probabilistic Momentum with Intraday data

I want to follow up the Intraday data post with testing the Probabilistic Momentum strategy on Intraday data. I will use Intraday data for SPY and GLD from the Bonnot Gang to test the strategy.

##############################################################################
# Load Systematic Investor Toolbox (SIT)
# http://systematicinvestor.wordpress.com/systematic-investor-toolbox/
###############################################################################
setInternet2(TRUE)
con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb'))
    source(con)
close(con)

	#*****************************************************************
	# Load historical data
	#****************************************************************** 
	load.packages('quantmod')	

	# data from http://thebonnotgang.com/tbg/historical-data/
	# please save SPY and GLD 1 min data at the given path
	spath = 'c:/Desktop/'
	data = bt.load.thebonnotgang.data('SPY,GLD', spath)
	
	data1 <- new.env()		
		data1$FI = data$GLD
		data1$EQ = data$SPY
	data = data1
	bt.prep(data, align='keep.all', fill.gaps = T)

	lookback.len = 120
	confidence.level = 60/100
	
	prices = data$prices
		ret = prices / mlag(prices) - 1 
		
	models = list()
	
	#*****************************************************************
	# Simple Momentum
	#****************************************************************** 
	momentum = prices / mlag(prices, lookback.len)
	data$weight[] = NA
		data$weight$EQ[] = momentum$EQ > momentum$FI
		data$weight$FI[] = momentum$EQ <= momentum$FI
	models$Simple  = bt.run.share(data, clean.signal=T) 	

	#*****************************************************************
	# Probabilistic Momentum + Confidence Level
	# http://cssanalytics.wordpress.com/2014/01/28/are-simple-momentum-strategies-too-dumb-introducing-probabilistic-momentum/
	# http://cssanalytics.wordpress.com/2014/02/12/probabilistic-momentum-spreadsheet/
	#****************************************************************** 
	ir = sqrt(lookback.len) * runMean(ret$EQ - ret$FI, lookback.len) / runSD(ret$EQ - ret$FI, lookback.len)
	momentum.p = pt(ir, lookback.len - 1)
		
	data$weight[] = NA
		data$weight$EQ[] = iif(cross.up(momentum.p, confidence.level), 1, iif(cross.dn(momentum.p, (1 - confidence.level)), 0,NA))
		data$weight$FI[] = iif(cross.dn(momentum.p, (1 - confidence.level)), 1, iif(cross.up(momentum.p, confidence.level), 0,NA))
	models$Probabilistic  = bt.run.share(data, clean.signal=T) 	

	data$weight[] = NA
		data$weight$EQ[] = iif(cross.up(momentum.p, confidence.level), 1, iif(cross.up(momentum.p, (1 - confidence.level)), 0,NA))
		data$weight$FI[] = iif(cross.dn(momentum.p, (1 - confidence.level)), 1, iif(cross.up(momentum.p, confidence.level), 0,NA))
	models$Probabilistic.Leverage = bt.run.share(data, clean.signal=T) 	
	
	#*****************************************************************
	# Create Report
	#******************************************************************        
	strategy.performance.snapshoot(models, T)    

plot1

Next, let’s examine the hourly perfromance of the strategy.

	#*****************************************************************
	# Hourly Performance
	#******************************************************************    
	strategy.name = 'Probabilistic.Leverage'
	ret = models[[strategy.name]]$ret	
		ret.number = 100*as.double(ret)
		
	dates = index(ret)
	factor = format(dates, '%H')
    
	layout(1:2)
	par(mar=c(4,4,1,1))
	boxplot(tapply(ret.number, factor, function(x) x),outline=T, main=paste(strategy.name, 'Distribution of Returns'), las=1)
	barplot(tapply(ret.number, factor, function(x) sum(x)), main=paste(strategy.name, 'P&L by Hour'), las=1)

plot2

There are lots of abnormal returns in the 9:30-10:00am box due to big overnight returns. I.e. a return from today’s open to prior’s day close. If we exclude this observation every day, the distribution each hour is more consistent.

   	#*****************************************************************
   	# Hourly Performance: Remove first return of the day (i.e. overnight)
   	#******************************************************************    
   	day.stat = bt.intraday.day(dates)
	ret.number[day.stat$day.start] = 0

   	layout(1:2)
   	par(mar=c(4,4,1,1))
	boxplot(tapply(ret.number, factor, function(x) x),outline=T, main=paste(strategy.name, 'Distribution of Returns'), las=1)
	barplot(tapply(ret.number, factor, function(x) sum(x)), main=paste(strategy.name, 'P&L by Hour'), las=1)

plot3

The strategy performs best in the morning and dwindles down in the afternoon and overnight.

These hourly seasonality plots are just a different way to analyze performance of the strategy based on Intraday data.

To view the complete source code for this example, please have a look at the bt.strategy.intraday.thebonnotgang.test() function in bt.test.r at github.

Categories: Backtesting, R

Capturing Intraday data

I want to follow up the Intraday data post with an example of how you can capture Intraday data without too much effort by recording 1 minute snapshots of the market.

I will take market snapshots from Yahoo Finance using following function that downloads delayed market quotes with date and time stamps:

###############################################################################
# getSymbols interface to Yahoo today's delayed qoutes
# based on getQuote.yahoo from quantmod package
###############################################################################            
getQuote.yahoo.today <- function(Symbols) {
    require('data.table')
    what = yahooQF(names = spl('Symbol,Last Trade Time,Last Trade Date,Open,Days High,Days Low,Last Trade (Price Only),Volume'))
    names = spl('Symbol,Time,Date,Open,High,Low,Close,Volume')
    
    all.symbols = lapply(seq(1, len(Symbols), 100), function(x) na.omit(Symbols[x:(x + 99)]))
    out = c()
    
    for(i in 1:len(all.symbols)) {
        # download
        url = paste('http://download.finance.yahoo.com/d/quotes.csv?s=',
            join( trim(all.symbols[[i]]), ','),
            '&f=', what[[1]], sep = '')
        
        txt = join(readLines(url),'\n') 
        data = fread(paste0(txt,'\n'), stringsAsFactors=F, sep=',')
            setnames(data,names)
            setkey(data,'Symbol')      	
      	out = rbind(out, data)
    }
    out
} 

Next we can run the getQuote.yahoo.today function from 9:30 to 16:00 every minute and record market snap shoots. Please note that you will have to make some judgement calls in terms of how you want to deal with highs and lows.

Symbols = spl('IBM,AAPL')

prev = c()
while(T) {
    out = getQuote.yahoo.today(Symbols)
	
    if (is.null(prev)) 
        for(i in 1:nrow(out)) {
	    cat(names(out), '\n', sep=',', file=paste0(out$Symbol[i],'.csv'), append=F)
	    cat(unlist(out[i]), '\n', sep=',', file=paste0(out$Symbol[i],'.csv'), append=T)					
	}
    else
        for(i in 1:nrow(out)) {
	    s0 = prev[Symbol==out$Symbol[i]]
	    s1 = out[i]
	    s1$Volume = s1$Volume - s0$Volume
	    s1$Open = s0$Close
	    s1$High = iif(s1$High > s0$High, s1$High, max(s1$Close, s1$Open))
	    s1$Low  = iif(s1$Low  < s0$Low , s1$Low , min(s1$Close, s1$Open))
	    cat(unlist(s1), '\n', sep=',', file=paste0(out$Symbol[i],'.csv'), append=T)					
        }

    # copy
    prev = out
		
    # sleep 1 minute   
    Sys.sleep(60)	
} 

For example I was able to saved following quotes for AAPL:

Symbol   Time      Date    Open    High      Low   Close  Volume
  AAPL 2:57pm 3/10/2014 528.360 533.330 528.3391 531.340 5048146
  AAPL 2:58pm 3/10/2014 531.340 531.570 531.3400 531.570    7650
  AAPL 2:59pm 3/10/2014 531.570 531.570 531.5170 531.517    2223
  AAPL 3:00pm 3/10/2014 531.517 531.517 531.4500 531.450    5283
  AAPL 3:01pm 3/10/2014 531.450 531.450 531.2900 531.290    4413
  AAPL 3:02pm 3/10/2014 531.290 531.490 531.2900 531.490    2440

Unfortunately, there is no way to go back in history, unless you buy historical intraday data. But if you want to start recording market moves yourself, following code should get you started.

Categories: R
Follow

Get every new post delivered to your Inbox.

Join 231 other followers