Download historical financial data from Yahoo! using Python

Update! I’ve extended and polished the library and published it as pyyahoofinance at github. Feel free to critizise my coding style and laugh at me, and of course to use it and/or contribute.

I’m working on a project where I need to analyze the ups and downs of different stock values along the history and find correlations between the contraction and expansion of the market, the evolution of the indexes and simulate certain trading strategies to become rich and famous. 馃槈

Yahoo! Finance lets you download historical financial data in CSV, but only provides data for one stock at a time. Clicking, saving, cutting and pasting is always tedious, error-prone and a waste of time in general, so I’ve taken advantage of my good friend Python to write a script to plunder (yeah, I’m a pirate, you know 馃槈 Yahoo!’s Standard & Poor’s 500 index in a couple of minutes and save it in a space-separated-values file named “results.txt”.

You can then generate beautiful charts like this one:

Code follows:

#!/usr/bin/python

import urllib2

CLOSE_COLUMN = 4 # the index of the column containing the close value
TICKER_COLUMN = 0 # the index of the column containing the ticker name

# get the Standard & Poor stock tickers
tickers = []
for n in range(0, 500, 50):
    url = urllib2.urlopen("http://download.finance.yahoo.com/d/quotes.csv?s=@%5EGSPC&f=sl1d1t1c1ohgv&e=.csv&h=PAGE".replace('PAGE', str(n)))
    data = url.read()
    stocks = data.split('rn')
    for stock in stocks:
        try:
            ticker = stock.split(',')[TICKER_COLUMN]
            ticker = ticker.replace('"', '') # remove surrounding quotes
            if ticker: # not empty ticker
                tickers.append(ticker)
        except IndexError: # empty row
            pass

global_closes = {}

for ticker in tickers:
    print "getting data from ticker: %s" % ticker
    url = urllib2.urlopen("http://ichart.finance.yahoo.com/table.csv?s=%s&a=00&b=1&c=2000&d=00&e=1&f=2009&g=m&ignore=.csv" % ticker)

    history = url.read()

    measures = history.split('n')
    measures = measures[1:-1] # the last row is empty and the first
                              # one contains the labels

    closes = [measure.split(',')[CLOSE_COLUMN] for measure in measures]

    global_closes[ticker] = closes

columns = [[ticker] + global_closes[ticker] for ticker in global_closes.keys()]

rows = zip(*columns)
out = open('results.txt', 'w')
for row in rows:
    out.write(' '.join(row))
    out.write('n')
out.close()

One thought on “Download historical financial data from Yahoo! using Python

  1. Thanks very much for your code. I am sure this will be useful not only to me but to others that wish to write their own interface fro financial data.

    –V

Leave a Reply to Virgil Cancel reply

Your email address will not be published. Required fields are marked *