python

You are currently browsing articles tagged python.

Update! I’ve extended and polished the library and published it as pyyahoofinance at github. Feel free to critizise my coding style and laugh at me, and of course to use it and/or contribute.

I’m working on a project where I need to analyze the ups and downs of different stock values along the history and find correlations between the contraction and expansion of the market, the evolution of the indexes and simulate certain trading strategies to become rich and famous. ;)

Yahoo! Finance lets you download historical financial data in CSV, but only provides data for one stock at a time. Clicking, saving, cutting and pasting is always tedious, error-prone and a waste of time in general, so I’ve taken advantage of my good friend Python to write a script to plunder (yeah, I’m a pirate, you know ;) Yahoo!’s Standard & Poor’s 500 index in a couple of minutes and save it in a space-separated-values file named “results.txt”.

You can then generate beautiful charts like this one:

Code follows:

#!/usr/bin/python

import urllib2

CLOSE_COLUMN = 4 # the index of the column containing the close value
TICKER_COLUMN = 0 # the index of the column containing the ticker name

# get the Standard & Poor stock tickers
tickers = []
for n in range(0, 500, 50):
    url = urllib2.urlopen("http://download.finance.yahoo.com/d/quotes.csv?s=@%5EGSPC&f=sl1d1t1c1ohgv&e=.csv&h=PAGE".replace('PAGE', str(n)))
    data = url.read()
    stocks = data.split('\r\n')
    for stock in stocks:
        try:
            ticker = stock.split(',')[TICKER_COLUMN]
            ticker = ticker.replace('"', '') # remove surrounding quotes
            if ticker: # not empty ticker
                tickers.append(ticker)
        except IndexError: # empty row
            pass

global_closes = {}

for ticker in tickers:
    print "getting data from ticker: %s" % ticker
    url = urllib2.urlopen("http://ichart.finance.yahoo.com/table.csv?s=%s&a=00&b=1&c=2000&d=00&e=1&f=2009&g=m&ignore=.csv" % ticker)

    history = url.read()

    measures = history.split('\n')
    measures = measures[1:-1] # the last row is empty and the first
                              # one contains the labels

    closes = [measure.split(',')[CLOSE_COLUMN] for measure in measures]

    global_closes[ticker] = closes

columns = [[ticker] + global_closes[ticker] for ticker in global_closes.keys()]

rows = zip(*columns)
out = open('results.txt', 'w')
for row in rows:
    out.write(' '.join(row))
    out.write('\n')
out.close()

Related articles

Tags: , , ,

Last week I started a project using Relational Databases for Zassh.com.

At first, I tried to construct the SQL queries myself using string joins, tuples and so but soon I realized the problems using this method:

  • It’s error-prone.
  • Highly vulnerable to SQL injection attacks.
  • Computers write SQL better than humans.
  • It made me remember my suffering PHP days.

So I decided to try a library I heard a lot of good things about. SQLAlchemy is a pythonic high-performance SQL toolkit and Object Relational Mapper (ORM).

The result couldn’t be better. Instead of fighting against query-strings and tuples, it lets you write SQL magic directly in Python. I discovered I can do calculations using SQL queries I’ve never have thought before, like counting the number of goals a team has scored through two simultaneous joins and a bunch of filters.

If you have to work with Python and SQL, I definitely recommend you to use SQLAlchemy and forget about old error-prone methods. I’m sure you won’t be disappointed.

Related articles

Tags: , ,