Scraping Yahoo! Finance with Ruby and Hpricot

Tagged yahoo, finance, ruby, hpricot  Languages css

This code extracts the numbers from the Fund operations table on the BLV fund's Profile page at Yahoo! Finance.

require 'rubygems'
require 'hpricot'
require 'open-uri'

page = Hpricot(open(''))

fund_operations = [] "//table[@class='yfnc_datamodoutline1']" ).each do |row| "//td[@class='yfnc_datamoddata1']").each do |data|
    fund_operations << data.inner_html

pp fund_operations

The output from this script is:

["N/A", "N/A", "55%", "72", "85.05M", "1.71B"]

Note that you could also use Scrubyt for this. Here's a snippet that explains how to use Scrubyt to scrape web pages: Scraping Google search results with Scrubyt and Ruby