Register now and start sharing your code snippets.

Scraping Yahoo! Finance with Ruby and Hpricot

CSS posted 6 months ago by christian

This code extracts the numbers from the Fund operations table on the BLV fund’s Profile page at Yahoo! Finance.

   1  require 'rubygems'
   2  require 'hpricot'
   3  require 'open-uri'
   4  
   5  page = Hpricot(open('http://finance.yahoo.com/q/pr?s=BLV'))
   6  
   7  fund_operations = []
   8  page.search( "//table[@class='yfnc_datamodoutline1']" ).each do |row|
   9    row.search( "//td[@class='yfnc_datamoddata1']").each do |data|
  10      fund_operations << data.inner_html
  11    end
  12  end
  13  
  14  pp fund_operations

The output from this script is:

   1  ["N/A", "N/A", "55%", "72", "85.05M", "1.71B"]

Note that you could also use Scrubyt for this. Here’s a snippet that explains how to use Scrubyt to scrape web pages: Scraping Google search results with Scrubyt and Ruby

Tagged yahoo, finance, ruby, hpricot

Generate a 56-bit DES encrypted (htpasswd) password with Ruby

CSS posted 9 months ago by christian

Run the following in an irb console to generate a 56-bit DES encrypted password:

   1  "password".crypt("salt")

The password can be used in an Apache or Nginx htpasswd file to enable basic authentication.

The generated password can also be used in other Unix password files.

Tagged ruby, irb, htpasswd, nginx, apache

A simple image replacement technique for increased usability and SEO ranking

CSS posted 12 months ago by christian

This is currently my favorite image replacement technique. I don’t remember where I found it… Using it can improve both your site’s usability and your search engine ranking, by allowing both screen readers and search engines to find your h1 headlines. First create the h1 and the description of your page/site, for example:

   1  <h1 id="logo">Viagra, Botox, you name it</h1>

Then create the CSS rule for the page title:

   1  h1#logo {
   2    text-indent: -9000px;
   3    background: url(logo.gif);
   4    width: 200px; /* Width of image */
   5    height: 50px; /* Height of image */
   6  }

People using a modern browser that support CSS will see your logo (the image), and search engines and people using less modern browsers will see the content of the h1 header tag.

Note that if you replace the text of a link then use the outline CSS property to remove the dotted border:

   1  .text-replacement {
   2  	text-indent: -9000px;
   3  }
   4  
   5  .text-replacement a {
   6  	outline: none;
   7  }

Tagged css, image, replacement, usability, seo

Implementing hanging bullets with CSS

CSS posted 12 months ago by christian

According to Mark Boulton’s article Five simple steps to better typography – part 2, the text in bulleted lists should be left-aligned with the surrounding text; this is rarely the case on the web, but is easily achievable by using the following CSS style:

   1  ul {
   2    list-style-position: outside;
   3    margin-left: 0px;
   4  }

Tagged lists, list-style-position, typography, css, bulleted

Reset CSS rules to render HTML identically in all browsers

CSS posted about 1 year ago by christian

These CSS rules remove most, if not all, browser specific styles from common HTML elements. Your page will look almost identical in all browser when using these CSS rules. Note that this is a combination of Tantek Celik’s undohtml.css and YUI ’s reset.css.

   1  /** START BLATANT RIP FROM Tantek Celik's undohtml.css */
   2  
   3  /* link underlines tend to make hypertext less readable, 
   4     because underlines obscure the shapes of the lower halves of words */
   5  :link,:visited { text-decoration:none }
   6  
   7  /** END BLATANT RIP FROM Tantek Celik's undohtml.css */
   8  
   9  /** START BLATANT RIP FROM YUI's reset.css */
  10  
  11  body,div,dl,dt,dd,ul,ol,li,h1,h2,h3,h4,h5,h6,pre,form,fieldset,input,textarea,p,blockquote,th,td {  
  12    margin:0; 
  13    padding:0; 
  14  } 
  15  table { 
  16    border-collapse:collapse; 
  17    border-spacing:0; 
  18  } 
  19  fieldset,img {  
  20    border:0; 
  21  } 
  22  address,caption,cite,code,dfn,em,strong,th,var { 
  23    font-style:normal; 
  24    font-weight:normal; 
  25  } 
  26  ol,ul { 
  27    list-style:none; 
  28  } 
  29  caption,th { 
  30    text-align:left; 
  31  } 
  32  h1,h2,h3,h4,h5,h6 { 
  33    font-size: 1em; 
  34    font-weight:normal; 
  35  } 
  36  q:before,q:after { 
  37    content:''; 
  38  } 
  39  abbr,acronym { 
  40    border:0; 
  41  } 
  42  
  43  /** START BLATANT RIP FROM YUI's reset.css */

Tagged css, reset, browser, compatibility