How to parse CSV data with Ruby

Ruby posted almost 5 years ago by christian

Ruby alternatives for parsing CSV files

  • Ruby String#split (slow)
  • Ruby CSV (slow)
  • FasterCSV (ok, recommended)
  • ccsv (fast & recommended if you have control over CSV format)
  • CSVScan (fast & recommended if you have control over CSV format)
  • Excelsior (fast & recommended if you have control over CSV format)

CSV library benchmarks can be found here and here

Parsing with plain Ruby

   1  filename = 'data.csv'
   2  file = File.new(filename, 'r')
   3  
   4  file.each_line("\n") do |row|
   5    columns = row.split(",")
   6    
   7    break if file.lineno > 10
   8  end

This option has several problems…

Parsing with the CSV library

   1  require 'csv'
   2  
   3  CSV.open('data.csv', 'r', ';') do |row|
   4    puts row
   5  end
   6  

Parsing with the FasterCSV library

   1  require 'rubygems'
   2  require 'faster_csv'
   3  
   4  FasterCSV.foreach("data.csv", :quote_char => '"', :col_sep =>';', :row_sep =>:auto) do |row|
   5    puts row[0]
   6  end

Parsing with the ccsv library

ccsv is hosted on GitHub.

   1  require 'rubygems'
   2  require 'ccsv'
   3  
   4  Ccsv.foreach(file) do |values|
   5    puts values[0]
   6  end

Parsing with the CSVScan library

CSVScan can be downloaded from here.

   1  require "csvscan"
   2  
   3  open("data.csv") do |io|
   4    CSVScan.scan(io)  do|row|
   5      puts row
   6    end
   7  end

Tagged csv, parse, ruby, fastercsv, ccsv, csvscan, excelsior