Ruby alternatives for parsing CSV files
- Ruby String#split (slow)
- Built-in CSV (ok, recommended)
- ccsv (fast & recommended if you have control over CSV format)
- CSVScan (fast & recommended if you have control over CSV format)
- Excelsior (fast & recommended if you have control over CSV format)
CSV library benchmarks can be found here and here
Parsing with plain Ruby
filename = 'data.csv'
file = File.new(filename, 'r')
file.each_line("\n") do |row|
columns = row.split(",")
break if file.lineno > 10
end
This option has several problems...
Parsing with the built-in CSV library
require 'csv'
CSV.open('data.csv', 'r', ';') do |row|
puts row
end
require 'csv'
CSV.foreach("changes.csv", quote_char: '"', col_sep: ';', row_sep: :auto, headers: true) do |row|
puts row[0]
puts row['xxx']
end
Parsing with the ccsv library
require 'rubygems'
require 'ccsv'
Ccsv.foreach(file) do |values|
puts values[0]
end
Parsing with the CSVScan library
CSVScan can be downloaded from here.
require "csvscan"
open("data.csv") do |io|
CSVScan.scan(io) do|row|
puts row
end
end