Register now and start sharing your code snippets.
-->
Hpricot's inner_text doesn't handle HTML entities correctly
Ruby posted 7 months ago by christian
Hpricot’s inner_text method is fubar and doesn’t handle HTML entities correctly, instead you’ll see questionmarks in the output. To fix this replace calls to Hpricot’s inner_text with a call to the following method (or Monkey patch Hpricot):
1 require 'rubygems' 2 require 'htmlentities' 3 4 def inner_text(node) 5 text = node.innerHTML.gsub(%r{<.*?>}, "").strip 6 HTMLEntities.new.decode(text) 7 end
Remember to install the htmlentities gem:
1 sudo gem install htmlentities