Register now and start sharing your code snippets.
-->
Detecting file/data encoding with Ruby and the chardet RubyGem
Ruby posted 8 months ago by christian
You can use the chardet gem to detect the charset of an arbitrary string.
Install the chardet gem by issuing the following command:
1 $ sudo gem install chardet
Then in irb:
1 require 'rubygems' 2 require 'UniversalDetector' 3 p UniversalDetector::chardet('Ascii text') 4 p UniversalDetector::chardet('åäö')
The output from this example is:
1 {"encoding"=>"ascii", "confidence"=>1.0} 2 {"encoding"=>"utf-8", "confidence"=>0.87625}
For Python users there exists an identical library…