Archive
Posts Tagged ‘utf-8’
Text file encoding
March 23, 2011
1 comment
Detect the encoding of a text file:
$ file all.txt all.txt: ISO-8859 text
Get more verbosity:
$ file --mime all.txt all.txt: text/plain; charset=iso-8859-1
Change the encoding of a text file:
iconv --from-code=UTF-8 --to-code=ISO-8859-2 file.txt >tmp.txt
This latter tip is from here.
Update (20110918)
You can also use “chardet” for detecting charcter encoding. Usage:
$ chardet test.txt test.txt: utf-8 (confidence: 0.99)
Categories: Uncategorized
character encoding, charset, encoding, file, iso-8859-1, iso-8859-2, utf-8