windows - R: can't read unicode text files even when specifying the encoding -


I am using R 3.1.1 on Windows 7 32bits. I'm having a lot of problems reading some text files on which I want to do text analysis. According to Notepad ++, the files are encoded with "UCS-2 Little Endian" . (GrepWin, a device whose name says it all, says that the file is "Unicode".)

The problem is that I do not even want to read the file that encoding is specified (these characters are standard Spanish Latin set- ± ÃÆ'à à ⠀ œA- ³- and it should be handled smoothly with CP1252 or similar.)

  & gt; Sys.getlocale () [1] "LC_COLLATE = Spanish_spen 0.1252; Elsi_sitiwaiipi = Spesaispiani 12.252; Elsiaimattiarai Spanish_spen = 0.1252; Elsi_anayrarik = C; Elsitiaimaiiii = Spesaispiai .1252" & gt; ReadLines ( "filename.txt") [1] "¡¾" "" "" "" "" ... ... ReadLines ( "filename.txt", encoding = "UTF -8") [1] "\ xff \ xfeE "" "" "" "" "... ... ReadLines (" filename.txt ", encoding =" UCS2LE ") [1]" ÃÆ'à ¢ a, ¬ Å ¡Ãƒâ € SA, a "" "" "" "" "" "" ... ... ReadLines ( "filename.txt", encoding = "UCS-2") [1] "ÃÆ'à ¢ a, ¬ Å ¡Ãƒâ € SA,  € "" "" "" "" ... ...   

Any ideas?

Thank you!


Edit: "UTF-16", "UTF-16LE" and "UTF-16BE" encondings evenly fail.

< Div class = "post-text" itemprop = "text">

After more thorough reading of the documentation, I got an answer to my question.

encoding ultimate readlines contains only apply to the ultimate input string documentation says:

The encoding to be assumed for the input string. It is used to mark the respective character strings in the Latin-1 or UTF -8: do not encode again input part of the latter, encoding connection Specify as or through options (encoding =): see examples. The correct way to read a file with an unusual encoding, then,

  filetext    & lt; - ReadLines (cone & lt; - file ("UnicodeFile.txt", encoding = "UCS-2 LE")) Close (con)    

Comments

Popular posts from this blog

java - ImportError: No module named py4j.java_gateway -

python - Receiving "KeyError" after decoding json result from url -

.net - Creating a new Queue Manager and Queue in Websphere MQ (using C#) -