r - Unrecognized column name after loading data table with fread -


I load a data table with the fread the way I always do the files There are ~ 2M records and tabs are delimited.

The load is successful I can print the table top and column names so far.

but then fails to make a complaint either by changing the name of First column or setting it as the key Older Column name can not be found. I'm sure there is no typo in the name of the column, there is no top or bottom position, I tried several times with copy / paste and re-typing. I apparently can change the name of any other column.

The first column is a long integer ID, so I had to load the bit 64 package to get rid of a warning in 'Fred', but it does not seem to help. Is this a clue?

Does anyone have any such information that can cause such symptoms? How to debug?

I use R 3.1.0 on Windows 64, the latest version of all packages.

Editing: More Details

Data Load Commands:

txnData & lt; - fread (txnInDataPathFileName, header = TRUE, sep = "\ t", na.strings = "NA")

column name:

  colnames ( txn_desc "" txn_desc "" txn_type_id "" site_id "" date_id "" device_id "" cust_id "[8]" empl_id "" txn_start_time "" column named (which also fails): Error in setting names (txn_data, "txn_ext_id"), "txn_end_time" "total_sales" "total_units" "gross_margin"   

set names (txnData, " txn_ext_id "," txnId "): 'old' items are not found in column names: txn_ext_id

and finally the requested dput command:

  dput (head (txnData)) structure (list (`txn_ext_id` = structure (C (4.88536 9 2440272- 311, 1.10 9 71 996159584e-311, 9.9460266389845-312, 1.0227644072435-311, 1.10329710699982e-311, 1.01 9 30594588518e-311), class = "integer 64"), txn_de Sc = c ("checkout transaction", "checkout transaction "," Checkout transaction "," checkout transaction "," checkout transaction "," checkout transaction "), txn_type_id = c (0l, 0l, 0l, 0l, 0l, 0l), site_id = c ( 982L, 982L, 982L, 982L, 982L, 982L), Date_ID = C ("2012-12-24", "2013-11-27", "2013-04-08", "2013- 06-04", " 2013-11-14 "," 2013-05-28 "), device_id = c (8L, 7L, 8L, 53L, 8L, 5L), cust_id = structure (C (2.02600292130833e-313, 2.02572944866119e-313, 2.02583815970388 E-313, 2.02580527009968e-313, 2.02568405005593e-313, 2.02736582767668e-313), class = "integer 64"), Empl_id = c ("?", "?", "?", "? ","? ","? "), Txn_start_time = c (" 2012-12-24T08: 35: 56 "," 2013-11-27 T12: 43: 30 "," 2013-04-08T11: 48: 29 "," 2013-06- 04T15: 27: 47 "," 2013-11-14T12: 57: 38 "," 2013-05-2811 11: 03: 21 "), TxN_and_time = C (" 2012-12-24 Tet 8: 38:00 " , "2013-11-27 T 12: 47: 00", "2013-04-08T11: 49: 00", "2013-06-04T15: 35: 00", "2013-11-14T13: 00: 00" , "2013-05-28 T11: 05: 00"), total_sales = c (48.86, 69.7, 8.53, 33.46, 39.19, 35.56), total_important = C (12L, 44L, 3L, 4L, 14L , 17l), gross_margin = C (0, 0, 0, 0, 0, 0) .name = c ("txn_ext_id", "txn_desc", "txn_type_id", "site_id", "date_id", "device_id" , "Data_id", "data_file", "custom_id", "txn_start_time", "txn_start_time", "txn_end_time", "total_sale", "total_ename", "gross_margin"), square = c ("data valid", "data.free M "), row.Name = C (NA, -6L), .internal.selfref = & lt; pointer: 0x00000000002c0788>)    

The hidden letter was displayed as à a »A when you see it will get. You can see it in theory by the editors set in ANSI display mode - OK, I could not do it in Notepad ++! In R, the prints of the data table are shown along with the use of Rstidio, but this does not use Eclipse Static which I use by default, it is understood that I immediately Why not notice.

The following link again how to get rid of BOM characters:

I loaded my file in Notepad ++, Encoding - & gt; Change in UTF-8 without the bomb, saved, and this BOM character disappeared, all is well

A pure R solution to this problem without touching the file is to include the BOM character as the prefix to change the name Command: SetName (DataTable, "OneFF One» First columnname " , "First columnname") . It worked in RStudio and I think it will work in R console too. However, it does not work in the Eclipse-StateAIT because the BOM character remains hidden when accessing the data table: The first column name is accessed without the BOM prefix or without, setnames failed in any way.

Comments

Popular posts from this blog

Verilog Error: output or inout port "Q" must be connected to a structural net expression -

jasper reports - How to center align barcode using jasperreports and barcode4j -

c# - ASP.NET MVC - Attaching an entity of type 'MODELNAME' failed because another entity of the same type already has the same primary key value -