tags:

views:

80

answers:

2

Format consist of lines, every line has set of key="value" elements.

Format example: X="1" Y="2" Z="who are you?" Y="4" Z="bla bla..." X="42"

I would like to import this data into R, table or data.frame, where key defines column.

Thanks!

+1  A: 

The following code parses the file you provided in a 'melted' form:

data<-NULL 
stream<-file("path");open(stream) #or stream<- textConnection(' X="1" Y="2" Z="who are you?" Y="4" Z="bla bla..." X="42"')
while(length(ele<-c(scan(stream,what="string",n=1,sep="="),scan(stream,what="string",n=1,sep=" ")))>0){
    data<-rbind(data,ele);
}
close(stream);
print(data);

Now crystallizing:

 sapply(unique(data[,1]),function(key) data[data[,1]==key,2])
mbq
This is working. Thanks!Can you give some hint how to speed this up? It took about 20min to read 40k lines, maybe because of many IO operations to HDD? I think that reading whole file into string and than parse it should improve performance?
watbywbarif
And one other question, final data structure is ok for some things, but i would like to have data in some other format, where row coincidence is conserved?
watbywbarif
About speed, it could go faster if you would replace "="s with " "s; than everything could be read as a series of space separated strings by one scan(stream,what="string",sep=" "). About conserving rows, can you write more about the structure of this file and how do you want to represent it in R? And I think a new question will be suitable for that.
mbq
A: 

Thanks for reply, i couldn't write code in comment so i placed it as answer. That variable w in second scan should be stream i guess? and why did you put n=1 ? In that case i only get first value repeatedly entered. Than i replaced dd with data, but its not ok, it seems that i read file more than once, maybe this length condition is bad? In next comment is version i tried, what do you thinj? Sorry if i did something stupid, im new in R, and this is first time i did more than call built-in funtion :)

data<-NULL  
stream <- file("path")
while(length(ele<-c(scan(stream,what="string",n=-1,sep="=",quiet=TRUE,flush=FALSE),
 scan(stream,what="string",n=-1,sep=" ",quiet=TRUE,flush=FALSE)))>0){
    data<-rbind(data,ele);
}
watbywbarif
Im new to SO also, i guess i did this bad. I dont know how to place code to comment? Or how should i handle this other way?
watbywbarif
I think it is a better idea to modify your question. What you've posted is not working because file("path") only creates a connection but does not open it; you must call open(stream) and than scan it. Otherwise subsequent scans will open and close connection starting from scratch each time.
mbq
I have edited my answer to reflect all of this.
mbq
And sorry for the buggy code; I should have checked it.
mbq
Thanks for explanations!
watbywbarif