tags:

views:

75

answers:

2

I'm making word frequency tables with R and the preferred output format would be a JSON file. sth like { "word" : "dog", "frequency" : 12 } Is there any way to save the table directly into this format? I've been using the write.csv() function and convert the output into JSON but this is very complicated and time consuming.

+3  A: 

You could presumably use the rjson package.

Dirk Eddelbuettel
Doesn't it serialize the table as a JSON object into a file? What I need is a plain text file with the data in JSON format.
zolizoli
JSON is a plain text format... what's the problem?
mbq
see ?writeLines : writeLines(toJSON(anobject),file="afile.txt")
Joris Meys
Joris, this doesn't work. `writeLines` function doesn't support `file` argument, but a file connection.
aL3xa
+1  A: 
set.seed(1)
( tbl <- table(round(runif(100, 1, 5))) )

## 1  2  3  4  5 
## 9 24 30 23 14 

library(rjson)
sink("json.txt")
cat(toJSON(tbl))
sink()

file.show("json.txt")
## {"1":9,"2":24,"3":30,"4":23,"5":14}

or even better:

set.seed(1)
( tab <- table(letters[round(runif(100, 1, 26))]) )

a b c d e f g h i j k l m n o p q r s t u v w x y z 
1 2 4 3 2 5 4 3 5 3 9 4 7 2 2 2 5 5 5 6 5 3 7 3 2 1 

sink("lets.txt")
cat(toJSON(tab))
sink()
file.show("lets.txt")
## {"a":1,"b":2,"c":4,"d":3,"e":2,"f":5,"g":4,"h":3,"i":5,"j":3,"k":9,"l":4,"m":7,"n":2,"o":2,"p":2,"q":5,"r":5,"s":5,"t":6,"u":5,"v":3,"w":7,"x":3,"y":2,"z":1}

Then validate it with http://www.jsonlint.com/ to get pretty formatting. If you have multidimensional table, you'll have to work it out a bit...

EDIT:

Oh, now I see, you want the dataset characteristics sink-ed to a JSON file. No problem, just give us a sample data, and I'll work on a code a bit. Practically, you need to carry out the data into desirable format, hence convert it to JSON. list should suffice. Give me a sec, I'll update my answer.

EDIT #2: Well, time is relative... it's a common knowledge... Here you go:

( dtf <- structure(list(word = structure(1:3, .Label = c("cat", "dog", 
"mouse"), class = "factor"), frequency = c(12, 32, 18)), .Names = c("word", 
"frequency"), row.names = c(NA, -3L), class = "data.frame") )

##   word frequency
## 1   cat        12
## 2   dog        32
## 3 mouse        18

If dtf is a simple data frame, yes, data.frame, if it's not, coerce it! Long story short, you can do:

toJSON(as.data.frame(t(dtf)))
## [1] "{\"V1\":{\"word\":\"cat\",\"frequency\":\"12\"},\"V2\":{\"word\":\"dog\",\"frequency\":\"32\"},\"V3\":{\"word\":\"mouse\",\"frequency\":\"18\"}}"

I though I'll need some melt with this one, but simple t did the trick. Now, you only need to deal with column names after transposing the data.frame. t coerces data.frames to matrix, so you need to convert it back to data.frame. I used as.data.frame, but you can also use toJSON(data.frame(t(dtf))) - you'll get X instead of V as a variable name. Alternatively, you can use regexp to clean the JSON file (if needed), but it's a lousy practice, try to work it out by preparing the data.frame.

I hope this helped a bit...

aL3xa
Thank you very much! Your answer helped me out :D
zolizoli