in my previous question about using serialize() to create a CSV of objects I got a great answer from jmoy where he recommended base64 encoding of my serialized text. That was exactly what I was looking for. Oddly enough, when I try to put this in practice I get results that look right but don't exactly match what I ran through the serialize/encoding process.
The example below takes a list with 3 vectors and serializes each vector. Then each vector is base64 encoded and written to a text file along with a key. The key is simply the index number of the vector. I then reverse the process and read each line back from the csv. At the very end you can see some items don't exactly match. Is this a floating point issue? Something else?
require(caTools)
randList <- NULL
set.seed(2)
randList[[1]] <- rnorm(100)
randList[[2]] <- rnorm(200)
randList[[3]] <- rnorm(300)
#delete file contents
fileName <- "/tmp/tmp.txt"
cat("", file=fileName, append=F)
i <- 1
for (item in randList) {
myLine <- paste(i, ",", base64encode(serialize(item, NULL, ascii=T)), "\n", sep="")
cat(myLine, file=fileName, append=T)
i <- i+1
}
linesIn <- readLines(fileName, n=-1)
parsedThing <- NULL
i <- 1
for (line in linesIn){
parsedThing[[i]] <- unserialize(base64decode(strsplit(linesIn[[i]], split=",")[[1]][[2]], "raw"))
i <- i+1
}
#floating point issue?
identical(randList, parsedThing)
for (i in 1:length(randList[[1]])) {
print(randList[[1]][[i]] == parsedThing[[1]][[i]])
}
i<-3
randList[[1]][[i]] == parsedThing[[1]][[i]]
randList[[1]][[i]]
parsedThing[[1]][[i]]
Here's the abridged output:
> #floating point issue?
> identical(randList, parsedThing)
[1] FALSE
>
> for (i in 1:length(randList[[1]])) {
+ print(randList[[1]][[i]] == parsedThing[[1]][[i]])
+ }
[1] TRUE
[1] TRUE
[1] FALSE
[1] FALSE
[1] TRUE
[1] FALSE
[1] TRUE
[1] TRUE
[1] FALSE
[1] FALSE
...
>
> i<-3
> randList[[1]][[i]] == parsedThing[[1]][[i]]
[1] FALSE
>
> randList[[1]][[i]]
[1] 1.587845
> parsedThing[[1]][[i]]
[1] 1.587845
>