I'm trying to normalize some data which I have in a data frame. I want to take each value and run it through the pnorm function along with the mean and standard deviation of the column the value lives in. Using loops, here's how I would write out what I want to do:
#example data
hist_data<-data.frame(matrix(rnorm(200,mean=5,sd=.5),nrow=20))
n<-dim(hist_data)[2] #columns=10
k<-dim(hist_data)[1] #rows =20
#set up the data frame which we will populate with a loop
normalized<-data.frame(matrix(nrow=dim(hist_data)[1],ncol=dim(hist_data)[2]))
#hot loop in loop action
for (i in 1:n){
for (j in 1:k){
normalized[j,i]<-pnorm(hist_data[j,i],mean=mean(hist_data[,i]),sd=sd(hist_data[,i]))
}
}
normalized
It seems that in R there should be a handy dandy vector way of doing this. I thought I was smart so tried using the apply function:
#trouble ahead
hist_data<-data.frame(matrix(rnorm(200,mean=5,sd=.5),nrow=10))
normalized<-apply(hist_data,2,pnorm,mean=mean(hist_data),sd=sd(hist_data))
normalized
Much to my chagrin, that does NOT produce what I expected. The upper left and bottom right elements of the output are correct, but that's it. So how can I de-loopify my life?
Bonus points if you can tell me what my second code block is actually doing. Kind of a mystery to me still. :)