ansaurus

Question

Root mean square deviation on binned GAM results using R

Answer 1

+1 A:

You say that:

The problem is that the correlations (shown in the bottom left) do not accurately reflect how closely the model fits the data.

You could calculate the correlation between the fitted values and the measured values:

cor(y,fitted(gam(y ~ s(x))))

I don't see why you want to bin your data, but you could do it as follows:

mean.binned <- function(y,n = 5){
  apply(matrix(c(y,rep(NA,(n - (length(y) %% n)) %% n)),n),
        2,
        function(x)mean(x,na.rm = TRUE))
}

It looks a bit ugly, but it should handle vectors whose length is not a multiple of the binning length (i.e. 5 in your example).

You also say that:

One way to improve the accuracy of the correlation is to use a root mean square error (RMSE) calculation on binned data.

I don't understand what you mean by this. The correlation is a factor in determining the mean squared error - for example, see equation 10 of Murphy (1988, Monthly Weather Review, v. 116, pp. 2417-2424). But please explain what you mean.

nullglob 2010-06-19 12:32:51

@nullglob: Thank you. What I really was looking for was a number between 0 and 1 that indicates how closely the fitted curve from GAM fits the data. I think I can use `anova()` and a chi-squared test.

Dave Jarvis 2010-06-20 21:03:56

ansaurus

tags:

views:

answers:

Root mean square deviation on binned GAM results using R

related questions