You say that:
The problem is that the correlations (shown in the bottom left) do not accurately reflect how closely the model fits the data.
You could calculate the correlation between the fitted values and the measured values:
cor(y,fitted(gam(y ~ s(x))))
I don't see why you want to bin your data, but you could do it as follows:
mean.binned <- function(y,n = 5){
apply(matrix(c(y,rep(NA,(n - (length(y) %% n)) %% n)),n),
2,
function(x)mean(x,na.rm = TRUE))
}
It looks a bit ugly, but it should handle vectors whose length is not a multiple of the binning length (i.e. 5 in your example).
You also say that:
One way to improve the accuracy of the correlation is to use a root mean square error (RMSE) calculation on binned data.
I don't understand what you mean by this. The correlation is a factor in determining the mean squared error - for example, see equation 10 of Murphy (1988, Monthly Weather Review, v. 116, pp. 2417-2424). But please explain what you mean.