tags:

views:

57

answers:

2

I have an array with dates as indices which I'm plotting. I'd like to plot a LOESS curve along with it. However, the input for loess is a formula. Is there a good way to define a formula from array index to value which I can then give to the loess function?

+2  A: 

Check out the help page for loess() - it has a couple of examples of specifying the formula. Basically, you need to put your data into a data.frame object with variables given appropriate names, then the formula will be y ~ x, where x and y are the names of the variables you want on the x- and y-axis, respectively.

I prefer the function lowess(), which is a faster, simpler alternative. It has fewer adjustable parameters than loess() but it just as good in many applications. Here are some links describing the differences between the two functions.

Below is a simple example for both loess() and lowess()

## create an example data set                                                                                                                                
x <- sort(rpois(100,10) + rnorm(100,0,2))
y <- x^2 + rnorm(100,0,7)
df <- data.frame(x = x,y = y)
plot(x,y)
## fit a lowess and plot it                                                                                                                                  
l.fit1 <- lowess(x,y,f = 0.3)
lines(l.fit1, col = 2,lwd = 2)

## fit a loess and plot it                                                                                                                                   
l.fit2 <- loess(y ~ x, data = df)
lines(x,predict(l.fit2,x), col = 3,lwd = 2)
nullglob
One thing that's not mentioned in the links... or I didn't see it... is that loess actually returns a model that you can make predictions with whereas lowess() just returns the x and y values. ggplot() apparently uses loess() and fills in lots of intermediate predictors to make the nice smooth lines they get. If you want anything other than fits to the predictors you give it then loess() must be your command of choice.
John
My array has dimensions 15238,1. If I pass it into lowess I get back a list of length 2 where each item is a variable of length 15238. What is it that is in the $x and $y that lowess returns?
Ben McCann
I'm still having quite a bit of trouble using either function. How do I convert my array to a function that I can pass into loess. Is it something like "arrayIndex ~ array[arrayIndex]"?
Ben McCann
@Ben McCann, try something like `y <- arrayIndex; x <- array[arrayIndex]` and then add this into the code above.
nullglob
Thanks for the help. Turns out my problem mostly stemmed from not understanding R's take on object oriented programming. If I understand correctly an object in R has both a type and a class which was a new notion for me coming from other languages.
Ben McCann
A: 

There are a number of ways to do what you want; the simplest might be to use 'scatter.smooth' which more or less packs everything you need (data plot, loess fit, and curve plot) in a single functional call.

data(AirPassengers)                # a monthly time series supplied w/ base R install
scatter.smooth( x=1:length(AP), 
                y = as.vector(AP), 
                pch=20,            # this line and lines below are just aesthetics
                col="orange", 
                lty="dotted", 
                lwd=1.5, 
                xlab="")
doug
Thanks for the suggestion. I'd really like to also have access to the underlying loess data and it appears scatter.smooth will only chart it, but good to know about regardless.
Ben McCann