+1  A: 

May I propose a different solution? Just use PostgreSQL to pull the data, feed it into some R script and finally show the results. The R script may be as complicated as you want as long as the user doesn't have to deal with it.

You may want to have a look at rapache, an Apache module that allows running R scripts in a webpage. A couple of videos illustrating its use:

In particular check how the San Francisco Estuary Institue Web Query Tool allows the user to interact with the parameters.

As for the regression, I'm not an expert, so I may be saying something extremely stupid... but wouldn't something like a LOESS regression be OK for this?

nico
@nico: Right now the architecture is: `Apache -> FORM -> PHP -> JasperReports -> PostgreSQL -> R`. It's easy to say "show the results" just as it's easy to say "use R". I have spent 17 days migrating a working and speedy MySQL database to PostgreSQL just so that I can use R. I am running out of time on the project and so architectural changes are not going to happen at this point.
Dave Jarvis
@nico: I don't know if LOESS is applicable. I'll look into it, though, as my first Googling looks promising. I was told to investigate autoregression modeling, too, but, again, very easy to say something than implement it and produce a working, rock-solid system suitable for use by the general public.
Dave Jarvis
@Dave Jarvis: of course, I understand it's not so easy to reimplement everything, I wasn't sure at what point you were with the project of course :)Anyway my idea of just pulling the data in PostgreSQL and feed them into R for the regression is still applicable, isn't it? Here's a page with some LOESS examples with R http://research.stowers-institute.org/efg/R/Statistics/loess.htm
nico
17 days to migrate from MySQL to Postgresql? Yeebus. I like Pg better to, but R would have been very happy to read from MySQL too. And as others have said, you don't need Jasper. But if that is what you know, go for it.
Dirk Eddelbuettel
@nico: Also (before everyone downvotes me), gplot is a plotting tool and R is a fantastic statistical analysis framework. Neither are *reporting* tools.
Dave Jarvis
@Dave Jarvis: I'm definitely not gonna downvote it for that, although I think you can easily do a graph like that in R with a one-liner :) Still, I do not understand why you think the user would have to interact with R. You would have a bunch of R functions that generate one graph or another, the user would not have to interact at all with it. Using RApache you would have been able to use PHP to pull raw data from your DB, send that raw data to R and then have it process it and return it to the PHP script (see the 3rd link in my answer).
nico
@nico: The graphs R produces are still not even close to the same calibre as those of JasperReports. I took a peek at the third link. The lines are aliased, the label spacing is off, the resolution is low, the fonts are arial, the blue colour doesn't match the site, generating a PDF that someone can print and share in Parliament would be difficult, the graph looks too "scientific", and I could go on.
Dave Jarvis
@nico: That's not to say it isn't a great job and valuable for the scientific community. I just have a hard time imagining it wowing the general public. Again, R and gplot are tools for scientific analysis, not *reporting* tools.
Dave Jarvis
+3  A: 

I don't think autoregression is what you want. Non-linear isn't what you want either because the implies discontinuous data. You have continuous data, it just may not be a straight line. If you're just visualizing, and especially if you don't know what the shape is supposed to be then loess is what you want.

It's easy to also get a confidence interval band around the line if you just plot the data with ggplot2.

qplot(x, y, data = df, geom = 'point') + stat_smooth()

That will make a nice plot.

If you want to a simpler graph in straight R.

plot(x, y)
lines(loess.smooth(x,y))
John
@John: Thank you. Neither gplot nor R make beautiful plots, nor would I be able to generate a PDF that looks like the image shown using either gplot or R. And even if they could, the amount of time invested would be a waste -- I already have the graph developed (as shown) and my deadline is July 15th. Trust me when I say there is much more I'd rather do with the project than rewrite something that already works. :-)
Dave Jarvis
+1  A: 

The awesome pl/r package allows you to run R inside PostgreSQL as a procedural language. There are some gotchas because R likes to think about data in terms of vectors which is not what a RDBMS does. It is still a very useful package as it gives you R inside of PostgreSQL saving you some of the roundtrips of your architecture.

And pl/r is apt-get-able for you as it has been part of Debian / Ubuntu for a while. Start with apt-cache show postgresql-8.4-plr (that is on testing, other versions/flavours have it too).

As for the appropriate modeling: that is a whole different ballgame. loess is a fair suggestion for something non-parametric, and you probably also want some sort of dynamic model, either ARMA/ARIMA or lagged regression. The choice of modeling is pretty critical given how politicized the topic is.

Dirk Eddelbuettel
Thanks, Dirk. I should have mentioned I already installed PL/R. (It didn't install out-of-the-box due to library path issues with the amd64 package for 8.4.) I'll look into ARMA/ARIMA, lagged regression, and LOESS. I was more wondering what R packages to install on top of PL/R.
Dave Jarvis