Hello.
I have some data in a dataframe calvarbyruno.1 with variables Nominal and PAR that represent the Peak Area Ratio (PAR) found from analysis of a set of standards using a particular analytical technique, and two lm models of that data (linear and quadratic) for the relationship PAR ~ Nominal. I'm trying to use the predict.lm function to back calculate Nominal values, given my PAR values, but both predict.lm and fitted seem to only give me PAR values. I'm slowly loosing my mojo, can anyone help ?
calvarbyruno.1 dataframe
structure(list(Nominal = c(1, 3, 6, 10, 30, 50, 150, 250), Run = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("1", "2", "3"), class = "factor"),
PAR = c(1.25000000000000e-05, 0.000960333333333333, 0.00205833333333334,
0.00423333333333333, 0.0322333333333334, 0.614433333333334,
1.24333333333333, 1.86333333333333), PredLin = c(-0.0119152187070942,
0.00375925114245899, 0.0272709559167888, 0.0586198956158952,
0.215364594111427, 0.372109292606959, 1.15583278508462, 1.93955627756228
), PredQuad = c(-0.0615895732702735, -0.0501563307416599,
-0.0330831368244257, -0.0104619953693943, 0.100190275883806,
0.20675348710041, 0.6782336426345, 1.04748729725370)), .Names = c("Nominal",
"Run", "PAR", "PredLin", "PredQuad"), row.names = c(NA, 8L), class = "data.frame")
Linear Model
summary(callin.1)
Call:
lm(formula = PAR ~ Nominal, data = calvarbyruno.1, weights = Nominal^calweight)
Residuals:
Min 1Q Median 3Q Max
-0.0041172 -0.0037785 -0.0003605 0.0024465 0.0071815
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.007083 0.005037 -1.406 0.2093
Nominal 0.005249 0.001910 2.748 0.0334 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.004517 on 6 degrees of freedom
Multiple R-squared: 0.5572, Adjusted R-squared: 0.4835
F-statistic: 7.551 on 1 and 6 DF, p-value: 0.03338
Quadratic Model
> summary(calquad.1)
Call:
lm(formula = PAR ~ Nominal + I(Nominal^2), data = calvarbyruno.1)
Residuals:
1 2 3 4 5 6 7 8
0.053366 0.033186 0.002766 -0.036756 -0.211640 0.177012 -0.021801 0.003867
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -6.395e-02 6.578e-02 -0.972 0.37560
Nominal 1.061e-02 2.205e-03 4.812 0.00483 **
I(Nominal^2) -1.167e-05 9.000e-06 -1.297 0.25138
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.128 on 5 degrees of freedom
Multiple R-squared: 0.9774, Adjusted R-squared: 0.9684
F-statistic: 108.2 on 2 and 5 DF, p-value: 7.658e-05
But Predict gives me these values, which both seem wrong (although I can't work out what it's doing that's different for the second set ?
> predict(callin.1)
1 2 3 4 5 6
-0.001834123 0.008663451 0.024409812 0.045404959 0.150380698 0.255356437
7 8
0.780235132 1.305113826
> predict(callin.1,type="terms")
Nominal
1 -0.32280040
2 -0.31230282
3 -0.29655646
4 -0.27556131
5 -0.17058558
6 -0.06560984
7 0.45926886
8 0.98414755
attr(,"constant")
[1] 0.3209663
EDIT: As has been pointed out, I've not been very clear on what I'm trying to achieve, so I'll try to exaplian myself better.
The data is from analysis of a set of standards of known concentrations (Nominal) which gives a particular set of responses, or peak area ratio's (PAR). I want to show which model best fits this data to use to then analyse unknown samples to find their concentration.
I'm trying to follow someone else working for this, which involves;
a) Find the appropriate weight to use, by finding the within run variance of PAR and fitting that to a model of log(Variance(PAR))=a+b*log(Nominal), where B will be the weight to use (rounded to nearest integer)
b) Fit the data for each run to a linear model (PAR = a+b*Nominal) and a Quadratic Model (PAR = a+B*Nominal+c*Nominal^2)
c) Back calculate the found concentration for each standard and compare to the Nominal conentration to give the bias
d) Assess bias across the calibration range and pick the model based on the bias
This question is trying to do c). Posts to the R mailing list suggest it's not appropriate to just do the regression with the terms reversed, I can manually do the calculation for the linear model, but am struggling with the quadratic model. It seems from searcing the R mailing list that others want to do the same thing.