tags:

views:

751

answers:

2

Hello.

I have some data in a dataframe calvarbyruno.1 with variables Nominal and PAR that represent the Peak Area Ratio (PAR) found from analysis of a set of standards using a particular analytical technique, and two lm models of that data (linear and quadratic) for the relationship PAR ~ Nominal. I'm trying to use the predict.lm function to back calculate Nominal values, given my PAR values, but both predict.lm and fitted seem to only give me PAR values. I'm slowly loosing my mojo, can anyone help ?

calvarbyruno.1 dataframe

structure(list(Nominal = c(1, 3, 6, 10, 30, 50, 150, 250), Run = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("1", "2", "3"), class = "factor"), 
    PAR = c(1.25000000000000e-05, 0.000960333333333333, 0.00205833333333334, 
    0.00423333333333333, 0.0322333333333334, 0.614433333333334, 
    1.24333333333333, 1.86333333333333), PredLin = c(-0.0119152187070942, 
    0.00375925114245899, 0.0272709559167888, 0.0586198956158952, 
    0.215364594111427, 0.372109292606959, 1.15583278508462, 1.93955627756228
    ), PredQuad = c(-0.0615895732702735, -0.0501563307416599, 
    -0.0330831368244257, -0.0104619953693943, 0.100190275883806, 
    0.20675348710041, 0.6782336426345, 1.04748729725370)), .Names = c("Nominal", 
"Run", "PAR", "PredLin", "PredQuad"), row.names = c(NA, 8L), class = "data.frame")

Linear Model

summary(callin.1)

Call:
lm(formula = PAR ~ Nominal, data = calvarbyruno.1, weights = Nominal^calweight)

Residuals:
       Min         1Q     Median         3Q        Max 
-0.0041172 -0.0037785 -0.0003605  0.0024465  0.0071815 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)  
(Intercept) -0.007083   0.005037  -1.406   0.2093  
Nominal      0.005249   0.001910   2.748   0.0334 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Residual standard error: 0.004517 on 6 degrees of freedom
Multiple R-squared: 0.5572,     Adjusted R-squared: 0.4835 
F-statistic: 7.551 on 1 and 6 DF,  p-value: 0.03338

Quadratic Model

> summary(calquad.1)

Call:
lm(formula = PAR ~ Nominal + I(Nominal^2), data = calvarbyruno.1)

Residuals:
        1         2         3         4         5         6         7         8 
 0.053366  0.033186  0.002766 -0.036756 -0.211640  0.177012 -0.021801  0.003867 

Coefficients:
               Estimate Std. Error t value Pr(>|t|)   
(Intercept)  -6.395e-02  6.578e-02  -0.972  0.37560   
Nominal       1.061e-02  2.205e-03   4.812  0.00483 **
I(Nominal^2) -1.167e-05  9.000e-06  -1.297  0.25138   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

Residual standard error: 0.128 on 5 degrees of freedom
Multiple R-squared: 0.9774,     Adjusted R-squared: 0.9684 
F-statistic: 108.2 on 2 and 5 DF,  p-value: 7.658e-05

But Predict gives me these values, which both seem wrong (although I can't work out what it's doing that's different for the second set ?

> predict(callin.1)
           1            2            3            4            5            6 
-0.001834123  0.008663451  0.024409812  0.045404959  0.150380698  0.255356437 
           7            8 
 0.780235132  1.305113826 
> predict(callin.1,type="terms")
      Nominal
1 -0.32280040
2 -0.31230282
3 -0.29655646
4 -0.27556131
5 -0.17058558
6 -0.06560984
7  0.45926886
8  0.98414755
attr(,"constant")
[1] 0.3209663

EDIT: As has been pointed out, I've not been very clear on what I'm trying to achieve, so I'll try to exaplian myself better.

The data is from analysis of a set of standards of known concentrations (Nominal) which gives a particular set of responses, or peak area ratio's (PAR). I want to show which model best fits this data to use to then analyse unknown samples to find their concentration.

I'm trying to follow someone else working for this, which involves;
a) Find the appropriate weight to use, by finding the within run variance of PAR and fitting that to a model of log(Variance(PAR))=a+b*log(Nominal), where B will be the weight to use (rounded to nearest integer)
b) Fit the data for each run to a linear model (PAR = a+b*Nominal) and a Quadratic Model (PAR = a+B*Nominal+c*Nominal^2)
c) Back calculate the found concentration for each standard and compare to the Nominal conentration to give the bias
d) Assess bias across the calibration range and pick the model based on the bias

This question is trying to do c). Posts to the R mailing list suggest it's not appropriate to just do the regression with the terms reversed, I can manually do the calculation for the linear model, but am struggling with the quadratic model. It seems from searcing the R mailing list that others want to do the same thing.

+2  A: 

This has come up on the R mialing list (see here). The post suggest a package I can't find, but these packages might be usefull (calib, chemcal )

PaulHurleyuk
This link might help with finding the inverse of a quadratic polynomial http://mathworld.wolfram.com/CompletingtheSquare.html
PaulHurleyuk
+3  A: 

OK, I actually had to try this, after looking at various things I wrote a function to find the roots of a quadratic equation.

invquad<-function(a,b,c,y,roots="both", xmin=(-Inf), xmax=(Inf),na.rm=FALSE){
#Calculate the inverse of a quadratic function y=ax^2+bx+c (ie find x when given y)
#Gives NaN with non real solutions
root1<-sqrt((y-(c-b^2/(4*a)))/a)-(b/(2*a))
root2<--sqrt((y-(c-b^2/(4*a)))/a)-(b/(2*a))
if (roots=="both") {
    root1<-ifelse(root1<xmin,NA,root1) 
    root1<-ifelse(root1>xmax,NA,root1) 
    root2<-ifelse(root2<xmin,NA,root2) 
    root2<-ifelse(root2>xmax,NA,root2)  
    result<-c(root1,root2)
    if (na.rm) result<-ifelse(is.na(root1),root2, result)
    if (na.rm) result<-ifelse(is.na(root2),root1,result)
    if (na.rm) result<-ifelse(is.na(root1)&is.na(root2),NA,result)
},roots="both"
if (roots=="min")
    result<-pmin(root1,root2, NA.rm=TRUE)
if (roots=="max")
    result<-pmax(root1,root2, NA.rm=TRUE)
result
}

so, given the original data

> PAR
[1] 0.0000125000 0.0009603333 0.0020583333 0.0042333333 0.0322333333 0.6144333333
[7] 1.2433333333 1.8633333333
> Nominal
[1]   1   3   6  10  30  50 150 250

we can do the analysis, find the co-efficients and then find the inverse, using some sensible limits for what Nominal values we expect back...

lm(PAR~Nominal+I(Nominal^2))->bob
> bob[[1]][[3]]
[1] -1.166904e-05 # Nominal^2
> bob[[1]][[2]]
[1] 0.01061094 # Nominal
> bob[[1]][[1]]
[1] -0.06395298 # Intercept
> invquad(bob[[1]][[3]],bob[[1]][[2]],bob[[1]][[1]],y=PAR,xmin=-0.2,xmax=300,na.rm=TRUE)
[1]   6.068762   6.159306   6.264217   6.472106   9.157041  69.198703 146.949154
[8] 250.811211

Hope this helps....

PaulHurleyuk