tags:

views:

37

answers:

0

Hi Everyone-

I am trying to run a pooled cross sectional analysis with panel data in R. The first thing I have done after importing the data, is to drop (a) some variables and (b) some years, since I have compiled this data from multiple sources and I don't have all of the data for all of the years.

The following is the code I have used:

*****firsttry.kept<-firsttry[,-c(3,4,5,12,13,15,16,18,19, 20 )]***

***reduced <- firsttry.kept[year > 2000,]***

after that, I am using the plm package

and i'm attempting to create id and time indexes

 ***hybridsubsidies <-plm.data(reduced, index= c("state","year"))***

with the hope to implement the following:

***pooled1 <-plm(hybridsubsidies$hovsubsidy~     ybridsubsidies$repgov+hybridsubsidies$eespending, 
hybridsubsidies,model=c("pooling"))***

but when doing the first line to index the state and years, i get this error:

***Error in `[.data.frame`(x, , !cst.check) : undefined columns selected***

when i call the data back up, it has dropped a ton of observations- specifically for all states alphabetically from A-P (up to pennsylvannia)

I can't figure out why, although when I dropped the observations for years before 2000, it looks like it creates a row.names variable in the data frame, to let me know which rows have been deleted. could this be why? but when i call the names of the data frame, it doesn't list this 'row.names' as a variable.

I'm totally confused. Someone told me that R does not like to manipulate data when it is a data frame, so i tried to set the data as a matrix and manipulate it and then put it back as a data frame, but that didn't work either