Hi Everyone-
I am trying to run a pooled cross sectional analysis with panel data in R. The first thing I have done after importing the data, is to drop (a) some variables and (b) some years, since I have compiled this data from multiple sources and I don't have all of the data for all of the years.
The following is the code I have used:
*****firsttry.kept<-firsttry[,-c(3,4,5,12,13,15,16,18,19, 20 )]***
***reduced <- firsttry.kept[year > 2000,]***
after that, I am using the plm package
and i'm attempting to create id and time indexes
***hybridsubsidies <-plm.data(reduced, index= c("state","year"))***
with the hope to implement the following:
***pooled1 <-plm(hybridsubsidies$hovsubsidy~ ybridsubsidies$repgov+hybridsubsidies$eespending,
hybridsubsidies,model=c("pooling"))***
but when doing the first line to index the state and years, i get this error:
***Error in `[.data.frame`(x, , !cst.check) : undefined columns selected***
when i call the data back up, it has dropped a ton of observations- specifically for all states alphabetically from A-P (up to pennsylvannia)
I can't figure out why, although when I dropped the observations for years before 2000, it looks like it creates a row.names variable in the data frame, to let me know which rows have been deleted. could this be why? but when i call the names of the data frame, it doesn't list this 'row.names' as a variable.
I'm totally confused. Someone told me that R does not like to manipulate data when it is a data frame, so i tried to set the data as a matrix and manipulate it and then put it back as a data frame, but that didn't work either