tags:

views:

125

answers:

2

i've been banging my head against rpart for a few days now (trying to make classification trees for this dataset that I have), and I think it's time to ask a lifeline at this point :-) I'm sure it's something silly that I'm not seeing, but here's what I've been doing:

EuropeWater <- read.csv(file=paste("/Users/artessaniccola/Documents/",
                       "Magic Briefcase/CityTypology/Europe_water.csv",sep=""))
library(rpart)
attach(EuropeWater)
names(EuropeWater)
[1] "City"          "waterpercapita_m3" "water_class"       "population"       
[5] "GDPpercapita"  "area_km2"          "populationdensity" "climate"            
EuropeWater$water_class <- factor(EuropeWater$water_class, levels=1:3, 
                                  labels=c("Low", "Medium", "High"))
EuropeWater$climate <- factor(EuropeWater$climate, levels=2:4, 
                              labels=c("Arid", "Warm temperate", "Snow"))
EuropeWater_tree <- rpart(EuropeWater$water_class ~ 
               population+GDPpercapita + area_km2 + populationdensity + 
               EuropeWater$climate, 
               data=EuropeWater, method=class)   
Error in as.character(x) : 
          cannot coerce type 'builtin' to vector of type 'character'

and for the life of me, I can't figure out what the Error is about.

Any thoughts would be most appreciated!!!!

+1  A: 

I would start by fixing the formula: remove the redundant EuropeWater as you already supply the data= argument:

res <- rpart(water_class ~ population + GDPpercapita + area_km2 + 
                           populationdensity + climate, 
             data=EuropeWater, method="class")

Also, make sure that all columns of your data.frame are of the appropriate type. Maybe some of the data read from the csv file was mistakenly read as a factor? A quick summary(EuropeWater) may reveal this.

Dirk Eddelbuettel
I think that `str` could be better to quick view of `data.frame' content
Marek
Sure, or `sapply(EuropeWater, class)` to get class-by-column -- there are many options.
Dirk Eddelbuettel
+5  A: 

Is this works:

EuropeWater_tree <- rpart(EuropeWater$water_class ~ 
 population+GDPpercapita + area_km2 + populationdensity + EuropeWater$climate, 
 data=EuropeWater, method="class")

I think you should quote method type.

Marek
Good catch! I'll amend my reply too.
Dirk Eddelbuettel
adding quotes to the method type was exactly the problem. Thank you!!!