tags:

views:

121

answers:

1

Has anyone worked with NY police stops data mentioned in Gelman, Hill book Data Analysis Using Reg. and Multi/Hier Modeling (ARM). The data is under

http://www.stat.columbia.edu/~gelman/arm/examples/police/

the file is frisk_with_noise.dat. I removed the description part of this data, renamed past.arrests as arrests, saved it as frisk.dat. Called glm from R like this:

library ("foreign")
frisk <- read.table ("frisk.dat", header=TRUE)
attach (frisk)
glm(formula = stops ~ 1, family=poisson, offset=log(arrests))

The glm call is right out of ARM book. In any case, I get the error:

Error: NA/NaN/Inf in foreign function call (arg 4)

Any ideas? Gelman has a piece of code under the same directory called police_setup.R that is supposed to have some cleanup code, but that doesnt work either.

+1  A: 

I haven't gone back to look at exactly what Gelman is doing in this chapter (my copy of the book is in storage ...), but the specific problem with this example is that 'arrests' is zero in some cases, so using log(arrests) as an offset causes problems. (You don't need library(foreign), and using a data argument to glm is usually safer/better than using attach().)

X <- read.table("frisk_with_noise.dat",skip=6,header=TRUE)
names(X)[3] <- "arrests"
glm(stops~1,family=poisson,offset=log(arrests),data=X,
    subset=arrests>0)
Ben Bolker
I ran it like you show, it says(Intercept) -0.5877 Degrees of Freedom: 898 Total (i.e. Null); 898 ResidualNull Deviance: 184000 Residual Deviance: 184000 AIC: 189300 > Book has (Intercept) -3.4 0.0 n = 225, k = 1 residual deviance = 44877, null deviance = 44877 (difference = 0)This is an improvement tho, thanks.