How about just looping through the (assuming few) N instances:
addZeros <- function(x, N = 3) {
xx <- x
z <- x - 1
for (i in 1:N) {
xx <- xx + c(rep(0, i), z[-c((NROW(x) - i + 1):NROW(x))])
}
xx[xx<0] <- 0
xx
}
Simply turns all zero instances into -1 in order to subtract the N succeeding values.
> x <- c(1,1,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,0,0,1,0,1)
> x
[1] 1 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 1
[39] 1 1 1 1 1 1 0 0 1 0 1
> addZeros(x)
[1] 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1
[39] 1 1 1 1 1 1 0 0 0 0 0
EDIT:
After reading your description of the data in the R-help mailing list, this clearly is not a case of small N. Hence, you might want to consider a C function for this.
In the file "addZeros.c":
void addZeros(int *x, int *N, int *n)
{
int i, j;
for (i = *n - 1; i > 0; i--)
{
if ((x[i - 1] == 0) && (x[i] == 1))
{
j = 0;
while ((j < *N) && (i + j < *n) && (x[i + j] == 1))
{
x[i + j] = 0;
j++;
}
}
}
}
In command prompt (MS DOS in Windows, press Win+r and write cmd), write "R CMD SHLIB addZeros.c". If the path to R is not attainable (i.e. "unknown kommand R") you need to state full address (on my system:
"c:\Program Files\R\R-2.10.1\bin\R.exe" CMD SHLIB addZeros.c
On Windows this should produce a DLL (.so in Linux), but if you do not already have the R-toolbox you should download and install it (it is a collection of tools, such as Perl and Mingw). Download the newest version from
http://www.murdoch-sutherland.com/Rtools/
The R wrapper function for this would be:
addZeros2 <- function(x, N) {
if (!is.loaded("addZeros"))
dyn.load(file.path(paste("addZeros", .Platform$dynlib.ext, sep = "")))
.C("addZeros",
x = as.integer(x),
as.integer(N),
as.integer(NROW(x)))$x
}
Note that the working directory in R should be the same as the DLL (on my system setwd("C:/Users/eyjo/Documents/Forrit/R/addZeros")
) before the addZeros R function is called the first time (alternatively, in dyn.load
just include the full path to the dll file). It is good practice to keep these in a sub-directory under the project (i.e. "c"), then just add "c/" in front of "addZeros" in the file path.
To illustrate:
> x <- rbinom(1000000, 1, 0.9)
>
> system.time(addZeros(x, 10))
user system elapsed
0.45 0.14 0.59
> system.time(addZeros(x, 400))
user system elapsed
15.87 3.70 19.64
>
> system.time(addZeros2(x, 10))
user system elapsed
0.01 0.02 0.03
> system.time(addZeros2(x, 400))
user system elapsed
0.03 0.00 0.03
>
Where the "addZeros" is my original suggestion with just internal R, and addZeros2 is using the C function.