Fast Levenshtein distance in R?
Is there a package that contains Levenshtein distance counting function which is implemented as a C or Fortran code? I have many strings to compare and stringMatch from MiscPsycho is too slow for this. ...
Is there a package that contains Levenshtein distance counting function which is implemented as a C or Fortran code? I have many strings to compare and stringMatch from MiscPsycho is too slow for this. ...
I have a data that looks like this: for_y_axis <-c(0.49534,0.80796,0.93970,0.99998) for_x_axis <-c(1,2,3,4) count <-c(0,33,0,4) What I want to do is to plot the graph using for_x_axis and for_y_axis but will mark the point with "o" if the count value is equal to 0(zero) and with "x" if the count value is greater than zero. Is th...
Good afternoon, After computing a rather large vector (a bit shorter than 2^20 elements), I have to store the result in a database. The script takes about 4 hours to execute with a simple code such as : #Do the processing myVector<-processData(myData) #Sends every thing to the database lapply(myVector,sendToDB) What do you think is ...
Suppose I have two vectors of same dimensions: x <-c(0.49534,0.80796,0.93970,0.99998) count <-c(0,33,0,4) How can I group the vectors 'x' into two vectors: Vector grzero that contain value in x with count value greater than 0 and Vector eqzero with value in x with count value equal to zero. Yielding > print(grzero) > [1] ...
when I load a library (with a NAMESPACE), the functions .onLoad and .onAttach are called, as is .onUnload when I detach the library unloading the namespace. I was wondering whether R does define a way that would save me the work of detaching/unloading the library by hand in each of my scripts that use the xxx library. for this I would ...
I wish to know what type of objects I've got in my environment. I can show who is there like this: ls() But running something like sapply(ls(), class) Would (obviously) not tell us what type (class) of objects we are having (function, numeric, factor and so on...) using ls.str() Will tell me what class my objects are, but I w...
(This question might be too difficult, and maybe not worth the hassle to solve - however, if there is an easy solution - I would be curious to know) Let's say I create an image (using the grid package) which looks like this: require(grid) grid.newpage() grid.polygon(x=c((0:4)/10, rep(.5, 5), (10:6)/10, rep(.5, 5)), y=c(rep...
Hello! In R, I would like to create a loop which takes the first 3000 columns of my data frame and writes them into one file, the next 3000 columns into another file, and so on and so forth until all columns have been divided as such. What would be the best way to do this? I understand there are the isplit and iterators functions avai...
What if you want to apply a function other than format to a list of POSIXct objects? For instance, say I want to take a vector of times, truncate those times to the hour, and apply an arbitrary function to each one of those times. > obs.times=as.POSIXct(c('2010-01-02 12:37:45','2010-01-02 08:45:45','2010-01-09 14:45:53')) > obs.truncat...
I've got a scatter plot. I'd like to scale the size of each point by its frequency. So I've got a frequency column of the same length. However, if I do: ... + geom_point(size=Freq) I get this error: When _setting_ aesthetics, they may only take one value. Problems: size which I interpret as all points can only have 1 size. So how w...
I'm trying to figure out how to use merge() to update a database. Here is an example. Take for example the data frame foo foo <- data.frame(index=c('a', 'b', 'c', 'd'), value=c(100, 101, NA, NA)) Which has the following values index value 1 a 100 2 b 101 3 c NA 4 d NA And the data frame bar bar <- data....
I have a data frame in R with POSIXct variable sessionstarttime. Each row is identified by integer ID variable of a specified location . Number of rows is different for each location. I plot overall graph simply by: myplot <- ggplot(bigMAC, aes(x = sessionstarttime)) + geom_freqpoly() Is it possible to create a loop that will create a...
Hi All, Lately I have seen some cool examples of mapping in R and wanted to give this a shot. I currently have ArcView at work, but my spatial join is not working correctly (most likely user error). Objective: I need a list of countries and what World Region they belong to. I have two layers (one country detail, the other region de...
my main class folder was named com.test, i changed it to com.myApplication, and now when i add objects to my layout my R.java won't get updated, all objects that i had before work fine, and they are in R.java, but anything new that i create won't get added to R.java Any idea how to resolve this problem? Thanks ...
Hi, I'm stuck with a simple loop that takes more than an hour to run, and need help to speed it up. Basically, I have a matrix with 31 columns and 400 000 rows. The first 30 columns have values, and the 31st column has a column-number. I need to, per row, retrieve the value in the column indicated by the 31st column. Example row: [26...
I want to get the indices of non zero elements in a matrix.for example X <- matrix(c(1,0,3,4,0,5), byrow=TRUE, nrow=2); should give me something like this row col 1 1 1 3 2 1 2 3 can any one please tell me how to do that, Thank you ...
Is there a simple way in R to extract only the text elements of an HTML page? I think this is known as 'screen scraping' but I have no experience of it, I just need a simple way of extracting the text you'd normally see in a browser when visiting a url. ...
I am wondering how to use apply on a multidimensional array. I have something like the following: A <- array(0, c(2, 2, 5)) for(i in 1:5) { A[, , i] <- matrix(rnorm(4), 2, 2) } I would like to take the average of those slices to get a single 2 by 2 matrix. Any way I come up with is pretty kludgy. I was hoping to be able to use ...
Hello guys, I'm trying to order this dataframe by population and date, so i'm using order() and rank() function : idgeoville date population 1 5 1950 500 2 8 1950 450 3 4 1950 350 4 3 1950 350 5 4 2000 650 6 5 ...
I want to use Random forests for attribute reduction. One problem I have in my data is that I don't have discrete class - only continuous, which indicates how example differs from 'normal'. This class attribute is a kind of distance from zero to infinity. Is there any way to use Random forest for such data? ...