tags:

views:

66

answers:

2

looking for something similiar to

http://code.google.com/p/templatemaker/

but for ruby...

basically comparing a variety of sample of random strings (100) and extracting only unique strings...

+1  A: 

You can put all the strings in an array and find the uniq one, like sp

string_array = ['string1', 'string2', 'string3', 'string4', 'string1', 'string1', 'string4' ]
unique_strngs = string_array.uniq 
#=> ['string1', 'string2', 'string3', 'string4']
nas
A: 

I don't know of anything that already exists but it shouldn't be that difficult to write something.

As far as I can tell, something along the following lines should work:

  • when #learn is first called with a string, chunk that string based on some regular expression and store the chunks in an array
  • when #learn is next called with a string, chunk that string, compare to what has already been stored, looking for chunks that differ to the stored ones.
    • for those chunks that differ store some marker in the array, maybe a symbol :hole (to use the terminology from the link)
  • when a string is then passed to #extract, it again chunks the string, compares to the signature that was built by #learn and if there is a match, extracts the words that lie in the place of the :hole symbol and returns them as an array.
  • the #as_text(string) method would simply replace the :hole symbols in the signature array with the supplied string, join them all back up and return the result.

I'm sure there are issues with this algorithm as it won't work exactly as in the supplied link but it is a start. I just chucked it together.

But as far as I know there's nothing already available in Ruby to do this.

tobyclemson