ansaurus

Question

Answer 1

A:

@dictionary.inject(@text) {|text, d|
  text.gsub d[:from], d[:to]
}

NV 2010-02-09 15:27:00

@dictionary = [{:to=>"crazy bob", :from=>"lazy"}, {:to=>"mad ben", :from=>"crazy bob"}]"quick brown fox jumps over the mad ben dog"As You can see, the same text is substituted twice, does anybody knows a one run solution (not overriding previous replacements) ?Good work @NV, thanks

astropanic 2010-02-09 15:38:46

What output do you expect? "quick brown fox jumps over the mad ben dog" or not?

NV 2010-02-09 15:49:47

I mean in my previous comment, the first substitution from lazy to crazy bob is later replaced by from crazy bob to mad ben. It should'nt replace existent substitutions.

astropanic 2010-02-09 17:51:02

Answer 2

A:

@enhaced_dictionary = @dictionary.inject({}) {|res, e| res[e[:from]] = e[:to]  }
@compiled = @text.split(/\s/).map do |e|
  @enhaced_dictionary[e] ?  @enhaced_dictionary[e] : e
end.join(' ')

clyfe 2010-02-09 15:33:19

`IndexError: string not matched`

Jonas Elfström 2010-02-09 16:06:09

Answer 3

+3 A:

@dictionary.each do |pair|
  @text.gsub!(/#{pair[:from]}/, pair[:to])
end

Or if you'd prefer it on a single line:

@dictionary.each { |pair| @text.gsub!(/#{pair[:from]}/, pair[:to]) }

It's the exact same code, just using { } instead of do end for the block (which tends to be the general Ruby practice).

mlambie 2010-02-09 15:34:29

"all occurences", so instead of `sub` should be `gsub`.

NV 2010-02-09 15:42:04

Good point, thanks NV. Updated now.

mlambie 2010-02-09 15:43:09

Answer 4

+1 A:

If it would have been only words without the {"over the"=>"under the"} then I think something like this would be faster than scanning the string over and over again like most of the solutions here do.

First I convert the array to a pure Hash

h=Hash.new
@dictionary.each {|ft| h[ft[:from]]=ft[:to]}
=> {"quick"=>"lazy", "over the"=>"under the", "jumps"=>"flies"}

then I scan the string word by word

@text.split(/ /).each{|w| h[w] || w}.join(" ")
=> "lazy brown fox flies over the lazy dog"

Also it doesn't suffer from the multiple substitution problem.

h["brown"]="quick"
=> {"brown"=>"quick", "quick"=>"lazy", "over the"=>"under the", "jumps"=>"flies"}
@text.split(/ /).each{|w| h[w] || w}.join(" ")
=> "lazy quick fox flies over the lazy dog"

I did some benchmarks and I had to add a lot more replacement pairs than I thought before the solution above got faster than gsub!.

require 'benchmark'

@dictionary = [{:to=>"lazy", :from=>"quick"}, {:to=>"flies", :from=>"jumps"}, {:from => "over the", :to => "under the"}]
@text = "quick brown fox jumps over the lazy dog" * 10000
Benchmark.bm do |benchmark|
  benchmark.report do
    h=Hash.new
    @dictionary.each {|ft| h[ft[:from]]=ft[:to]}
    [email protected](/ /).each{|w| h[w] || w}.join(' ')
  end
  benchmark.report do
    @dictionary.each { |pair| @text.gsub!(/#{pair[:from]}/, pair[:to]) }
  end

  @dictionary+=[{:to=>"black", :from=>"brown"}, {:to=>"ox", :from=>"fox"}, {:to=>"hazy", :from=>"lazy"}, {:to=>"frog", :from=>"dog"}]
  @dictionary=@dictionary*15

  benchmark.report do
    h=Hash.new
    @dictionary.each {|ft| h[ft[:from]]=ft[:to]}
    [email protected](/ /).each{|w| h[w] || w}.join(' ')
  end
  benchmark.report do
    @dictionary.each { |pair| @text.gsub!(/#{pair[:from]}/, pair[:to]) }
  end
end

The results:

      user     system      total        real
  0.890000   0.060000   0.950000 (  0.962106)
  0.200000   0.020000   0.220000 (  0.217235)
  0.980000   0.060000   1.040000 (  1.042783)
  0.980000   0.030000   1.010000 (  1.011380)

The gsub! solution was 4.5 times faster with only three replacement pairs. At 105 replacement pairs the split solution finally is as fast, it actually only got 10% slower with 105 replacement pairs than for three. The gsub! got five times slower.

Jonas Elfström 2010-02-09 15:48:12

I don't think `split` and `join` is a good idea. What if `@text = " lazy quick fox \n flies over \t the lazy dog"` ?

NV 2010-02-09 16:02:56

`@text.split(/ /).each{|w| h[w] || w}.join(" ")``=> " lazy quick fox \n flies over \t the lazy dog"`

Jonas Elfström 2010-02-09 16:26:15

ansaurus

tags:

views:

answers:

Replace words from a dictionary

related questions