1
0
Fork 0
mirror of https://github.com/rails/rails.git synced 2022-11-09 12:12:34 -05:00
rails--rails/guides/rails_guides/levenshtein.rb
Yuki Nishijima f7ba69436a Speed up Levenshtein by 50% and reduce 97% of memory usage
Calculating -------------------------------------
             each_char   924.000  i/100ms
        each_codepoint     1.381k i/100ms
  -------------------------------------------------
             each_char      9.320k (¡Þ 5.1%) i/s -     47.124k
        each_codepoint     13.857k (¡Þ 3.6%) i/s -     70.431k

  Comparison:
        each_codepoint:    13857.4 i/s
             each_char:     9319.5 i/s - 1.49x slower

The full report can be found here:
  https://gist.github.com/yuki24/a80988f35aceac76f1d5
2015-04-11 15:28:23 -07:00

40 lines
923 B
Ruby

module RailsGuides
module Levenshtein
# This code is based directly on the Text gem implementation
# Returns a value representing the "cost" of transforming str1 into str2
def self.distance str1, str2
s = str1
t = str2
n = s.length
m = t.length
return m if (0 == n)
return n if (0 == m)
d = (0..m).to_a
x = nil
# avoid duplicating an enumerable object in the loop
str2_codepoint_enumerable = str2.each_codepoint
str1.each_codepoint.with_index do |char1, i|
e = i+1
str2_codepoint_enumerable.with_index do |char2, j|
cost = (char1 == char2) ? 0 : 1
x = [
d[j+1] + 1, # insertion
e + 1, # deletion
d[j] + cost # substitution
].min
d[j] = e
e = x
end
d[m] = x
end
return x
end
end
end