Edit distance (Levenshtein-Distance) algorithm explanation

Problem Detail: I want to calculate the edit distance (aka Levenshtein-Distance) between two words: «solo» and «oslo». According to this site we’ll get the result matrix: Edit distance (Levenshtein-Distance) calculation matrix What I don’t understand is: In case of comparison the last «o» from «solo» with the first «o» of «oslo» will see the submatrix: 3 2
4
3 As far as I understand, in order to calculate the bottom right value, which is equal in this example to 3 of this submatrix, we’ll get the min(3, 2, 4) + (0 if the letters are the same, otherwise 1). So, why in this example the bottom right value is equal to 3 and not to 2?

Asked By : Mike B.

Answered By : Mike B.

If we look deeply at algorithm implementation in C: http://en.m.wikibooks.org/wiki/Algorithm_Implementation/Strings/Levenshtein_distance#C we’ll see the main string which fills the matrix:

matrix[x][y] = MIN3(matrix[x-1][y] + 1, matrix[x][y-1] + 1,                      matrix[x-1][y-1] + (s1[y-1] == s2[x-1] ? 0 : 1)); 

as we can see, the value of cells with indexes matrix[x-1][y] (upper cell) and matrix[x][y-1] (left cell) in automatically increased, the upper-left cell (by diagonal) with index matrix[x-1][y-1] is increased by 1 only if the letters are different and only after that we’ll take a minimum from these 3 values. In other words: 3 2
4
min(3+0, 2+1, 4+1) = 3

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/28734

Leave a Reply