Back in 2015 the String.levenshtein_distance/2
function was removed, and replaced with the String.jaro_distance/2
we have now.
This stackoverflow answer explains the difference between the two algorithims.
My question is, why was levenshtein distance removed, when it has certain use cases that Jaro does not satisfy. To go deeper, why are any of these string comparisons still in STDlib? I do find them immensely useful, but it does seem odd to have them as an incomplete set.
Interesting enough, both the Jaro implementation we have in STDlib right now and the older levenshtein implementation both have “alternates” that aim to improve their ability to differentiate string distance. Jaro has the winkler modification, which gives favorability to strings that share longer prefixes, and levenshtein has the damerau modification, which allows it to recognize transpositions, and the Boehmer & Rees modification, which allows for arbitrary length transpositions.