StringTools
ApproximateSearch
find the first approximate occurrences of a string in another string
ApproximateSearchAll
find all approximate occurrences of a string in another string
Calling Sequence
Parameters
Description
Examples
ApproximateSearch( pattern, text, k )
ApproximateSearchAll( pattern, text, k )
pattern
-
string; pattern
text
string; text to search
k
non-negative integer; maximum edit distance
The ApproximateSearch(pattern, text, k) command locates the first occurrence of a substring of the string text that is close to the string pattern in terms of its Levenshtein distance from pattern. Specifically, it locates the first such substring whose Levenshtein distance from pattern is less than or equal to k.
The ApproximateSearchAll(pattern, text, k) command locates all occurrences of substrings of the string text that are within Levenshtein distance k of the string pattern. An expression sequence of offsets marking the ends of matches is returned.
Note that for k=0, approximate searching degenerates into exact searching, for which StringTools[Search] and StringTools[SearchAll] provide faster algorithms.
If length⁡pattern≤k, then every possible substring of text with length equal to the length of pattern matches pattern.
For a related concept that uses the Hamming metric instead of the edit distance, see StringTools[HammingSearch].
All of the StringTools package commands treat strings as (null-terminated) sequences of 8-bit (ASCII) characters. Thus, there is no support for multibyte character encodings, such as unicode encodings.
with⁡StringTools:
ApproximateSearch⁡foo,defoe,0
0
ApproximateSearch⁡foo,defoe,1
4
ApproximateSearch⁡foo,defoe,2
3
ApproximateSearch⁡foo,defoe,3
1
ApproximateSearch⁡foo,defoe,4
ApproximateSearch⁡gataa,cagataagagaa,2
5
ApproximateSearchAll⁡gataa,cagataagagaa,2
5,6,7,8,9,11,12
See Also
searchtext
string
StringTools[HammingDistance]
StringTools[HammingSearch]
StringTools[HammingSearchAll]
StringTools[Levenshtein]
StringTools[Search]
StringTools[SearchAll]
Download Help Document