StringTools
SimilarityCoefficient
computes the similarity coefficient of two strings
Calling Sequence
Parameters
Description
Examples
SimilarityCoefficient( s, t )
SimilarityCoefficient( s, t, n )
s
-
Maple string
t
n
(optional) positive integer
The SimilarityCoefficient(s, t) command computes the similarity coefficient of two strings s and t, defined as follows.
Let N(S) denote the set of trigrams of any string S. Then nops⁡N⁡s∪N⁡tnops⁡N⁡s∩N⁡t is the similarity coefficient of s and t, with the convention that strings having no trigrams in common have similarity coefficient equal to infinity.
An optional argument n may be specified causing the similarity coefficient to be computed for n-gram similarity instead of the default trigram similarity.
In typical applications, n is taken to be either 2 or 3 (the default). Note that Maple computes this measure as an exact rational quantity, rather than a floating-point approximation. You can obtain a floating-point result by applying evalf to the result.
All of the StringTools package commands treat strings as (null-terminated) sequences of 8-bit (ASCII) characters. Thus, there is no support for multibyte character encodings, such as unicode encodings.
with⁡StringTools:
SimilarityCoefficient⁡Canada,Canary
3
SimilarityCoefficient⁡Kline,Cline
2
SimilarityCoefficient⁡mathematics,mathematische
SimilarityCoefficient⁡Constance,Connor,1
SimilarityCoefficient⁡Constance,Connor,2
112
SimilarityCoefficient⁡Constance,Connor,3
10
SimilarityCoefficient⁡Constance,Connor,4
∞
SimilarityCoefficient⁡Constance,Connor,5
See Also
evalf
string
StringTools[NGrams]
Download Help Document