StringTools
NGrams
compute the n-grams of a string
Calling Sequence
Parameters
Description
Examples
NGrams( s, n )
s
-
string
n
non-negative (32-bit) integer
The NGrams( s, n ) command computes the ordered list of n-grams of the string s. The second argument n must be a 32-bit positive integer no larger than the length of s.
An n-gram of s is a substring of s consisting of exactly n contiguous characters from s.
The empty string has no n-grams, for any value of n.
Note that NGrams( s, 1 ) is equivalent to Explode⁡s.
All of the StringTools package commands treat strings as (null-terminated) sequences of 8-bit (ASCII) characters. Thus, there is no support for multibyte character encodings, such as unicode encodings.
with⁡StringTools:
NGrams⁡abcd,1
a,b,c,d
NGrams⁡abcd,2
ab,bc,cd
NGrams⁡abcd,3
abc,bcd
NGrams⁡abcd,4
abcd
NGrams⁡mathematics,3
mat,ath,the,hem,ema,mat,ati,tic,ics
See Also
length
StringTools[Explode]
StringTools[SimilarityCoefficient]
Download Help Document