StringTools
Entropy
compute the Entropy of a string
Calling Sequence
Parameters
Description
Examples
Entropy( s )
s
-
Maple string
The Entropy(s) command returns the Shannon entropy of the string s. A floating-point number, the entropy of the string, is returned.
Shannon's entropy is defined as -add( P( ch ) * log[ 2 ]( P( ch ) ), ch = Support( s ) ), where P⁡ch=CountCharacterOccurrences⁡s,chlength⁡s. It is a measure of the information content of the string, and can be interpreted as the number of bits required to encode each character of the string given perfect compression. The entropy is maximal when each character is equally likely. For arbitrary non-null characters, this maximal value is log2⁡255=7.99435.
(The null byte, with code point 0, cannot appear in a Maple string. If all 256 single byte code points could appear, then the maximal entropy would be log2⁡256=8, which is the number of bits per byte).
Note that the entropy is computed as a floating-point number, at hardware (double) precision.
All of the StringTools package commands treat strings as (null-terminated) sequences of 8-bit (ASCII) characters. Thus, there is no support for multibyte character encodings, such as unicode encodings.
useStringToolsinEntropy⁡Mathematicsend use
3.09579525500093355
with⁡StringTools:
Entropy⁡aaaaaaaaaaaaaaaaaaaaaaaaaa
−0.
Entropy⁡aaaaaaaaaaaaaaaaaaaaaaaaaaB
0.228538143953528006
Entropy( Iota( 1, 255 ) );
7.99435343685886934
Entropy⁡Random⁡1000000
7.99417106407216149
evalf⁡log2⁡255
7.994353436
Entropy⁡Random⁡1000000,lower
4.70042263084046397
evalf⁡log2⁡26
4.700439718
Entropy⁡Repeat⁡ab,100
1.
Entropy⁡Repeat⁡abc,100
1.58496250072115585
Entropy⁡Repeat⁡abcde,100
2.32192809488736218
Entropy⁡Repeat⁡Random⁡10,10000
3.32192809488736218
The following steps illustrate the definition of Entropy.
s≔Random⁡30,lower
s≔rbygsggdjijjtiqelzxehfnojeorwr
occ≔seq⁡CountCharacterOccurrences⁡s,ch,ch=Support⁡s
occ≔1,1,3,1,3,1,2,4,1,1,2,1,3,1,1,1,1,1,1
L≔map⁡`/`,occ,length⁡s
L≔130,130,110,130,110,130,115,215,130,130,115,130,110,130,130,130,130,130,130
U≔map⁡p↦−evalf⁡p⋅log2⁡p,L
U≔0.1635630199,0.1635630199,0.3321928095,0.1635630199,0.3321928095,0.1635630199,0.2604593730,0.3875854127,0.1635630199,0.1635630199,0.2604593730,0.1635630199,0.3321928095,0.1635630199,0.1635630199,0.1635630199,0.1635630199,0.1635630199,0.1635630199
convert⁡U,`+`
4.031401848
Entropy⁡s
4.03140184539217117
See Also
add
convert
evalf
length
log
map
seq
string
StringTools[CountCharacterOccurrences]
StringTools[Iota]
StringTools[Random]
StringTools[Repeat]
StringTools[Support]
with
Download Help Document