ArrayTools
Lookup
look up the values in one 1-D container based on the matches in another 1-D container
Calling Sequence
Parameters
Options
Description
Examples
Compatibility
Lookup( matchvalue, matchdata, lookupdata, options )
Lookup( matchvalue, data, orientation, matchlabel, lookuplabel, options )
matchvalue
-
value to match
data
2-D Array, Matrix, or DataFrame with both the match and lookup data
matchdata
1-D Array, Vector, DataSeries, list, or set containing the match data
lookupdata
1-D Array, Vector, DataSeries, list, or set containing the lookup data
orientation
(optional) either row or column; specifies if the match and lookup data are in rows or columns. The default is column.
matchlabel
(optional) index or label (for a DataFrame) which specifies which row or column of matchdata is to be scanned for matchvalue. When data is an Array or Matrix, the default is 2. When data is a DataFrame, the default is 1.
lookuplabel
(optional) index or label (for a DataFrame) which specifies which row or column of lookupdata is to be used to lookup results that correspond to matches in matchdata. When data is an Array or Matrix, the default is 1. When data is a DataFrame, the default is the container of DataFrame labels (either row or column, which is the opposite of orientation).
options
(optional) equation(s) of the form keyword = value, where keyword is one of compiled, digits, direction, indices, match, numresults, output, relativeerror, or ulp.
compiled: Either true or false, specifies if auxiliary procedures for floating-point comparisons are to be compiled if they are currently uncompiled. The default is false.
digits: Positive integer, specifies the working precision used for floating-point calculations and comparisons. The default is Digits.
direction: Either forward or reverse, specifies if scanning is to be performed in the forward direction or in reverse. The default is forward.
indices: Either absolute or relative, specifies if a label passed for matchlabel or lookuplabel is to be considered an absolute or relative index in the event that it is ambiguous. (See Note)
match: Specifies how matches are to be determined. There are a few options:
equal: Matches have to be equal to matchvalue.
exact: Matches have to be exact to be counted. This is the default, and it is equivalent to equal.
float: Matches are determined using floating-point comparisons.
greater_equal: Matches have to be greater than or equal to matchvalue.
greater_than: Matches have to be greater than matchvalue.
less_equal: Matches have to be less than or equal to matchvalue.
less_than: Matches have to be less than matchvalue.
regexp: Match strings using regular expressions.
wildcard: Match strings using wildcards.
An expression of type callable (e.g. procedure) that takes currentvalue as its first argument and matchvalue as its second argument (where currentvalue is the value being compared to matchvalue) and returns either true (for a match) or false.
List of the form [callable,...]. The first term in the list is an expression of type callable (e.g. procedure) that takes currentvalue and matchvalue (where currentvalue is the value being compared to matchvalue) as its first and second arguments, respectively, and returns either true or false. Any additional terms in the list are passed as additional arguments to the callable.
numresults: (optional) Either a non-negative integer or a range of non-negative integers, specifies the minimum and maximum number of matches to be found and returned. The default is 1..n, where n is the size of the match data container.
output: (optional) One of Array, list, set, sequence, Vector, Vector[column], and Vector[row], specifies the format of the output. The default is sequence.
relativeerror: (optional) Either true or false, specifies if floating-point errors are to be measured in absolute or relative terms. The default is true.
ulp: (optional) Positive integer, specifies the number of units in the last place to be used for floating-point comparisons. The default is 1.
The Lookup command searches for matchvalue in one container, column, or row of data, and for all the matches, looks up and returns the corresponding values in another container, column, or row.
When two 1-D containers, i.e. matchdata and lookupdata, are passed, they must have the same dimensions.
The numresults option also accepts ranges of the form i.., ..j, and .., which are equivalent to, respectively, i..n, 0..j, and 0..n, where n is the size of the match container.
The command works fastest when match is exact or float.
When match=exact, an algorithm based on the member command is used.
When match=float, all data must be coercible to software or hardware floats, and comparisons are made with the digits, relativeerror, and ulp options providing flexibility and tolerance. Unless Digits>evalhf(Digits) or UseHardwareFloats=false, the values will be checked using evalhf mode or a compiled procedure. A couple of notes:
The compilable procedures will be compiled when the Lookup command is first called with compiled=true, and any subsequent calls to Lookup with match=float, unless Digits>evalhf(Digits) or UseHardwareFloats=false, will use these compiled procedures, even if compiled=false is passed as an option.
There is overhead when compiling for the first time, but the speedup should be considerable. Thus, the compiled=true option is recommended. when the sizes of the containers are large or Lookup will be called many times in the session.
It is most efficient to pass numeric data in containers that already have either float[8] or complex[8] datatypes.
Comparisons when match=regexp and match=wildcard are made, respectively, with the StringTools[RegMatch] and StringTools[WildcardMatch] commands.
Note: By absolute and relative index, we mean the following:
For an Array with, say, dimensions i..j, the allowed absolute indices are any integers in the range i..j (inclusive), and the allowed relative indices are any integers in the the range 1..n or the range −n..−1 (for indexing from the end), where n=j−i+1 is the number of elements.
For a Matrix, any row or column would have dimensions of the form 1..n, where n is the size of the row or column, and any integer in the range 1..n would be allowed as either an absolute or a relative index. The allowable range of relative indices would also include −n..−1.
For a DataFrame, the absolute indices are the row and column labels. The relative indices are the same as for a Matrix.
This command is part of the ArrayTools package, so it can be used in the short form Lookup only after executing the command with(ArrayTools). However, it can always be accessed through the long form of the command by using ArrayTools:-Lookup.
with⁡ArrayTools:
Example 1
Consider the following two row Vectors:
X≔Vectorrow⁡1,2,3,4,5,4,3,2,1
X≔123454321
Y≔Vectorrow⁡10,20,30,40,50,60,70,80,90
Y≔102030405060708090
What values in Y occur at the same places that 3 appears in X?
Lookup⁡3,X,Y
30,70
If we instead search from right to left:
Lookup⁡3,X,Y,direction=reverse
70,30
The number of results can be specified, in both the forward and reverse directions:
Lookup⁡3,X,Y,numresults=1
30
Lookup⁡3,X,Y,direction=reverse,numresults=1
70
Example 2
Consider the following Matrix:
A≔Matrix⁡a,1,4,7,b,2,5,8,c,3,6,9
A≔a147b258c369
If we don't specify the orientation (row or column), match container, and lookup container, it is assumed the match container is the second column and the lookup container is the first column:
Lookup⁡2,A
b
Indexing from both the left and right are recognized:
Lookup⁡7,A,column,4,1
a
Lookup⁡7,A,column,−1,1
We can also specify rows as the match and lookup containers:
Lookup⁡6,A,row,3,1
4
Example 3
Exact matches for real and complex data may require floats:
U≔sqrt⁡2,evalf⁡sqrt⁡2,sqrt⁡3,evalf⁡sqrt⁡3+Float⁡2,1−Digits
U≔2,1.414213562,3,1.732050810
V≔a,b,c,d
We cannot detect the numerical approximation for 2 without floats, but with floats and the default options we can detect it:
Lookup⁡sqrt⁡2,U,V
Lookup⁡sqrt⁡2,U,V,match=float
a,b
For the perturbed numerical approximation of 3, on the other hand, we will need to use floats with more ULPs:
Lookup⁡sqrt⁡3,U,V
c
Lookup⁡sqrt⁡3,U,V,match=float
Lookup⁡sqrt⁡3,U,V,match=float,ulp=2
c,d
Example 4
Consider the following two lists:
A≔ab,abc,ad,abbc,a2c,aBc
B≔10,20,30,40,50,60
Matches with strings can be detected using regular expressions. For example, let's take the match value to be the regular expression for a string starting with "a", ending with "c", and having one or more characters, all lowercase letters, between them:
Lookup⁡(^)(a)([a-z]+)(c)($),A,B,match=regexp
20,40
We can also use wildcards:
Lookup⁡a*c,A,B,match=wildcard
20,40,50,60
Example 5
The Lookup command can also accept custom matching procedures to, for example, search for values in matchdata that are less than matchvalue:
Lookup⁡4,1,2,3,4,5,6,7,8,9,10,match=less_than
6,7,8
When match is a list with the first element being a procedure, the additional terms in the list are passed as additional arguments to the procedure:
F≔Vectorrow⁡seq⁡1..26
F≔1234567891011121314151617181920212223242526
G≔Vectorrow⁡seq⁡a..z
G≔abcdefghijklmnopqrstuvwxyz
Lookup⁡20,F,G,match=verify,neighbourhood⁡5,open
p,q,r,s,t,u,v,w,x
Lookup⁡20,F,G,match=verify,neighbourhood⁡5,closed
o,p,q,r,s,t,u,v,w,x,y
Example 6
Consider the following 2-D Array:
A≔Array⁡0..3,0..2,24,11,14,10,15,13,1,13,18,5,25,9
Due to the column indices being indexed 0..2 as opposed to 1..3, the column index 2 can refer to both column 2 (absolute index) and column 3 (relative index). To indicate precisely what type of indices you are using with the Lookup command when there is ambiguity, you can pass the indices option:
Lookup⁡13,A,column,2,1,indices=absolute
15
Lookup⁡13,A,column,2,1,indices=relative
1
Example 7
In this example, we work with a DataFrame of batting statistics for the 2021 Toronto Blue Jay hitters with 100 or more at bats:
Data≔Matrix⁡0.242,165,19,40,8,0,8,24,19,0.328,0.436,0.298,640,121,191,30,1,29,102,40,0.343,0.484,0.224,250,27,56,10,1,7,27,37,0.322,0.356,0.282,131,16,37,6,2,4,15,9,0.329,0.45,0.223,184,32,41,13,0,11,28,17,0.299,0.473,0.264,299,59,79,19,1,22,50,37,0.352,0.555,0.246,114,9,28,6,0,2,11,8,0.293,0.351,0.276,500,62,138,28,2,21,84,32,0.319,0.466,0.265,652,115,173,39,2,45,102,66,0.334,0.538,0.241,511,59,123,25,1,22,81,27,0.281,0.423,20.53,198,22,50,15,0,1,10,15,0.31,0.343,0.209,139,12,29,4,1,4,8,9,0.272,0.338,0.311,222,32,69,13,1,2,17,22,0.376,0.405,0.296,550,92,163,29,0,32,116,36,0.346,0.524,0.311,604,123,188,29,1,48,111,86,0.401,0.601:
Players≔Alejandro Kirk,Bo Bichette,Cavan Biggio,Corey Dickerson,Danny Jansen,George Springer,Joe Panik,Lourdes Gurriel Jr,Marcus Semien,Randal Grichuk,Reese McGuire,Rowdy Tellez,Santiago Espinal,Teoscar Hernandez,Vladimir Guerrero Jr:
Categories≔AVG,AB,R,H,2B,3B,HR,RBI,BB,OBP,SLG:
DF≔DataFrame⁡Data,rows=Players,columns=Categories
Recall that, for a DataFrame, if no lookup column is specified, the row labels are used.
Which players hit 30 or more home runs?
Lookup⁡30,DF,HR,match=greater_equal
Marcus Semien,Teoscar Hernandez,Vladimir Guerrero Jr
Which players hit 0.300 or higher and had 500 or more at bats?
select⁡member,Lookup⁡0.300,DF,AVG,match=greater_equal,output=list,Lookup⁡500,DF,AB,match=greater_equal,output=list
Vladimir Guerrero Jr
Teoscar Hernandez led the Blue Jays with 116 RBIs in 2021. Who were the (at most) top five players that drove in 100 or more runs?
Lookup⁡100,sort⁡DF,RBI,RBI,match=greater_equal,direction=reverse,numresults=1..5
Teoscar Hernandez,Vladimir Guerrero Jr,Marcus Semien,Bo Bichette
The ArrayTools[Lookup] command was introduced in Maple 2022.
For more information on Maple 2022 changes, see Updates in Maple 2022.
See Also
ArrayTools[IsSubsequence]
ListTools
member
StringTools
verify
Download Help Document