Python has about a dozen built in functions that enable a "vectorized" style of coding. I say "style" because this is not true vectorization. Even so, learning how to use them will save you time and produce code that is faster and has fewer bugs.
The problem statement
Given some function (lets call it F) and a list of values
[1, 2, 3, 4, 5]
generate every sequential 3-tuple
(1, 2, 3) (2, 3, 4) (3, 4, 5)
and find the 3-tuple which produces the largest return value of F.
This requires generating all the 3-tuples, passing them one by one to the function, and keeping track of the results.
The imperative-style solution employs a "test and swap" strategy. It requires a for loop, an if statement, and some variables to keep track of intermediate results
i = 0
for x in num_list:
some_var = some_function(num_list[i:i+3])
if some_var > some_other_var:
some_other_var = some_var
final_list = num_list[i:i+3]
i += 1
This looks pefectly reasonable if you come from a C, Java, or Basic background. In fact, with minor syntax changes, this code would work in any of those languages.
But there are a few problems
- the some_other_var variable isn't initialized
- the slice [i:i+3] goes beyond the end of the list
So you see, this kind of low-level code is difficult to get right even though it's only 7 lines of code.
Using Python's higher level strategy
You can reduce this block of code to this one linemax( zip(num_list, num_list[1:], num_list[2:]), key=F )
if you understand how the zip and max functions work.
Generating the 3-tuples
You can use list slicing combined with the zip function to generate these groups.Start with a list of numbers
a = [1, 2, 3, 4, 5]
a[0:]
a[1:]
a[2:]
Now, if you line up the slices they form a triangular matrix and the columns form the groups you are looking for
[ 1, 2, 3, 4, 5 ]
[ 2, 3, 4, 5 ] group 1
[ 3, 4, 5 ]
[ 1, 2, 3, 4, 5 ]
[ 2, 3, 4, 5 ] group 2
[ 3, 4, 5 ]
[ 1, 2, 3, 4, 5 ]
[ 2, 3, 4, 5 ] group 3
[ 3, 4, 5 ]
[ 2, 3, 4, 5 ] group 3
[ 3, 4, 5 ]
You can generate the groups by passing the slices to the zip function
groups = zip(a[0:], a[1:], a[2:])
The zip function automatically stops when it reaches the end of the shortest slice. This prevents the "off by 1" errors that are so common in this type of code.
The max function normally works by comparing values with the "<" operator, but you can override this by specifying a key function.
max(groups, key=F)
I have another post that explains the key parameter in more detail.
Finding the maximum return value
The max function normally works by comparing values with the "<" operator, but you can override this by specifying a key function.
max(groups, key=F)
I have another post that explains the key parameter in more detail.
No comments:
Post a Comment