I am trying to use itertools.product to manage the bookkeeping of some nested for loops, where the number of nested loops is not known in advance. Below is a specific example where I have chosen two nested for loops; the choice of two is only for clarity, what I need is a solution that works for an arbitrary number of loops.
This question provides an extension/generalization of the question appearing here:
Efficient algorithm for evaluating a 1-d array of functions on a same-length 1d numpy array
Now I am extending the above technique using an itertools trick I learned here:
Iterating over an unknown number of nested loops in python
Preamble:
from itertools import product
def trivial_functional(i, j): return lambda x : (i+j)*x
idx1 = [1, 2, 3, 4]
idx2 = [5, 6, 7]
joint = [idx1, idx2]
func_table = []
for items in product(*joint):
f = trivial_functional(*items)
func_table.append(f)
At the end of the above itertools loop, I have a 12-element, 1-d array of functions, func_table, each element having been built from the trivial_functional.
Question:
Suppose I am given a pair of integers, (i_1, i_2), where these integers are to be interpreted as the indices of idx1 and idx2, respectively. How can I use itertools.product to determine the correct corresponding element of the func_table array?
I know how to hack the answer by writing my own function that mimics the itertools.product bookkeeping, but surely there is a built-in feature of itertools.product that is intended for exactly this purpose?
I don't know of a way of calculating the flat index other than doing it yourself. Fortunately this isn't that difficult:
def product_flat_index(factors, indices):
if len(factors) == 1: return indices[0]
else: return indices[0] * len(factors[0]) + product_flat_index(factors[1:], indices[1:])
>> product_flat_index(joint, (2, 1))
9
An alternative approach is to store the results in a nested array in the first place, making translation unnecessary, though this is more complex:
from functools import reduce
from operator import getitem, setitem, itemgetter
def get_items(container, indices):
return reduce(getitem, indices, container)
def set_items(container, indices, value):
c = reduce(getitem, indices[:-1], container)
setitem(c, indices[-1], value)
def initialize_table(lengths):
if len(lengths) == 1: return [0] * lengths[0]
subtable = initialize_table(lengths[1:])
return [subtable[:] for _ in range(lengths[0])]
func_table = initialize_table(list(map(len, joint)))
for items in product(*map(enumerate, joint)):
f = trivial_functional(*map(itemgetter(1), items))
set_items(func_table, list(map(itemgetter(0), items)), f)
>>> get_items(func_table, (2, 1)) # same as func_table[2][1]
<function>
So numerous answers were quite useful, thanks to everyone for the solutions.
It turns out that if I recast the problem slightly with Numpy, I can accomplish the same bookkeeping, and solve the problem I was trying to solve with vastly improved speed relative to pure python solutions. The trick is just to use Numpy's reshape method together with the normal multi-dimensional array indexing syntax.
Here's how this works. We just convert func_table into a Numpy array, and reshape it:
func_table = np.array(func_table)
component_dimensions = [len(idx1), len(idx2)]
func_table = np.array(func_table).reshape(component_dimensions)
Now func_table can be used to return the correct function not just for a single 2d point, but for a full array of 2d points:
dim1_pts = [3,1,2,1,3,3,1,3,0]
dim2_pts = [0,1,2,1,2,0,1,2,1]
func_array = func_table[dim1_pts, dim2_pts]
As usual, Numpy to the rescue!
This is a little messy, but here you go:
from itertools import product
def trivial_functional(i, j): return lambda x : (i+j)*x
idx1 = [1, 2, 3, 4]
idx2 = [5, 6, 7]
joint = [enumerate(idx1), enumerate(idx2)]
func_map = {}
for indexes, items in map(lambda x: zip(*x), product(*joint)):
f = trivial_functional(*items)
func_map[indexes] = f
print(func_map[(2, 0)](5)) # 40 = (3+5)*5
I'd suggest using enumerate() in the right place:
from itertools import product
def trivial_functional(i, j): return lambda x : (i+j)*x
idx1 = [1, 2, 3, 4]
idx2 = [5, 6, 7]
joint = [idx1, idx2]
func_table = []
for items in product(*joint):
f = trivial_functional(*items)
func_table.append(f)
From what I understood from your comments and your code, func_table is simply indexed by the occurence of a certain input in the sequence. You can access it back again using:
for index, items in enumerate(product(*joint)):
# because of the append(), index is now the
# position of the function created from the
# respective tuple in join()
func_table[index](some_value)
Related
i have got a question about tuple indexing and slicing in python. I want to write better and clearer code. This is a simplified version of my problem:
I have a tuple a = (1,2,3,4,5) and i want to index into it so that i get b = (1,2,4).
Is it possible to do this in one operation or do i have do b = a[0:2] + (a[3],)? I have thought about indexing with another tuple, which is not possible, i have also searched if there is a way to combine a slice and an index. It just seems to me, that there must be a better way to do it.
Thank you very much :)
Just for fun, you can implement an indexer and use simple syntax to get the results:
from collections.abc import Sequence
class SequenceIndexer:
def __init__(self, seq: Sequence):
self.seq = seq
def __getitem__(self, item):
if not isinstance(item, tuple):
return self.seq[item]
result = []
for i in item:
if isinstance(i, slice):
result.extend(self.seq[i])
else:
result.append(self.seq[i])
return result
a = 1, 2, 3, 4, 5
indexer = SequenceIndexer(a)
print(indexer[:2, 3]) # [1, 2, 4]
print(indexer[:3, 2, 4, -3:]) # [1, 2, 3, 3, 5, 3, 4, 5]
As stated in the comment you can use itemgetter
import operator
a = (1,2,3,4)
result = operator.itemgetter(0, 1, 3)(a)
If you do this kind of indexing often and your arrays are big, you might also have a look at numpy as it supports more complex indexing schemes:
import numpy as np
a = (1,2,3,4)
b = np.array(a)
result = b[[0,1,3]]
If the array is very large and or you have a huge list of single points you want to include it will get convinient to construct the slicing list in advance. For the example above it will look like this:
singlePoints = [3,]
idx = [*range(0,2), *singlePoints]
result = b[idx]
you can easily use unpacking but dont name the parameters you dont need you can name them as underscore (_).
one, two, _, four, _ = (1, 2, 3, 4, 5)
now you have your numbers.
this is one simple way.
b = (one, two, four)
or you can use numpy as well.
I have an array:
arr = [5,5,5,5,5,5]
I want to increment a particular range in the arr by 'n'. So if n=2 and the range is [2,5].
The array should look like this:
arr = [5,5,7,7,7,5]
Needed to do this without a for loop, for a problem im trying to solve.
Tried:
arr[2:5] = [n]*3
but that obviously replaces the entries and becomes:
arr = [5,5,3,3,3,5]
Any suggestions would be highly appriciated.
n = 2
arr_range = slice(2, 5)
arr = [5,5,7,7,7,5]
arr[arr_range] = map(lambda x: x+n, arr[arr_range])
# arr
# [5, 5, 9, 9, 9, 5]
But I would recommend using numpy...
import numpy as np
n = 2
arr_range = slice(2, 5)
arr = np.array([5,5,7,7,7,5])
arr[arr_range] += n
You actually have a list, not an array. If you convert it to a Numpy array it is simple.
>>> n=3
>>> arr = np.array([5,5,5,5,5,5])
>>> arr[2:5] += n
>>> arr
array([5, 5, 8, 8, 8, 5])
You have basically two options (for code see below):
Use slice assignment via a list comprehension (a[:] = [x+1 for x in a]),
Use a for-loop (even though you exclude this in your question, I don't see a legitimate reason for doing so).
They come with pros and cons. Let's assume you are going to replace some fraction of the list items (as opposed to a fixed number of items). The for-loop runs in Python and hence might be slower but it has O(1) memory usage. The list comprehension and slice assignment both operate in C (assuming you are using CPython) but it has O(N) memory usage due to the temporary list.
Using a generator doesn't buy anything since it is converted to a list anyway before the assignment happens (this is necessary because if the generator had fewer or more items than the slice, the list would need to be resized accordingly; see the source code).
Using a map adds even more overhead since it needs to call the mapped function on every item.
The following is a performance comparison of the different methods. The for-loop is fastest for very small lists since it has minimal overhead (just the range object). For more than about a dozen items, the list comprehension clearly outperforms the other methods and especially for larger lists (len(a) > 3e5) the difference to the generator becomes noticeable (the generator cannot provide information about its size, so the generated list needs to be resized as more items are fetched). For very large lists the difference between for-loop and list comprehension seems to shrink again since the memory overhead tends to outweigh the loop cost, but reaching that point would require unusually large lists (where you'd be better off using something like Numpy anyway).
This is the code using the perfplot package:
import numpy
import perfplot
def use_generator(a):
i = slice(0, len(a)//2)
a[i] = (x+1 for x in a[i])
def use_map(a):
i = slice(0, len(a)//2)
a[i] = map(lambda x: x+1, a[i])
def use_list(a):
i = slice(0, len(a)//2)
a[i] = [x+1 for x in a[i]]
def use_loop(a):
for i in range(len(a)//2):
a[i] += 1
perfplot.show(
setup=lambda n: [0]*n,
kernels=[use_generator, use_map, use_list, use_loop],
n_range=[2**k for k in range(1, 26)],
xlabel="len(a)",
equality_check=None,
)
I have two arrays and want to sum each element of both arrays and find the maximum sum.
I have programmed it like this:
sum = []
for element in arrayOne:
sum.append(max([item + element for item in arrayTwo]))
print max(sum)
is there any better way to achieve this?
You can use numpy.
import numpy as np
a = np.array(arrayOne)
b = np.array(arrayTwo)
max = max(a + b)
print(max)
Use itertools.product with max:
from itertools import product
print(max(sum(x) for x in product(arrayOne, arrayTwo)))
Or using map:
print(max(map(sum,product(arrayOne, arrayTwo))))
max_sum = max(map(sum, zip(arrayOne, arrayTwo)))
Upd.
If you need max from sum of all elements in array:
max_sum = max(sum(arrayOne), sum(arrayTwo))
If arrayOne and arrayTwo are nested lists ([[1, 2], [3, 3], [3, 5], [4, 9]]) and you need to find element with max sum:
max_sum = max(map(sum, arrayOne + arrayTwo))
P. S. Next time, please, provide examples of input and output to not let us guess what do you need.
To find a maximum of all pairwise sums of elements of two arrays of lengths n and m respectively one can just
max(arrayOne) + max(arrayTwo)
which would perform at worst in O(max(n, m)) instead of O(n*m) when going over all the combinations.
However, if, for whatever reason, it is necessary to iterate over all the pairs, the solution might be
max(foo(one, two) for one in arrayOne for two in arrayTwo)
Where foo can be any function of two numeric parameters outputting a number (or an object of any class that implements ordering).
By the way, please avoid redefining built-ins like sum in your code.
I know you can create easily nested lists in python like this:
[[1,2],[3,4]]
But how to create a 3x3x3 matrix of zeroes?
[[[0] * 3 for i in range(0, 3)] for j in range (0,3)]
or
[[[0]*3]*3]*3
Doesn't seem right. There is no way to create it just passing a list of dimensions to a method? Ex:
CreateArray([3,3,3])
In case a matrix is actually what you are looking for, consider the numpy package.
http://docs.scipy.org/doc/numpy/reference/generated/numpy.zeros.html#numpy.zeros
This will give you a 3x3x3 array of zeros:
numpy.zeros((3,3,3))
You also benefit from the convenience features of a module built for scientific computing.
List comprehensions are just syntactic sugar for adding expressiveness to list initialization; in your case, I would not use them at all, and go for a simple nested loop.
On a completely different level: do you think the n-dimensional array of NumPy could be a better approach?
Although you can use lists to implement multi-dimensional matrices, I think they are not the best tool for that goal.
NumPy addresses this problem
Link
>>> a = array( [2,3,4] )
>>> a
array([2, 3, 4])
>>> type(a)
<type 'numpy.ndarray'>
But if you want to use the Python native lists as a matrix the following helper methods can become handy:
import copy
def Create(dimensions, item):
for dimension in dimensions:
item = map(copy.copy, [item] * dimension)
return item
def Get(matrix, position):
for index in position:
matrix = matrix[index]
return matrix
def Set(matrix, position, value):
for index in position[:-1]:
matrix = matrix[index]
matrix[position[-1]] = value
Or use the nest function defined here, combined with repeat(0) from the itertools module:
nest(itertools.repeat(0),[3,3,3])
Just nest the multiplication syntax:
[[[0] * 3] * 3] * 3
It's therefore simple to express this operation using folds
def zeros(dimensions):
return reduce(lambda x, d: [x] * d, [0] + dimensions)
Or if you want to avoid reference replication, so altering one item won't affect any other you should instead use copies:
import copy
def zeros(dimensions):
item = 0
for dimension in dimensions:
item = map(copy.copy, [item] * dimension)
return item
I'm sure there's a nice way to do this in Python, but I'm pretty new to the language, so forgive me if this is an easy one!
I have a list, and I'd like to pick out certain values from that list. The values I want to pick out are the ones whose indexes in the list are specified in another list.
For example:
indexes = [2, 4, 5]
main_list = [0, 1, 9, 3, 2, 6, 1, 9, 8]
the output would be:
[9, 2, 6]
(i.e., the elements with indexes 2, 4 and 5 from main_list).
I have a feeling this should be doable using something like list comprehensions, but I can't figure it out (in particular, I can't figure out how to access the index of an item when using a list comprehension).
[main_list[x] for x in indexes]
This will return a list of the objects, using a list comprehension.
t = []
for i in indexes:
t.append(main_list[i])
return t
map(lambda x:main_list[x],indexes)
If you're good with numpy:
import numpy as np
main_array = np.array(main_list) # converting to numpy array
out_array = main_array.take([2, 4, 5])
out_list = out_array.tolist() # if you want a list specifically
I think Yuval A's solution is a pretty clear and simple. But if you actually want a one line list comprehension:
[e for i, e in enumerate(main_list) if i in indexes]
As an alternative to a list comprehension, you can use map with list.__getitem__. For large lists you should see better performance:
import random
n = 10**7
L = list(range(n))
idx = random.sample(range(n), int(n/10))
x = [L[x] for x in idx]
y = list(map(L.__getitem__, idx))
assert all(i==j for i, j in zip(x, y))
%timeit [L[x] for x in idx] # 474 ms per loop
%timeit list(map(L.__getitem__, idx)) # 417 ms per loop
For a lazy iterator, you can just use map(L.__getitem__, idx). Note in Python 2.7, map returns a list, so there is no need to pass to list.
I have noticed that there are two optional ways to do this job, either by loop or by turning to np.array. Then I test the time needed by these two methods, the result shows that when dataset is large
【[main_list[x] for x in indexes]】is about 3~5 times faster than
【np.array.take()】
if your code is sensitive to the computation time, the highest voted answer is a good choice.