Summing each element of two arrays - python

I have two arrays and want to sum each element of both arrays and find the maximum sum.
I have programmed it like this:
sum = []
for element in arrayOne:
sum.append(max([item + element for item in arrayTwo]))
print max(sum)
is there any better way to achieve this?

You can use numpy.
import numpy as np
a = np.array(arrayOne)
b = np.array(arrayTwo)
max = max(a + b)
print(max)

Use itertools.product with max:
from itertools import product
print(max(sum(x) for x in product(arrayOne, arrayTwo)))
Or using map:
print(max(map(sum,product(arrayOne, arrayTwo))))

max_sum = max(map(sum, zip(arrayOne, arrayTwo)))
Upd.
If you need max from sum of all elements in array:
max_sum = max(sum(arrayOne), sum(arrayTwo))
If arrayOne and arrayTwo are nested lists ([[1, 2], [3, 3], [3, 5], [4, 9]]) and you need to find element with max sum:
max_sum = max(map(sum, arrayOne + arrayTwo))
P. S. Next time, please, provide examples of input and output to not let us guess what do you need.

To find a maximum of all pairwise sums of elements of two arrays of lengths n and m respectively one can just
max(arrayOne) + max(arrayTwo)
which would perform at worst in O(max(n, m)) instead of O(n*m) when going over all the combinations.
However, if, for whatever reason, it is necessary to iterate over all the pairs, the solution might be
max(foo(one, two) for one in arrayOne for two in arrayTwo)
Where foo can be any function of two numeric parameters outputting a number (or an object of any class that implements ordering).
By the way, please avoid redefining built-ins like sum in your code.

Related

Find minimum values of both "columns" of list of lists

Given a list like the next one:
foo_list = [[1,8],[2,7],[3,6]]
I've found in questions like Tuple pairs, finding minimum using python and
minimum of list of lists that the pair with the minimum value of a list of lists can be found using a generator like:
min(x for x in foo_list)
which returns
[1, 8]
But I was wondering if there is a similar way to return both minimum values of the "columns" of the list:
output = [1,6]
I know this can be achieved using numpy arrays:
output = np.min(np.array(foo_list), axis=0)
But I'm interested in finding such a way of doing so with generators (if possible).
Thanks in advance!
[min(l) for l in zip(*foo_list)]
returns [1, 6]
zip(*foo_list) gets the list transpose and then we find the minimum in both lists.
Thanks #mousetail for suggestion.
You can use two min() for this. Like -
min1 = min(a for a, _ in foo_list)
min2 = min(b for _, b in foo_list)
print([min1, min2])
Will this do? But I think if you don't want to use third party library, you can just use plain old loop which will be more efficient.

Determining index each group duplicate values in an array in Python with the fastest way

I want to find an index of each group duplicate value like this:
s = [2,6,2,88,6,...]
The results must return the index from original s: [[0,2],[1,4],..] or the result can show another way.
I find many solutions so I find the fastest way to get duplicate group:
s = np.sort(a, axis=None)
s[:-1][s[1:] == s[:-1]]
But after sort I got wrong index from original s.
In my case, I have ~ 200mil value on the list and I want to find the fastest way to do that. I use an array to store value because I want to use GPU to make it faster.
Using hash structures like dict helps.
For example:
import numpy as np
from collections import defaultdict
a=np.array([2,4,2,88,15,4])
table=defaultdict(list)
for ind,num in enumerate(a):
table[num]+=[ind]
Outputs:
{2: [0, 2], 4: [1, 5], 88: [3], 15: [4]}
If you want to show duplicated elements in the order from small to large:
for k,v in sorted(table.items()):
if len(v)>1:
print(k,":",v)
Outputs:
2 : [0, 2]
4 : [1, 5]
The speed is determined by how many different values in the number list.
See if this meets your performance requirements (here, s is your input array):
counts = np.bincount(s)
cum_counts = np.add.accumulate(counts)
sorted_inds = np.argsort(s)
result = np.split(sorted_inds, cum_counts[:-1])
Notes:
The result would be a list of arrays.
Each of these arrays would contain indices of a repeated value in s. Eg, if the value 13 is repeated 7 times in s, there would be an array with 7 indices among the arrays of result
If you want to ignore singleton values of s (values that occur only once in s), you can pass minlength=2 to np.bincount()
(This is a variation of my other answer. Here, instead of splitting the large array sorted_inds, we take slices from it, so it's likely to have a different kind of performance characteristic)
If s is the input array:
counts = np.bincount(s)
cum_counts = np.add.accumulate(counts)
sorted_inds = np.argsort(s)
result = [sorted_inds[:cum_counts[0]]] + [sorted_inds[cum_counts[i]:cum_counts[i+1]] for i in range(cum_counts.size-1)]

Numpy minimum like np.outer()

Maybe I'm just being lazy here, but let's say that I have two arrays, of length n and m, and I'd like a pairwise minimum of all of the elements of the two arrays compared against each other. For example:
a = [1,5,3]
b = [2,4]
cross_min(a,b)
= [[1,1],[2,4],[2,3]]
This is similar to the behavior of np.outer(), except that instead of multiplying the two arrays, it computes the minimum of the two elements.
Is there an operation in numpy that does a similar thing?
I know that I can just run np.minimum() along b and stack the results together. I'm wondering if this is a well-known operation that I just don't know the name of.
You can use np.minimum.outer(a, b)
You might turn one of the array into a 2d array, and then make use of the broadcasting rule and np.minimum:
import numpy as np
a = np.array([1,5,3])
b = np.array([2,4])
np.minimum(a[:,None], b)
#array([[1, 1],
# [2, 4],
# [2, 3]])

Is there a one line code to find maximal value in a matrix?

To find the maximal value in a matrix of numbers, we can code 5 lines to solve the problem:
ans = matrix[0][0]
for x in range(len(matrix)):
for y in range(len(matrix[0])):
ans = max(ans, matrix[x][y])
return ans
Is there a one line solution for this problem?
The one that I came up with is pretty awkward actually:
return max(max(matrix, key=max))
or
return max(map(max, matrix))
You can use generator expression to find the maximum in your matrix. That way you can avoid building the full list of matrix elements in memory.
maximum = max(max(row) for row in matrix)
instead of list comprehension as given in a previous answer here
maximum = max([max(row) for row in matrix])
This is from PEP (the rationale section):
...many of the use cases do not need to have a full list created in
memory. Instead, they only need to iterate over the elements one at a
time.
...
Generator expressions are especially useful with functions like sum(), min(), and max() that reduce an iterable input to a single value
...
The utility of generator expressions is greatly enhanced when combined with reduction functions like sum(), min(), and max().
Also, take a look at this SO post: Generator Expressions vs. List Comprehension.
By matrix, I assume you mean a 2d-list.
max([max(i) for i in matrix])
using numpy.amax:
import numpy as np
>>> my_array
array([[1, 2, 3],
[9, 8, 6]])
>>> np.amax(my_array)
9
You can also flatten your array:
from itertools import chain
flatten = chain.from_iterable
max(flatten(matrix))

using python itertools to manage nested for loops

I am trying to use itertools.product to manage the bookkeeping of some nested for loops, where the number of nested loops is not known in advance. Below is a specific example where I have chosen two nested for loops; the choice of two is only for clarity, what I need is a solution that works for an arbitrary number of loops.
This question provides an extension/generalization of the question appearing here:
Efficient algorithm for evaluating a 1-d array of functions on a same-length 1d numpy array
Now I am extending the above technique using an itertools trick I learned here:
Iterating over an unknown number of nested loops in python
Preamble:
from itertools import product
def trivial_functional(i, j): return lambda x : (i+j)*x
idx1 = [1, 2, 3, 4]
idx2 = [5, 6, 7]
joint = [idx1, idx2]
func_table = []
for items in product(*joint):
f = trivial_functional(*items)
func_table.append(f)
At the end of the above itertools loop, I have a 12-element, 1-d array of functions, func_table, each element having been built from the trivial_functional.
Question:
Suppose I am given a pair of integers, (i_1, i_2), where these integers are to be interpreted as the indices of idx1 and idx2, respectively. How can I use itertools.product to determine the correct corresponding element of the func_table array?
I know how to hack the answer by writing my own function that mimics the itertools.product bookkeeping, but surely there is a built-in feature of itertools.product that is intended for exactly this purpose?
I don't know of a way of calculating the flat index other than doing it yourself. Fortunately this isn't that difficult:
def product_flat_index(factors, indices):
if len(factors) == 1: return indices[0]
else: return indices[0] * len(factors[0]) + product_flat_index(factors[1:], indices[1:])
>> product_flat_index(joint, (2, 1))
9
An alternative approach is to store the results in a nested array in the first place, making translation unnecessary, though this is more complex:
from functools import reduce
from operator import getitem, setitem, itemgetter
def get_items(container, indices):
return reduce(getitem, indices, container)
def set_items(container, indices, value):
c = reduce(getitem, indices[:-1], container)
setitem(c, indices[-1], value)
def initialize_table(lengths):
if len(lengths) == 1: return [0] * lengths[0]
subtable = initialize_table(lengths[1:])
return [subtable[:] for _ in range(lengths[0])]
func_table = initialize_table(list(map(len, joint)))
for items in product(*map(enumerate, joint)):
f = trivial_functional(*map(itemgetter(1), items))
set_items(func_table, list(map(itemgetter(0), items)), f)
>>> get_items(func_table, (2, 1)) # same as func_table[2][1]
<function>
So numerous answers were quite useful, thanks to everyone for the solutions.
It turns out that if I recast the problem slightly with Numpy, I can accomplish the same bookkeeping, and solve the problem I was trying to solve with vastly improved speed relative to pure python solutions. The trick is just to use Numpy's reshape method together with the normal multi-dimensional array indexing syntax.
Here's how this works. We just convert func_table into a Numpy array, and reshape it:
func_table = np.array(func_table)
component_dimensions = [len(idx1), len(idx2)]
func_table = np.array(func_table).reshape(component_dimensions)
Now func_table can be used to return the correct function not just for a single 2d point, but for a full array of 2d points:
dim1_pts = [3,1,2,1,3,3,1,3,0]
dim2_pts = [0,1,2,1,2,0,1,2,1]
func_array = func_table[dim1_pts, dim2_pts]
As usual, Numpy to the rescue!
This is a little messy, but here you go:
from itertools import product
def trivial_functional(i, j): return lambda x : (i+j)*x
idx1 = [1, 2, 3, 4]
idx2 = [5, 6, 7]
joint = [enumerate(idx1), enumerate(idx2)]
func_map = {}
for indexes, items in map(lambda x: zip(*x), product(*joint)):
f = trivial_functional(*items)
func_map[indexes] = f
print(func_map[(2, 0)](5)) # 40 = (3+5)*5
I'd suggest using enumerate() in the right place:
from itertools import product
def trivial_functional(i, j): return lambda x : (i+j)*x
idx1 = [1, 2, 3, 4]
idx2 = [5, 6, 7]
joint = [idx1, idx2]
func_table = []
for items in product(*joint):
f = trivial_functional(*items)
func_table.append(f)
From what I understood from your comments and your code, func_table is simply indexed by the occurence of a certain input in the sequence. You can access it back again using:
for index, items in enumerate(product(*joint)):
# because of the append(), index is now the
# position of the function created from the
# respective tuple in join()
func_table[index](some_value)

Categories

Resources