Using map to find the average of nested lists - python

I am trying to find a concise single line of code that will calculate the mean of each nested list. There will be an input of a two dimensional list of integers and an output float value. The kicker is I am trying to do this with the map() built-in, but am unsure how. Just trying to play around with a couple of things.
Comprehension code:
row_sum = [(sum(idx)/float(len(idx))) for idx in matrix]
return row_sum
Any tips would be greatly appreciated.

If you're intended on using map, this should work
row_sum = list(map(lambda idx: sum(idx)/float(len(idx)), matrix))

Seems pretty straight-forward. You can either make your own"mean" fucntion or use the one from the statistics library.:
>>> import statistics
>>> rows = [[1,2,3], [4,5,6]]
>>> list(map(statistics.mean, rows))
I'm on Python 3, so / is not integer division:
>>> def average(lst): return sum(lst)/len(lst)
...
>>> list(map(average, rows))
[2.0, 5.0]
Interesting that statistics.mean returned an int...
>>> rows = [[1,2,3], [4,5,6,3]]
>>> list(map(statistics.mean, rows))
[2, 4.5]
Very interesting...

Related

Find minimum values of both "columns" of list of lists

Given a list like the next one:
foo_list = [[1,8],[2,7],[3,6]]
I've found in questions like Tuple pairs, finding minimum using python and
minimum of list of lists that the pair with the minimum value of a list of lists can be found using a generator like:
min(x for x in foo_list)
which returns
[1, 8]
But I was wondering if there is a similar way to return both minimum values of the "columns" of the list:
output = [1,6]
I know this can be achieved using numpy arrays:
output = np.min(np.array(foo_list), axis=0)
But I'm interested in finding such a way of doing so with generators (if possible).
Thanks in advance!
[min(l) for l in zip(*foo_list)]
returns [1, 6]
zip(*foo_list) gets the list transpose and then we find the minimum in both lists.
Thanks #mousetail for suggestion.
You can use two min() for this. Like -
min1 = min(a for a, _ in foo_list)
min2 = min(b for _, b in foo_list)
print([min1, min2])
Will this do? But I think if you don't want to use third party library, you can just use plain old loop which will be more efficient.

How do you find the largest/smallest number amongst several array's?

I'm trying to get the largest/smallest number returned out of two or more numpy.array of equal length. Since max()/min() function doesn't work on multiple arrays, this is some of the best(worst) I've come up with:
max(max(a1), max(a2), max(a3), ...) / min(min(a1), min(a2), min(a3), ...)
Alternately one can use numpy's maximum, but those only work for two arrays at time.
Thanks in advance
this is linear time and works with Numpy arrays
>>> import itertools
>>> max(itertools.chain([1,2,3],[1,2,4],[-1, -2, 5])
5
max()/min() function doesn't work on multiple arrays
map will work on an array of arrays, then return a single array, which you can then find the max or min of that array.
>>> l1 = [1,2,3]
>>> l2 = [3,4,1]
>>> l3 = [6,1,8]
>>> map(max, [l1, l2, l3])
[3, 4, 8]
>>> max(map(max, [l1, l2, l3]))
8
Combine your arrays into one, then take the min/max along the new axis.
A = np.array([a1,a2, ... , an])
A.min(axis=0), A.max(axis=0)

using python itertools to manage nested for loops

I am trying to use itertools.product to manage the bookkeeping of some nested for loops, where the number of nested loops is not known in advance. Below is a specific example where I have chosen two nested for loops; the choice of two is only for clarity, what I need is a solution that works for an arbitrary number of loops.
This question provides an extension/generalization of the question appearing here:
Efficient algorithm for evaluating a 1-d array of functions on a same-length 1d numpy array
Now I am extending the above technique using an itertools trick I learned here:
Iterating over an unknown number of nested loops in python
Preamble:
from itertools import product
def trivial_functional(i, j): return lambda x : (i+j)*x
idx1 = [1, 2, 3, 4]
idx2 = [5, 6, 7]
joint = [idx1, idx2]
func_table = []
for items in product(*joint):
f = trivial_functional(*items)
func_table.append(f)
At the end of the above itertools loop, I have a 12-element, 1-d array of functions, func_table, each element having been built from the trivial_functional.
Question:
Suppose I am given a pair of integers, (i_1, i_2), where these integers are to be interpreted as the indices of idx1 and idx2, respectively. How can I use itertools.product to determine the correct corresponding element of the func_table array?
I know how to hack the answer by writing my own function that mimics the itertools.product bookkeeping, but surely there is a built-in feature of itertools.product that is intended for exactly this purpose?
I don't know of a way of calculating the flat index other than doing it yourself. Fortunately this isn't that difficult:
def product_flat_index(factors, indices):
if len(factors) == 1: return indices[0]
else: return indices[0] * len(factors[0]) + product_flat_index(factors[1:], indices[1:])
>> product_flat_index(joint, (2, 1))
9
An alternative approach is to store the results in a nested array in the first place, making translation unnecessary, though this is more complex:
from functools import reduce
from operator import getitem, setitem, itemgetter
def get_items(container, indices):
return reduce(getitem, indices, container)
def set_items(container, indices, value):
c = reduce(getitem, indices[:-1], container)
setitem(c, indices[-1], value)
def initialize_table(lengths):
if len(lengths) == 1: return [0] * lengths[0]
subtable = initialize_table(lengths[1:])
return [subtable[:] for _ in range(lengths[0])]
func_table = initialize_table(list(map(len, joint)))
for items in product(*map(enumerate, joint)):
f = trivial_functional(*map(itemgetter(1), items))
set_items(func_table, list(map(itemgetter(0), items)), f)
>>> get_items(func_table, (2, 1)) # same as func_table[2][1]
<function>
So numerous answers were quite useful, thanks to everyone for the solutions.
It turns out that if I recast the problem slightly with Numpy, I can accomplish the same bookkeeping, and solve the problem I was trying to solve with vastly improved speed relative to pure python solutions. The trick is just to use Numpy's reshape method together with the normal multi-dimensional array indexing syntax.
Here's how this works. We just convert func_table into a Numpy array, and reshape it:
func_table = np.array(func_table)
component_dimensions = [len(idx1), len(idx2)]
func_table = np.array(func_table).reshape(component_dimensions)
Now func_table can be used to return the correct function not just for a single 2d point, but for a full array of 2d points:
dim1_pts = [3,1,2,1,3,3,1,3,0]
dim2_pts = [0,1,2,1,2,0,1,2,1]
func_array = func_table[dim1_pts, dim2_pts]
As usual, Numpy to the rescue!
This is a little messy, but here you go:
from itertools import product
def trivial_functional(i, j): return lambda x : (i+j)*x
idx1 = [1, 2, 3, 4]
idx2 = [5, 6, 7]
joint = [enumerate(idx1), enumerate(idx2)]
func_map = {}
for indexes, items in map(lambda x: zip(*x), product(*joint)):
f = trivial_functional(*items)
func_map[indexes] = f
print(func_map[(2, 0)](5)) # 40 = (3+5)*5
I'd suggest using enumerate() in the right place:
from itertools import product
def trivial_functional(i, j): return lambda x : (i+j)*x
idx1 = [1, 2, 3, 4]
idx2 = [5, 6, 7]
joint = [idx1, idx2]
func_table = []
for items in product(*joint):
f = trivial_functional(*items)
func_table.append(f)
From what I understood from your comments and your code, func_table is simply indexed by the occurence of a certain input in the sequence. You can access it back again using:
for index, items in enumerate(product(*joint)):
# because of the append(), index is now the
# position of the function created from the
# respective tuple in join()
func_table[index](some_value)

How to calculate mean in python?

I have a list that I want to calculate the average(mean?) of the values for her.
When I do this:
import numpy as np #in the beginning of the code
goodPix = ['96.7958', '97.4333', '96.7938', '96.2792', '97.2292']
PixAvg = np.mean(goodPix)
I'm getting this error code:
ret = um.add.reduce(arr, axis=axis, dtype=dtype, out=out, keepdims=keepdims)
TypeError: cannot perform reduce with flexible type
I tried to find some help but didn't find something that was helpful
Thank you all.
Convert you list from strings to np.float:
>>> gp = np.array(goodPix, np.float)
>>> np.mean(gp)
96.906260000000003
There is a statistics library if you are using python >= 3.4
https://docs.python.org/3/library/statistics.html
You may use it's mean method like this. Let's say you have a list of numbers of which you want to find mean:-
list = [11, 13, 12, 15, 17]
import statistics as s
s.mean(list)
It has other methods too like stdev, variance, mode etc.
The things are still strings instead of floats. Try the following:
goodPix = ['96.7958', '97.4333', '96.7938', '96.2792', '97.2292']
gp2 = []
for i in goodPix:
gp2.append(float(i))
numpy.mean(gp2)
Using list comprehension
>>> np.mean([float(n) for n in goodPix])
96.906260000000003
If you're not using numpy, the obvious way to calculate the arithmetic mean of a list of values is to divide the sum of all elements by the number of elements, which is easily achieved using the two built-ins sum() and len(), e.g.:
>>> l = [1,3]
>>> sum(l)/len(l)
2.0
In case the list elements are strings, one way to convert them is with a list comprehension:
>>> s = ['1','3']
>>> l = [float(e) for e in s]
>>> l
[1.0, 3.0]
For an integer result, use the // operator ("floored quotient of x and y") or convert with int().
For many other solutions, also see Calculating arithmetic mean (one type of average) in Python

How can I sum a column of a list?

I have a Python array, like so:
[[1,2,3],
[1,2,3]]
I can add the row by doing sum(array[i]), how can I sum a column, using a double for loop?
I.E. for the first column, I could get 2, then 4, then 6.
Using a for loop (in a generator expression):
data = [[1,2,3],
[1,2,3]]
column = 1
print(sum(row[column] for row in data)) # -> 4
Try this:
a = [[1,2,3],
[1,2,3]]
print [sum(x) for x in zip(*a)]
zip function description
You don't need a loop, use zip() to transpose the list, then take the desired column:
sum(list(zip(*data)[i]))
(Note in 2.x, zip() returns a list, so you don't need the list() call).
Edit: The simplest solution to this problem, without using zip(), would probably be:
column_sum = 0
for row in data:
column_sum += row[i]
We just loop through the rows, taking the element and adding it to our total.
This is, however, less efficient and rather pointless given we have built-in functions to do this for us. In general, use zip().
[sum(row[i] for row in array) for i in range(len(array[0]))]
That should do it. len(array[0]) is the number of columns, so i iterates through those. The generator expression row[i] for row in array goes through all of the rows and selects a single column, for each column number.
I think the easiest way is this:
sumcolumn=data.sum(axis=0)
print (sumcolumn)
you can use zip():
In [16]: lis=[[1,2,3],
....: [1,2,3]]
In [17]: map(sum,zip(*lis))
Out[17]: [2, 4, 6]
or with a simple for loops:
In [25]: for i in xrange(len(lis[0])):
summ=0
for x in lis:
summ+=x[i]
print summ
....:
2
4
6
You may be interested in numpy, which has more advanced array features.
One of which is to easily sum a column:
from numpy import array
a = array([[1,2,3],
[1,2,3]])
column_idx = 1
a[:, column_idx].sum() # ":" here refers to the whole array, no filtering.
You can use numpy:
import numpy as np
a = np.array([[1,2,3],[1,2,3]])
a.sum(0)

Categories

Resources