Average every 2 values in a list using a loop - python

I have a question about how to get the average of every 2 elements in a list in python.
Ex:
a = [1, 3, 4, 1, 5, 2]
In this case, as it needs to compute (1 + 4 + 5)/3 and the next one (3 + 1 + 2)/3. The new list would have the following values:
amean = [3.3333,2]
So far I have managed to average, but I have no idea how to create a loop for it to return and start the average on the second element (3 + 1 + 2)/3.
Here's a piece of what I have done so far:
import numpy as np
a = [1.,3.,4.,1., 5., 2.]
def altElement(my_list):
b = my_list[:len(my_list):2]
print b
return np.mean(b)
print altElement(a)
Does anyone have any idea how to create this loop?
Here's a link for the code that I have done so far:
code

import numpy as np
a = np.asarray([1, 3, 4, 1, 5, 2])
print( a[::2].mean() ) #All Odd Elements
print( a[1::2].mean() ) #All Even Elements
Output:
3.33333333333
2.0
Edit as per comment(every 24 elements)
import numpy as np
a = range(1, 73)
for i in map(None,*[iter(a)]*24):
print( np.array(i).mean() )
Output:
12.5
36.5
60.5

my_list[1::2].mean() will give you the other element.

If you want pure Python and not Numpy:
mean = [sum(a[i::2]) / len(a[i::2]) for i in xrange(2)]
You may also want to add from __future__ import division or map(float, a) to avoid rounding.

Another approach is assuming that you have an even number of elements, you can reshape the array so that the odd elements appear in the first column and the even elements appear in the second column of a 2D array, then take the mean of each column:
b = np.array([a]).reshape(-1,2).mean(axis=0)
Example Output
>>> a = [1.,3.,4.,1., 5., 2.]
>>> b = np.array([a]).reshape(-1,2).mean(axis=0)
>>> b
array([ 3.33333333, 2. ])
The output is of course a NumPy array so if it is desired for you to have a list, simply invoke the tolist() method on the NumPy array:
>> b.tolist()
[3.3333333333333335, 2.0]

The following is an inefficient solution. But because the question is very basic, one might be curious to know the most basic solution first before the efficient solution which can be achieved using numpy or list comprehension
a = [1, 3, 4, 1, 5, 2]
list_1 = []
list_2 = []
for idx, elem in enumerate(a):
if idx % 2 == 0:
list_1.append(elem)
else:
list_2.append(elem)
print("Mean of the first every other elements ", sum(list_1)/float(len(list_1)))
print("Mean of the seond every other elements ", sum(list_2)/float(len(list_2)))

Related

Python sum values from multiple lists (more than two)

Looking for a pythonic way to sum values from multiple lists:
I have got the following list of lists:
a = [0,5,2]
b = [2,1,1]
c = [1,1,1]
d = [5,3,4]
my_list = [a,b,c,d]
I am looking for the output:
[8,10,8]
I`ve used:
print ([sum(x) for x in zip(*my_list )])
but zip only works when I have 2 elements in my_list.
Any idea?
zip works for an arbitrary number of iterables:
>>> list(map(sum, zip(*my_list)))
[8, 10, 8]
which is, of course, roughly equivalent to your comprehension which also works:
>>> [sum(x) for x in zip(*my_list)]
[8, 10, 8]
Numpy has a nice way of doing this, it is also able to handle very large arrays. First we create the my_list as a numpy array as such:
import numpy as np
a = [0,5,2]
b = [2,1,1]
c = [1,1,1]
d = [5,3,4]
my_list = np.array([a,b,c,d])
To get the sum over the columns, you can do the following
np.sum(my_list, axis=0)
Alternatively, the sum over the rows can be retrieved by
np.sum(my_list, axis=1)
I'd make it a numpy array and then sum along axis 0:
my_list = numpy.array([a,b,c,d])
my_list.sum(axis=0)
Output:
[ 8 10 8]

Loop over clump_masked indices

I have an array y_filtered that contains some masked values. I want to replace these values by some value I calculate based on their neighbouring values. I can get the indices of the masked values by using masked_slices = ma.clump_masked(y_filtered). This returns a list of slices, e.g. [slice(194, 196, None)].
I can easily get the values from my masked array, by using y_filtered[masked_slices], and even loop over them. However, I need to access the index of the values as well, so i can calculate its new value based on its neighbours. Enumerate (logically) returns 0, 1, etc. instead of the indices I need.
Here's the solution I came up with.
# get indices of masked data
masked_slices = ma.clump_masked(y_filtered)
y_enum = [(i, y_i) for i, y_i in zip(range(len(y_filtered)), y_filtered)]
for sl in masked_slices:
for i, y_i in y_enum[sl]:
# simplified example calculation
y_filtered[i] = np.average(y_filtered[i-2:i+2])
It is very ugly method i.m.o. and I think there has to be a better way to do this. Any suggestions?
Thanks!
EDIT:
I figured out a better way to achieve what I think you want to do. This code picks every window of 5 elements and compute its (masked) average, then uses those values to fill the gaps in the original array. If some index does not have any unmasked value close enough it will just leave it as masked:
import numpy as np
from numpy.lib.stride_tricks import as_strided
SMOOTH_MARGIN = 2
x = np.ma.array(data=[1, 2, 3, 4, 5, 6, 8, 9, 10],
mask=[0, 1, 0, 0, 1, 1, 1, 1, 0])
print(x)
# [1 -- 3 4 -- -- -- -- 10]
pad_data = np.pad(x.data, (SMOOTH_MARGIN, SMOOTH_MARGIN), mode='constant')
pad_mask = np.pad(x.mask, (SMOOTH_MARGIN, SMOOTH_MARGIN), mode='constant',
constant_values=True)
k = 2 * SMOOTH_MARGIN + 1
isize = x.dtype.itemsize
msize = x.mask.dtype.itemsize
x_pad = np.ma.array(
data=as_strided(pad_data, (len(x), k), (isize, isize), writeable=False),
mask=as_strided(pad_mask, (len(x), k), (msize, msize), writeable=False))
x_avg = np.ma.average(x_pad, axis=1).astype(x_pad.dtype)
fill_mask = ~x_avg.mask & x.mask
result = x.copy()
result[fill_mask] = x_avg[fill_mask]
print(result)
# [1 2 3 4 3 4 10 10 10]
(note all the values are integers here because x was originally of integer type)
The original posted code has a few errors, firstly it both reads and writes values from y_filtered in the loop, so the results of later indices are affected by the previous iterations, this could be fixed with a copy of the original y_filtered. Second, [i-2:i+2] should probably be [max(i-2, 0):i+3], in order to have a symmetric window starting at zero or later always.
You could do this:
from itertools import chain
# get indices of masked data
masked_slices = ma.clump_masked(y_filtered)
for idx in chain.from_iterable(range(s.start, s.stop) for s in masked_slices):
y_filtered[idx] = np.average(y_filtered[max(idx - 2, 0):idx + 3])

Python: How do you make a variable which can be used as an index call to slice another variable?

I am coming from a MATLAB background and moving over to Python. I am trying to figure out a way to set up a variable which is some vector which contains a range of indices which can then be used to slice some other array.
In MATLAB I would do this:
A = [2,3,4,5,6; 9,4,3,2,1; 5,4,3,2,5]; %some arbitrary matrix
begin = 2; %the first index I want to pull
end = 4; %the last index I want to pull
idx = 2:4; %the vector of indices I want
A(:,idx) %results in me pulling out the 2nd, 3rd and 4th column of A
Now in Python, what is the equivalent?
import numpy as np
A = np.array([[2,3,4,5,6],[9,4,3,2,1],[5,4,3,2,5]]) #some arbitrary matrix
begin = 1 #first index
end = 3 #last index
idx = ??? #This is the part I don't know! <<<-------------------
A[:,idx] #I want the same result as the Matlab example above
Obviously for this trivial example I could just have idx = [1,2,3], but I have much more complicated scenario in real life where I cannot write out the indices manually.
I have tried using the range and np.arange functions but they give the error that the object is not callable.
When I look at some MATLAB-to-Numpy conversions such as here, it suggests that the idx = 2:4 command in MATLAB command is equivalent to idx = range(1,3) in Python, but this is apparently not quite true?
Any help is appreciated.
You need slice:
>>> import numpy as np
>>> A = np.array([[2,3,4,5,6],[9,4,3,2,1],[5,4,3,2,5]])
>>> begin = 1
>>> end = 3
>>> s = slice(begin, end)
>>> A[:,s]
array([[3, 4],
[4, 3],
[4, 3]])
you would need to do this
idx = range(begin, end + 1)
notice you need to add 1 to end value because range doesn't include final value, i.e. ends in end - 1
A fairly general and convenient way to "freeze" an indexing expression is np.s_:
a = np.arange(12).reshape(3, 4)
idx = np.s_[1:3]
a[:, idx]
# array([[ 1, 2],
# [ 5, 6],
# [ 9, 10]])
idx = np.s_[::2, [1, 3, 0]]
a[idx]
# array([[ 1, 3, 0],
# [ 9, 11, 8]])

How can i do this numpy array operation in an efficient manner?

I am new to python.
If i have a (m x n) array, how can i find which column has the maximum repetitions of a particular value eg. 1 ? Is there any easy operation rather than writing iterative loops to do this.
Welcome to python and numpy. You can get the column with the most 1s by first checking which values in your array are 1, then counting that along each column and finally taking the argmax. In code it looks something like this:
>>> import numpy as np
>>> (m, n) = (4, 5)
>>> a = np.zeros((m, n))
>>> a[2, 3] = 1.
>>>
>>> a_eq_1 = a == 1
>>> repetitions = a_eq_1.sum(axis=0)
>>> np.argmax(repetitions)
3
Or more compactly:
>>> np.argmax((a == 1).sum(axis=0))
3

compare two following values in numpy array

What is the best way to touch two following values in an numpy array?
example:
npdata = np.array([13,15,20,25])
for i in range( len(npdata) ):
print npdata[i] - npdata[i+1]
this looks really messed up and additionally needs exception code for the last iteration of the loop.
any ideas?
Thanks!
numpy provides a function diff for this basic use case
>>> import numpy
>>> x = numpy.array([1, 2, 4, 7, 0])
>>> numpy.diff(x)
array([ 1, 2, 3, -7])
Your snippet computes something closer to -numpy.diff(x).
How about range(len(npdata) - 1) ?
Here's code (using a simple array, but it doesn't matter):
>>> ar = [1, 2, 3, 4, 5]
>>> for i in range(len(ar) - 1):
... print ar[i] + ar[i + 1]
...
3
5
7
9
As you can see it successfully prints the sums of all consecutive pairs in the array, without any exceptions for the last iteration.
You can use ediff1d to get differences of consecutive elements. More generally, a[1:] - a[:-1] will give the differences of consecutive elements and can be used with other operators as well.

Categories

Resources