Python: How to get values of an array at certain index positions? - python

I have a numpy array like this:
a = [0,88,26,3,48,85,65,16,97,83,91]
How can I get the values at certain index positions in ONE step? For example:
ind_pos = [1,5,7]
The result should be:
[88,85,16]

Just index using you ind_pos
ind_pos = [1,5,7]
print (a[ind_pos])
[88 85 16]
In [55]: a = [0,88,26,3,48,85,65,16,97,83,91]
In [56]: import numpy as np
In [57]: arr = np.array(a)
In [58]: ind_pos = [1,5,7]
In [59]: arr[ind_pos]
Out[59]: array([88, 85, 16])

The one liner "no imports" version
a = [0,88,26,3,48,85,65,16,97,83,91]
ind_pos = [1,5,7]
[ a[i] for i in ind_pos ]

Although you ask about numpy arrays, you can get the same behavior for regular Python lists by using operator.itemgetter.
>>> from operator import itemgetter
>>> a = [0,88,26,3,48,85,65,16,97,83,91]
>>> ind_pos = [1, 5, 7]
>>> print itemgetter(*ind_pos)(a)
(88, 85, 16)

You can use index arrays, simply pass your ind_pos as an index argument as below:
a = np.array([0,88,26,3,48,85,65,16,97,83,91])
ind_pos = np.array([1,5,7])
print(a[ind_pos])
# [88,85,16]
Index arrays do not necessarily have to be numpy arrays, they can be also be lists or any sequence-like object (though not tuples).

your code would be
a = [0,88,26,3,48,85,65,16,97,83,91]
ind_pos = [a[1],a[5],a[7]]
print(ind_pos)
you get [88, 85, 16]

Related

How to sum arrays in nested arrays?

I have a nested array
array([[1,2,4], [2,5,6]])
I want to sum each array in it to get:
array([[7], [13]])
How to do that? When I do np.array([[1,2,4], [2,5,6]]) it gives
array([7, 13])
Using sum over axis 1:
>>> a = np.array([[1,2,4], [2,5,6]])
>>> a.sum(axis=1, keepdims=True)
[[ 7]
[13]]
Or without numpy:
>>> a = [[1,2,4], [2,5,6]]
>>> [[sum(l)] for l in a]
[[7], [13]]
I am not sure what the array() function is, but if its just a list,
then this should work:
a=array([[1,2,4], [2,5,6]])
b=[[sum(x)] for x in a] #new list of answers

Python sum values from multiple lists (more than two)

Looking for a pythonic way to sum values from multiple lists:
I have got the following list of lists:
a = [0,5,2]
b = [2,1,1]
c = [1,1,1]
d = [5,3,4]
my_list = [a,b,c,d]
I am looking for the output:
[8,10,8]
I`ve used:
print ([sum(x) for x in zip(*my_list )])
but zip only works when I have 2 elements in my_list.
Any idea?
zip works for an arbitrary number of iterables:
>>> list(map(sum, zip(*my_list)))
[8, 10, 8]
which is, of course, roughly equivalent to your comprehension which also works:
>>> [sum(x) for x in zip(*my_list)]
[8, 10, 8]
Numpy has a nice way of doing this, it is also able to handle very large arrays. First we create the my_list as a numpy array as such:
import numpy as np
a = [0,5,2]
b = [2,1,1]
c = [1,1,1]
d = [5,3,4]
my_list = np.array([a,b,c,d])
To get the sum over the columns, you can do the following
np.sum(my_list, axis=0)
Alternatively, the sum over the rows can be retrieved by
np.sum(my_list, axis=1)
I'd make it a numpy array and then sum along axis 0:
my_list = numpy.array([a,b,c,d])
my_list.sum(axis=0)
Output:
[ 8 10 8]

how to subset list base on array in python

I have following column names in a list:
vars = ['age','balance','day','duration','campaign','pdays','previous','job_admin.','job_blue-collar']
I have one array which consists of array indexes
(array([1, 5, 7], dtype=int64),)
I want to subset the list based on array indexes
Desired output should be
vars = ['balance','pdays','job_admin.']
I have tried something like this in python
for i, a in enumerate(X):
if i in new_L:
print i
But,it does not work.
Simply use a loop to do that:
result=[]
for i in your_array:
result.append(vars[i])
or one linear
[vars[i] for i in your_array]
If you're using numpy anyway, use its advanced indexing
import numpy as np
vars = ['age','balance','day','duration','campaign','pdays',
'previous','job_admin.','job_blue-collar']
indices = (np.array([1, 5, 7]),)
sub_array = np.asarray(vars)[indices]
# --> array(['balance', 'pdays', 'job_admin.'], dtype='<U15')
or if you want a list
sub_list = np.asarray(vars)[indices].tolist()
# --> ['balance', 'pdays', 'job_admin.']
index = [1, 5, 7]
vars = [vars[i] for i in index]
If I understand correctly, your data are:
vars = ['age','balance','day','duration','campaign','pdays','previous','job_admin.','job_blue-collar']
and indexes are:
idx = [1, 5, 7]
Then you can do:
>>> [vars[i] for i in idx]
['balance', 'pdays', 'job_admin.']
You can use operator.itemgetter:
>>> import numpy as np
>>> import operator
>>> vars = ['age','balance','day','duration','campaign','pdays','previous','job_admin.','job_blue-collar']
>>> idx = np.array([1,5,7])
>>> operator.itemgetter(*idx)(vars)
('balance', 'pdays', 'job_admin.'
This is actually the fastest solution posted so far.
>>> from timeit import repeat
>>> kwds = dict(globals=globals(), number=1000000)
>>>
>>> repeat("np.asarray(vars)[idx]", **kwds)
[2.2382465780247003, 2.225632123881951, 2.1969433058984578]
>>> repeat("[vars[i] for i in idx]", **kwds)
[0.9384958958253264, 0.9366465201601386, 0.9373494561295956]
>>> repeat("operator.itemgetter(*idx)(vars)", **kwds)
[0.9045725339092314, 0.9015877249184996, 0.9032398068811744]
Interestingly, it becomes more than twice as fast if we convert idx to a list first, and that's including the cost of conversion:
>>> repeat("operator.itemgetter(*idx.tolist())(vars)", **kwds)
[0.4062491739168763, 0.4086623480543494, 0.4049343201331794]
We can also afford to convert the result to list and still are much faster than all the other solutions:
>>> repeat("list(operator.itemgetter(*idx.tolist())(vars))", **kwds)
[0.561687784967944, 0.5593925788998604, 0.5586365279741585]

Using the reduce function on a multidimensional array

So i have a particular array, that has 2 seperate arrays withing itself. What I am looking to do is to average together those 2 seperate arrays, so for instance, if i have my original array such as [(2,3,4),(4,5,6)] and I want an output array like [3,5], how would i do this? My attempt to do this is as follows:
averages = reduce(sum(array)/len(array), [array])
>>> map(lambda x: sum(x)/len(x), [(2,3,4),(4,5,6)])
[3, 5]
reduce is not a good choice here. Just use a list comprehension:
>>> a = [(2,3,4),(4,5,6)]
>>> [sum(t)/len(t) for t in a]
[3, 5]
Note that / is integer division by default in python2.
If you have numpy available, you have a nicer option:
>>> import numpy as np
>>> a = np.array(a)
>>> a.mean(axis=1)
array([ 3., 5.])
You can do this with a list comphrehesion:
data = [(2,3,4),(4,5,6)]
averages = [ sum(tup)/len(tup) for tup in data ]

Vectorised code for selecting elements of a 2D array

Well the following code obviously returns the element in position ind in matrix:
def select_coord(a,ind):
return a[ind]
However I don't know how to vectorise this. In other words:
b=np.asarray([[2,3,4,5],[7,6,8,10]])
indices=np.asarray([2,3])
select_coord(b,indices)
Should return [4,10].
Which can be written with a for loop:
def new_select_record(a,indices):
ret=[]
for i in range a.shape[0]:
ret.append(a[indices[i]])
return np.asarray(ret)
Is there a way to write this in a vectorised manner?
To get b[0, 2], b[1, 3]:
>>> import numpy as np
>>> b = np.array([[2,3,4,5], [7,6,8,10]])
>>> indices = np.array([2, 3])
>>> b[np.arange(len(indices)), indices]
array([ 4, 10])
how about: np.diag(b[:,[2,3]])?

Categories

Resources