Assembly numpy vector - python

How to add a numpy array A to elements of a numpy array B with indices given by an index array C?
Ideally, I can write:
A=np.zeros(4,float)
B=np.array([1,2,3,4])
C=np.array([1,2,1,3])
A[C] +=B
print A
output:
[0, 4, 2, 4]
but it doesn't work since (according to documentation) A[C] is a copy.
(I only wonder why it in fact works if indexes in C appears only once.)
I need to do it fast (for big arrays).

It looks like your example was supposed to be
A = np.zeros(4, dtype=float)
B=np.array([1,2,3,4])
C=np.array([1,2,1,3])
A[C] += B
print A
If so, then instead of +=, you want numpy.add.at. add.at does what += does, but with repeated indices handled the way you want. Similar constructs work for other operators, e.g. subtract.at for -=.
numpy.add.at(A, C, B)

Related

What is difference b/w Python Range() vs Numpy.arange() function?

I learned on my web search that numpy.arange take less space than python range function. but i tried
using below it gives me different result.
import sys
x = range(1,10000)
print(sys.getsizeof(x)) # --> Output is 48
a = np.arange(1,10000,1,dtype=np.int8)
print(sys.getsizeof(a)) # --> OutPut is 10095
Could anyone please explain?
In PY3, range is an object that can generate a sequence of numbers; it is not the actual sequence. You may need to brush up on some basic Python reading, paying attention to things like lists and generators, and their differences.
In [359]: x = range(3)
In [360]: x
Out[360]: range(0, 3)
We have use something like list or a list comprehension to actually create those numbers:
In [361]: list(x)
Out[361]: [0, 1, 2]
In [362]: [i for i in x]
Out[362]: [0, 1, 2]
A range is often used in a for i in range(3): print(i) kind of loop.
arange is a numpy function that produces a numpy array:
In [363]: arr = np.arange(3)
In [364]: arr
Out[364]: array([0, 1, 2])
We can iterate on such an array, but it is slower than [362]:
In [365]: [i for i in arr]
Out[365]: [0, 1, 2]
But for doing things math, the array is much better:
In [366]: arr * 10
Out[366]: array([ 0, 10, 20])
The array can also be created from the list [361] (and for compatibility with earlier Py2 usage from the range itself):
In [376]: np.array(list(x)) # np.array(x)
Out[376]: array([0, 1, 2])
But this is slower than using arange directly (that's an implementation detail).
Despite the similarity in names, these shouldn't be seen as simple alternatives. Use range in basic Python constructs such as for loop and comprehension. Use arange when you need an array.
An important innovation in Python (compared to earlier languages) is that we could iterate directly on a list. We didn't have to step through indices. And if we needed indices along with with values we could use enumerate:
In [378]: alist = ['a','b','c']
In [379]: for i in range(3): print(alist[i]) # index iteration
a
b
c
In [380]: for v in alist: print(v) # iterate on list directly
a
b
c
In [381]: for i,v in enumerate(alist): print(i,v) # index and values
0 a
1 b
2 c
Thus you might not see range used that much in basic Python code.
the range type constructor creates range objects, which represent sequences of integers with a start, stop, and step in a space efficient manner, calculating the values on the fly.
np.arange function returns a numpy.ndarray object, which is essentially a wrapper around a primitive array. This is a fast and relatively compact representation, compared to if you created a python list, so list(range(N)), but range objects are more space efficient, and indeed, take constant space, so for all practical purposes, range(a) is the same size as range(b) for any integers a, b
As an aside, you should take care interpreting the results of sys.getsizeof, you must understand what it is doing. So do not naively compare the size of Python lists and numpy.ndarray, for example.
Perhaps whatever you read was referring to Python 2, where range returned a list. List objects do require more space than numpy.ndarray objects, generally.
arange store each individual value of the array while range store only 3 values (start, stop and step). That's the reason arange is taking more space compared to range.
As the question is about the size, this will be the answer.
But there are many advantages of using numpy array and arange than python lists for speed, space and efficiency perspective.

Numpy minimum like np.outer()

Maybe I'm just being lazy here, but let's say that I have two arrays, of length n and m, and I'd like a pairwise minimum of all of the elements of the two arrays compared against each other. For example:
a = [1,5,3]
b = [2,4]
cross_min(a,b)
= [[1,1],[2,4],[2,3]]
This is similar to the behavior of np.outer(), except that instead of multiplying the two arrays, it computes the minimum of the two elements.
Is there an operation in numpy that does a similar thing?
I know that I can just run np.minimum() along b and stack the results together. I'm wondering if this is a well-known operation that I just don't know the name of.
You can use np.minimum.outer(a, b)
You might turn one of the array into a 2d array, and then make use of the broadcasting rule and np.minimum:
import numpy as np
a = np.array([1,5,3])
b = np.array([2,4])
np.minimum(a[:,None], b)
#array([[1, 1],
# [2, 4],
# [2, 3]])

Python column addition of numpy arrays with shift

How can i accomplish column addition with shift using python numpy arrays ?
I have two dimensional array and need it's extended copy.
a = array([[0, 2, 4, 6, 8],
[1, 3, 5, 7, 9]])
i want something like (following is in pseudo code, it doesn't work; there is no a.columns in numpy as far as i know):
shift = 3
mult_factor = 0.7
for column in a.columns - shift :
out[column] = a[column] + 0.7 * a[column + shift]
I also know, that i can do the something similar to what i need using indexes. But i seems that is really overkill enumerating three values and using only one (j) :
for (i,j),value in np.ndenumerate(a):
print i,j
I founded, that i could iterate over columns, but not their indexes:
for column in a.T:
print column
Than i though that i can simply do this with something that is similar to xrange, but applying to multidimensional array:
In [225]: for column in np.ndindex(a.shape[1]):
print column
.....:
(0,)
(1,)
(2,)
(3,)
(4,)
So now i only know how to do this with simple xrange and i am not sure, that is the best solution.
out = np.zeros(a.shape)
shift = 2
mult_factor = 0.7
for i in xrange(a.shape[1]-shift):
print a[:, i]
out[:, i] = a[:, i] + mult_factor * a[:, i+shift]
However it will be not so fast in Python as it maybe can be.
Can you give me an advice how it will be in performance and maybe there is more faster way to accomplish column addition of numpy arrays with shift ?
out = a[:, :-shift] + mult_factor * a[:, shift:]
I think this is what you're looking for. It's a vectorized form of your loop, operating on large slices of a instead of column by column.
I'm not positive I completely understand what the computed quantity should be, but here are two things that seem germane to what you are asking:
If you have a 2D array, called a that you wish to convert to a list of 1D arrays which are the columns of a you can do this
cols = [c for c in a.T]
It looks like what you want can be accomplished with matrix multiplication if I am not mistaken. You could make a banded matrix in numpy using numpy.diag or, since you would have the same values along each band 1, mult_factor, or 0, you could use scipy.linalg.toeplitz
m,n = a.shape
band = np.eye(1,n)
band[0,shift] = mult_factor
T = scipy.linalg.toeplitz(np.eye(1,m),band)
out = np.inner(a,T)
For large matrices, it might make sense to use a sparse matrix for T if you only want to add two or a few columns of a.

Find whether a numpy array is a subset of a larger array in Python

I have 2 arrays, for the sake of simplicity let's say the original one is a random set of numbers:
import numpy as np
a=np.random.rand(N)
Then I sample and shuffle a subset from this array:
b=np.array() <------size<N
The shuffling I do do not store the index values, so b is an unordered subset of a
Is there an easy way to get the original indexes of b, so they are in the same order as a, say, if element 2 of b has the index 4 in a, create an array of its assignation.
I could use a for cycle checking element by element, but perhaps there is a more pythonic way
Thanks
I think the most computationally efficient thing to do is to keep track of the indices that associate b with a as b is created.
For example, instead of sampling a, sample the indices of a:
indices = random.sample(range(len(a)), k) # k < N
b = a[indices]
On the off chance a happens to be sorted you could do:
>>> from numpy import array
>>> a = array([1, 3, 4, 10, 11])
>>> b = array([11, 1, 4])
>>> a.searchsorted(b)
array([4, 0, 2])
If a is not sorted you're probably best off going with something like #unutbu's answer.

Pythonic way to get the first AND the last element of the sequence

What is the easiest and cleanest way to get the first AND the last elements of a sequence? E.g., I have a sequence [1, 2, 3, 4, 5], and I'd like to get [1, 5] via some kind of slicing magic. What I have come up with so far is:
l = len(s)
result = s[0:l:l-1]
I actually need this for a bit more complex task. I have a 3D numpy array, which is cubic (i.e. is of size NxNxN, where N may vary). I'd like an easy and fast way to get a 2x2x2 array containing the values from the vertices of the source array. The example above is an oversimplified, 1D version of my task.
Use this:
result = [s[0], s[-1]]
Since you're using a numpy array, you may want to use fancy indexing:
a = np.arange(27)
indices = [0, -1]
b = a[indices] # array([0, 26])
For the 3d case:
vertices = [(0,0,0),(0,0,-1),(0,-1,0),(0,-1,-1),(-1,-1,-1),(-1,-1,0),(-1,0,0),(-1,0,-1)]
indices = list(zip(*vertices)) #Can store this for later use.
a = np.arange(27).reshape((3,3,3)) #dummy array for testing. Can be any shape size :)
vertex_values = a[indices].reshape((2,2,2))
I first write down all the vertices (although I am willing to bet there is a clever way to do it using itertools which would let you scale this up to N dimensions ...). The order you specify the vertices is the order they will be in the output array. Then I "transpose" the list of vertices (using zip) so that all the x indices are together and all the y indices are together, etc. (that's how numpy likes it). At this point, you can save that index array and use it to index your array whenever you want the corners of your box. You can easily reshape the result into a 2x2x2 array (although the order I have it is probably not the order you want).
This would give you a list of the first and last element in your sequence:
result = [s[0], s[-1]]
Alternatively, this would give you a tuple
result = s[0], s[-1]
With the particular case of a (N,N,N) ndarray X that you mention, would the following work for you?
s = slice(0,N,N-1)
X[s,s,s]
Example
>>> N = 3
>>> X = np.arange(N*N*N).reshape(N,N,N)
>>> s = slice(0,N,N-1)
>>> print X[s,s,s]
[[[ 0 2]
[ 6 8]]
[[18 20]
[24 26]]]
>>> from operator import itemgetter
>>> first_and_last = itemgetter(0, -1)
>>> first_and_last([1, 2, 3, 4, 5])
(1, 5)
Why do you want to use a slice? Getting each element with
result = [s[0], s[-1]]
is better and more readable.
If you really need to use the slice, then your solution is the simplest working one that I can think of.
This also works for the 3D case you've mentioned.

Categories

Resources