import numpy as np
I have two arrays of size n (to simplify, I use in this example n = 2):
A = array([[1,2,3],[1,2,3]])
B has two dimensions with n time a random integer: 1, 2 or 3.
Let's pretend:
B = array([[1],[3]])
What is the most pythonic way to subtract B from A in order to obtain C, C = array([2,3],[1,2]) ?
I tried to use np.subtract but due to the broadcasting rules I do not obtain C. I do not want to use mask or indices but element's values. I also tried to use np.delete, np.where without success.
Thank you.
This might work and should be quite Pythonic:
dd=[[val for val in A[i] if val not in B[i]] for i in xrange(len(A))]
Related
I am trying to make the transition from excel to python, but run in some troubles, with summing every element within an one dimensional array.
In excel this can be easily done, as in the attached image.
From excel I can clearly see the mathematical pattern for achieving this. My approach was to create a for loop, by indexing the array A.
This is my code:
import numpy as np
A = np.array([0.520094,0.850895E-1,-0.108374e1])
B = np.array([0]) #initialize array
B[0] = A[0]
A is equivalent to column A in excel & similarly B
Using a for loop to sum every element/row:
for i in range(len(A)):
i = i+1
B.append([B[i-1]+A[i]])
print(B)
This strategy doesn't work and keep getting erros. Any suggestion as to how I could make this work or is there a more elegant way of doing this?
Just use np.cumsum:
import numpy as np
A = np.array([0.520094,0.850895E-1,-0.108374e1])
cumsum = np.cumsum(A)
print(cumsum)
Output:
[ 0.520094 0.6051835 -0.4785565]
A manual approach would look like this:
A = np.array([0.520094,0.850895E-1,-0.108374e1])
B = [] # Create B as a list and not a numpy array, because it's faster to append
for i in range(len(A)):
cumulated = A[i]
if i > 0:
cumulated += B[i-1]
B.append(cumulated)
B = np.array(B) # Convert B from list to a numpy array
I am using numpy and i want to generate an array of size n with random integers from a to b [upper bound exclusive] that are not in the array arr (if it helps, all values in arr are unique). I want the probability to be distributed uniformly among the other possible values. I am aware I can do it in this way:
randlist = np.random.randint(a, b, n)
while np.intersect1d(randlist, arr).size > 0:
randlist = np.random.randint(a, b, n)
But this seems really inefficent. What would be the fastest way to do this?
Simplest vectorized way would be with np.setdiff1d + np.random.choice -
c = np.setdiff1d(np.arange(a,b),arr)
out = np.random.choice(c,n)
Another way with masking -
mask = np.ones(b-a,dtype=bool)
mask[arr-a] = 0
idx = np.flatnonzero(mask)+a
out = idx[np.random.randint(0,len(idx),n)]
I am assigning values to a numpy array by looking up values in other numpy arrays. These arrays have potentially different indices. Here is an example:
import numpy as np
A=1; B=2; C=3; D=4; E=5
X = np.random.normal(0,1,(A,B,C,E))
Y = np.random.normal(0,1,(A,B,D))
Z = np.random.normal(0,1,(A,C))
Result = np.zeros((A,B,C,D,E))
for a in range(A):
for b in range(B):
for c in range(C):
for d in range(D):
for e in range(E):
Result[a,b,c,d,e] = Z[a,c] + Y[a,b,d] + X[a,b,c,e]
What is the best way to optimize this code? I can remove the E for loop using Result[a,b,c,d,:] = Z[a,c] + Y[a,b,d] + X[a,b,c,:]. But then how to remove the rest of the loops? I was also thinking that I could manipulate X,Y,Z before assignment so it merges easily with the dimensions of Result. There must be more elegant ways. Thanks for tips.
Here's one way:
Result = Z[:,None,:,None,None] + Y[:,:,None,:,None] + X[:,:,:,None,:]
To produce this vectorized version, all I did was replace the various indices into X, Y, and Z with full a,b,c,d,e-style indexing, inserting None where missing indices were found. For example, Y[a,b,d] becomes Y[a,b,None,d,None], which vectorizes into Y[:,:,None,:,None].
In numpy, indexing by None tells the array to pretend like it has an additional axis. This doesn't change the size of the array, but it does change how operations get broadcasted, which is what we need here. Check out the numpy broadcasting docs for more info.
I have an array with 4 values in it, called array r, using the numpy array command.
from numpy import array, amax, amin
r = array([r1,r2,r3,r4]
I need to sum the max and the min of this array:
g_1 = amax(r)+amin(r)
Now I need to compare this value (g_1) with the sum of the two other elements of the array (I don't know what value is the max when I program this part of the code) and I don't know how to do that.
from numpy import sum
g_2 = sum(r) - g_1
comp = g_1 <= g_2
The sum of the other two elements of the array is simply the sum of all elements of the array, minus the max and min values: sum(r) - g_1
You might as well sort the array, will probably require less compares overall:
r_sort = np.sort(r)
g_1 = r_sort[0] + r_sort[-1]
g_2 = r_sort[1] + r_sort[2]
I tried to use a for and if to see if I can do it right and the code I wrote seems to work:
from numpy import array, amax, amin
r=array([r1,r2,r3,r4])
g_1=amax(r)+amin(r)
for j in range (size(r)):
if r[j] != amax(r) and r[j] != amin(r):
g_2+=r[j]
This code seems to return correctly the g_2 I was looking for. Not the best solution, what do you think about it?
I am working with data from netcdf files, with multi-dimensional variables, read into numpy arrays. I need to scan all values in all dimensions (axes in numpy) and alter some values. But, I don't know in advance the dimension of any given variable. At runtime I can, of course, get the ndims and shapes of the numpy array.
How can I program a loop thru all values without knowing the number of dimensions, or shapes in advance? If I knew a variable was exactly 2 dimensions, I would do
shp=myarray.shape
for i in range(shp[0]):
for j in range(shp[1]):
do_something(myarray[i][j])
You should look into ravel, nditer and ndindex.
# For the simple case
for value in np.nditer(a):
do_something_with(value)
# This is similar to above
for value in a.ravel():
do_something_with(value)
# Or if you need the index
for idx in np.ndindex(a.shape):
a[idx] = do_something_with(a[idx])
On an unrelated note, numpy arrays are indexed a[i, j] instead of a[i][j]. In python a[i, j] is equivalent to indexing with a tuple, ie a[(i, j)].
You can use the flat property of numpy arrays, which returns a generator on all values (no matter the shape).
For instance:
>>> A = np.array([[1,2,3],[4,5,6]])
>>> for x in A.flat:
... print x
1
2
3
4
5
6
You can also set the values in the same order they're returned, e.g. like this:
>>> A.flat[:] = [x / 2 if x % 2 == 0 else x for x in A.flat]
>>> A
array([[1, 1, 3],
[2, 5, 3]])
I am not sure the order in which flat returns the elements is guaranteed in any way (as it iterates through the elements as they are in memory, so depending on your array convention you are likely to have it always being the same, unless you are really doing it on purpose, but be careful...)
And this will work for any dimension.
** -- Edit -- **
To clarify what I meant by 'order not guaranteed', the order of elements returned by flat does not change, but I think it would be unwise to count on it for things like row1 = A.flat[:N], although it will work most of the time.
This might be the easiest with recursion:
a = numpy.array(range(30)).reshape(5, 3, 2)
def recursive_do_something(array):
if len(array.shape) == 1:
for obj in array:
do_something(obj)
else:
for subarray in array:
recursive_do_something(subarray)
recursive_do_something(a)
In case you want the indices:
a = numpy.array(range(30)).reshape(5, 3, 2)
def do_something(x, indices):
print(indices, x)
def recursive_do_something(array, indices=None):
indices = indices or []
if len(array.shape) == 1:
for obj in array:
do_something(obj, indices)
else:
for i, subarray in enumerate(array):
recursive_do_something(subarray, indices + [i])
recursive_do_something(a)
Look into Python's itertools module.
Python 2: http://docs.python.org/2/library/itertools.html#itertools.product
Python 3: http://docs.python.org/3.3/library/itertools.html#itertools.product
This will allow you to do something along the lines of
for lengths in product(shp[0], shp[1], ...):
do_something(myarray[lengths[0]][lengths[1]]