I have a numpy array with zeros and non-zeros and shape (10,10).
To a subpart of this array I need to add a certain value, where initial value is not zero.
a[2:7,2:7] += 0.5 #But with a condition that a[a!=0]
Currently, I do it in a rather cumbersome way, by first making a copy of the array and modifying the second array consistently and then copying back to the first.
b = a.copy()
b[b!=0] = 1
b[2:7,2:7] *= 0.5
b[b ==1] =0
a += b
Is there more elegant way to achieve this?
As Thomas Kühn, correctly wrote in the comment, its good enough to create a reference to that subpart of the array and modify it. So the following does the job.
b = a[2:7,2:7]
b[b!=0] += 0.5
Related
I am trying to make the transition from excel to python, but run in some troubles, with summing every element within an one dimensional array.
In excel this can be easily done, as in the attached image.
From excel I can clearly see the mathematical pattern for achieving this. My approach was to create a for loop, by indexing the array A.
This is my code:
import numpy as np
A = np.array([0.520094,0.850895E-1,-0.108374e1])
B = np.array([0]) #initialize array
B[0] = A[0]
A is equivalent to column A in excel & similarly B
Using a for loop to sum every element/row:
for i in range(len(A)):
i = i+1
B.append([B[i-1]+A[i]])
print(B)
This strategy doesn't work and keep getting erros. Any suggestion as to how I could make this work or is there a more elegant way of doing this?
Just use np.cumsum:
import numpy as np
A = np.array([0.520094,0.850895E-1,-0.108374e1])
cumsum = np.cumsum(A)
print(cumsum)
Output:
[ 0.520094 0.6051835 -0.4785565]
A manual approach would look like this:
A = np.array([0.520094,0.850895E-1,-0.108374e1])
B = [] # Create B as a list and not a numpy array, because it's faster to append
for i in range(len(A)):
cumulated = A[i]
if i > 0:
cumulated += B[i-1]
B.append(cumulated)
B = np.array(B) # Convert B from list to a numpy array
I'm trying to make an rgb color picture editor, using just numpy.
I've tried using a nested for loop, but it's really slow (over a minute).
I'm wanting to control first, second, and third element (r,g,b) of the third dimension of the nested array. Thanks
This is to just look at the numbers:
%matplotlib inline
import numpy as np
img = plt.imread('galaxy.jpg')
img = np.array(img)
for i in range(len(img)):
for j in range(len(img[i])):
for k in (img[i][j]):
print(k)
Perhaps this might help you. np.ndenumerate() lets you iterate through a matrix without nested for loops. I did a quick test and my second for loop (in the example below) is slightly faster than your triple nested for loop, as far as printing is concerned. Printing is very slow so taking out the print statements might help with speed. As far as modifying these values, I added r g b a variables that can be modified to scale the various pixel values. Just a thought, but perhaps it might give you more ideas to expand on. Also, I didn't check to see which index values correspond to r, g, b, or a.
r = 1.0
g = 1.0
b = 1.0
a = 1.0
for index, pixel in np.ndenumerate(img): # <--- Acheives the same as your original code
print(pixel)
for index, pixel in np.ndenumerate(img):
i = index[0]
j = index[1]
print("{} {} {} {}".format(img[i][j][0], img[i][j][1], img[i][j][2], img[i][j][3]))
for index, pixel in np.ndenumerate(img):
i = index[0]
j = index[1]
imgp[i][j][0] *= r;
imgp[i][j][1] *= g;
imgp[i][j][2] *= b;
imgp[i][j][3] *= a;
Hope this helps
I have what I'm sure is a very simple problem to solve but I can't seem to get it right and have not been able to search for an answer, likely because I am using the wrong vocabulary, etc.
My goal is to have an array, called array_1, which has different 'test cases'. For each of the elements in that array, I want to run a Monte Carlo, with the current element being the input to a function. I would like to get the mean of all of the results (num_samples) and store that into another array, which will be an array of the 'means' to be easily visualized. Hard coding for each of the conditions is easy but I am looking for a more automated method. Any help would be appreciated. What I'm currently working with is below:
import numpy as np
num_samples = 5
array_1 = ([1,2,3])
array_2 = np.zeros(num_samples)
array_3 = ([])
def func_add(a, b):
return a + b + 2
#def func_append(c):
for j in array_1:
for i in range(num_samples):
r = np.random.randint(1,2)
array_2[i] = func_add( j, r)
c = np.mean(array_2) #this value I want to put in a new array to have an 'array of means'
#print(b)
array_3 = np.append(array_3, c)
print(array_2)
print(np.mean(array_2))
print(c)
print(array_3)
Which returns:
[6. 6. 6. 6. 6.]
6.0
6.0
[4. 5. 6.]
EDIT 2: The results for array_3 seem to make sense but now I'm curious as to why array_2 only contains 6's. In the first case of the loops, j = 1 and r = 1, so the function should return 4 and place that in index 1 for array_2, or do they all get overwritten by the last case of the for loop, which also would make sense I think.
Thank you in advance.
EDIT: I think the problem is that maybe that I'm pulling the value from array_1 but I want to put the mean from the processing into the first index of some array(meaning I might have to create a third array to hold those values?)
Outside the loop, you can instantiate c as an empty numpy array. Inside the for loop, you append the mean to the end of c:
np.append(c, np.mean(array_2))
Thus, the array c grows with each iteration until it contains all results.
I have an array A of dimension (654 X 2). Now within a loop, I have an 'if' statement. If for a value of 'i', the condition is true, I need to append the values of 'i'th row of A into a separate array B. That means the dimension of B is not known to me beforehand. So, how to initialize such array B in python. If this is not the procedure, suggest me alternative ways to execute the same.
You could initialize B as empty array:
B = []
Then when needs, just append it with the value:
B.append( [x,y] )
you do not provide ANY code to start from, please read this to learn how to ask a question How to create a Minimal, Complete, and Verifiable example
from the almost 0 information that you've provided
you should try doing something like this:
B = []
for i in range(n):
if i % 2 == 0: # example of condition
B += [ A[i] ]
print(B)
Definition: Array A(a1,a2,...,an) is >= than B(b1,b2,...bn) if they are equal sized and a_i>=b_i for every i from 1 to n.
For example:
[1,2,3] >= [1,2,0]
[1,2,0] not comparable with [1,0,2]
[1,0,2] >= [1,0,0]
I have a list which consists of a big number of such arrays (approx. 10000, but can be bigger). Arrays' elements are positive integers. I need to remove all arrays from this list that are bigger than at least one of other arrays. In other words: if there exists such B that A >= B then remove A.
Here is my current O(n^2) approach which is extremely slow. I simply compare every array with all other arrays and remove it if it's bigger. Are there any ways to speed it up.
import numpy as np
import time
import random
def filter_minimal(lst):
n = len(lst)
to_delete = set()
for i in xrange(n-1):
if i in to_delete:
continue
for j in xrange(i+1,n):
if j in to_delete: continue
if all(lst[i]>=lst[j]):
to_delete.add(i)
break
elif all(lst[i]<=lst[j]):
to_delete.add(j)
return [lst[i] for i in xrange(len(lst)) if i not in to_delete]
def test(number_of_arrays,size):
x = map(np.array,[[random.randrange(0,10) for _ in xrange(size)] for i in xrange(number_of_arrays)])
return filter_minimal(x)
a = time.time()
result = test(400,10)
print time.time()-a
print len(result)
P.S. I've noticed that using numpy.all instead of builtin python all slows the program dramatically. What can be the reason?
Might not be exactly what you are asking for, but this should get you started.
import numpy as np
import time
import random
def compare(x,y):
#Reshape x to a higher dimensional array
compare_array=x.reshape(-1,1,x.shape[-1])
#You can now compare every x with every y element wise simultaneously
mask=(y>=compare_array)
#Create a mask that first ensures that all elements of y are greater then x and
#then ensure that this is the case at least once.
mask=np.any(np.all(mask,axis=-1),axis=-1)
#Places this mask on x
return x[mask]
def test(number_of_arrays,size,maxval):
#Create arrays of size (number_of_arrays,size) with maximum value maxval.
x = np.random.randint(maxval, size=(number_of_arrays,size))
y= np.random.randint(maxval, size=(number_of_arrays,size))
return compare(x,y)
print test(50,10,20)
First of all we need to carefully check the objective. Is it true that we delete any array that is > ANY of the other arrays, even the deleted ones? For example, if A > B and C > A and B=C, then do we need to delete only A or both A and C? If we only need to delete INCOMPATIBLE arrays, then it is a much harder problem. This is a very difficult problem because different partitions of the set of arrays may be compatible, so you have the problem of finding the largest valid partition.
Assuming the easy problem, a better way to define the problem is that you want to KEEP all arrays which have at least one element < the corresponding element in ALL the other arrays. (In the hard problem, it is the corresponding element in the other KEPT arrays. We will not consider this.)
Stage 1
To solve this problem what you do is arrange the arrays in columns and then sort each row while maintaining the key to the array and the mapping of each array-row to position (POSITION lists). For example, you might end up with a result in stage 1 like this:
row 1: B C D A E
row 2: C A E B D
row 3: E D B C A
Meaning that for the first element (row 1) array B has a value >= C, C >= D, etc.
Now, sort and iterate the last column of this matrix ({E D A} in the example). For each item, check if the element is less than the previous element in its row. For example, in row 1, you would check if E < A. If this is true you return immediately and keep the result. For example, if E_row1 < A_row1 then you can keep array E. Only if the values in the row are equal do you need to do a stage 2 test (see below).
In the example shown you would keep E, D, A (as long as they passed the test above).
Stage 2
This leaves B and C. Sort the POSITION list for each. For example, this will tell you that the row with B's mininum position is row 2. Now do a direct comparison between B and every array below it in the mininum row, here row 2. Here there is only one such array, D. Do a direct comparison between B and D. This shows that B < D in row 3, therefore B is compatible with D. If the item is compatible with every array below its minimum position keep it. We keep B.
Now we do the same thing for C. In C's case we need only do one direct comparison, with A. C dominates A so we do not keep C.
Note that in addition to testing items that did not appear in the last column we need to test items that had equality in Stage 1. For example, imagine D=A=E in row 1. In this case we would have to do direct comparisons for every equality involving the array in the last column. So, in this case we direct compare E to A and E to D. This shows that E dominates D, so E is not kept.
The final result is we keep A, B, and D. C and E are discarded.
The overall performance of this algorithm is n2*log n in Stage 1 + { n lower bound, n * log n - upper bound } in Stage 2. So, maximum running time is n2*log n + nlogn and minimum running time is n2logn + n. Note that the running time of your algorithm is n-cubed n3. Since you compare each matrix (n*n) and each comparison is n element comparisons = n*n*n.
In general, this will be much faster than the brute force approach. Most of the time will be spent sorting the original matrix, a more or less unavoidable task. Note that you could potentially improve my algorithm by using priority queues instead of sorting, but the resulting algorithm would be much more complicated.