I'm trying to make an rgb color picture editor, using just numpy.
I've tried using a nested for loop, but it's really slow (over a minute).
I'm wanting to control first, second, and third element (r,g,b) of the third dimension of the nested array. Thanks
This is to just look at the numbers:
%matplotlib inline
import numpy as np
img = plt.imread('galaxy.jpg')
img = np.array(img)
for i in range(len(img)):
for j in range(len(img[i])):
for k in (img[i][j]):
print(k)
Perhaps this might help you. np.ndenumerate() lets you iterate through a matrix without nested for loops. I did a quick test and my second for loop (in the example below) is slightly faster than your triple nested for loop, as far as printing is concerned. Printing is very slow so taking out the print statements might help with speed. As far as modifying these values, I added r g b a variables that can be modified to scale the various pixel values. Just a thought, but perhaps it might give you more ideas to expand on. Also, I didn't check to see which index values correspond to r, g, b, or a.
r = 1.0
g = 1.0
b = 1.0
a = 1.0
for index, pixel in np.ndenumerate(img): # <--- Acheives the same as your original code
print(pixel)
for index, pixel in np.ndenumerate(img):
i = index[0]
j = index[1]
print("{} {} {} {}".format(img[i][j][0], img[i][j][1], img[i][j][2], img[i][j][3]))
for index, pixel in np.ndenumerate(img):
i = index[0]
j = index[1]
imgp[i][j][0] *= r;
imgp[i][j][1] *= g;
imgp[i][j][2] *= b;
imgp[i][j][3] *= a;
Hope this helps
Related
I am working with 4D data set, where I have a nested for loop (4 loops). The for loop works, but it takes a while to run: ~5 minutes. I am trying to write this properly with list comprehension instead, but I am getting confused on exactly how to do this given my nested loops:
data = np.random.rand(12, 27, 282, 375)
stdev_data = np.std(data, axis=1)
## nested for loop
count = []
for i in range(data.shape[0]):
for j in range(data.shape[1]):
for lat in range(data.shape[2]):
for lon in range(data.shape[3]):
count.append((data[i, j, lat, lon] < -1.282 * stdev_data[i, lat, lon]).sum(axis=0))
reshape_counts = np.reshape(count, data.shape)
This is my attempt at the list comprehension:
i, j, lat, lon = data.shape[0], data.shape[1], data.shape[2], data.shape[3]
print(i, j, lat, lon)
test_list = [[(data < -1.282 * stdev_data).sum(axis=0) for lon in lat] for j in i]
I get an error saying 'int' object is not iterable. How do I rewrite my nested for loop in the form of list comprehension to speed up the process?
Given that you are using numpy, I suggest you take advantage of the fact that their for loops are written in C, and often optimized. You will still end up stepping through the data, but a lot faster. This approach is called vectorization.
In this case, you seek to make a boolean mask, which arguably simplifies the operation. Keep in mind that the .sum() call in your expression is a red herring: you are actually summing a scalar boolean, which will always give you zero or one.
Here is how you would find points smaller than -1.282 of the sigma in the second dimension:
result = data < -1.282 * stdev_data[:, None, ...]
Alternatively, you could do
result = data < -1.282 * stdev_data.reshape(stdev_data.shape[0], 1, *stdev_data.shape[1:])
or
result = data < -1.282 * np.reshape(stdev_data, stdev_data.shape[:1] + (1,) + stdev_data.shape[1:])
An even easier solution would be to pass keepdims=True to np.std from the very beginning:
result = data < -1.282 * np.std(data, axis=1, keepdims=True)
keepdims=True ensures that the output of std has the shape (12, 1, 282, 375) instead of just (12, 282, 375), so you don't need to re-insert the dimension yourself.
Now if you actually wanted to compute the counts as your question seems to imply, you could just sum the result mask along the second dimension:
counts = result.sum(axis=1)
Finally, to answer your actual question exactly as stated: for loops translate directly into list comprehensions. In your case, that means four fors in the comprehension, in exactly the order you originally had them:
[data[i, j, lat, lon] < -1.282 * stdev_data[i, lat, lon]
for i in range(data.shape[0])
for j in range(data.shape[1])
for lat in range(data.shape[2])
for lon in range(data.shape[3])]
Since comprehensions are surrounded by brackets, you are free to write their contents on separate lines as I've done, although this is of course not required. Notice that the only real differences are that the contents of the append comes first and there are no colons. Also, that red herring sum is gone.
I don't think the sum would do anything except convert False to 0 and True to 1 since you'd just be comparing two numbers to each other. I think this would do the same thing (I couldn't find a way to get rid of last loop, if you really need it to be faster maybe joblib or numba would help but I don't use them very much so not sure):
count = np.empty(data.shape)
for j in range(data.shape[1]):
count[:,j,...] = (data[:,j,...] < -1.282*stdev_data).astype(np.int32)
But also the standard deviation can't be negative so nothing will satisfy the above condition since you're multiplying by a negative number but all your data is between 0 and 1 so I'd recommend double checking everything
I have a numpy array with zeros and non-zeros and shape (10,10).
To a subpart of this array I need to add a certain value, where initial value is not zero.
a[2:7,2:7] += 0.5 #But with a condition that a[a!=0]
Currently, I do it in a rather cumbersome way, by first making a copy of the array and modifying the second array consistently and then copying back to the first.
b = a.copy()
b[b!=0] = 1
b[2:7,2:7] *= 0.5
b[b ==1] =0
a += b
Is there more elegant way to achieve this?
As Thomas Kühn, correctly wrote in the comment, its good enough to create a reference to that subpart of the array and modify it. So the following does the job.
b = a[2:7,2:7]
b[b!=0] += 0.5
I have to evaluate the following expression, given two quite large matrices A,B and a very complicated function F:
The mathematical expression
I was thinking if there is an efficient way in order to first find those indices i,j that will give a non-zero element after the multiplication of the matrices, so that I avoid the quite slow 'for loops'.
Current working code
# Starting with 4 random matrices
A = np.random.randint(0,2,size=(50,50))
B = np.random.randint(0,2,size=(50,50))
C = np.random.randint(0,2,size=(50,50))
D = np.random.randint(0,2,size=(50,50))
indices []
for i in range(A.shape[0]):
for j in range(A.shape[0]):
if A[i,j] != 0:
for k in range(B.shape[1]):
if B[j,k] != 0:
for l in range(C.shape[1]):
if A[i,j]*B[j,k]*C[k,l]*D[l,i]!=0:
indices.append((i,j,k,l))
print indices
As you can see, in order to get the indices I need I have to use nested loops (= huge computational time).
My guess would be NO: you cannot avoid the for-loops. In order to find all the indices ij you need to loop through all the elements which defeats the purpose of this check. Therefore, you should go ahead and use simple array elementwise multiplication and dot product in numpy - it should be quite fast with for loops taken care by numpy.
However, if you plan on using a Python loop then the answer is YES, you can avoid them by using numpy, using the following pseudo-code (=hand-waving):
i, j = np.indices((N, M)) # CAREFUL: you may need to swap i<->j or N<->M
fs = F(i, j, z) # array of values of function F
# for a given z over the index grid
R = np.dot(A*fs, B) # summation over j
# return R # if necessary do a summation over i: np.sum(R, axis=...)
If the issue is that computing fs = F(i, j, z) is a very slow operation, then you will have to identify elements of A that are zero using two loops built-in into numpy (so they are quite fast):
good = np.nonzero(A) # hidden double loop (for 2D data)
fs = np.zeros_like(A)
fs[good] = F(i[good], j[good], z) # compute F only where A != 0
I'm trying to map a color histogram where each pixel also as another (float) property, alpha, from a similar size array.
I want eventually to have a dictionary of (color) -> (count, sum) where count is actually the histogram count for that color, and sum is the sum of alpha values that correspond to a certain color.
here's a simple python code that makes what i want (c and d are the same length, and are very long):
for i in range(len(c)):
if str(c[i]) in dict:
dict[str(c[i])][0] += 1
dict[str(c[i])][1] += alpha[i]
else:
dict[str(c[i])] = [0, alpha[i]]
but naturally that takes a lot of time. Any ideas for a numpy equivalent?
Thanks
Okay, so i eventually found a very nice solution using this answer using only numpy:
https://stackoverflow.com/a/8732260/1752591
Which is a function that sums up vector according to another vector of indices.
So all I had to do is to give an id for each color, and make the dictionary:
d = alpha.reshape((-1))
id = color_code_image(colormap)
v, g = sum_by_group(d, id)
count, g = sum_by_group(np.ones(len(d)), id)
avg = v/count
return dict(np.array([g, avg]).T)
Okay, so I've got a piece of Python code which really needs optimizing.
It's a Game-of-Life iteration over a small (80x60-pixel) image and extracts the RGB values from it.
currently using nested for-loops; I'd rather swap out those for loops for the faster map() c function, but if I do that I can't figure out how I can get the x,y values, nor the local values defined out of the scope of the functions I'd need to define.
would using map() be any faster than this current set of for loops? How could I use it and still get x,y?
I currently use pygame Surfaces, and I've tried the surfarray/pixelarray modules, but since I'm changing/getting every pixel, it's a lot slower than Surface.get_at()/set_at().
Also, slightly irrelevant... do you think this could be made quicker if Python wasn't traversing a list of numbers but just incrementing a number, like in other languages? Why doesn't python include a normal for() as well as their foreach()?
The amount of conditionals there probably makes things slower too, right? The slowest part is checking for neighbours (where it builds the list n)... I replaced that whole bit with slice access on a 2D array but it doesn't work properly.
Redacted version of code:
xr = xrange(80)
yr = xrange(60)
# surface is an instance of pygame.Surface
get_at = surface.get_at()
set_at = surface.set_at()
for x in xr:
# ....
for y in yr:
# ...
pixelR = get_at((x,y))[0]
pixelG = get_at((x,y))[1]
pixelB = get_at((x,y))[2]
# ... more complex stuff here which changes R,G,B values independently of each other
set_at((x,y),(pixelR,pixelG,pixelB))
Full version of the function:
# xr, yr = xrange(80), xrange(60)
def live(surface,xr,yr):
randint = random.randint
set_at = surface.set_at
get_at = surface.get_at
perfect = perfectNeighbours #
minN = minNeighbours # All global variables that're defined in a config file.
maxN = maxNeighbours #
pos = actual # actual = (80,60)
n = []
append = n.append
NEIGHBOURS = 0
for y in yr: # going height-first for aesthetic reasons.
decay = randint(1,maxDecay)
growth = randint(1,maxGrowth)
for x in xr:
r, g, b, a = get_at((x,y))
del n[:]
NEIGHBOURS = 0
if x>0 and y>0 and x<pos[0]-1 and y<pos[1]-1:
append(get_at((x-1,y-1))[1])
append(get_at((x+1,y-1))[1])
append(get_at((x,y-1))[1])
append(get_at((x-1,y))[1])
append(get_at((x+1,y))[1])
append(get_at((x-1,y+1))[1])
append(get_at((x+1,y+1))[1])
append(get_at((x,y+1))[1])
for a in n:
if a > 63:
NEIGHBOURS += 1
if NEIGHBOURS == 0 and (r,g,b) == (0,0,0): pass
else:
if NEIGHBOURS < minN or NEIGHBOURS > maxN:
g = 0
b = 0
elif NEIGHBOURS==perfect:
g += growth
if g > 255:
g = 255
b += growth
if b > growth: b = growth
else:
if g > 10: r = g-10
if g > 200: b = g-100
if r > growth: g = r
g -= decay
if g < 0:
g = 0
b = 0
r -= 1
if r < 0:
r = 0
set_at((x,y),(r,g,b))
What's making your code slow is probably not the loops, they are incredibly fast.
What slows done your code are the number of function calls. For example
pixelR = get_at((x,y))[0]
pixelG = get_at((x,y))[1]
pixelB = get_at((x,y))[2]
is a lot slower than (about 3 times I guess)
r, g, b, a = get_at((x,y))
Every get_at, set_at call locks the surface, therefore it's faster to directly access the pixels using the available methods. The one that seems most reasonable is Surface.get_buffer.
Using map doesn't work in your example, because you need the indexes. With as few as 80 and 60 numbers it might even be faster to use range() instead of xrange().
map(do_stuff, ((x, y) for x in xrange(80) for y in xrange(60)))
where do_stuff would presumably be defined like so:
def do_stuff(coords):
r, g, b, a = get_at(coords)
# ... whatever you need to do with those ...
set_at(coords, (r, g, b))
You could alternatively use a list comprehension instead of a generator expression as the second argument to map (replace ((x, y) ...) with [(x, y) ...]) and use range instead of xrange. I'd say that it's not very likely to have a significant effect on performance, though.
Edit: Note that gs is certainly right about the for loops not being the main thing in need of optimisation in your code... Cutting down on superfluous calls to get_at is more important. In fact, I'm not sure if replacing the loops with map will actually improve performance here at all... Having said that, I find the map version more readable (perhaps because of my FP background...), so here you go anyway. ;-)
Since you are reading and rewriting every pixel, I think you can get the best speed improvement by not using a Surface.
I suggest first taking your 80x60 image and converting it to a plain bitmap file with 32-bit pixels. Then read the pixel data into a python array object. Now you can walk over the array object, reading values, calculating new values, and poking the new values into place with maximum speed. When done, save your new bitmap image, and then convert it to a Surface.
You could also use 24-bit pixels, but that should be slower. 32-bit pixels means one pixel is one 32-bit integer value, which makes the array of pixels much easier to index. 24-bit packed pixels means each pixel is 3 bytes, which is much more annoying to index into.
I believe you will gain much more speed out of this approach than by trying to avoid the use of for. If you try this, please post something here to let us know how well it worked or didn't. Good luck.
EDIT: I thought that an array has only a single index. I'm not sure how you managed to get two indexes to work. I was expecting you to do something like this:
def __i(x, y):
assert(0 <= x < 80)
assert(0 <= y < 60)
i = (y*80 + x) * 4
return i
def red(x, y):
return __a[__i(x, y)]
def green(x, y):
return __a[__i(x, y) + 1]
def blue(x, y):
return __a[__i(x, y) + 2]
def rgb(x, y):
i = __i(x, y)
return __a[i], __a[i + 1], __a[i + 2]
def set_rgb(x, y, r, g, b):
i = __i(x, y)
_a[i] = r
_a[i + 1] = g
_a[i + 2] = b
# example:
r, g, b = rgb(23, 33)
Since a Python array can only hold a single type, you will want to set the type to "unsigned byte" and then index like I showed.
Where of course __a is the actual array variable.
If none of this is helpful, try converting your bitmap into a list, or perhaps three lists. You can use nested lists to get 2D addressing.
I hope this helps. If it is not helpful, then I am not understanding what you are doing; if you explain more I'll try to improve the answer.