Optimizing or replacing array iteration with python loop by numpy functionality - python

I have the following code that works like expected, but I'm curious if the loop can be replaced by a native numpy function/method for better performance. What I have is one array holding RGB values that I use as a lookup table and two 2d arrays holding greyscale values (0-255). Each value of these two arrays corresponds to the value of one axis of the lookup table.
As mentioned, what would be really nice is getting rid of the (slow) loop in python and using a faster numpy method.
#!/usr/bin/env python3
from PIL import Image
import numpy as np
dim = (2000, 2000)
rows, cols = dim
# holding a 256x256 RGB color lookup table
color_map = np.random.random_integers(0, 255, (256,256,3))
# image 1 greyscale values
color_map_idx_row = np.random.randint(0, 255, dim)
# image 2 greyscale values
color_map_idx_col = np.random.randint(0, 255, dim)
# output image data
result_data = np.zeros((rows, cols, 3), dtype=np.uint8)
# is there any built in function in numpy that could
# replace this loop?
# -------------------------------------------------------
for i in range(rows):
for j in range(cols):
row_idx = color_map_idx_row.item(i, j)
col_idx = color_map_idx_col.item(i, j)
rgb_color = color_map[row_idx,col_idx]
result_data[i,j] = rgb_color
img = Image.fromarray(result_data, 'RGB')
img.save('result.png')

You can replace the double-for loop with fancy-indexing:
In [33]: result_alt = color_map[color_map_idx_row, color_map_idx_col]
This confirms the result is the same:
In [36]: np.allclose(result_data, result_alt)
Out[36]: True

You can reshape the 3D array into a 2D array with the axis=1 holding the three channels. Then, use row-slicing with the row indices being calculated as linear indices from the row and column indices arrays. Please note that the reshaped array being a view only, won't burden any of the workspace memory. Thus, we would have -
m = color_map.shape[0]
out = color_map.reshape(-1,3)[color_map_idx_row*m + color_map_idx_col]

Related

Enumerate through 2D array in numpy and append to new array

I'm trying to enumerate through a 2D numpy array of shape (512, 512), which holds the pixel values of an image. So basically it's an array representing width and height in pixel values for the image. I'm trying to enumerate through each element to output: [y_index, x_index, pixel_value]. I need these 3 output values to be stored in an array (either the existing one or a new one, whichever is more efficient to execute).
So the input array would have a shape of (512, 512), and if I'm not mistaken, the output array would have an array shape (262144, 3). 262144 is the total number of pixels in a 512x512 matrix. And 3 because there are 3 columns, for 3 pieces of data that I need to extract: pixel value, y coordinate, x coordinate. So basically I want to have an array of pixel values and their y, x coordinates.
In the code below, I used ndenumerate to enumerate through the numpy array (img) of pixel values (512, 512). But I'm struggling on how to store the output values in an array. I created coordinates array to store the output values, but my attempt at it with the last line is clearly incorrect to achieve the desired effect. So how to solve this?
img = as_np_array[:, :, 0]
img = img.reshape(512, 512)
coordinates = np.empty([262,144, 3])
for index, x in np.ndenumerate(img):
coordinates = np.append(coordinates, [[x], index[0], index[1]])
Also the other challenge I'm facing is, to execute this code, my Intel Core i7 2.7GHz (4 Cores) processor takes about 7-10 minutes (possibly more at times) to execute. Is there a more efficient code that can execute faster?
Any help would be greatly appreciated.
You could use numpy.indices to do this. What you want ultimately is image_data with y, x indices and the corresponding pixels (px). There are three columns in image_data.
row, col = np.indices(img.shape)
y, x, px = row.flatten(), col.flatten(), img.flatten()
image_data = np.array([y, x, px]).T
Detailed Example:
img = np.arange(20).reshape(5, 4)
def process_image_data(img):
row, col = np.indices(img.shape)
return (row.flatten(), col.flatten(), img.flatten())
y, x, px = process_image_data(img)
The solution that worked for me is this:
with open('img_pixel_coor1.csv', 'w', newline='') as f:
headernames = ['y_coord', 'x_coord', 'pixel_value']
thewriter = csv.DictWriter(f, fieldnames=headernames)
thewriter.writeheader()
for index, pix in np.ndenumerate(img):
thewriter.writerow({'y_coord' : index[0], 'x_coord' : index[1], 'pixel_value' : pix})

Numpy 2D spatial mask to be filled with specific values from a 2D array to form a 3D structure

I'm quite new to programming in general, but I could not figure this problem out until now.
I've got a two-dimensional numpy array mask, lets say mask.shape is (3800,3500)which is filled with 0s and 1s representing a spatial resolution of a 2D image, where a 1 represents a visible pixel and 0 represents background.
I've got a second two-dimensional array data of data.shape is (909,x) where x is exactly the amount of 1s in the first array. I now want to replace each 1 in the first array with a vector of length 909 from the second array. Resulting in a final 3D array of shape(3800,3500,909) which is basically a 2D x by y image where select pixels have a spectrum of 909 values in z direction.
I tried
mask_vector = mask.flatten
ones = np.ones((909,1))
mask_909 = mask_vector.dot(ones) #results in a 13300000 by 909 2d array
count = 0
for i in mask_vector:
if i == 1:
mask_909[i,:] = data[:,count]
count += 1
result = mask_909.reshape((3800,3500,909))
This results in a viable 3D array giving a 2D picture when doing plt.imshow(result.mean(axis=2))
But the values are still only 1s and 0s not the wanted spectral data in z direction.
I also tried using np.where but broadcasting fails as the two 2D arrays have clearly different shapes.
Has anybody got a solution? I am sure that there must be an easy way...
Basically, you simply need to use np.where to locate the 1s in your mask array. Then initialize your result array to zero and replace the third dimension with your data using the outputs of np.where:
import numpy as np
m, n, k = 380, 350, 91
mask = np.round(np.random.rand(m, n))
x = np.sum(mask == 1)
data = np.random.rand(k, x)
result = np.zeros((m, n, k))
row, col = np.where(mask == 1)
result[row,col] = data.transpose()

removing entries from a numpy array

I have a multidimensional numpy array with the shape (4, 2000). Each column in the array is a 4D element where the first two elements represent 2D positions.
Now, I have an image mask with the same shape as an image which is binary and tells me which pixels are valid or invalid. An entry of 0 in the mask highlights pixels that are invalid.
Now, I would like to do is filter my first array based on this mask i.e. remove entries where the position elements in my first array correspond to invalid pixels in the image. This can be done by looking up the corresponding entries in the mask and marking those columns to be deleted which correspond to a 0 entry in the mask.
So, something like:
import numpy as np
# Let mask be a 2D array of 0 and 1s
array = np.random.rand(4, 2000)
for i in range(2000):
current = array[:, i]
if mask[current[0], current[1]] <= 0:
# Somehow remove this entry from my array.
If possible, I would like to do this without looping as I have in my incomplete code.
You could select the x and y coordinates from array like this:
xarr, yarr = array[0, :], array[1, :]
Then form a boolean array of shape (2000,) which is True wherever the mask is 1:
idx = mask[xarr, yarr].astype(bool)
mask[xarr, yarr] is using so-called "integer array indexing".
All it means here is that the ith element of idx equals mask[xarr[i], yarr[i]].
Then select those columns from array:
result = array[:, idx]
import numpy as np
mask = np.random.randint(2, size=(500,500))
array = np.random.randint(500, size=(4, 2000))
xarr, yarr = array[0, :], array[1, :]
idx = mask[xarr, yarr].astype(bool)
result = array[:, idx]
cols = []
for i in range(2000):
current = array[:, i]
if mask[current[0], current[1]] > 0:
cols.append(i)
expected = array[:, cols]
assert np.allclose(result, expected)
I'm not sure if I'm reading the question right. Let's try again!
You have an array with 2 dimensions and you want to remove all columns that have masked data. Again, apologies if I've read this wrong.
import numpy.ma as ma
a = ma.array((([[1,2,3,4,5],[6,7,8,9,10]]),mask=[[0,0,0,1,0],[0,0,1,0,0]])
a[:,-a.mask.any(0)] # this is where the action happens
the a.mask.any(0) identifies all columns that are masked into a Boolean array. It's negated (the '-' sign) because we want the inverse, and then it uses that array to remove all masked values via indexing.
This gives me an array:
[[1 2 5],[6 7 10]]
In other words, the array has all removed all columns with masked data anywhere. Hope I got it right this time.

Assigning a RGB column array to an arbitrary number of columns in an image array

Let's say I have a 4x5 RGB image array, a single RGB row array, and a single RGB column array.
import numpy as np
img=np.zeros((4,5,3))
row=np.arange(15).reshape((5,3))
col=np.arange(12).reshape((4,3))
It is simple to assign the row array to multiple rows of the image array.
img[1:3] = row
It is equally simple to assign the column array to a single column of the image array.
img[:,1,:] = col
It is easy enough to assign the column array to multiple columns of the image array using a loop.
for n in (2,3):
img[:,n,:] = col
But looping seems inefficient. Is there a better way (i.e., without looping) to assign the RGB column array to an arbitrary number of columns?
img[:, [2,3], :] = col[:, None, :]

How to get an array from RGB values of a bitmap image?

I am running this code
from PIL import Image
import numpy as np
im = Image.open("/Users/Hugo/green_leaves.jpg")
im.load()
height, widht = im.size
p = np.array([0,0,0])
for row in range(height):
for col in range(widht):
a = im.getpixel((row,col))
p = np.append(a.asarray())
But I am getting the following error
Traceback (most recent call last):
File "/Users/hugo/PycharmProjects/Meteo API/image.py", line 17, in <module>
p = np.append(a.asarray())
AttributeError: 'tuple' object has no attribute 'asarray'
Could you help me?
You mentioned numpy. If you want a numpy array of the image, don't iterate through it, just do data = np.array(im).
E.g.
from PIL import Image
import numpy as np
im = Image.open("/Users/Hugo/green_leaves.jpg")
p = np.array(im)
Building up a numpy array by repeatedly appending to it is very inefficient. Numpy arrays aren't like python lists (python lists serve that purpose very well!!). They're fixed-size, homogenous, memory-efficient arrays.
If you did want to build up a numpy array through appending, use a list (which can be efficiently appended to) and then convert that list to a numpy array.
However, in this case, PIL images support being converted to numpy arrays directly.
On one more note, the example I gave above isn't 100% equivalent to your code. p will be a height by width by numbands (3 or 4) array, instead of a numpixels by numbands array as it was in your original example.
If you want to reshape the array into numpixels by numbands, just do:
p = p.reshape(-1, p.shape[2])
(Or equivalently, p.shape = -1, p.shape[2])
This will reshape the array into width*height by numbands (either 3 or 4, depending on whether or not there's an alpha channel) array. In other words a sequence of the red,green,blue,alpha pixel values in the image. The -1 is a placeholder that tells numpy to calculate the appropriate shape for the first axes based on the other sizes that are specified.
Initialize p as a list, and convert it to a numpy array after the for-loop:
p=[]
for row in range(height):
for col in range(widht):
a = im.getpixel((row,col))
p.append(a)
p=np.asarray(p)
This will create a list of shape (*, 3), which is same as np.array(im).reshape(-1, 3). So if you need this, just use the latter form ;)

Categories

Resources