I'm trying to enumerate through a 2D numpy array of shape (512, 512), which holds the pixel values of an image. So basically it's an array representing width and height in pixel values for the image. I'm trying to enumerate through each element to output: [y_index, x_index, pixel_value]. I need these 3 output values to be stored in an array (either the existing one or a new one, whichever is more efficient to execute).
So the input array would have a shape of (512, 512), and if I'm not mistaken, the output array would have an array shape (262144, 3). 262144 is the total number of pixels in a 512x512 matrix. And 3 because there are 3 columns, for 3 pieces of data that I need to extract: pixel value, y coordinate, x coordinate. So basically I want to have an array of pixel values and their y, x coordinates.
In the code below, I used ndenumerate to enumerate through the numpy array (img) of pixel values (512, 512). But I'm struggling on how to store the output values in an array. I created coordinates array to store the output values, but my attempt at it with the last line is clearly incorrect to achieve the desired effect. So how to solve this?
img = as_np_array[:, :, 0]
img = img.reshape(512, 512)
coordinates = np.empty([262,144, 3])
for index, x in np.ndenumerate(img):
coordinates = np.append(coordinates, [[x], index[0], index[1]])
Also the other challenge I'm facing is, to execute this code, my Intel Core i7 2.7GHz (4 Cores) processor takes about 7-10 minutes (possibly more at times) to execute. Is there a more efficient code that can execute faster?
Any help would be greatly appreciated.
You could use numpy.indices to do this. What you want ultimately is image_data with y, x indices and the corresponding pixels (px). There are three columns in image_data.
row, col = np.indices(img.shape)
y, x, px = row.flatten(), col.flatten(), img.flatten()
image_data = np.array([y, x, px]).T
Detailed Example:
img = np.arange(20).reshape(5, 4)
def process_image_data(img):
row, col = np.indices(img.shape)
return (row.flatten(), col.flatten(), img.flatten())
y, x, px = process_image_data(img)
The solution that worked for me is this:
with open('img_pixel_coor1.csv', 'w', newline='') as f:
headernames = ['y_coord', 'x_coord', 'pixel_value']
thewriter = csv.DictWriter(f, fieldnames=headernames)
thewriter.writeheader()
for index, pix in np.ndenumerate(img):
thewriter.writerow({'y_coord' : index[0], 'x_coord' : index[1], 'pixel_value' : pix})
Related
I have a numpy array of images in that shape:
(50000, 32, 32, 3)
50000 is the number of images
32, 32 are the height and width
3 are the RGB values with a range of 0-1
I would like to convert it to a 2D shape of:
(50000, 1024)
Here I would have 50000 images represented in one row,
the RGB value would be converted to let's say an hexadecimal value
I've went through a lot of conversion processes into stack overflow and I've found some.
I know that if my array was a 3D array with an already converted value I could easily use reshape()function to convert it to 2D.
Now what I'm searching is the easiest way to convert RGB values and reshape my array
Would this be possible in 1 or two lines or should I use an external function?
First convert the RGB values in the last dimension to the HEX value using whatever function you like. This SO answer may help.
Reshape then works on any number of dimensions:
import numpy as np
def rgb2hex(r, g, b):
return '#%02x%02x%02x' % (r, g, b)
vfunc = np.vectorize(rgb2hex)
a = (np.random.uniform(0,1,(10,5,5,3))*255).astype(int)
c = vfunc(a[:,:,:,0], a[:,:,:,1], a[:,:,:,2])
c.reshape((10,25))
In order to do so, you firstly need to reshape the ndarray (np.reshape):
a = np.random.randint(1,10,(500, 32, 32, 3))
a_r = np.reshape(a, (500, 1024, 3))
print(a_r.shape)
# (500, 1024, 3)
Now, in order to convert the RGB values along the last dimension to hexadecimal representation as you suggest, you could define a function that returns a hexadecimal representation of the three values with a simple string formatting:
def rgb_to_hex(x):
return '#{:02X}{:02X}{:02X}'.format(*rgb.reshape(3))
In order to apply the conversion along all rows in the last axis, you can use np.apply_along_axis:
a_new = np.apply_along_axis(rgb2hex, axis=-1, arr=a_r).shape
print(a_new.shape)
# (500, 1024)
The following combines the RGB values into a single value
x=np.zeros((100,32,32,3))
x[:,:,:,0] = np.trunc(x[:,:,:,0]) + np.trunc(x[:,:,:,1] *256) + np.trunc(x[:,:,:,2] *65535)
y=x[:,:,:,0]
print(y.shape)
The resulting shape of y: (100, 32, 32)
Next you can use the reshape function on y.
I'm quite new to programming in general, but I could not figure this problem out until now.
I've got a two-dimensional numpy array mask, lets say mask.shape is (3800,3500)which is filled with 0s and 1s representing a spatial resolution of a 2D image, where a 1 represents a visible pixel and 0 represents background.
I've got a second two-dimensional array data of data.shape is (909,x) where x is exactly the amount of 1s in the first array. I now want to replace each 1 in the first array with a vector of length 909 from the second array. Resulting in a final 3D array of shape(3800,3500,909) which is basically a 2D x by y image where select pixels have a spectrum of 909 values in z direction.
I tried
mask_vector = mask.flatten
ones = np.ones((909,1))
mask_909 = mask_vector.dot(ones) #results in a 13300000 by 909 2d array
count = 0
for i in mask_vector:
if i == 1:
mask_909[i,:] = data[:,count]
count += 1
result = mask_909.reshape((3800,3500,909))
This results in a viable 3D array giving a 2D picture when doing plt.imshow(result.mean(axis=2))
But the values are still only 1s and 0s not the wanted spectral data in z direction.
I also tried using np.where but broadcasting fails as the two 2D arrays have clearly different shapes.
Has anybody got a solution? I am sure that there must be an easy way...
Basically, you simply need to use np.where to locate the 1s in your mask array. Then initialize your result array to zero and replace the third dimension with your data using the outputs of np.where:
import numpy as np
m, n, k = 380, 350, 91
mask = np.round(np.random.rand(m, n))
x = np.sum(mask == 1)
data = np.random.rand(k, x)
result = np.zeros((m, n, k))
row, col = np.where(mask == 1)
result[row,col] = data.transpose()
x = np.linspace(0,10, 5)
y = 2*x
points = np.array([x, y]).T.reshape(-1, 1, 2)
What's the mean of the third line?I know the mean of reshape(m,n), but what does reshape(-1, 1, 2) means?
Your question is not entirely clear, so I'm guessing the -1 part is what troubles you.
From the documantaion:
The new shape should be compatible with the original shape. If an integer, then the result will be a 1-D array of that length. One shape dimension can be -1. In this case, the value is inferred from the length of the array and remaining dimensions.
The whole line meaning is this (breaking it down for simplicity):
points = np.array([x, y]) -> create a 2 X 5 np.array consisting of x,y
.T -> transpose
.reshape(-1, 1, 2) -> reshape it, in this case to a 5X1X2 array (as can seen by the output of points.shape [(5L, 1L, 2L)])
vertices = np.array([[100,300],[200,200],[400,300],[200,400]],np.int32)
vertices.shape
pts = vertices.reshape((-1,1,2))
refer this image
Consider the above code
here we have created set of vertices for to be plotted on a image using opencv but opencv expects 3d array but we only have vertices in 2d array.So the .reshape((-1,1,2)) allows us to keep the original array intact while adding the 3rd dimension to the array(Notice the extra brackets added to the list).This third dimension coontains the details for colors i.e RGB
I have some images I want to work with, the problem is that there are two kinds of images both are 106 x 106 pixels, some are in color and some are black and white.
one with only two (2) dimensions:
(106,106)
and one with three (3)
(106,106,3)
Is there a way I can strip this last dimension?
I tried np.delete, but it did not seem to work.
np.shape(np.delete(Xtrain[0], [2] , 2))
Out[67]: (106, 106, 2)
You could use numpy's fancy indexing (an extension to Python's built-in slice notation):
x = np.zeros( (106, 106, 3) )
result = x[:, :, 0]
print(result.shape)
prints
(106, 106)
A shape of (106, 106, 3) means you have 3 sets of things that have shape (106, 106). So in order to "strip" the last dimension, you just have to pick one of these (that's what the fancy indexing does).
You can keep any slice you want. I arbitrarily choose to keep the 0th, since you didn't specify what you wanted. So, result = x[:, :, 1] and result = x[:, :, 2] would give the desired shape as well: it all just depends on which slice you need to keep.
if you have multiple dimensional this might help
pred_mask[0,...] #Remove First Dim
Pred_mask[...,0] #Remove Last Dim
Just take the mean value over the colors dimension (axis=2):
Xtrain_monochrome = Xtrain.mean(axis=2)
When the shape of your array is (106, 106, 3), you can visualize it as a table with 106 rows and 106 columns filled with data points where each point is array of 3 numbers which we can represent as [x, y ,z]. Therefore, if you want to get the dimensions (106, 106), you must make the data points in your table of to not be arrays but single numbers. You can achieve this by extracting either the x-component, y-component or z-component of each data point or by applying a function that somehow aggregates the three component like the mean, sum, max etc. You can extract any component just like #matt Messersmith suggested above.
well, you should be careful when you are trying to reduce the dimensions of an image.
An Image is normally a 3-D matrix that contains data of the RGB values of each pixel. If you want to reduce it to 2-D, what you really are doing is converting a colored RGB image into a grayscale image.
And there are several ways to do this like you can take the maximum of three, min, average, sum, etc, depending on the accuracy you want in your image. The best you can do is, take a weighted average of the RGB values using the formula
Y = 0.299R + 0.587G + 0.114B
where R stands for RED, G is GREEN and B is BLUE. In numpy, this can be written as
new_image = img[:, :, 0]*0.299 + img[:, :, 1]*0.587 + img[:, :, 2]*0.114
Actually np.delete would work if you would apply it two times,
if you want to preserve the first channel for example then you could run the following:
Xtrain = np.delete(Xtrain,2,2) # this will get rid of the 3rd component of the 3 dimensions
print(Xtrain.shape) # will now output (106,106,2)
# again we apply np.delete but on the second component of the 3rd dimension
Xtrain = np.delete(Xtrain,1,2)
print(Xtrain.shape) # will now output (106,106,1)
# you may finally squeeze your output to get a 2d array
Xtrain = Xtrain.squeeze()
print(Xtrain.shape) # will now output (106,106)
I have the following code that works like expected, but I'm curious if the loop can be replaced by a native numpy function/method for better performance. What I have is one array holding RGB values that I use as a lookup table and two 2d arrays holding greyscale values (0-255). Each value of these two arrays corresponds to the value of one axis of the lookup table.
As mentioned, what would be really nice is getting rid of the (slow) loop in python and using a faster numpy method.
#!/usr/bin/env python3
from PIL import Image
import numpy as np
dim = (2000, 2000)
rows, cols = dim
# holding a 256x256 RGB color lookup table
color_map = np.random.random_integers(0, 255, (256,256,3))
# image 1 greyscale values
color_map_idx_row = np.random.randint(0, 255, dim)
# image 2 greyscale values
color_map_idx_col = np.random.randint(0, 255, dim)
# output image data
result_data = np.zeros((rows, cols, 3), dtype=np.uint8)
# is there any built in function in numpy that could
# replace this loop?
# -------------------------------------------------------
for i in range(rows):
for j in range(cols):
row_idx = color_map_idx_row.item(i, j)
col_idx = color_map_idx_col.item(i, j)
rgb_color = color_map[row_idx,col_idx]
result_data[i,j] = rgb_color
img = Image.fromarray(result_data, 'RGB')
img.save('result.png')
You can replace the double-for loop with fancy-indexing:
In [33]: result_alt = color_map[color_map_idx_row, color_map_idx_col]
This confirms the result is the same:
In [36]: np.allclose(result_data, result_alt)
Out[36]: True
You can reshape the 3D array into a 2D array with the axis=1 holding the three channels. Then, use row-slicing with the row indices being calculated as linear indices from the row and column indices arrays. Please note that the reshaped array being a view only, won't burden any of the workspace memory. Thus, we would have -
m = color_map.shape[0]
out = color_map.reshape(-1,3)[color_map_idx_row*m + color_map_idx_col]