I'm trying to scale the colors of images to predefined ranges. Based on least-squared error from palette's range of colors, a color is assigned to output pixel.
I have written the code in python loops is there a better vectorized way to do this?
import numpy as np
import skimage.io as io
palette = [
[180, 0 , 0],
[255, 150, 0],
[255, 200, 0],
[0, 128, 0]
IMG = io.imread('lena.jpg')[:,:,:3]
DIM = IMG.shape
IOUT = np.empty(DIM)
for x in range(DIM[0]):
for y in range(DIM[1]):
P = ((np.array(palette)-IMG[x,y,:])**2).sum(axis=1).argmin()
IOUT[x,y,:] = palette[P]
Can the loops be avoided and solved using numpy operations itself?
Don't loop over all pixels, but over all colors:
import pylab as pl
palette = pl.array([[180, 0, 0], [255, 150, 0], [255, 200, 0], [0, 128, 0]])
img = pl.imread('lena.jpg')[:, :, :3].astype('float')
R, G, B = img[:, :, 0].copy(), img[:, :, 1].copy(), img[:, :, 2].copy()
dist = pl.inf * R
for i in range(len(palette)):
new_dist = pl.square(img[:, :, 0] - palette[i, 0]) \
+ pl.square(img[:, :, 1] - palette[i, 1]) \
+ pl.square(img[:, :, 2] - palette[i, 2])
R[new_dist < dist] = palette[i, 0]
G[new_dist < dist] = palette[i, 1]
B[new_dist < dist] = palette[i, 2]
dist = pl.minimum(dist, new_dist)
pl.subplot(1, 2, 1)
pl.subplot(1, 2, 2)
pl.imshow(pl.dstack((R, G, B)))
Edit: The loop-less alternative. ;)
import pylab as pl
palette = pl.array([[180, 0 , 0], [255, 150, 0], [255, 200, 0], [0, 128, 0]])
img = pl.imread('lena.jpg')[:, :, :3]
pl.subplot(1, 2, 1)
IMG = img.reshape((512, 512, 3, 1))
PAL = palette.transpose().reshape((1, 1, 3, -1))
idx = pl.argmin(pl.sum((IMG - PAL)**2, axis=2), axis=2)
img = palette[idx, :]
pl.subplot(1, 2, 2)
I am trying to warp an image based of the orientation of the camera relative to an aruco marker in the middle of the image. I have managed to get the translation part working but the rotation element is not working. It seems like the image isn't rotating about the centre of the aruco axis. The reference image was taken straight on and the warped image is overlayed.
# Find centre of the marker
top_left_x = (corners[0][0][0, 0])
top_left_y = (corners[0][0][0, 1])
top_right_x = (corners[0][0][1, 0])
top_right_y = (corners[0][0][1, 1])
bottom_right_x = (corners[0][0][2, 0])
bottom_right_y = (corners[0][0][2, 1])
bottom_left_x = (corners[0][0][3, 0])
bottom_left_y = (corners[0][0][3, 1])
# Compare this to the centre of the image to calculate the offset
mid_x = top_right_x - (top_right_x - top_left_x) / 2
mid_y = bottom_left_y - (bottom_left_y - top_left_y) / 2
x_centre = 960
y_centre = 540
x_offset = x_centre - mid_x
y_offset = y_centre - mid_y
if x_centre > mid_x: # gone right
x_offset = 1 * (x_centre - mid_x) # correction to the left
if x_centre < mid_x: # gone left
x_offset = -1 * (mid_x - x_centre) # correction to the right
if y_centre > mid_y: # gone down
y_offset = 1 * (y_centre - mid_y) # correction to the left
if y_centre < mid_y: # gone left
y_offset = -1 * (mid_y - y_centre) # correction to the right
current_z_distance = (math.sqrt((pos_camera[0]**2) + (pos_camera[1]**2) +
(pos_camera[2]**2))) * 15.4
img = cv2.imread('Corrected.png')
corrected_z = 31 # Distance when image was taken
initial_z_distance = corrected_z * 15.4 # Pixels
delta_z = (initial_z_distance - current_z_distance)
scale_factor = current_z_distance / initial_z_distance # how much larger the image
now is. Used for scaling
z_translation = delta_z * 1.54 # how much the image has moved. negative for going
z_translation = 0
z_axis = 960 / scale_factor
proj2dto3d = np.array([[1, 0, -mid_x],
[0, 1, -mid_y],
[0, 0, 0],
[0, 0, 1]], np.float32)
proj3dto2d = np.array([[z_axis, 0, mid_x, 0],
[0, z_axis, mid_y, 0], # defines to centre of rotation
[0, 0, 1, 0]], np.float32)
trans = np.array([[1, 0, 0, x_offset * -1], # Working
[0, 1, 0, y_offset * -1],
[0, 0, 1, 960], # keep as 960
[0, 0, 0, 1]], np.float32)
x = math.degrees(roll_marker) * -1 # forwards and backwards
y = math.degrees(pitch_marker) * -1 # Left and right
z = 0
rx = np.array([[1, 0, 0, 0],
[0, 1, 0, 0],
[0, 0, 1, 0],
[0, 0, 0, 1]], np.float32) #
ry = np.array([[1, 0, 0, 0],
[0, 1, 0, 0],
[0, 0, 1, 0],
[0, 0, 0, 1]], np.float32)
rz = np.array([[1, 0, 0, 0],
[0, 1, 0, 0],
[0, 0, 1, 0],
[0, 0, 0, 1]], np.float32)
ax = float(x * (math.pi / 180.0)) # 0
ay = float(y * (math.pi / 180.0))
az = float(z * (math.pi / 180.0)) # 0
rx[1, 1] = math.cos(ax) # 0
rx[1, 2] = -math.sin(ax) # 0
rx[2, 1] = math.sin(ax) # 0
rx[2, 2] = math.cos(ax) # 0
ry[0, 0] = math.cos(ay)
ry[0, 2] = -math.sin(ay)
ry[2, 0] = math.sin(ay)
ry[2, 2] = math.cos(ay)
rz[0, 0] = math.cos(az) # 0
rz[0, 1] = -math.sin(az) # 0
rz[1, 0] = math.sin(az) # 0
rz[1, 1] = math.cos(az) # 0
# Translation matrix
# r = rx.dot(ry) # if we remove the lines we put r=ry
r = rx.dot(ry) # order may need to be changed
final = proj3dto2d.dot(trans.dot(r.dot(proj2dto3d))) # just rotation
dst = cv2.warpPerspective(img, final, (img.shape[1], img.shape[0]), None, cv2.INTER_LINEAR, cv2.BORDER_CONSTANT, (255, 255, 255))
I need to solve a problem in which I have spent hours, with the data from my excel sheet I have created a 6x36 '' zeros '' matrix of zeros and a 6x6 '' matrix_tran '' coordinate transformation matrix [image 1].
My problem is that I can't find a way to replace the zeros of the '' zeros '' matrix with the values that the matrix '' matrix_tran '' dictates, and whose location must be in the columns (4,5,6, 7,8,9) that are given by the connection vector (4,5,6,7,8,9) of element 15 of the Excel sheet, that is, the last row of the for loop iteration [image 2].
In summary: Below I show how it fits and how it should look [image 3 and 4 respectively].
I would very much appreciate your help, and excuse my English, but it is not my native language, a big greeting.
import pandas as pd
import numpy as np
ex = pd.ExcelFile('matrix_tr.xlsx')
hoja = ex.parse('Hoja1')
cols = 36
for n in range(0,len(hoja)):
A = hoja['ELEMENT #'][n]
B = hoja['1(i)'][n]
C = hoja['2(i)'][n]
D = hoja['3(i)'][n]
E = hoja['1(j)'][n]
F = hoja['2(j)'][n]
G = hoja['3(j)'][n]
H = hoja['X(i)'][n]
I = hoja['Y(i)'][n]
J = hoja['X(j)'][n]
K = hoja['Y(j)'][n]
L = np.sqrt((J-H)**2+(K-I)**2)
lx = (J-H)/L
ly = (K-I)/L
zeros = np.zeros((6, cols))
counters = hoja.loc[:, ["1(i)", "2(i)", "3(i)", "1(j)", "2(j)", "3(j)"]]
for _, i1, i2, i3, j1, j2, j3 in counters.itertuples():
matrix_tran = np.array([[lx, ly, 0, 0, 0, 0],
[-ly, lx, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0],
[0, 0, 0, lx, ly, 0],
[0, 0, 0, -ly, lx, 0],
[0, 0, 0, 0, 0, 1]])
zeros[:, [i1 - 1, i2 - 1, i3 - 1, j1 - 1, j2 - 1 , j3 - 1]] = matrix_tran
Try with a transposed zeros matrix
import pandas as pd
import numpy as np
ex = pd.ExcelFile('c:/tmp/SO/matrix_tr.xlsx')
hoja = ex.parse('Hoja1')
counters = hoja.loc[:, ["1(i)", "2(i)", "3(i)", "1(j)", "2(j)", "3(j)"]]
# zeros matrix transposed
cols = 36
zeros_trans = np.zeros((cols,6))
# last row only
for n in range(14,len(hoja)):
Xi = hoja['X(i)'][n]
Yi = hoja['Y(i)'][n]
Xj = hoja['X(j)'][n]
Yj = hoja['Y(j)'][n]
X = Xj-Xi
Y = Yj-Yi
L = np.sqrt(X**2+Y**2)
lx = X/L
ly = Y/L
matrix_tran = np.array([[lx, ly, 0, 0, 0, 0],
[-ly, lx, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0],
[0, 0, 0, lx, ly, 0],
[0, 0, 0, -ly, lx, 0],
[0, 0, 0, 0, 0, 1]])
i = 0
for r in counters.iloc[n]:
zeros_trans[r-1] = matrix_tran[i]
i += 1
I have a color array like this:
colors = [
[50, 0, 255], # Blue
[255, 0, 0], # Red
[255, 0, 180], # Pink
[255, 140, 0], # Orange
[255, 255, 0], # Jaune
[0, 255, 255], # Cyan
I have a big pixels array:
pixels = [0] * width * height * 3
I want to build an image which contains random blocks of the same color:
for y in range(block_count_y):
for x in range(block_count_x):
color = random.choice(colors)
for yp in range(block_height):
for xp in range(block_width):
offset = (y * width * block_height + yp * width + x * block_width + xp) * 3
pixels[offset + 0] = color[0]
pixels[offset + 1] = color[1]
pixels[offset + 2] = color[2]
Then i have to get bytes pixels with this:
My loops are very slow. I am looking for a way to optimize this.
I have tried numpy and list comprehension but it does not work...
Any idea ?
I want to ask you about calculating the histogram in Python using OpenCV. I used this code:
hist = cv2.calcHist(im, [0, 1, 2], None, [8, 8, 8], [0, 256, 0, 256, 0, 256])
The result gave me the histogram of each color channel with 8 bins, but what I want to get is:
1st bin (R=0-32,G=0-32,B=0-32),
2nd bin (R=33-64,G=0-32,B=0-32),
and so on,
so I will have 512 bins in total.
From my point of view, your cv2.calcHist call isn't correct:
hist = cv2.calcHist(im, [0, 1, 2], None, [8, 8, 8], [0, 256, 0, 256, 0, 256])
The first parameter should be a list of images:
hist = cv2.calcHist([im], [0, 1, 2], None, [8, 8, 8], [0, 256, 0, 256, 0, 256])
Let's see this small example:
import cv2
import numpy as np
# Red blue square of size [4, 4], i.e. eight pixels (255, 0, 0) and eight pixels (0, 0, 255); Attention: BGR ordering!
image = np.zeros((4, 4, 3), dtype=np.uint8)
image[:, 0:2, 2] = 255
image[:, 2:4, 0] = 255
# Calculate histogram with two bins [0 - 127] and [128 - 255] per channel:
# Result should be hist["bin 0", "bin 0", "bin 1"] = 8 (red) and hist["bin 1", "bin 0", "bin 0"] = 8 (blue)
# Original cv2.calcHist call with two bins [0 - 127] and [128 - 255]
hist = cv2.calcHist(image, [0, 1, 2], None, [2, 2, 2], [0, 256, 0, 256, 0, 256])
print(hist, '\n') # Not correct
# Correct cv2.calcHist call
hist = cv2.calcHist([image], [0, 1, 2], None, [2, 2, 2], [0, 256, 0, 256, 0, 256])
print(hist, '\n') # Correct
[[[8. 0.]
[0. 0.]]
[[0. 0.]
[0. 4.]]]
[[[0. 8.]
[0. 0.]]
[[8. 0.]
[0. 0.]]]
As you can, your version only has 12 values in total, whereas there are 16 pixels in the image! Also, it's not clear, what "bins" (if at all) are represented.
So, having the proper cv2.calcHist call, your general idea/approach is correct! Maybe, you just need a little hint, "how to read" the resuling hist:
import cv2
import numpy as np
# Colored rectangle of size [32, 16] with one "color" per bin for eight bins per channel,
# i.e. 512 pixels, such that each of the resulting 512 bins has value 1
x = np.linspace(16, 240, 8, dtype=np.uint8)
image = np.reshape(np.moveaxis(np.array(np.meshgrid(x, x, x)), [0, 1, 2, 3], [3, 0, 1, 2]), (32, 16, 3))
# Correct cv2.calcHist call
hist = cv2.calcHist([image], [0, 1, 2], None, [8, 8, 8], [0, 256, 0, 256, 0, 256])
# Lengthy output of each histogram bin
for B in np.arange(hist.shape[0]):
for G in np.arange(hist.shape[1]):
for R in np.arange(hist.shape[2]):
r = 'R=' + str(R*32).zfill(3) + '-' + str((R+1)*32-1).zfill(3)
g = 'G=' + str(G*32).zfill(3) + '-' + str((G+1)*32-1).zfill(3)
b = 'B=' + str(B*32).zfill(3) + '-' + str((B+1)*32-1).zfill(3)
print('(' + r + ', ' + g + ', ' + b + '): ', int(hist[B, G, R]))
(R=000-031, G=000-031, B=000-031): 1
(R=032-063, G=000-031, B=000-031): 1
(R=064-095, G=000-031, B=000-031): 1
[... 506 more lines ...]
(R=160-191, G=224-255, B=224-255): 1
(R=192-223, G=224-255, B=224-255): 1
(R=224-255, G=224-255, B=224-255): 1
Hope that helps!
I'm working on implementing a semantic segmentation network in Tensorflow, and I'm trying to figure out how to write out summary images of the labels during training. I want to encode the images in a similar style to the class segmentation annotations used in the Pascal VOC dataset.
For example, let's assume I have a network that trains on a batch size of 1 with 4 classes. The networks final predictions have shape [1, 3, 3, 4]
Essentially I want to take the output predictions and run it through argmin to get a tensor containing the most likely class at each point in the output:
[[[0, 1, 3],
[2, 0, 1],
[3, 1, 2]]]
The annotated images use a color palette of 255 colors to encode labels. I have a tensor containing all the color triples:
[[ 0, 0, 0],
[128, 0, 0],
[ 0, 128, 0],
[128, 128, 0],
[ 0, 0, 128],
[224, 224, 192]]
How could I obtain a tensor of shape [1, 3, 3, 3] (a single 3x3 color image) that indexes into the color palette using the values obtained from argmin?
[[palette[0], palette[1], palette[3]],
[palette[2], palette[0], palette[1]],
[palette[3], palette[1], palette[2]]]
I could easily wrap some numpy and PIL code in tf.py_func but I'm wondering if there is a pure Tensorflow way of obtaining this result.
For those curious, this is the solution I got using just numpy. It works quite well, but I still dislike the use of tf.py_func:
import numpy as np
import tensorflow as tf
def voc_colormap(N=256):
bitget = lambda val, idx: ((val & (1 << idx)) != 0)
cmap = np.zeros((N, 3), dtype=np.uint8)
for i in range(N):
r = g = b = 0
c = i
for j in range(8):
r |= (bitget(c, 0) << 7 - j)
g |= (bitget(c, 1) << 7 - j)
b |= (bitget(c, 2) << 7 - j)
c >>= 3
cmap[i, :] = [r, g, b]
return cmap
VOC_COLORMAP = voc_colormap()
def grayscale_to_voc(input, name="grayscale_to_voc"):
return tf.py_func(grayscale_to_voc_impl, [input], tf.uint8, stateful=False, name=name)
def grayscale_to_voc_impl(input):
return np.squeeze(VOC_COLORMAP[input])
You can use tf.gather_nd(), but you will need to modify the shapes of the palette and logits to obtain the desired image, for example:
import tensorflow as tf
import numpy as np
import PIL.Image as Image
# We can load the palette from some random image in the PASCAL VOC dataset
palette = Image.open('.../VOC2012/SegmentationClass/2007_000032.png').getpalette()
# We build a random logits tensor of the requested size
batch_size = 1
height = width = 3
num_classes = 4
logits = np.random.random_sample((batch_size, height, width, num_classes))
logits_argmax = np.argmax(logits, axis=3) # shape = (1, 3, 3)
# array([[[3, 3, 0],
# [1, 3, 1],
# [0, 2, 0]]])
sess = tf.InteractiveSession()
image = tf.gather_nd(
params=tf.reshape(palette, [-1, 3]), # reshaped from list to RGB
indices=tf.reshape(logits_argmax, [batch_size, -1, 1]))
image = tf.cast(tf.reshape(image, [batch_size, height, width, 3]), tf.uint8)
# array([[[[128, 128, 0],
# [128, 128, 0],
# [ 0, 0, 0]],
# [[128, 0, 0],
# [128, 128, 0],
# [128, 0, 0]],
# [[ 0, 0, 0],
# [ 0, 128, 0],
# [ 0, 0, 0]]]], dtype=uint8)
The resulting tensor can be directly fed to a tf.summary.image(), but depending on your implementation you should upsample it before the summary.