I have a problem with convolution kernel in python. It is about simple convolution operator. I have input matrix and output matrix. I want to find a possible convolution kernel with size(5x5). How to solve this problem with python, numpy or tensorflow ?
import scipy.signal as ss
input_img = np.array([[94, 166, 76, 106, 152, 232],
[48, 242, 30, 98, 46, 210],
[52, 60, 86, 60, 216, 248],
[52, 236, 116, 240, 224, 184],
[138, 160, 146, 254, 236, 252],
[94, 100, 224, 246, 152, 74]], dtype=float)
output_img = np.array([[15, 49, 23, 105, 0, 0],
[43,30, 108, 124, 0, 0],
[58, 120, 112, 92, 0, 0],
[73, 127, 118, 126, 0, 0],
[112, 123, 76, 37, 0, 0],
[0, 0, 0, 0, 0, 0]], dtype=float)
# I want to find this kernel
conv = np.zeros((5,5), dtype=int)
# So if I do convolution operator, output_img will resulting a value same as I defined above
output_img = ss.convolve2d(input_img, conv, padding='same')
As far as I understood, you need to reconstruct window weights by given input, output arrays and window size. This is possible, I think, especially, if input array (image) is sufficiently big.
Look at the code below:
import scipy.signal as ss
import numpy as np
source_dataset = np.random.rand(20, 10)
sample_convolution = np.diag([1, 1, 1])
output_dataset = ss.convolve2d(data, sample_convolution, mode='same')
conv_size = c.shape[0]
# Given output_dataset, source_datset, and conv_size we need to reconstruct
# window weights.
def reconstruct(data, output, csize):
half_size = int(csize / 2)
min_row_ind = half_size
max_row_ind = int(data.shape[0]) - half_size
min_col_ind = half_size
max_col_ind = int(data.shape[1]) - half_size
A = list()
b = list()
for i in np.arange(min_row_ind, max_row_ind, dtype=int):
for j in np.arange(min_col_ind, max_col_ind, dtype=int):
A.append(data[(i - half_size):(i + half_size + 1), (j - half_size):(j + half_size + 1)].ravel().tolist())
b.append(output[i, j])
if len(A) == csize * csize and np.linalg.matrix_rank(A) == csize * csize:
return (np.linalg.pinv(A)#np.array(b)[:, np.newaxis]).reshape(csize, csize)
if len(A) < csize*csize:
raise Exception("Insufficient data")
result = reconstruct(source_dataset, output_dataset, 3)
I got the following result
array([[ 1.00000000e+00, -1.77635684e-15, -1.11022302e-16],
[ 0.00000000e+00, 1.00000000e+00, -8.88178420e-16],
[ 0.00000000e+00, -1.22124533e-15, 1.00000000e+00]])
So, it works as expected; but definitely need to be improved to take into account edge effects, case when size of window is even etc.
Related
Using standard numpy and cv2.filter2D solutions I can apply static convolutions to an image:
import numpy as np
convolution_kernel = np.array([[-2, -1, 0],
[-1, 1, 1],
[0, 1, 2]])
import cv2
image = cv2.imread('1.png') result = cv2.filter2D(image, -1, convolution_kernel)
(example from https://stackoverflow.com/a/58383803/3310334)
Every pixel at [i, j] in the output image has a value calculated by centering a 3x3 "window" onto [i, j] in the input image, and then multiplying each value in the window by the corresponding value in the convolution kernel (Hadamard product) and finally summing the 9 products to get the value for [i, j] in the output image (for each color channel).
(image from: https://github.com/ashushekar/image-convolution-from-scratch#convolution)
In my case, the function to perform to calculate for each output pixel is not as simple as sum of Hadamard product. It is for each pixel calculated from operations performed on known-size windows into two input matrices centered around that pixel.
I have two input matrixes ("images"), like
A = [[179, 97, 77, 118, 144, 105],
[ 68, 56, 184, 210, 141, 230],
[178, 166, 218, 47, 106, 172],
[ 38, 183, 50, 185, 48, 87],
[ 60, 200, 228, 232, 6, 190],
[253, 75, 231, 166, 117, 134]]
B = [[116, 95, 94, 220, 80, 223],
[135, 9, 166, 78, 5, 129],
[102, 167, 120, 81, 141, 29],
[ 83, 117, 81, 129, 255, 48],
[130, 231, 165, 7, 187, 169],
[ 44, 137, 16, 50, 229, 202]]
And in the output matrix, each [i, j] pixel should be calculated as the sum of all of A[u,v] ** 2 - B[u,v] ** 2 values for [u, v] coordinates within 3x3 "windows" onto the two (same-sized) input matrixes.
How can I calculate this output matrix quickly in Python?
Using numpy, it seems to be the 3x3 sums of A * A - B * B, but how to do those sums? Or is there another "2d map" process I could be using?
I've written a loop-based solution to calculate the expected output for these two examples:
W = 3 # size of kernel is WxW
out = np.zeros(A.shape)
difference_of_squares = A * A - B * B
for i, j in np.ndindex(out.shape):
starti = max(i - W//2, 0) # use smaller kernels at input's boundaries, output will have same dimension as input
stopi = min(i - W//2 + W, np.shape(out)[0]) # I'm not worried at this point about what happens at boundaries
startj = max(j - W//2, 0) # standard convolution solutions are often just reducing output size or padding input with zeroes
stopj = min(j - W//2 + W, np.shape(out)[1])
out[i, j] = np.sum(difference_of_squares[starti:stopi, startj:stopj])
print(out)
[[ 8423. 11816. 10372. 41125. 35287. 31747.]
[ 29370. 65887. 38811. 61252. 51033. 51845.]
[ 24756. 60119. 109133. 35101. 70005. 18757.]
[ 8641. 62463. 126935. 14530. 2255. -64752.]
[ 36623. 110426. 163513. 33812. -50035. -146450.]
[ 22268. 100132. 130190. 83010. -10163. -88994.]]
You can use scipy.signal.convolve2d:
from scipy.signal import convolve2d
# Same shape as original (6x6)
>>> convolve2d(A**2-B**2, np.ones((3, 3), dtype=int), mode='same')
array([[ 8423, 11816, 10372, 41125, 35287, 31747],
[ 29370, 65887, 38811, 61252, 51033, 51845],
[ 24756, 60119, 109133, 35101, 70005, 18757],
[ 8641, 62463, 126935, 14530, 2255, -64752],
[ 36623, 110426, 163513, 33812, -50035, -146450],
[ 22268, 100132, 130190, 83010, -10163, -88994]])
# Shape reduce by 1 (5x5)
>>> convolve2d(A**2-B**2, np.ones((3, 3), dtype=int), mode='valid')
array([[ 65887, 38811, 61252, 51033],
[ 60119, 109133, 35101, 70005],
[ 62463, 126935, 14530, 2255],
[110426, 163513, 33812, -50035]])
Note: You have to play around with the "mode" and "limit" parameters until you get what you want.
Update
If the border is not a problem at this point, you can use sliding_window_view:
from numpy.lib.stride_tricks import sliding_window_view
>>> np.sum(sliding_window_view(A**2-B**2, (3, 3)), axis=(2, 3))
array([[ 65887, 38811, 61252, 51033],
[ 60119, 109133, 35101, 70005],
[ 62463, 126935, 14530, 2255],
[110426, 163513, 33812, -50035]])
I want to create a def function name from concatenating "string" + variable + "string" and call that def function.
I am currently using this condensed version of code for simplicity to similarly accomplish tasks and I want to minimize the hard code contents of the function do_update(a):
ROTATE = '90'
ROT20 = [
[0, 0, 0, 0, 0, 0, 0, 0],
[126, 129, 153, 189, 129, 165, 129, 126],
[126, 255, 231, 195, 255, 219, 255, 126],
[0, 8, 28, 62, 127, 127, 127, 54],
[0, 8, 28, 62, 127, 62, 28, 8],
[62, 28, 62, 127, 127, 28, 62, 28],
[62, 28, 62, 127, 62, 28, 8, 8],
[0, 0, 24, 60, 60, 24, 0, 0],
];
def updatevalues90(a):
b = []
for i in range(8):
for j in range(8):
b[i] += a[j] + i
return b
def do_update(a):
if ROTATE == '90':
ROT = [updatevalues90(char) for char in a]
elif ROTATE == '180':
ROT = [updatevalues180(char) for char in a]
elif ROTATE == '270':
ROT = [updatevalues270(char) for char in a]
do_update(ROT20)
Everything I have tried has resulted in Invalid Syntax or ROT filled with the string name of what I want.
I want to take the function call to updatevalues90(char) and instead of needing it hard coded, I want to change it to:
ROT = ["updatevalues" + ROTATE + "(char)" for char in a]
So that whatever value is in ROTATE will become part of the function call, i.e. function name.
My question is how in Python do I concatenate the strings and a variable name into a useable function name?
I think eval, but I can't get the syntax to work for me. Maybe there is something simpler in Python that works?
Store your functions in a dict:
updaters = {
'90': updatevalues90,
'180': updatevalues180,
'270': updatevalues270
}
def do_update(a):
ROT = [updaters[ROTATE](char) for char in a]
# return ROT ?
I've an image processing task and we're prohibited to use NumPy so we need to code from scratch. I've done the logic image transformation but now I'm stuck on creating an array without numpy.
So here's my last output code :
Output :
new_log =
[[236,
232,
226,
.
.
.
198,
204]]
I need to convert this to an array so I can write the image like this (with Numpy)
new_log =
array([[236, 232, 226, ..., 208, 209, 212],
[202, 197, 187, ..., 198, 200, 203],
[192, 188, 180, ..., 205, 206, 207],
...,
[233, 226, 227, ..., 172, 189, 199],
[235, 233, 228, ..., 175, 182, 192],
[235, 232, 228, ..., 195, 198, 204]], dtype=uint8)
cv.imwrite('log_transformed.jpg', new_log)
# new_log must be shaped like the second output
You can make a straightforward function to take your list and reshape it in a similar way to NumPy's np.reshape(). But it's not going to be fast, and it doesn't know anything about data types (NumPy's dtype) so... my advice is to challenge whoever it is that doesn't like NumPy. Especially if you're using OpenCV — it depends on NumPy!
Here's an example of what you could do in pure Python:
def reshape(l, shape):
"""Reshape a list.
Example
-------
>>> l = [1,2,3,4,5,6,7,8,9]
>>> reshape(l, shape=(3, -1))
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
"""
nrows, ncols = shape
if ncols == -1:
ncols = len(l) // nrows
if nrows == -1:
nrows = len(l) // ncols
array = []
for r in range(nrows):
row = []
for c in range(ncols):
row.append(l[ncols*r + c])
array.append(row)
return array
I have multiple 5x5 arrays which are contained within one large array - the overarching shape is: 5 x 5 x 29. I want to sum every 5 x 5 array to produce one single array, instead of 29 single arrays.
I know that you can do something along the lines of:
new_data = data1[:,:,0] + data1[:,:,1] + ... + data1[:,:,29]
However, this gets very cumbersome for large arrays. Is there an easier way to do this?
Assuming you are using NumPy, you should be able to do this with:
In [13]: data1 = np.arange(100).reshape(5, 5, 4) # For example
In [14]: data1[:,:,0] + data1[:,:,1] + data1[:,:,2] + data1[:,:,3] # Bad way
Out[14]:
array([[ 6, 22, 38, 54, 70],
[ 86, 102, 118, 134, 150],
[166, 182, 198, 214, 230],
[246, 262, 278, 294, 310],
[326, 342, 358, 374, 390]])
In [15]: data1.sum(axis=2) # Good way
Out[15]:
array([[ 6, 22, 38, 54, 70],
[ 86, 102, 118, 134, 150],
[166, 182, 198, 214, 230],
[246, 262, 278, 294, 310],
[326, 342, 358, 374, 390]])
If you are saying you have a list of arrays, then use a for loop.
for i in range(29):
new_data+= data1[:,:,i]
If you are saying you have a tensor or some ND array you should review and research numpy's ND array docs.
You can use a for loop. Like this:
import numpy as np
new_data = np.zeros((5, 5))
for i in range(29):
new_data += data1[:,:,i]
I am trying to use python to just compute a local pixel color average, however my output is not at all that.
Image:
Output:
Code:
image = cv2.imread('perspective.jpeg')
for i in range(image.shape[1]):
for j in range(image.shape[0]):
up = image[min(j + 1, image.shape[0]-1), i]
down = image[max(j - 1, 0), i]
right = image[j, min(i + 1, image.shape[1]-1)]
left = image[j, max(i - 1, 0)]
average = (up + down + left + right + image[j, i]) / 5
image[j, i] = average
The issues that you are observing is due to integer arithmetic overflow while computing the average. The reason of overflow is that the pixels are of type np.uint8 which when added together, generate result of type np.uint8 which is not large enough to hold the result of addition.
The solution to this problem is to cast the pixels to a larger data-type before adding them. Then cast the final value back to np.uint8 before storing back to the result image.
In-fact, casting only one of the values (say up) to larger data type will suffice as the rest of them will automatically be upgraded while performing addition.
The corrected code may look like this:
image = cv2.imread('perspective.jpeg')
for i in range(image.shape[1]):
for j in range(image.shape[0]):
up = np.float32(image[min(j + 1, image.shape[0]-1), i])
down = image[max(j - 1, 0), i]
right = image[j, min(i + 1, image.shape[1]-1)]
left = image[j, max(i - 1, 0)]
average = (up + down + left + right + image[j, i]) / 5
image[j, i] = np.uint8(average)
You can easily do this with filter2D as shown in the example below. It will work on any number of channels.
im = np.random.randint(0, 256, (5, 5), np.uint8)
kernel = np.array([[0, 1./5, 0], [1./5, 1./5, 1./5], [0, 1./5, 0]])
filt = cv2.filter2D(im, cv2.CV_8U, kernel)
For example:
im
array([[ 14, 127, 221, 74, 2],
[132, 251, 88, 19, 215],
[183, 140, 17, 60, 76],
[208, 144, 182, 11, 64],
[183, 89, 217, 131, 23]], dtype=uint8)
filt
array([[106, 173, 120, 67, 116],
[166, 148, 119, 91, 66],
[161, 147, 97, 37, 95],
[172, 153, 114, 90, 37],
[155, 155, 160, 79, 83]], dtype=uint8)
You can choose the border type, I've used the default.