Why doesn't cv2 dilate actually affect my image? - python

So, I'm generating a binary (well, really gray scale, 8bit, used as binary) image with python and opencv2, writing a small number of polygons to the image, and then dilating the image using a kernel. However, my source and destination image always end up the same, no matter what kernel I use. Any thoughts?
from matplotlib import pyplot
import numpy as np
import cv2
binary_image = np.zeros(image.shape,dtype='int8')
for rect in list_of_rectangles:
cv2.fillConvexPoly(binary_image, np.array(rect), 255)
kernel = np.ones((11,11),'int')
dilated = cv2.dilate(binary_image,kernel)
if np.array_equal(dilated, binary_image):
print("EPIC FAIL!!")
else:
print("eureka!!")
All I get is EPIC FAIL!
Thanks!

So, it turns out the problem was in the creation of both the kernel and the image. I believe that openCV expects 'uint8' as a data type for both the kernel and the image. In this particular case, I created the kernel with dtype='int', which defaults to 'int64'. Additionally, I created the image as 'int8', not 'uint8'. Somehow this did not trigger an exception, but caused the dilation to fail in a surprising fashion.
Changing the above two lines to
binary_image = np.zeros(image.shape,dtype='uint8')
kernel = np.ones((11,11),'uint8')
Fixed the problem, and now I get EUREKA! Hooray!

Related

How to detect black object on black background using Python OpenCV

I am trying to detect a black tape on a black background.
No tape, with tape (cropped pictures):
I have first cropped the area of the tape from the original image and then performing thresholding on it. Below is the image when there is no tape:
You can notice there is an almost solid line. Black tape is placed right next to it and when it is placed this line becomes very light. Below is the image:
Is there any good image processing techniques I can use to detect when the black tape is placed and when its not placed?
Below is the code I am currently using:
import cv2
import os
import imutils
from pathlib import Path
import numpy as np
def on_mouse(event, x, y, flags, param):
if event == cv2.EVENT_LBUTTONDOWN:
print("X: {} | Y: {}".format(x, y))
dirPath = Path(__file__).parents[2]
imgPath = os.path.join(dirPath, "img", "img.png")
win_name = "Image"
cv2.namedWindow(win_name)
cv2.setMouseCallback(win_name, on_mouse)
img = cv2.imread(imgPath)
img = imutils.resize(img, width=800)
roiImg = img[298:337, 520:591]
img_gray = cv2.cvtColor(roiImg, cv2.COLOR_BGR2GRAY)
rett, thresh = cv2.threshold(img_gray, 50, 255, cv2.THRESH_BINARY)
cv2.imshow(win_name, img)
cv2.imshow("Thres", thresh)
cv2.waitKey(0)
cv2.destroyAllWindows()
Here is the link to test video: https://drive.google.com/file/d/1P3Xkx_SuHidDs1UdacS3-DZqA-CiXQOX/view?usp=sharing
Below is the image with area marked in red where tape is usually placed
Thanks
No way to write a stable image processing software here.
In this industrial environment you get differences in ambient light, reflexion, shadows, light different presentation angles, sun light etc. This will impact the brightness of your image partial or global much more than the presence of a nearly invisible tape. This way it's not possible to find any good threshold.
So I guess you are on the right way using the "temporary solution" of detecting the gray hand.
If you really like to detect the tape you need a hardware solution that brings you away from this black on black thing:
Use white tape on black part or black tape on white part. Only to mention :-)
Use dark/bright field illumination instead of ambient light. Guess will not work because angle of part and tape is similar.
Use different wavelength with more difference than in visible light. Needs specific camera and illumination. Best wavelength depends a lot of the material but that's the most professional and stable solution here.

reproduce same output with scikit-image resize and OpenCV resize function

I'm trying to reproduce the same output with these snippets:
Scikit-Image + Keras
from keras.models import model_from_json
import numpy as np
from skimage.io import imread
from skimage.transform import resize
image = resize(imread(img_path, as_grey=False), (80, 80), preserve_range=True, mode='constant')
image /= 255.
img_array = np.array([image])
pred_IN = model.predict(img_array)
OpenCV
import cv2
model = cv2.dnn.readNet('mynet.prototxt', 'mynet.caffemodel')
image = cv2.imread(image_path)
img = cv2.dnn.blobFromImage(image, scalefactor=(1.0/255.0), size=(80, 80), swapRB=True, crop=False)
model.setInput(img)
pred = model.forward()
The problem is that I cannot get the same data to pass to the network (DNN module in case of OpenCV). Network is the same, input data is the same, but the results is slightly different and the reason is that resize function behaves differently between scikit-learn and OpenCV (used internally by blobFromImage) and don't know how to adapt the OpenCV code to match scikit-learn.
My final application will use OpenCV in C++, so I need to match this snippets, as my network has been trained with data generated by scikit-learn.
I think the reason is skimage use antialiasing (gaussian blur from scipy.ndimage before rescale) by default. You can achieve similar result wit resize in OpenCV by blurring your image (e.g. using cv2.GaussianBlur) before cv2.resize. Result from resize is not the same but with proper blur kernel size is very very similar (almost identical). Hope it'll help :)

overcome Graphdef cannot be larger than 2GB in tensorflow -- Image transformations with tf

I've referred to this question, but I don't quite understand the
second method provided by Mr.mrry.
overcome Graphdef cannot be larger than 2GB in tensorflow
Basically, I'm trying to use tf's built in image transformation methods on images. I'm running into the error provided in the title.
Also, do I need to keep creating a new session for each iteration?
Currently, this process is a little slow and am not sure how to speed it up.
import tensorflow as tf
import os
from scipy.ndimage import imread
from scipy.misc import imresize, imshow
import matplotlib.pyplot as plt
for fish in Fishes:
fish_images = os.listdir(os.path.join('C:\\Users\\Moondra\\Desktop\\Fishes', fish)) # get the image files
os.makedirs(SAVE_DIR + fish, exist_ok = True)
for num, fish_image in enumerate(fish_images):
image =imread(os.path.join('C:\\Users\\Moondra\\Desktop\\Fishes', fish, fish_image))
new_img =tf.image.adjust_brightness(image, .4) #image transformation
with tf.Session() as sess:
new_image =sess.run(new_img)
imsave(os.path.join(SAVE_DIR, fish, fish +str(num)+'.jpg'), new_image)
This is not how TF should be used.
You should create graph once.
You should create session once.
Your current code does both things in a loop, thus causing slowness and memory issues. The problem lies in the fact that TF is not imperative language, so
new_img =tf.image.adjust_brightness(image, .4) #image transformation
is not application of a function on the image.This creates an operation in a graph, and stores reference to this operation in new_img. So each time you call this function, your graph grows.
So in pseudo code it should be:
create placeholder for image name
create transformed image op - new_img
create session
for each image
call in a session new_img op, providing path to the placehodler using feed_dict

Alternative to .dilate() OpenCV

I am using cv2 and Pillow in my script:
image = Image.open("img1.png")
#do some stuff to the image
image.save("result1.png")
image = cv2.imread("result1.png")
kernel = np.ones((5, 5), np.uint8)
dilated_image = cv2.dilate(image, kernel, iterations=3)
cv2.imwrite("result2.png", dilated_image)
final_image = Image.open("result2.png")
#do some other stuff to the image
final_image.save("final_result.png")
As you can see, I have to switch between OpenCV and Pillow, and save three images. What I want, is to save just one result, instead of three.
Is there a way, where I can continue with Pillow, dilate the image with almost the same execution speed, without using cv2?
I have already tried image.filter(ImageFilter.MaxFilter(size=3)), but it takes too much CPU time. The reason it takes too much time, is that for having the same effect as cv2.dilate(image, kernel, iterations=5), I should use at least image.filter(ImageFilter.MaxFilter(size=15))
If you are just looking for an OpenCV alternative for the function which is there in a standard library, then you can try SciPy's function (SO Question here)

Numpy (OpenCV) image array to OpenGL texture (pi3d)

I'm using pi3d to display an ImageSprite on the screen the texture of which comes form an image I'm loading.
displayTexture = pi3d.Texture("display/display.jpg", blend=True, mipmap=True)
displaySlide = pi3d.ImageSprite(texture=displayTexture, shader=shader, w=800, h=600)
This texture image is actually something I'm creating in-program. It's an openCV2 image and therefore just a numpy array. At the moment I'm saving it just to load it again as a texture, but is there a way to just constantly update the texture of the sprite with the changing numpy array values?
I looked into the openCV OpenGL support but from what I could see it only supports Windows at this stage and is therefore not suitable for this use.
Edit: Should have mentioned I'm happy for a lower level solution too. I'm currently trying to use .toString() on the image array and use the resulting byte list with glTexImage2D to produce a texture but no dice so far.
Yes you can pass a PIL.Image to pi3d.Texture and it will create a new Texture using that. There is a bit of work involved there so it will impact on frame rate if it's a big Texture. Also you need to update the pointer in the Buffer that holds the Texture array so the new Texture gets used.
There is a method to load a numpy array to a PIL.Image (Image.fromarray()) so this would be an easy route. However it's a bit convoluted as pi3d already converts the PIL.Image into a numpy array see https://github.com/tipam/pi3d/blob/master/pi3d/Texture.py#L163
The following works ok as a short-cut into the workings of pi3d.Texture but it's a bit of a hack calling the 'private' function _load_opengl. I might look at making a more robust method of doing this (i.e. for mapping videos to 3D objects etc)
#!/usr/bin/python
from __future__ import absolute_import, division, print_function, unicode_literals
import demo
import pi3d
import random
import numpy as np
from PIL import Image, ImageDraw
DISPLAY = pi3d.Display.create(x=150, y=150)
shader = pi3d.Shader("uv_flat")
im = Image.open("textures/PATRN.PNG")
#draw = ImageDraw.Draw(im) # there are various PIL libraries you could use
nparr = np.array(im)
tex = pi3d.Texture(im) # can pass PIL.Image rather than path as string
sprite = pi3d.ImageSprite(tex, shader, w=10.0, h=10.0)
mykeys = pi3d.Keyboard()
while DISPLAY.loop_running():
#draw.line((random.randint(0,im.size[0]),
# random.randint(0,im.size[1]),
# random.randint(0,im.size[0]),
# random.randint(0,im.size[1])), fill=128) # draw random lines
#nparr = np.array(im)
nparr += np.random.randint(-2, 2, nparr.shape) # random noise
tex.image = nparr
tex._load_opengl()
sprite.draw()
if mykeys.read() == 27:
mykeys.close()
DISPLAY.destroy()
break
PS I can't remember what version of pi3d the switch to numpy textures happened but it's quite recent so you probably have to upgrade
EDIT:
The switch from Texture.image being a bytes object to numpy array was v1.14 posted on 18Mar15
To clarify the steps to use a numpy array to initialise and refresh a changing image:
...
im = Image.fromarray(cv2im) # cv2im is a numpy array
tex = pi3d.Texture(im) # create Texture from PIL image
sprite = pi3d.ImageSprite(tex, shader, w=10.0, h=10.0)
...
tex.image = cv2im # set Texture.image to modified numpy array
tex._load_opengl() # re-run OpenGLESv2 routines

Categories

Resources