is there a python version for https://www.mathworks.com/help/images/sliding-neighborhood-operations.html, or a way to do it efficiently?
I want to be able to fit a custom sized window say [2x2] over an image/array, and apply a custom function to the values on that window. For example, like the matlab page, it could be setting the max of the surrounding numbers, though i know a convolution can do that. Specifically, I want to apply https://python-colormath.readthedocs.io/en/latest/delta_e.html the delta2000 function to each pixel and its neighbors and see the results.
I have not been able to find anything.
For discrete convolution operations which are the most common sliding neighborhood methods (to apply filters on images like gaussian, blurring, e.t.c), OpenCV is a good library (and for any image processing task).
You should check also Numpy, which is used to operate with arrays (for instance, 2D or 3D images) and works fine with OpenCV
Related
Im trying to remove the differences between two frames and keep the non-chaning graphics. Would probably repeat the same process with more frames to get more accurate results. My idea is to simplify the frames removing things that won't need to simplify the rest of the process that will do after.
The different frames are coming from the same video so no need to deal with different sizes, orientation, etc. If the same graphic its in another frame but with a different orientation or scale, I would like to also remove it. For example:
Image 1
Image 2
Result (more or less, I suppose that will be uglier but containing a similar information)
One of the problems of this idea is that the source video, even if they are computer generated graphics, is compressed so its not that easy to identify if a change on the tonality of a pixel its actually a change or not.
Im ideally not looking at a pixel level and given the differences in saturation applied by the compression probably is not possible. Im looking for unchaged "objects" in the image. I want to extract the information layer shown on top of whats happening behind it.
During the last couple of days I have tried to achieve it in a Python script by using OpenCV with all kinds of combinations of absdiffs, subtracts, thresholds, equalizeHists, canny but so far haven't found the right implementation and would appreciate any guidance. How would you achieve it?
Im ideally not looking at a pixel level and given the differences in saturation applied by the compression probably is not possible. Im looking for unchaged "objects" in the image. I want to extract the information layer shown on top of whats happening behind it.
This will be extremely hard. You would need to employ proper CV and if you're not an expert in that field, you'll have really hard time.
How about this, forgetting about tooling and libs, you have two images, ie. two equally sized sequences of RGB pixels. Image A and Image B, and the output image R. Allocate output image R of the same size as A or B.
Run a single loop for every pixel, read pixel a and from A and pixel b from B. You get a 3-element (RGB) vector. Find distance between the two vectors, eg. magnitude of a vector (b-a), if this is less than some tolerance, write either a or b to the same offset into result image R. If not, write some default (background) color to R.
You can most likely do this with some HW accelerated way using OpenCV or some other library, but that's up to you to find a tool that does what you want.
I am currently working on a summer research project and we have generated 360 slices of a tumor. I now need to compile (if that's the right word) these images into one large 3D image. Is there a way to do this with either a python module or an outside source? I would prefer to use a free software if that is possible.
Perhaps via matplotlib, but anyway may require preprocessing I suppose:
https://www.youtube.com/watch?v=5E5mVVsrwZw
In your case, the z axis (3rd dimension) should be specified by your vector of images. Nonetheless, before proceeding, I suppose you would need to extract the shapes of the object you want to reconstruct. For instance, if i take any image of the many 2D you have, I expect to find RGB value for each pixel, but then, for instance if you want to plot a skull like in the video link, as I understand you would need to extract the borders of your object and from each of its frame (2D shape) and then plot their serie. But anyway, the processing may depend on the encoding of the information you have. Perhaps is sufficient to simply plot the series of images.
Some useful link I found:
https://www.researchgate.net/post/How_to_reconstruct_3D_images_from_two_or_four_2D_images
Python: 3D contour from a 2D image - pylab and contourf
Is there any way to obtain background from cv2.BackgroundSubtractorMOG2 in python?
In other words, is there any technique to compute an image based on last n frames of a video, which can be used as background?
Such a technique would be pretty complicated, but you might want to look at some keywords: image-stitching, gradient-based methods, patch-match, image filling. Matlab, for example, has a function that tries to interpolate missing values from nearby pixels. You could extend this method to work with 3D (shouldn't be so difficult in linear case).
More generally, it is sort of an ill-posed problem since there is no way to know what goes in the missing region.
Specifically to address your question, you might first take the difference between the original frame, and the extracted image, which should reveal the background. Then, use ROI fill in or similar method. There is likely some examples you can find on the web, such as this.
I have written a program in Python which automatically reads score sheets like this one
At the moment I am using the following basic strategy:
Deskew the image using ImageMagick
Read into Python using PIL, converting the image to B&W
Calculate calculate the sums of pixels in the rows and the columns
Find peaks in these sums
Check the intersections implied by these peaks for fill.
The result of running the program is shown in this image:
You can see the peak plots below and to the right of the image shown in the top left. The lines in the top left image are the positions of the columns and the red dots show the identified scores. The histogram bottom right shows the fill levels of each circle, and the classification line.
The problem with this method is that it requires careful tuning, and is sensitive to differences in scanning settings. Is there a more robust way of recognising the grid, which will require less a-priori information (at the moment I am using knowledge about how many dots there are) and is more robust to people drawing other shapes on the sheets? I believe it may be possible using a 2D Fourier Transform, but I'm not sure how.
I am using the EPD, so I have quite a few libraries at my disposal.
First of all, I find your initial method quite sound and I would have probably tried the same way (I especially appreciate the row/column projection followed by histogramming, which is an underrated method that is usually quite efficient in real applications).
However, since you want to go for a more robust processing pipeline, here is a proposal that can probably be fully automated (also removing at the same time the deskewing via ImageMagick):
Feature extraction: extract the circles via a generalized Hough transform. As suggested in other answers, you can use OpenCV's Python wrapper for that. The detector may miss some circles but this is not important.
Apply a robust alignment detector using the circle centers.You can use Desloneux parameter-less detector described here. Don't be afraid by the math, the procedure is quite simple to implement (and you can find example implementations online).
Get rid of diagonal lines by a selection on the orientation.
Find the intersections of the lines to get the dots. You can use these coordinates for deskewing by assuming ideal fixed positions for these intersections.
This pipeline may be a bit CPU-intensive (especially step 2 that will proceed to some kind of greedy search), but it should be quite robust and automatic.
The correct way to do this is to use Connected Component analysis on the image, to segment it into "objects". Then you can use higher level algorithms (e.g. hough transform on the components centroids) to detect the grid and also determine for each cell whether it's on/off, by looking at the number of active pixels it contains.
I had a problem where I need to search for a pattern (present as a numpy ndarray) within another image (also present as a numpy ndarray) and compute a template match (minimum difference position in the image). My question is... is there any in-built image that I can possibly use in the Python Imaging Library or Numpy or anything possible that can do this without me manually writing a function to do so???
Thank you....
This is likely best done as an inverse convolution or correlation. Numpy/scipy has code to do both.
edit: including a little example.
Go here for the ipython notebook file: http://nbviewer.ipython.org/4020770/
I made a little gaussian and then use scipy.signal.correlate2d with the original image and a small subset.
you can see that the highest values of the correlation are centered around where the subset of the image was taken. note that for large kernels or images, this code can take a while (because correlation is expensive)