I want to use PIL Images in conjunction with tkinter display. I would like to be able to do a custom pixel by pixel (probably exclusive or) operation on two images to create a third image. A cursory read of the PIL.Image documentation does not make it clear how best to get at those pixels. Any suggestion on the most efficient way to do this? (The two images will be stills from the same camera, so there will be no issue of size or mode mismatch.)
If there is a way to get access to the pixel map in such a way that I could pass it to a Numpy routine or a custom C extension, that would be ideal.
Related
Im trying to remove the differences between two frames and keep the non-chaning graphics. Would probably repeat the same process with more frames to get more accurate results. My idea is to simplify the frames removing things that won't need to simplify the rest of the process that will do after.
The different frames are coming from the same video so no need to deal with different sizes, orientation, etc. If the same graphic its in another frame but with a different orientation or scale, I would like to also remove it. For example:
Image 1
Image 2
Result (more or less, I suppose that will be uglier but containing a similar information)
One of the problems of this idea is that the source video, even if they are computer generated graphics, is compressed so its not that easy to identify if a change on the tonality of a pixel its actually a change or not.
Im ideally not looking at a pixel level and given the differences in saturation applied by the compression probably is not possible. Im looking for unchaged "objects" in the image. I want to extract the information layer shown on top of whats happening behind it.
During the last couple of days I have tried to achieve it in a Python script by using OpenCV with all kinds of combinations of absdiffs, subtracts, thresholds, equalizeHists, canny but so far haven't found the right implementation and would appreciate any guidance. How would you achieve it?
Im ideally not looking at a pixel level and given the differences in saturation applied by the compression probably is not possible. Im looking for unchaged "objects" in the image. I want to extract the information layer shown on top of whats happening behind it.
This will be extremely hard. You would need to employ proper CV and if you're not an expert in that field, you'll have really hard time.
How about this, forgetting about tooling and libs, you have two images, ie. two equally sized sequences of RGB pixels. Image A and Image B, and the output image R. Allocate output image R of the same size as A or B.
Run a single loop for every pixel, read pixel a and from A and pixel b from B. You get a 3-element (RGB) vector. Find distance between the two vectors, eg. magnitude of a vector (b-a), if this is less than some tolerance, write either a or b to the same offset into result image R. If not, write some default (background) color to R.
You can most likely do this with some HW accelerated way using OpenCV or some other library, but that's up to you to find a tool that does what you want.
I'm looking for OpenCV or other Python function that displays a NumPy array as an image like this:
Referenced from this tutorial.
What function creates this kind of grey-scale image with pixel values display?
Is there a color image equivalent?
MATLAB has a function called showPixelValues().
The best way to do this is to search "heat map" or "confusion matrix" rather than image, then there are two good options:
Using matplotlib only, with imshow() and text() as building blocks the solution is actually not that hard, and here are some examples.
Using seaborn which is a data visualisation package and the solution is essentially a one-liner using seaborn.heatmap() as shown in these examples.
My problem was really tunnel vision, coming at it from an image processing mindset, and not thinking about what other communities have a need to display the same thing even if they call it by a different name.
I'm a beginner in opencv using python. I have many 16 bit gray scale images and need to detect the same object every time in the different images. Tried template matching in opencv python but needed to take different templates for different images which could be not desirable. Can any one suggest me any algorithm in python to do it efficiently.
Your question is way too general. Feature matching is a very vast field.
The type of algorithm to be used totally depends on the object you want to detect, its environment etc.
So if your object won't change its size or angle in the image then use Template Matching.
If the image will change its size and orientation you can use SIFT or SURF.
If your object has unique color features that is different from its background, you can use hsv method.
If you have to classify a group of images as you object,for example all the cricket bats should be detected then you can train a number of positive images to tell the computer how the object looks like and negative image to tell how it doesn't, it can be done using haar training.
u can try out sliding window method. if ur object is the same in all samples
One way to do this is to look for known colors, shapes, and sizes.
You could start by performing an HSV threshold on your image, by converting your image to HSV colorspace and then calling
cv2.inRange(source, (minHue, minSat, minVal), (maxHue, maxSat, maxVal))
Next, you could use cv2.findContours to find all the areas in your image that meet your color requirements. Then, you could use methods such as boundingRect and contourArea to find specific attributes of the object that you want.
What you will end up with is essentially a 'pipeline' that can take a frame, and look for a shape that fits the criteria you have set. Depending on the complexity of what you want to do (you didn't say what you're looking for), this may or may not work, but I have used it with reasonable success.
GRIP is an application that allows you to threshold things in a visual way, and it will also generate Python code for you if you want. I don't really recommend using the generated code as-is because I've run into some problems that way. Here's the link to GRIP: https://github.com/WPIRoboticsProjects/GRIP
If the object you want to detect has different size in every image and also slightly varies in shape too, then I recommend you use HaarCascade of that object. If the object is very general then you can easily find haar cascade for it online. Otherwise it is not very difficult to make haar cascades(can be a littile time consuming though).
You can use this tutorial by sentdex to make HaarCascade here.
Or If you want to know how to use HaarCascades then you can get it on this link
here.
I have a large 2D array (4000x3000) saved as a numpy array which I would like to display and save while keeping the ability to look at each individual pixels.
For the display part, I currently use matplotlib imshow() function which works very well.
For the saving part, it is not clear to me how I can save this figure and preserve the information contained in all 12M pixels. I tried adjusting the figure size and the resolution (dpi) of the saved image but it is not obvious which figsize/dpi settings should be used to match the resolution of the large 2D matrix displayed. Here is an example code of what I'm doing (arr is a numpy array of shape (3000,4000)):
fig = pylab.figure(figsize=(16,12))
pylab.imshow(arr,interpolation='nearest')
fig.savefig("image.png",dpi=500)
One option would be to increase the resolution of the saved image substantially to be sure all pixels will be properly recorded but this has the significant drawback of creating an image of extremely large size (at least much larger than the 4000x3000 pixels image which is all that I would really need). It also has the disadvantage that not all pixels will be of exactly the same size.
I also had a look at the Python Image Library but it is not clear to me how it could be used for this purpose, if at all.
Any help on the subject would be much appreciated!
I think I found a solution which works fairly well. I use figimage to plot the numpy array without resampling. If you're careful in the size of the figure you create, you can keep full resolution of your matrix whatever size it has.
I figured out that figimage plots a single pixel with size 0.01 inch (this number might be system dependent) so the following code will for example save the matrix with full resolution (arr is a numpy array of shape (3000,4000)):
rows = 3000
columns = 4000
fig = pylab.figure(figsize=(columns*0.01,rows*0.01))
pylab.figimage(arr,cmap=cm.jet,origin='lower')
fig.savefig("image.png")
Two issues I still have with this options:
there is no markers indicating column/row numbers making it hard to know which pixel is which besides the ones on the edges
if you decide to interactively look at the image, it is not possible to zoom in/out
A solution that also solves the above 2 issues would be terrific, if it exists.
The OpenCV library was designed for scientific analysis of images. Consequently, it doesn't "resample" images without your explicitly asking for it. To save an image:
import cv2
cv2.imwrite('image.png', arr)
where arr is your numpy array. The saved image will be the same size as your array arr.
You didn't mention the color-model that you are using. Pngs, like jpegs, are usually 8-bit per color channel. OpenCV will support up to 16-bits per channel if you request it.
Documentation on OpenCV's imwrite is here.
I have a script to save between 8 and 12 images to a local folder. These images are always GIFs. I am looking for a python script to combine all the images in that one specific folder into one image. The combined 8-12 images would have to be scaled down, but I do not want to compromise the original quality(resolution) of the images either (ie. when zoomed in on the combined images, they would look as they did initially)
The only way I am able to do this currently is by copying each image to power point.
Is this possible with python (or any other language, but preferably python)?
As an input to the script, I would type in the path where only the images are stores (ie. C:\Documents and Settings\user\My Documents\My Pictures\BearImages)
EDIT: I downloaded ImageMagick and have been using it with the python api and from the command line. This simple command worked great for what I wanted: montage "*.gif" -tile x4 -geometry +1+1 -background none combine.gif
If you want to be able to zoom into the images, you do not want to scale them. You'll have to rely on the image viewer to do the scaling as they're being displayed - that's what PowerPoint is doing for you now.
The input images are GIF so they all contain a palette to describe which colors are in the image. If your images don't all have identical palettes, you'll need to convert them to 24-bit color before you combine them. This means that the output can't be another GIF; good options would be PNG or JPG depending on whether you can tolerate a bit of loss in the image quality.
You can use PIL to read the images, combine them, and write the result. You'll need to create a new image that is the size of the final result, and copy each of the smaller images into different parts of it.
You may want to outsource the image manipulation part to ImageMagick. It has a montage command that gets you 90% of the way there; just pass it some options and the names of the files in the directory.
Have a look at Python Imaging Library.
The handbook contains several examples on both opening files, combining them and saving the result.
The easiest thing to do is turn the images into numpy matrices, and then construct a new, much bigger numpy matrix to house all of them. Then convert the np matrix back into an image. Of course it'll be enormous, so you may want to downsample.