I want to create a collage of images on a defined area. Since I have a big number of images, I am looking for a algorithm that can solve this problem. The goal of this algorithm should be to maximize the area that is covered by the images.
There are also two rules that should be met:
1.) It is allowed to resize the images but only proportionally (avoid ‚squeezing‘, lock the aspect ratio).
2.) There must be a maximum and minimum height and width (in order to avoid that some photos are unproportionally big compared to others and to prevent the algorithm from shrinking a photo to a size where you can‘t see the image anymore).
I have also a two (optional) goals that should be solved by the algorithm:
3.) The images should have as much contact to the borders as possible.
4.) I am not not able to define the second goal algorithmically, so please excuse my loose language here:The algorithm should try to create a ‚pretty‘ distribution of the images. For example, one could agree that the second collage looks prettier than the first one, because there is a more harmonic ratio between the number of 'uncovered-area-shapes' and their size. Also, in contrast to the first example, the uncovered-area-shapes in the second example take up the shape of rectangles which makes the whole image look 'more calm':
Related
I have a program where I get one image as input and I have to compare it with ~640 known images to see which one is the most similar. To do this I was thinking of using OpenCV's match template, as it seems very fast and effective in doing what I want to do.
I noticed that matching two images, both 400x240 px, 1000 times is much slower than matching a 400x240 px image in a 1400x240 px, despite both of them being 1000 matches.
My idea was to combine the 640 images in one big image containing them in a grid (easy to do since they all have the same size).
Doing this I could really speed up the process if I could match template the input image in only some of the subimages of the big combined reference (the ones with the top-left corner in a "grid pixel", as in a subimage that was actually one of those 640 images that I used to combine into the big reference image), but it doesn't seem to be there a way to tell match template only to match on a specified set of positions.
How could I go to speed up this process? Is there a different library (I'm working with Python) that does something similar to match template but where I can specify which subimages to match? Is there an entirely different approach that suits more my goal?
EDIT:
Basically what I have to do is: I'm taking a screenshot from a videogame and in the screenshot there's a clean portrait of some character (400x240 px). I have clean portraits of all the characters (80 of them, 8 skins each for a total of 640 portraits) and I want to find which portraits is the closest to the one in the screenshot so that I can identify the character played. It'd be awesome if I could also "mask" some pixels like with matchTemplate from OpenCV as there are some specific parts of the 400x240px rectangle that change from player to player and I'd like if I didn't have to account for that and could just mask out those (known) pixels
My goal is to draw a rectangle border around the face by removing the neck area connected to the whole face area. All positive values here represent skin color pixels. Here I have so far filtered out the binary image using OpenCV and python. Code so far skinid.py
Below is the test image.
Noise removals have also been applied to this binary image
Up to this point, I followed this paper Face segmentation using skin-color map in videophone applications. And for the most of it, I used custom functions rather than using built-in OpenCV functions because I kind of wanted to do it from scratch. (although some erosion, opening, closing were used to tune it up)
I want to know a way to split the neck from the whole face area and remove it like this,
as I am quite new to the whole image processing area.
Perform a distance transform (built into opencv or you could write by hand its a pretty fun and easy one to write using the erode function iteratively, and adding the result into another matrix each round, lol slow but conceptually easy). On the binary image you presented above, the highest value in a distance transform (and tbh I think pretty generalized across any mug shots) will be the center of the face. So that pixel is the center of your box, but also that value (value of that pixel after the distance transform) will give you a pretty solid approx face size (since it is going to be the pixel distance from the center of the face to the horizontal edges of the face). Depending on what you are after, you may just be able to multiply that distance by say 1.5 or so (figure out standard face width to height ratio and such to choose your best multiplier), set that as your circle radius (or half side width for a box) and call it a day. Comment if you need anything clarified as I am pretty confident in this answer and would be happy to write up some quick code (in c++ opencv) if you need/ it would help.
(alt idea). You could tweak your color filter a bit to reject darker areas (this will at least in the image presented) create a nice separation between your face and neck due to the shadowing of the chin. (you may have to dial back your dilate/ closing op tho)
I am relatively new to python. I would like to make some string-art portraits. I was watching this video which really intrigued me:
https://youtu.be/RSRNZaq30W0?t=56
I understand that to achieve this, I would first need to load the image, then do some edge-detection and then use some form of Delaunay triangulation but have no idea where to even start.
I looked up some sample code for OpenCV and figured out how to do basic edge-detection. How do I then convert those to points? And then what sort of algorithm would I need to "fill in" the different gradients?
I don't even know if this is the right approach to achieve this. Could someone please point me in the right direction and perhaps give me some sample code to get started? I would really appreciate it very much.
Edge detection or triangulation is less important in this application. The core part is to understand the pseudo-code at 1:27 of the video. The final product uses a single string at wrap around different nails in particular way, so that: darker areas in original image have less string density, and brighter areas have more strings crossing over.
The initial preparation is to:
generate an edge dection version of the image (A)
generate a blurred version of the image (B)
Then the first step is to create random positions for the nails. Apparently to achieve a good outcome, if a random-generated nail is close enough to the 'edge' of a black-white image, you should 'snap' it to the edge, so that later the strings wrapping around these edge nails will create an accurate boundary just like in the original picture. Here you use the image A) to adjust your nails. For example, just perform some potential minimization:
Add small random position change to the nails. If a nail now gets
close enough to a white point (edge) in image A), directly change to
that position.
Compute the potential. Make sure your potential function
penalizes two points that come too close. Repeat 1) 100 times to
pick one with lowest potential.
Iterate 1) and 2) 20 times
Next you decide how you want the strings to wrap around the nails.
Starting from a point A, look at some neighboring points (within certain radius) B1, B2, B3, etc. Imagine if you attach a string with certain width from A to Bi, it visually changes your string image P in a slight way. Render line segment A-B1 on P to get P1, render A-B2 on P to get P2, etc.
Find the best Bi so that the new image Pi looks closer to the original. You can just do a pixel-wise comparison between the string image and the original picture, and use this measurement to score each Bi. The video author used a blurred image B) to get rid of textures that may randomly impact his scoring algorithm.
Now the optimal Bi becomes the new A. Find its neighbors and loop over. The algorithm may stop if adding any new strings only negatively impacts the score.
There are cases where bright areas in a photo are widely separated, so any white strings crossing the dark gap will only decrease the score. Use your judgement to tweak the algorithm to workaround those non-convex scenarios.
I am trying to identify a state of a valve(on or off). My approach is to give to images of each states and compare the current image with those two and see which one it belongs to.
I have tried to compare using new_image - on_image and new_image - off_image. Then compare the number of different pixels. It works, but i feel like in some cases it might not work and there must be another better way do a simple classification like this.
Any reference or ideas?
Subtracting pixels might not be very robust in case your camera position changes slightly. If you don't shy away from using open Computer Vision (open CV) there is an interesting recipe for finding a predefined object in a picture:
Feature Matching + Homography to find Objects
You could cut out the lever from your image and search it in every new image. Depending on the coordinates and especially the rotation, you can set the status of the valve. This might even work in crazy cases where someone half opened (or for pessimists: half closed) the valve, or if the lever becomes partially covered.
This is something that I'm sure there must be a package out there for, but for the life of me I can not find it.
I've trained a CNN with a given size image looking for heads and then wish to give it a larger image to look within for the heads. Now the way this is typically done is a sliding window, a sub image is made from the larger image, and then we slide the bounding box of the sub image through the larger image, often with significant overlap (maybe 50%). Also since the thing I'm looking for might be bigger or smaller than in the training data I need to start with a huge bounding box (say 3 times the size of the training images) scale it down to the training size, slide over the image, then try again with something 2.5x the size, then 2, 1.5,1,0.75,0.5,0.25 etc.
It's not too complicated for me to write on my lonesome, but my implementation will be slow and messy. There must be an python package that does this. What's it called?!