It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I need to build code for this question.
First we need to understand what the vignette looks like in isolation, without the clutter of a beautiful scene. For this, a picture of a flat surface of a single colour that is evenly lit (such as a white wall or plain blue sky during mid-day) is obtained and used as a “vignette profile”. You are given the following vignette profile image.
In this image, the pixels towards the center of the image have higher RGB values (i.e.
brighter pixels) than the pixels that are away from the center. The image is also black and
white, so red, green and blue values of any given pixel are the same. You must not modify
this image.
We need to use the vignette profile image from step #1 as a filter for our normal photos
captured by our camera so that the vignette can be removed. For this you need to divide
the photo image (with vignette) by the vignette profile image. As the pixels in the vignette image have RGB values that are smaller (darker) towards its edges, dividing the original image's corresponding pixels by a small number will make them brighter
These are the hints
Hint 1: This requires you to perform your operations on your images, pixel by pixel. I.e. you cannot do it in a single step.
Hint 2: The first challenge for you will be keeping the RGB values resulting from the division within the 0-255 range for each channel, as valid RGB values are between 0 (darkest) and 255 (brightest).
Edit:
Sample code:
def runA1(picture): myFile = pickAFile() picture = makePicture(myFile)
myFile2 = pickAFile()
picture2 = makePicture(myFile2)
for x in range(0,getWidth(picture)):
for y in range(0,getHeight(picture)):
px = getPixel(picture,x,y)
color = getColor(px)
color = makeLighter(color)
setColor(px,color)
for x in range(2,getWidth(picture)):
for y in range(2,getHeight(picture)):
px = getPixel(picture,x,y)
color = getColor(px)
color = makeDarker(color)
setColor(px,color)
show(picture2)
Since you haven't demonstrated that you've tried anything on your own, I'm only going to give hints as to what you should try.
Consider a single pixel represented as three integers, 0-255, in the form (R,G,B). The corresponding pixel from the vignette mask has value A, again 0-255. Divide (R,G,B) by A and multiply by 255 to get the un-vignetted pixel (RR,GG,BB). (Why do we need to multiply by 256?)
Decide what you want to do about R,G,B values exceeding 255. What happens if the vignette value A is zero?
Do this for each pixel in the image, starting with the top row of pixels working left to right, then the next row down, and so on until you're done.
Incidentally, this kind of thing is a one-step operation in a language with first-class numerical matrix support - say MATLAB, Octave, Numpy/Scipy. Here's a MATLAB example:
processed_image = original_image ./ repmat(vignette_image,[1 1 3]) * 256
Edit 2:
Some comments on your sample code:
Your indentation is wrong - this code will not run. Maybe this got mangled when you pasted it into StackOverflow. Please fix it. In particular,
the def statement has to be on a line by its own.
myFile2 =.... has to be indented from the def: statement.
myFile1, myFile2 - these variable names could be more meaningful. (Which of these is the original photo? Which is the vignette mask? You could try calling these variables original_file, vignette_file instead. Ditto for myPicture1, myPicture2.)
Where are the comments in your code? It's hard to tell what your code does.
Apart from this, you need to post more code. Your example needs to be a Short, Self Contained, Correct Example. Right now your code example is not self-contained because to run it, we would also require the getPixel(), getColor(), makeLighter(), etc. functions. It's also not compilable due to indentation errors.
Related
I have recently started studying steganography and I've come across a problem that I just don't seem to understand. Basically, the image is a png which contains a hidden flag in it.
When you extract the bit planes from the image, you can see that there's an image in the blue and green planes that you can see in the red one. To reveal the flag in clear text, you have to remove those images from the red one by XORing the LSB or something. I am not totally sure.
This is what the image in the red plane looks like if you don't remove the others.
My question is how do I go about doing this kind of thing? This is the image in question.
Actually the hidden image is in the lowest 3 bit planes. Doing a full bit decomposition makes that clear.
Start by loading the image to a numpy array, which will have dimensions MxNx3.
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
img = Image.open('stego.png')
data = np.array(img)
All you have to do now is XOR each colour plane with another and then keep the 3 least significant bits (lsb).
extracted = (data[...,0] ^ data[...,1] ^ data[...,2]) & 0x07
plt.imshow(extracted)
plt.show()
In case it wasn't obvious, the & 0x07 part is an AND operation with the binary number 00000111, just written in hexadecimal for conciseness.
If you don't keep all 3 lsb, then you'll either be missing some letters in the solution, or everything will be there but some edges won't be as smooth. The first of these is critically important.
I'm currently trying to start with an original RGB image, convert it to LUV, perform some operations (namely, rotate the hues), then rotate it back to RGB for display purposes. However, I'm encountering a vexing issue where the RGB-to-LUV conversion (and vice versa) seems to be changing the image. Specifically, if I begin with an LUV image, convert it to RGB, and then change it back to LUV, without changing anything else, the original image is different. This has happened for both the Python (cv2) and Matlab (open source) implementations of the color conversion algorithms, as well as my own hand-coded ones based on. Here is an example:
luv1 = np.array([[[100,6.12,0]]]).astype('float32')
rgb1 = cv2.cvtColor(luv1,cv2.COLOR_Luv2RGB)
luv2 = cv2.cvtColor(rgb1,cv2.COLOR_RGB2Luv)
print(luv2)
[[[99.36293 1.3064307 -1.0494182]]]
As you can see, the LUV coordinates have changed from the input. Is this because certain LUV coordinates have no direct match in RGB space?
Yes, remove the astype('uint8') bit in your code, and the difference should disappear if the conversion is implemented correctly.
You can see the equations for the conversion in Wikipedia. There is nothing there that is irreversible, the conversions are perfect inverses of each other.
However, this conversion contains a 3rd power, which does stretch some values significantly. The rounding of the conversion to an integer can introduce a significant shift of color.
Also, the Luv domain is highly irregular and it might not be easy to verify that Luv values will lead to a valued RGB value. Your statement "I've verified that luv1 has entries that all fall in the allowable input ranges" makes me believe that you think the Luv domain is a box. It is not. The ranges for u and v change with L. One good exercise is to start with a sampling of the RGB cube, and map those to Luv, then plot those points to see the shape of the Luv domain. Wikipedia has an example of what this could look like for the sRGB gamut.
The OpenCV cvtColor function will clamp RGB values to the [0,1] range (if of type float32), leading to irreversible changes of color if the input is out of gamut.
Here is an example that shows that the conversion is reversible. I start with RGB values because these are easy to verify as valid:
import numpy as np
import cv2
rgb1 = np.array([[[1.0,1.0,1.0],[0.5,1.0,0.5],[0.0,0.5,0.5],[0.0,0.0,0.0]]], 'float32')
luv1 = cv2.cvtColor(rgb1, cv2.COLOR_RGB2Luv)
rgb2 = cv2.cvtColor(luv1, cv2.COLOR_Luv2RGB)
np.max(np.abs(rgb2-rgb1))
This returns 2.8897537e-06, which is numerical precision for 32-bit floats.
My question is not too far off from the "Image Alignment (ECC) in OpenCV ( C++ / Python )" article.
I also found the following article about facial alignment to be very interesting, but WAY more complex than my problem.
Wow! I can really go down the rabbit-hole.
My question is WAY more simple.
I have a scanned document that I have treated as a "template". In this template I have manually mapped the pixel regions that I require info from as:
area = (x1,y1,x2,y2)
such that x1<x2, y1<y2.
Now, these regions are, as is likely obvious, a bit too specific to my "template".
All other files that I want to extract data from are mostly shifted by some unknown amount such that their true area for my desired data is:
area = (x1 + ε1, y1 + ε2, x2 + ε1, y2 + ε2)
Where ε1, ε2 are unknown in advance.
But the documents are otherwise HIGHLY similar outside of this shift.
I want to discover, ideally through opencv, what translation is required (for the time being ignoring euclidean) to "align" these images as to disover my ε, shift my area, and parse my data directly.
I have thought about using tesseract to mine the text from the document and then parse from there, but there are check boxes that are either filled or empty
that contain meaningful information for my problem.
The code I currently have for cropping the image is:
from PIL import Image
img = Image.open(img_path)
area = area_lookup['key']
cropped_img = img.crop(area)
cropped_img.show()
My two sample files are attached.
My two images are:
We can assume my first image is my "template".
As you can see, the two images are very "similar" but one is moved slightly (human error). There may be cases where the rotation is more extreme, or the image is shifted more.
I would like transform image 2 to be as aligned to image 1 as possible, and then parse data from it.
Any help would be sincerely appreciated.
Thank you very much
I'm using PIL to transform a portion of the screen perspectively.
The original image-data is a pygame Surface which needs to be converted to a PIL Image.
Therefore I found the tostring-function of pygame which exists for that purpose.
However the result looks pretty odd (see attached screenshot). What is going wrong with this code:
rImage = pygame.Surface((1024,768))
#draw something to the Surface
sprite = pygame.sprite.RenderPlain((playboard,))
sprite.draw(rImage)
pil_string_image = pygame.image.tostring(rImage, "RGBA",False)
pil_image = Image.fromstring("RGBA",(660,660),pil_string_image)
What am I doing wrong?
As I noted in a comment, pygame documentation
for pygame.image.fromstring(string, size, format, flipped=False) says “The size and format image must compute the exact same size as the passed string buffer. Otherwise an exception will be raised”. Thus, using (1024,768) in place of (660,660), or vice versa – in general, the same dimensions for the two calls – is more likely to work. (I say “more likely to work” instead of “will work” because of I didn't test any cases.)
The reason for suspecting a problem like this: The strange look of part of the image resembles a display screen which is set to a raster rate it can't synchronize; ie, lines of the image start displaying at points other than at the left margin; in this case because of image line lengths being longer than display line lengths. I'm assuming the snowflakes are sprites, generated separately from the distorted image.
I have a number of images from Chinese genealogies, and I would like to be able to programatically categorize them. Generally speaking, one type of image has primarily line-by-line text, while the other type may be in a grid or chart format.
Example photos
'Desired' type: http://www.flickr.com/photos/63588871#N05/8138563082/
'Other' type: http://www.flickr.com/photos/63588871#N05/8138561342/in/photostream/
Question: Is there a (relatively) simple way to do this? I have experience with Python, but little knowledge of image processing. Direction to other resources is appreciated as well.
Thanks!
Assuming that at least some of the grid lines are exactly or almost exactly vertical, a fairly simple approach might work.
I used PIL to find all the columns in the image where more than half of the pixels were darker than some threshold value.
Code
import Image, ImageDraw # PIL modules
withlines = Image.open('withgrid.jpg')
nolines = Image.open('nogrid.jpg')
def findlines(image):
w,h, = image.size
s = w*h
im = image.point(lambda i: 255 * (i < 60)) # threshold
d = im.getdata() # faster than per-pixel operations
linecolumns = []
for col in range(w):
black = sum( (d[x] for x in range(col, s, w)) )//255
if black > 450:
linecolumns += [col]
# return an image showing the detected lines
im2 = image.convert('RGB')
draw = ImageDraw.Draw(im2)
for col in linecolumns:
draw.line( (col,0,col,h-1), fill='#f00', width = 1)
return im2
findlines(withlines).show()
findlines(nolines).show()
Results
showing detected vertical lines in red for illustration
As you can see, four of the grid lines are detected, and, with some processing to ignore the left and right sides and the center of the book, there should be no false positives on the desired type.
This means that you could use the above code to detect black columns, discard those that are near to the edge or the center. If any black columns remain, classify it as the "other" undesired class of pictures.
AFAIK, there is no easy way to solve this. You will need a decent amount of image processing and some basic machine learning to classify these kinds of images (and even than it probably won't be 100% successful)
Another note:
While this can be solved by only using machine learning techniques, I would advice you to start searching for some image processing techniques first and try to convert your image to a form that has a decent difference for both images. For this you best start reading about the fft. After that have a look at some digital image processing techniques. When you feel comfortable that you have a decent understanding of these, you can read up on pattern recognition.
This is only one suggested approach though, there are more ways to achieve this.