invisible watermarks in images [closed]

invisible watermarks in images [closed] - python

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
How do you insert invisible watermarks in images for copyright purposes? I'm looking for a python library.
What algorithm do you use? What about performance and efficiency?

You might want to look into Steganography; that is hiding data inside of images. There are forms that won't get lost if you convert to a lossier format or even crop parts of the image out.

I'm looking for "unbreakable" watermarks, so data stored in exif or image metadata are out.
I have found some interesting stuff on the web while waiting for replies here:
http://www.cosy.sbg.ac.at/~pmeerw/Watermarking/
There is a master thesis that's fairly exhaustive about algorithms and their caracteristics (what they do and how unbreakable they are). I haven't got any time to read it in depth, but this stuff looks serious. There are algorithms that support JPEG compression, cropping, gamma correction or down scaling in some way. It's C, but I can port it to Python or use C libraries from Python.
However, it's from 2001 and I guess 7 years are a long time in this field :( Does anybody have some similar and more recent stuff?

I use the following code. It requires PIL:
def reduceOpacity(im, opacity):
"""Returns an image with reduced opacity."""
assert opacity >= 0 and opacity <= 1
if im.mode != 'RGBA':
im = im.convert('RGBA')
else:
im = im.copy()
alpha = im.split()[3]
alpha = ImageEnhance.Brightness(alpha).enhance(opacity)
im.putalpha(alpha)
return im
def watermark(im, mark, position, opacity=1):
"""Adds a watermark to an image."""
if opacity < 1:
mark = reduceOpacity(mark, opacity)
if im.mode != 'RGBA':
im = im.convert('RGBA')
# create a transparent layer the size of the image and draw the
# watermark in that layer.
layer = Image.new('RGBA', im.size, (0,0,0,0))
if position == 'tile':
for y in range(0, im.size[1], mark.size[1]):
for x in range(0, im.size[0], mark.size[0]):
layer.paste(mark, (x, y))
elif position == 'scale':
# scale, but preserve the aspect ratio
ratio = min(float(im.size[0]) / mark.size[0], float(im.size[1]) / mark.size[1])
w = int(mark.size[0] * ratio)
h = int(mark.size[1] * ratio)
mark = mark.resize((w, h))
layer.paste(mark, ((im.size[0] - w) / 2, (im.size[1] - h) / 2))
else:
layer.paste(mark, position)
# composite the watermark with the layer
return Image.composite(layer, im, layer)
img = Image.open('/path/to/image/to/be/watermarked.jpg')
mark1 = Image.open('/path/to/watermark1.png')
mark2 = Image.open('/path/to/watermark2.png')
img = watermark(img, mark1, (img.size[0]-mark1.size[0]-5, img.size[1]-mark1.size[1]-5), 0.5)
img = watermark(img, mark2, 'scale', 0.01)
The watermark is too faint to see. Only a solid color image would really show it. I can use it to create an image that doesn't show a watermark, but if I do a bit-by-bit subtraction using the original image, I can demonstrate that my watermark is there.
If you want to see how it works, go to TylerGriffinPhotography.com. Each image on the site is watermarked twice: once with the watermark in the lower right corner at 50% opacity (5px from the edge), and once over the whole image at 1% opacity (using "scale", which scales the watermark to the whole image). Can you figure out what the second, low opacity watermark shape is?

If you're talking about steganography, here's an old not too-fancy module I did for a friend once (Python 2.x code):
the code
from __future__ import division
import math, os, array, random
import itertools as it
import Image as I
import sys
def encode(txtfn, imgfn):
with open(txtfn, "rb") as ifp:
txtdata= ifp.read()
txtdata= txtdata.encode('zip')
img= I.open(imgfn).convert("RGB")
pixelcount= img.size[0]*img.size[1]
## sys.stderr.write("image %dx%d\n" % img.size)
factor= len(txtdata) / pixelcount
width= int(math.ceil(img.size[0]*factor**.5))
height= int(math.ceil(img.size[1]*factor**.5))
pixelcount= width * height
if pixelcount < len(txtdata): # just a sanity check
sys.stderr.write("phase 2, %d bytes in %d pixels?\n" % (len(txtdata), pixelcount))
sys.exit(1)
## sys.stderr.write("%d bytes in %d pixels (%dx%d)\n" % (len(txtdata), pixelcount, width, height))
img= img.resize( (width, height), I.ANTIALIAS)
txtarr= array.array('B')
txtarr.fromstring(txtdata)
txtarr.extend(random.randrange(256) for x in xrange(len(txtdata) - pixelcount))
newimg= img.copy()
newimg.putdata([
(
r & 0xf8 |(c & 0xe0)>>5,
g & 0xfc |(c & 0x18)>>3,
b & 0xf8 |(c & 0x07),
)
for (r, g, b), c in it.izip(img.getdata(), txtarr)])
newimg.save(os.path.splitext(imgfn)[0]+'.png', optimize=1, compression=9)
def decode(imgfn, txtfn):
img= I.open(imgfn)
with open(txtfn, 'wb') as ofp:
arrdata= array.array('B',
((r & 0x7) << 5 | (g & 0x3) << 3 | (b & 0x7)
for r, g, b in img.getdata())).tostring()
findata= arrdata.decode('zip')
ofp.write(findata)
if __name__ == "__main__":
if sys.argv[1] == 'e':
encode(sys.argv[2], sys.argv[3])
elif sys.argv[1] == 'd':
decode(sys.argv[2], sys.argv[3])
the algorithm
It stores a byte of data per image pixel using: the 3 least-significant bits of the blue band, the 2 LSB of the green one and the 3 LSB of the red one.
encode function: An input text file is compressed by zlib, and the input image is resized (keeping proportions) to ensure that there are at least as many pixels as compressed bytes. A PNG image with the same name as the input image (so don't use a ".png" filename as input if you leave the code as-is :) is saved containing the steganographic data.
decode function: The previously stored zlib-compressed data are extracted from the input image, and saved uncompressed under the provided filename.
I verified the old code still runs, so here's an example image containing steganographic data:
You'll notice that the noise added is barely visible.

Well, invisible watermarking is not that easy. Check digimarc, what money did they earn on it. There is no free C/Python code that a lonely genius has written a leave it for free usage. I've implemented my own algorithm and the name of the tool is SignMyImage. Google it if interested ... F>

What about Exif? It's probably not as secure as what you're thinking, but most users don't even know it exists and if you make it that easy to read the watermark information those who care will still be able to do it anyway.

I don't think there is a library that does this out of the box. If you want to implement your own, I would definitely go with the Python Imaging Library (PIL).
This is a Python Cookbook recipe that uses PIL to add a visible watermark to an image. If it's enough for your needs, you could use this to add a watermark with enough transparency that it is only visible if you know what you are looking for.

There is a newer (2005) digital watermarking FAQ at watermarkingworld.org

I was going to post an answer similar to Ugh. I would suggest putting a small TXT file describing the image source (and perhaps a small copyright statement, if one applies) into the image in a manner that is difficult to detect and break.

I'm not sure how important it is to be unbreakable, but a simple solution might just be to append a text file to the end of the image. Something like "This image belongs to ...".
If you open the image in a viewer/browser, it looks like a normal jpeg, but if you open it in a text editor, the last line would be readable.
The same method allows you include an actual file into an image. (hide a file inside of an image) I've found that it's a bit hit-or-miss, but 7-zip files seem to work. You could hide all sorts of copywrite goodies inside the image.
Again, it's not unbreakable by any stretch of the imagination, but it's completely invisible to the naked eye.

Some image formats have headers where you can store arbitrary information as well.
For example, the PNG specification has a chunk where you can store text data. This is similar to the answers above, but without adding random data to the image data itself.

Related

Problem converting an image for a 3-color e-ink display

I am trying to process an image file into something that can be displayed on a Black/White/Red e-ink display, but I am running into a problem with the output resolution.
Based on the example code for the display, it expects two arrays of bytes (one for Black/White, one for Red), each 15,000 bytes. The resolution of the e-ink display is 400x300.
I'm using the following Python script to generate two BMP files: one for Black/White and one for Red. This is all working, but the file sizes are 360,000 bytes each, which won't fit in the ESP32 memory. The input image (a PNG file) is 195,316 bytes.
The library I'm using has a function called EPD_4IN2B_V2_Display(BLACKWHITEBUFFER, REDBUFFER);, which wants the full image (one channel for BW, one for Red) to be in memory. But, with these image sizes, it won't fit on the ESP32. And, the example uses 15KB for each color channel (BW, R), so I feel like I'm missing something in the image processing necessary to make this work.
Can anyone shed some light on what I'm missing? How would I update the Python image-processing script to account for this?
I am using the Waveshare 4.2inch E-Ink display and the Waveshare ESP32 driver board. A lot of the Python code is based on this StackOverflow post but I can't seem to find the issue.
import io
import traceback
from wand.image import Image as WandImage
from PIL import Image
# This function takes as input a filename for an image
# It resizes the image into the dimensions supported by the ePaper Display
# It then remaps the image into a tri-color scheme using a palette (affinity)
# for remapping, and the Floyd Steinberg algorithm for dithering
# It then splits the image into two component parts:
# a white and black image (with the red pixels removed)
# a white and red image (with the black pixels removed)
# It then converts these into PIL Images and returns them
# The PIL Images can be used by the ePaper library to display
def getImagesToDisplay(filename):
print(filename)
red_image = None
black_image = None
try:
with WandImage(filename=filename) as img:
img.resize(400, 300)
with WandImage() as palette:
with WandImage(width = 1, height = 1, pseudo ="xc:red") as red:
palette.sequence.append(red)
with WandImage(width = 1, height = 1, pseudo ="xc:black") as black:
palette.sequence.append(black)
with WandImage(width = 1, height = 1, pseudo ="xc:white") as white:
palette.sequence.append(white)
palette.concat()
img.remap(affinity=palette, method='floyd_steinberg')
red = img.clone()
black = img.clone()
red.opaque_paint(target='black', fill='white')
black.opaque_paint(target='red', fill='white')
red_image = Image.open(io.BytesIO(red.make_blob("bmp")))
black_image = Image.open(io.BytesIO(black.make_blob("bmp")))
red_bytes = io.BytesIO(red.make_blob("bmp"))
black_bytes = io.BytesIO(black.make_blob("bmp"))
except Exception as ex:
print ('traceback.format_exc():\n%s',traceback.format_exc())
return (red_image, black_image, red_bytes, black_bytes)
if __name__ == "__main__":
print("Running...")
file_path = "testimage-tree.png"
with open(file_path, "rb") as f:
image_data = f.read()
red_image, black_image, red_bytes, black_bytes = getImagesToDisplay(file_path)
print("bw: ", red_bytes)
print("red: ", black_bytes)
black_image.save("output/bw.bmp")
red_image.save("output/red.bmp")
print("BW file size:", len(black_image.tobytes()))
print("Red file size:", len(red_image.tobytes()))

As requested, and in the event that it may be useful for future reader, I write a little bit more extensively what I've said in comments (and was verified to be indeed the reason of the problem).
The e-ink display needs usually a black&white image. That is 1 bit per pixel image. Not a grayscale (1 channel byte per pixel), even less a RGB (3 channels/bytes per pixel).
I am not familiar with bi-color red/black displays. But it seems quite logical that it behave just like 2 binary displays (one black & white display, and one black-white & red display). Sharing the same location.
What your code seemingly does is to remove all black pixels from a RGB image, and use it as a red image, and remove all red pixels from the same RDB image, and use it as a black image. But since those images are obtained with clone they are still RGB images. RGB images that happen to contain only black and white pixels, or red and white pixels, but still RGB image.
With PIL, it is the mode that control how images are represented in memory, and therefore, how they are saved to file.
Relevant modes are RGB, L (grayscale aka 1 linear byte/channel per pixel), and 1 (binary aka 1 bit per pixel).
So what you need is to convert to mode 1. Usind .convert('1') method on both your images.
Note that 400x300×3 (uncompressed rgb data for your image) is 360000, which is what you got. 400×300 (L mode for same image) is 120000, and 400×300/8 (1 mode, 1 bit/pixel) is 15000, which is precisely the expected size as you mentioned. So that is another confirmation that, indeed, 1 bit/pixel image is expected.

Why size of jpg file is bigger than expected?

I generate a grayscale image and save it in jpg format.
SCENE_WIDTH = 28
SCENE_HEIGHT = 28
# draw random noice
p, n = 0.5, SCENE_WIDTH*SCENE_HEIGHT
scene_noise = np.random.binomial(1, p, n).reshape((SCENE_WIDTH, SCENE_HEIGHT))*255
scene_noise = scene_noise.astype(np.uint8)
n = scene_noise
print('%d bytes' % (n.size * n.itemsize)) # 784 bytes
cv2.imwrite('scene_noise.jpg', scene_noise)
print('noise: ', os.path.getsize("scene_noise.jpg")) # 1549 bytes
from PIL import Image
im = Image.fromarray(scene_noise)
im.save('scene_noise2.jpg')
print('noise2: ', os.path.getsize("scene_noise2.jpg")) # 1017 bytes
when I change from:
scene_noise = np.random.binomial(1, p, n).reshape((SCENE_WIDTH, SCENE_HEIGHT))*255
to:
scene_noise = np.random.binomial(255, p, n).reshape((SCENE_WIDTH, SCENE_HEIGHT))
The size of file decrease almost 2 times: ~ 775 bytes.
Can you please explain why JPG file is bigger than the raw version and why the size decreases when I change colors from black and white to full grayscale spectrum?
cv2.__version__.split(".") # ['4', '1', '2']

Two things here:
can you explain why the JPEG file is bigger than the raw version?
The size differs because you are not comparing the same things. The first object is a NumPy array, and the second one is a JPEG file. The JPEG file is bigger than the NumPy array (ie. after creating it with OpenCV) because JPEG encoding includes information in the overhead that a NumPy array does not store nor need.
can you explain why the size decreases when I change colours from black and white to a full grayscale spectrum?
This is due to JPEG encoding. If you truly want to understand all of what happens, I highly suggest to understand how JPEG encoding works as I will not go into much detail about this (I am in no way a specialist in this topic). Information on this is well documented on the Wikipedia JPEG article. The general idea is that the more contrast you have in your picture, the bigger it will be in terms of size. Here, having a picture in black and white only will force you to always go between 0 and 255, whereas a grayscale picture will not usually see as big a change between adjacent pixels.

FFT on image with Python

I have a problem with FFT implementation in Python. I have completely strange results.
Ok so, I want to open image, get value of every pixel in RGB, then I need to use fft on it, and convert to image again.
My steps:
1) I'm opening image with PIL library in Python like this
from PIL import Image
im = Image.open("test.png")
2) I'm getting pixels
pixels = list(im.getdata())
3) I'm seperate every pixel to r,g,b values
for x in range(width):
for y in range(height):
r,g,b = pixels[x*width+y]
red[x][y] = r
green[x][y] = g
blue[x][y] = b
4). Let's assume that I have one pixel (111,111,111). And use fft on all red values like this
red = np.fft.fft(red)
And then:
print (red[0][0], green[0][0], blue[0][0])
My output is:
(53866+0j) 111 111
It's completely wrong I think. My image is 64x64, and FFT from gimp is completely different. Actually, my FFT give me only arrays with huge values, thats why my output image is black.
Do you have any idea where is problem?
[EDIT]
I've changed as suggested to
red= np.fft.fft2(red)
And after that I scale it
scale = 1/(width*height)
red= abs(red* scale)
And still, I'm getting only black image.
[EDIT2]
Ok, so lets take one image.
Assume that I dont want to open it and save as greyscale image. So I'm doing like this.
def getGray(pixel):
r,g,b = pixel
return (r+g+b)/3
im = Image.open("test.png")
im.load()
pixels = list(im.getdata())
width, height = im.size
for x in range(width):
for y in range(height):
greyscale[x][y] = getGray(pixels[x*width+y])
data = []
for x in range(width):
for y in range(height):
pix = greyscale[x][y]
data.append(pix)
img = Image.new("L", (width,height), "white")
img.putdata(data)
img.save('out.png')
After this, I'm getting this image , which is ok. So now, I want to make fft on my image before I'll save it to new one, so I'm doing like this
scale = 1/(width*height)
greyscale = np.fft.fft2(greyscale)
greyscale = abs(greyscale * scale)
after loading it. After saving it to file, I have . So lets try now open test.png with gimp and use FFT filter plugin. I'm getting this image, which is correct
How I can handle it?

Great question. I’ve never heard of it but the Gimp Fourier plugin seems really neat:
A simple plug-in to do fourier transform on you image. The major advantage of this plugin is to be able to work with the transformed image inside GIMP. You can so draw or apply filters in fourier space, and get the modified image with an inverse FFT.
This idea—of doing Gimp-style manipulation on frequency-domain data and transforming back to an image—is very cool! Despite years of working with FFTs, I’ve never thought about doing this. Instead of messing with Gimp plugins and C executables and ugliness, let’s do this in Python!
Caveat. I experimented with a number of ways to do this, attempting to get something close to the output Gimp Fourier image (gray with moiré pattern) from the original input image, but I simply couldn’t. The Gimp image appears to be somewhat symmetric around the middle of the image, but it’s not flipped vertically or horizontally, nor is it transpose-symmetric. I’d expect the plugin to be using a real 2D FFT to transform an H×W image into a H×W array of real-valued data in the frequency domain, in which case there would be no symmetry (it’s just the to-complex FFT that’s conjugate-symmetric for real-valued inputs like images). So I gave up trying to reverse-engineer what the Gimp plugin is doing and looked at how I’d do this from scratch.
The code. Very simple: read an image, apply scipy.fftpack.rfft in the leading two dimensions to get the “frequency-image”, rescale to 0–255, and save.
Note how this is different from the other answers! No grayscaling—the 2D real-to-real FFT happens independently on all three channels. No abs needed: the frequency-domain image can legitimately have negative values, and if you make them positive, you can’t recover your original image. (Also a nice feature: no compromises on image size. The size of the array remains the same before and after the FFT, whether the width/height is even or odd.)
from PIL import Image
import numpy as np
import scipy.fftpack as fp
## Functions to go from image to frequency-image and back
im2freq = lambda data: fp.rfft(fp.rfft(data, axis=0),
axis=1)
freq2im = lambda f: fp.irfft(fp.irfft(f, axis=1),
axis=0)
## Read in data file and transform
data = np.array(Image.open('test.png'))
freq = im2freq(data)
back = freq2im(freq)
# Make sure the forward and backward transforms work!
assert(np.allclose(data, back))
## Helper functions to rescale a frequency-image to [0, 255] and save
remmax = lambda x: x/x.max()
remmin = lambda x: x - np.amin(x, axis=(0,1), keepdims=True)
touint8 = lambda x: (remmax(remmin(x))*(256-1e-4)).astype(int)
def arr2im(data, fname):
out = Image.new('RGB', data.shape[1::-1])
out.putdata(map(tuple, data.reshape(-1, 3)))
out.save(fname)
arr2im(touint8(freq), 'freq.png')
(Aside: FFT-lover geek note. Look at the documentation for rfft for details, but I used Scipy’s FFTPACK module because its rfft interleaves real and imaginary components of a single pixel as two adjacent real values, guaranteeing that the output for any-sized 2D image (even vs odd, width vs height) will be preserved. This is in contrast to Numpy’s numpy.fft.rfft2 which, because it returns complex data of size width/2+1 by height/2+1, forces you to deal with one extra row/column and deal with deinterleaving complex-to-real yourself. Who needs that hassle for this application.)
Results. Given input named test.png:
this snippet produces the following output (global min/max have been rescaled and quantized to 0-255):
And upscaled:
In this frequency-image, the DC (0 Hz frequency) component is in the top-left, and frequencies move higher as you go right and down.
Now, let’s see what happens when you manipulate this image in a couple of ways. Instead of this test image, let’s use a cat photo.
I made a few mask images in Gimp that I then load into Python and multiply the frequency-image with to see what effect the mask has on the image.
Here’s the code:
# Make frequency-image of cat photo
freq = im2freq(np.array(Image.open('cat.jpg')))
# Load three frequency-domain masks (DSP "filters")
bpfMask = np.array(Image.open('cat-mask-bpfcorner.png')).astype(float) / 255
hpfMask = np.array(Image.open('cat-mask-hpfcorner.png')).astype(float) / 255
lpfMask = np.array(Image.open('cat-mask-corner.png')).astype(float) / 255
# Apply each filter and save the output
arr2im(touint8(freq2im(freq * bpfMask)), 'cat-bpf.png')
arr2im(touint8(freq2im(freq * hpfMask)), 'cat-hpf.png')
arr2im(touint8(freq2im(freq * lpfMask)), 'cat-lpf.png')
Here’s a low-pass filter mask on the left, and on the right, the result—click to see the full-res image:
In the mask, black = 0.0, white = 1.0. So the lowest frequencies are kept here (white), while the high ones are blocked (black). This blurs the image by attenuating high frequencies. Low-pass filters are used all over the place, including when decimating (“downsampling”) an image (though they will be shaped much more carefully than me drawing in Gimp 😜).
Here’s a band-pass filter, where the lowest frequencies (see that bit of white in the top-left corner?) and high frequencies are kept, but the middling-frequencies are blocked. Quite bizarre!
Here’s a high-pass filter, where the top-left corner that was left white in the above mask is blacked out:
This is how edge-detection works.
Postscript. Someone, make a webapp using this technique that lets you draw masks and apply them to an image real-time!!!

There are several issues here.
1) Manual conversion to grayscale isn't good. Use Image.open("test.png").convert('L')
2) Most likely there is an issue with types. You shouldn't pass np.ndarray from fft2 to a PIL image without being sure their types are compatible. abs(np.fft.fft2(something)) will return you an array of type np.float32 or something like this, whereas PIL image is going to receive something like an array of type np.uint8.
3) Scaling suggested in the comments looks wrong. You actually need your values to fit into 0..255 range.
Here's my code that addresses these 3 points:
import numpy as np
from PIL import Image
def fft(channel):
fft = np.fft.fft2(channel)
fft *= 255.0 / fft.max() # proper scaling into 0..255 range
return np.absolute(fft)
input_image = Image.open("test.png")
channels = input_image.split() # splits an image into R, G, B channels
result_array = np.zeros_like(input_image) # make sure data types,
# sizes and numbers of channels of input and output numpy arrays are the save
if len(channels) > 1: # grayscale images have only one channel
for i, channel in enumerate(channels):
result_array[..., i] = fft(channel)
else:
result_array[...] = fft(channels[0])
result_image = Image.fromarray(result_array)
result_image.save('out.png')
I must admit I haven't managed to get results identical to the GIMP FFT plugin. As far as I see it does some post-processing. My results are all kinda very low contrast mess, and GIMP seems to overcome this by tuning contrast and scaling down non-informative channels (in your case all chanels except Red are just empty). Refer to the image:

How do I deinterlace an image in Python?

Let's assume the image is stored as a png file and I need to drop every odd line and resize the result horizontally to 50% in order to keep the aspect ratio.
The result must have 50% of the resolution of the original image.
It will not be enough to recommend an existing image library, like PIL, I would like to see some working code.
UPDATE - Even if the question received a correct answer, I want to warn others that PIL is not in a great shape, the project website was not updated in months, there is no link to a bug traker and the list activity is quite low. I was surprised to discover that a simple BMP file saved with Paint was not loaded by PIL.

Is it essential to keep every even line (in fact, define "even" - are you counting from 1 or 0 as the first row of the image?)
If you don't mind which rows are dropped, use PIL:
from PIL import Image
img=Image.open("file.png")
size=list(img.size)
size[0] /= 2
size[1] /= 2
downsized=img.resize(size, Image.NEAREST) # NEAREST drops the lines
downsized.save("file_small.png")

I recently wanted to deinterlace some stereo images, extracting the images for the left and right eye. For that I wrote:
from PIL import Image
def deinterlace_file(input_file, output_format_str, row_names=('Left', 'Right')):
print("Deinterlacing {}".format(input_file))
source = Image.open(input_file)
source.load()
dim = source.size
scaled_size1 = (math.floor(dim[0]), math.floor(dim[1]/2) + 1)
scaled_size2 = (math.floor(dim[0]/2), math.floor(dim[1]/2) + 1)
top = Image.new(source.mode, scaled_size1)
top_pixels = top.load()
other = Image.new(source.mode, scaled_size1)
other_pixels = other.load()
for row in range(dim[1]):
for col in range(dim[0]):
pixel = source.getpixel((col, row))
row_int = math.floor(row / 2)
if row % 2:
top_pixels[col, row_int] = pixel
else:
other_pixels[col, row_int] = pixel
top_final = top.resize(scaled_size2, Image.NEAREST) # Downsize to maintain aspect ratio
other_final = other.resize(scaled_size2, Image.NEAREST) # Downsize to maintain aspect ratio
top_final.save(output_format_str.format(row_names[0]))
other_final.save(output_format_str.format(row_names[1]))
output_format_str should be something like: "filename-{}.png" where the {} will be replaced with the row name.
Note that it ends up with the image being half of it's original size. If you don't want this you can twiddle the last scaling step
It's not the fastest operation as it goes through pixel by pixel, but I could not see an easy way to extract rows from an image.

Create PDF with (resized) PNG images using Pycairo - rescaling Surface issue

I have som PNG image links that I want to download, "convert to thumbnails" and save to PDF using Python and Cairo.
Now, I have a working code, but I don't know how to control image size on paper. Is there a way to resize a PyCairo Surface to the dimensions I want (which happens to be smaller than the original)? I want the original pixels to be "shrinked" to a higher resolution (on paper).
Also, I tried Image.rescale() function from PIL, but it gives me back a 20x20 pixel output (out of a 200x200 pixel original image, which is not the banner example on the code). What I want is a 200x200 pixel image plotted inside a 20x20 mm square on paper (instead of a 200x200 mm square as I am getting now)
My current code is:
#!/usr/bin/python
import cairo, urllib, StringIO, Image # could I do it without Image module?
paper_width = 210
paper_height = 297
margin = 20
point_to_milimeter = 72/25.4
pdfname = "out.pdf"
pdf = cairo.PDFSurface(pdfname , paper_width*point_to_milimeter, paper_height*point_to_milimeter)
cr = cairo.Context(pdf)
cr.scale(point_to_milimeter, point_to_milimeter)
f=urllib.urlopen("http://cairographics.org/cairo-banner.png")
i=StringIO.StringIO(f.read())
im=Image.open(i)
# are these StringIO operations really necessary?
imagebuffer = StringIO.StringIO()
im.save(imagebuffer, format="PNG")
imagebuffer.seek(0)
imagesurface = cairo.ImageSurface.create_from_png(imagebuffer)
### EDIT: best answer from Jeremy, and an alternate answer from mine:
best_answer = True # put false to use my own alternate answer
if best_answer:
cr.save()
cr.scale(0.5, 0.5)
cr.set_source_surface(imagesurface, margin, margin)
cr.paint()
cr.restore()
else:
cr.set_source_surface(imagesurface, margin, margin)
pattern = cr.get_source()
scalematrix = cairo.Matrix() # this can also be used to shear, rotate, etc.
scalematrix.scale(2,2) # matrix numbers seem to be the opposite - the greater the number, the smaller the source
scalematrix.translate(-margin,-margin) # this is necessary, don't ask me why - negative values!!
pattern.set_matrix(scalematrix)
cr.paint()
pdf.show_page()
Note that the beautiful Cairo banner does not even fit the page...
The ideal result would be that I could control the width and height of this image in user space units (milimeters, in this case), to create a nice header image, for example.
Thanks for reading and for any help or comment!!

Try scaling the context when you draw the image.
E.g.
cr.save() # push a new context onto the stack
cr.scale(0.5, 0.5) # scale the context by (x, y)
cr.set_source_surface(imagesurface, margin, margin)
cr.paint()
cr.restore() # pop the context
See: http://cairographics.org/documentation/pycairo/2/reference/context.html for more details.

This is not answering the question, I just wanted to share heltonbiker's current code edited to run with Python 3.2:
import cairo, urllib.request, io
from PIL import Image
paper_width = 210
paper_height = 297
margin = 20
point_to_millimeter = 72/25.4
pdfname = "out.pdf"
pdf = cairo.PDFSurface( pdfname,
paper_width*point_to_millimeter,
paper_height*point_to_millimeter
)
cr = cairo.Context(pdf)
cr.scale(point_to_millimeter, point_to_millimeter)
# load image
f = urllib.request.urlopen("http://cairographics.org/cairo-banner.png")
i = io.BytesIO(f.read())
im = Image.open(i)
imagebuffer = io.BytesIO()
im.save(imagebuffer, format="PNG")
imagebuffer.seek(0)
imagesurface = cairo.ImageSurface.create_from_png(imagebuffer)
cr.save()
cr.scale(0.5, 0.5)
cr.set_source_surface(imagesurface, margin, margin)
cr.paint()
cr.restore()
pdf.show_page()

Jeremy Flores solved my problem very well by scaling the target surface before setting the imagesurface as source. Even though, perhaps some day you actually NEED to resize a Surface (or transform it in any way), so I will briefly describe the rationale used in my alternate answer (already included in the question), deduced after thoroughly reading the docs:
Set your surface as the context's source - it implicitly creates a cairo.Pattern!!
Use Context.get_source() to get the pattern back;
Create a cairo.Matrix;
Apply this matrix (with all its transforms) to the pattern;
Paint!
The only problem seems to be the transformations working always around the origin, so that scaling and rotation must be preceeded and followed by complementary translations to the origin (bleargh).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.