Image.fromarray() is making every element in the matrix mod of 256 - python

I am writing a script to encrypt and decrypt an image in python3 using PIL. Here I am converting the image into a numpy array and then multiplying every element of the array with 10. Now I noticed that the default function in PIL fromarray() is converting every element of the array to the mod of 256 if its larger than the 255, so when I am trying to retrieve the original value of the matrix I'm not getting the original one. For example, if the original value is 40 then its 10 times is 400 so the fromarray() is making it as 400 mod 256, which will give 144. Now if I add 256 to 144 I will have 400 and then divided by 10 will give me 40. But if the value is 54 then 10times is 540 and 540 mod 256 is 28. Now to get back the original value I need to add 256 two times which will give me 540. 540 isn't the only number which will give me 28 when I will mod it with 256. So I will never know when to add 256 one time and when two times or more. Is there any way I can make it stop of replacing every element of the matrix with its mod of 256?
from PIL import Image
from numpy import *
from pylab import *
#encryption
img1 = (Image.open('image.jpeg').convert('L'))
img1.show() #displaying the image
img = array(Image.open('image.jpeg').convert('L'))
a,b = img.shape
print(img)
print((a,b))
tup = a,b
for i in range (0, tup[0]):
for j in range (0, tup[1]):
img[i][j]= img[i][j]*10 #converting every element of the original array to its 10times
print(img)
imgOut = Image.fromarray(img)
imgOut.show()
imgOut.save('img.jpeg')
#decryption
img2 = (Image.open('img.jpeg'))
img2.show()
img3 = array(Image.open('img.jpeg'))
print(img3)
a1,b1 = img3.shape
print((a1,b1))
tup1 = a1,b1
for i1 in range (0, tup1[0]):
for j1 in range (0, tup1[1]):
img3[i1][j1]= ((img3[i1][j1])/10) #reverse of encryption
print(img3)
imgOut1 = Image.fromarray(img3)
imgOut1.show()
part of the original matrix before multiplying with 10 :
[41 42 45 ... 47 41 33]
[41 43 45 ... 44 38 30]
[41 42 46 ... 41 36 30]
[43 43 44 ... 56 56 55]
[45 44 45 ... 55 55 54]
[46 46 46 ... 53 54 54]
part of the matrix after multiplying with 10 :
[[154 164 194 ... 214 154 74]
[154 174 194 ... 184 124 44]
[154 164 204 ... 154 104 44]
[174 174 184 ... 48 48 38]
[194 184 194 ... 38 38 28]
[204 204 204 ... 18 28 28]
part of the expected matrix after dividing by 10 :
[41 42 45 ... 47 41 33]
[41 43 45 ... 44 38 30]
[41 42 46 ... 41 36 30]
[43 43 44 ... 56 56 55]
[45 44 45 ... 55 55 54]
[46 46 46 ... 53 54 54]
part of th output the script is providing: [[41 41 45 ... 48 40 33]
[41 43 44 ... 44 37 31]
[41 41 48 ... 41 35 30]
[44 42 43 ... 30 30 29]
[44 42 45 ... 29 29 29]
[45 47 44 ... 28 28 28]]

There are several problems with what you're trying to do here.
PIL images are either 8 bit per channel or 16 bit per channel (to the best of my knowledge). When you load a JPEG, it's loaded as 8 bits per channel, so the underlying data type is an unsigned 8-bit integer, i.e. range 0..255. Operations that would overflow or underflow this range wrap, which looks like the modulus behavior you're seeing.
You could convert the 8-bit PIL image to a floating point numpy array with np.array(img).astype('float32') and then normalize this to 0..1 by dividing with 255.
At this point you have non-quantized floating point numbers you can freely mangle however you wish.
However, then you still need to save the resulting image, at which point you again have a format problem. I believe TIFFs and some HDR image formats support floating point data, but if you want something that is widely readable, you'd likely go for PNG or JPEG.
For an encryption use case, JPEGs are not a good choice, as they're always inherently lossy, and you will, more likely than not, not get the same data back.
PNGs can be 8 or 16 bits per channel, but still, you'd have the problem of having to compress a basically infinite "dynamic range" of pixels (let's say you'd multiplied everything by a thousand!) into 0..255 or 0..65535.
An obvious way to do this is to find the maximum value in the image (np.max(...)), divide everything by it (so now you're back to 0..1), then multiply with the maximum value of the image data format... so with a simple multiplication "cipher" as you'd described, you'd essentially get the same image back.
Another way would be to clip the infinite range at the allowed values, i.e. everything below zero is zero, everything above it is, say, 65535. That'd be a lossy operation though, and you'd have no way of getting the unclipped values back.

First of all, PIL only supports 8-bit per channel images - although Pillow (the PIL fork) supports many more formats including higher bit-depths. The JPEG format is defined as only 8-bit per channel.
Calling Image.open() on a JPEG in PIL will therefore return an 8-bit array, so any operations on individual pixels will be performed as equivalent to uint8_t arithmetic in the backing representation. Since the maximum value in a uint8_t value is 256, all your arithmetic is necessarily modulo 256.
If you want to avoid this, you'll need to convert the representation to a higher bit-depth, such as 16bpp or 32bpp. You can do this with the NumPy code, such as:
img16 = np.array(img, dtype=np.uint16)
# or
img32 = np.array(img, dtype=np.uint32)
That will give you the extended precision that you desire.
However - your code example shows that you are trying to encryption and decrypt the image data. In that case, you do want to use modulo arithmetic! You just need to do some more research on actual encryption algorithms.

As none of the answers have helped me that much and I have solved the problem I would like to give an answer hoping one day it will help someone. Here the keys are (3, 25777) and (16971,25777).
The working code is as follows:
from PIL import Image
import numpy as np
#encryption
img1 = (Image.open('image.jpeg').convert('L'))
img1.show()
img = array((Image.open('image.jpeg').convert('L')))
img16 = np.array(img, dtype=np.uint32)
a,b = img.shape
print('\n\nOriginal image: ')
print(img16)
print((a,b))
tup = a,b
for i in range (0, tup[0]):
for j in range (0, tup[1]):
x = img16[i][j]
x = (pow(x,3)%25777)
img16[i][j] = x
print('\n\nEncrypted image: ')
print(img16)
imgOut = Image.fromarray(img16)
imgOut.show()
#decryption
img3_16 = img16
img3_16 = np.array(img, dtype=np.uint32)
print('\n\nEncrypted image: ')
print(img3_16)
a1,b1 = img3_16.shape
print((a1,b1))
tup1 = a1,b1
for i1 in range (0, tup1[0]):
for j1 in range (0, tup1[1]):
x1 = img3_16[i1][j1]
x1 = (pow(x,16971)%25777)
img3_16[i][j] = x1
print('\n\nDecrypted image: ')
print(img3_16)
imgOut1 = Image.fromarray(img3_16)y
imgOut1.show()
Feel free to point out the faults. Thank you.

Related

How to keep the header and trailer while zlib decompress and compress

I have raw data extracted from PDF and I decompressed the raw data and compressed it again.
I expected the same header and trailer, but the header was changed.
Original Hex Header
48 89 EC 57 ....
Converted Hex Header
78 9C EC BD ...
I dug into zlib compression and got header 48 also is one of zlib.header.
But mostly 78 is used for zlib compression.
It's my code which decompress and compress:
decompress_wbit = 12
compress_variable = 6
output_data = zlib.decompress(open(raw_data, "rb").read(), decompress_wbit)
output_data = zlib.compress(output_data, 6)
output_file = open(raw_data + '_', "wb")
output_file.write(output_data)
output_file.close()
I changed the decompress_wbit and compress_variable but still keeps 78.
So not sure how to get 48 as header.
Here is the short description about zlib.header.
CINFO (bits 12-15)
Indicates the window size as a power of two, from 0 (256 bytes) to 7 (32768 bytes). This will usually be 7. Higher values are not allowed.
CM (bits 8-11)
The compression method. Only Deflate (8) is allowed.
FLEVEL (bits 6-7)
Roughly indicates the compression level, from 0 (fast/low) to 3 (slow/high)
FDICT (bit 5)
Indicates whether a preset dictionary is used. This is usually 0. 1 is technically allowed, but I don't know of any Deflate formats that define preset dictionaries.
FCHECK (bits 0-4)
A checksum (5 bits, 0..31), whose value is calculated such that the entire value divides 31 with no remainder.
Typically, only the CINFO and FLEVEL fields can be freely changed, and FCHECK must be calculated based on the final value.* Assuming no preset dictionary, there is no choice in what the other fields contain, so a total of 32 possible headers are valid. Here they are:
FLEVEL: 0 1 2 3
CINFO:
0 08 1D 08 5B 08 99 08 D7
1 18 19 18 57 18 95 18 D3
2 28 15 28 53 28 91 28 CF
3 38 11 38 4F 38 8D 38 CB
4 48 0D 48 4B 48 89 48 C7
5 58 09 58 47 58 85 58 C3
6 68 05 68 43 68 81 68 DE
7 78 01 78 5E 78 9C 78 DA
Please let me know how to keep the zlib.header while decompression & compression
Thanks for your time.
I will first note that it doesn't matter. The data will be decompressed fine with that zlib header. Why do you care?
You are giving zlib.compress a small amount of data that permits a smaller window. Since it is permitted, the Python library is electing to compress with a smaller window.
A way to avoid that would be to use zlib.compressobj instead. Upon initiation, it doesn't know how much data you will be feeding it and will default to the largest window size.

Improving Numpy For Loop Speed

I'm trying to find the pixels closest to an RGB value of (0,0,255). I'm trying to calculate the distance of the pixel in RGB values to that value using a 3D Pythagoras calculation, add them to a list, and then return the X and Y coordinates of the values that have the lowest distance. Here's what I have:
# import the necessary packages
import numpy as np
import scipy.spatial as sp
import matplotlib.pyplot as plt
import cv2
import math
from PIL import Image, ImageDraw, ImageFont
background = Image.open("test.tif").convert('RGBA')
png = background.save("test.png")
retina = cv2.imread("test.png")
#convert BGR to RGB image
retina = cv2.cvtColor(retina, cv2.COLOR_BGR2RGB)
h,w,bpp = np.shape(retina)
min1_d = float('inf')
min1_coords = (None, None)
min2_d = float('inf')
min2_coords = (None, None)
for py in range(0,h):
for px in range (0,w):
r = retina[py][px][0]
g = retina[py][px][1]
b = retina[py][px][2]
d = math.sqrt(((r-0)**2) + ((g-0)**2) + ((255-b)**2))
print(str(r) + "," + str(g) + "," + str(b) + ",," + str(px) + "," + str(py) + ",," + str(d))
if d < min1_d:
min2_d = min1_d
min2_coords = min1_coords
min1_d = d
min1_coords = (px, py)
elif d < min2_d: # if it's not the smallest, check if it's the second smallest
min2_d = d
min2_coords = (px, py)
print(min1_coords, min2_coords)
width, height = background.size
x_max = int(width)
y_max = int(height)
img = Image.new('RGBA', (x_max, y_max), (255,255,255,0))
draw = ImageDraw.Draw(img)
draw.point(min1_coords, (0,0,255))
draw.point(min2_coords, (0,0,255))
foreground = img
background.paste(foreground, (0, 0), foreground)
foreground.save("test_bluer.png")
background.save("test_bluer_composite.png")
How can I speed up my for loops? I believe this answer is on the right track, but I'm not sure how to implement the px and py variables while slicing as this answer shows.
You can speed up your code by vectorizing the for loop:
r = retina[:,:,0]
g = retina[:,:,1]
b = retina[:,:,2]
d = np.sqrt(r**2 + g**2 + (255-b)**2)
You can find the coordinates of the minimum with:
min_coords = np.unravel_index(np.argmin(d), np.shape(d))
If you want to find the second smallest distance just change the previous minimum to be a larger distance:
d[min_coords[0],min_coords[1]] = np.inf
min_coords = np.unravel_index(np.argmin(d), np.shape(d))
# min_coords now has the second smallest distance
Here is one way in Python/OpenCV.
Read the input
Define your color (pure blue)
Create an image of the color desired
Compute an image representing the rmse difference
Threshold the rmse image
Get the coordinates of all white pixels
Input:
import cv2
import numpy as np
# read image
img = cv2.imread('red_blue2.png')
# reference color (blue)
color = (255,0,0)
# create image the size of the input, but with blue color
ref = np.full_like(img, color)
# compute rmse difference image
diff = cv2.absdiff(img, ref)
diff2 = diff*diff
b,g,r = cv2.split(diff)
rmse = np.sqrt( ( b+g+r )/3 )
# threshold for pixels within 1 graylevel different
thresh = cv2.threshold(rmse, 1, 255, cv2.THRESH_BINARY_INV)[1]
# get coordinates
coords = np.argwhere(thresh == 255)
for coord in coords:
print(coord[1],coord[0])
# write results to disk
cv2.imwrite("red_blue2_rmse.png", (20*rmse).clip(0,255).astype(np.uint8))
cv2.imwrite("red_blue2_thresh.png", thresh)
# display it
cv2.imshow("rmse", rmse)
cv2.imshow("thresh", thresh)
cv2.waitKey(0)
RMSE Image (scaled in brightness by 20x for viewing):
Thresholded rmse image:
Coordinates:
127 0
128 0
127 1
128 1
127 2
128 2
127 3
128 3
127 4
128 4
127 5
128 5
127 6
128 6
127 7
128 7
127 8
128 8
127 9
128 9
127 10
128 10
127 11
128 11
127 12
128 12
127 13
128 13
127 14
128 14
127 15
128 15
127 16
128 16
127 17
128 17
127 18
128 18
127 19
128 19
127 20
128 20
127 21
128 21
127 22
128 22
127 23
128 23
127 24
128 24
127 25
128 25
127 26
128 26
127 27
128 27
127 28
128 28
127 29
128 29
127 30
128 30
127 31
128 31
127 32
128 32
127 33
128 33
127 34
128 34
127 35
128 35
127 36
128 36
127 37
128 37
127 38
128 38
127 39
128 39
127 40
128 40
127 41
128 41
127 42
128 42
127 43
128 43
127 44
128 44
127 45
128 45
127 46
128 46
127 47
128 47
127 48
128 48
127 49
128 49
As commented, subtract rgb value from array, square, average(or sum) pixel rgb values, get minimum.
Here is my variant:
import numpy
rgb_value = numpy.array([17,211,51])
img = numpy.random.randint(255, size=(1000,1000,3),dtype=numpy.uint8)
img_goal = numpy.average(numpy.square(numpy.subtract(img, rgb_value)), axis=2)
result = numpy.where(img_goal == numpy.amin(img_goal))
result_list = [result[0].tolist(),result[1].tolist()]
for i in range(len(result_list[0])):
print("RGB needed:", rgb_value)
print("Pixel:", result_list[0][i], result_list[1][i])
print("RGB gotten:", img[result_list[0][i]][result_list[1][i]])
print("Distance to value:", img_goal[result_list[0][i]][result_list[1][i]])
There can be multiple results with the same values.

Extracting thumbnails from an image in numpy

I have a weird question, it concerns slicing arrays and extract small thumbnail cutouts. I do have a solution, but it's a chunky for loop which runs fairly slowly on big images.
The current solution looks something like this:
import numpy as np
image = np.arange(0,10000,1).reshape(100,100) #create an image
cutouts = np.zeros((100,10,10)) #array to hold the thumbnails
l = 0
for i in range(0,10):
for j in range(0,10): #step a (10,10) box across the image + save results
cutouts[l,:,:] = image[(i*10):(i+1)*10, (j*10):(j+1)*10]
l = l+1
print(cutouts[0,:,:])
[[ 0. 1. 2. 3. 4. 5. 6. 7. 8. 9.]
[ 100. 101. 102. 103. 104. 105. 106. 107. 108. 109.]
[ 200. 201. 202. 203. 204. 205. 206. 207. 208. 209.]
[ 300. 301. 302. 303. 304. 305. 306. 307. 308. 309.]
[ 400. 401. 402. 403. 404. 405. 406. 407. 408. 409.]
[ 500. 501. 502. 503. 504. 505. 506. 507. 508. 509.]
[ 600. 601. 602. 603. 604. 605. 606. 607. 608. 609.]
[ 700. 701. 702. 703. 704. 705. 706. 707. 708. 709.]
[ 800. 801. 802. 803. 804. 805. 806. 807. 808. 809.]
[ 900. 901. 902. 903. 904. 905. 906. 907. 908. 909.]]
So, like I said, this works. But, once I get to very large images (I work in astronomy) with a couple different colour bands, it gets slow and clunky. In my dream world, I'd be able to do somethin like:
import numpy as np
image = np.arange(0,10000,1).reshape(100,100) #create an image
cutouts = image.reshape(100,10,10)
BUT, the doesn't create the right thumbnails, because it will read a whole row into the first (10,10) array, before moving onto the next one:
print(cutouts[0,:,:])
[[ 0 1 2 3 4 5 6 7 8 9]
[10 11 12 13 14 15 16 17 18 19]
[20 21 22 23 24 25 26 27 28 29]
[30 31 32 33 34 35 36 37 38 39]
[40 41 42 43 44 45 46 47 48 49]
[50 51 52 53 54 55 56 57 58 59]
[60 61 62 63 64 65 66 67 68 69]
[70 71 72 73 74 75 76 77 78 79]
[80 81 82 83 84 85 86 87 88 89]
[90 91 92 93 94 95 96 97 98 99]]
So yeah, that's the problem, am I going mad and the for loop is the best way to do it, or is there some clever way I can slice image array so that it produces the thumbnails I need.
Cheers!
Reshape to 4D, permute axes, reshape again -
H,W = 10,10 # height,width of thumbnail imgs
m,n = image.shape
cutouts = image.reshape(m//H,H,n//W,W).swapaxes(1,2).reshape(-1,H,W)
More info on the intuition behind it.
A more compact version with scikit-image builtin : view_as_blocks -
from skimage.util.shape import view_as_blocks
cutouts = view_as_blocks(image,(H,W)).reshape(-1,H,W)
If you are okay with the intermediate 4D output, it would a view into the input image and hence virtually free on runtime. Let's verify the view-part -
In [51]: np.shares_memory(image, image.reshape(m//H,H,n//W,W))
Out[51]: True
In [52]: np.shares_memory(image, view_as_blocks(image,(H,W)))
Out[52]: True

Importing an image as grayscale and converting it to grayscale doesn't produce the same result when multiplying it by 255

I'm currently working on a project to isolate a number plate from an image.
When I import an image using cv2.imread("filename",0) the grayscale image I obtain is more or less the same (maybe a few rounding differences due to the fact that I convert it to integers.) To when I import it using cv2.imread("filename") and then convert it to grayscale using np.dot(original_image[...,:3], [0.299, 0.587, 0.144]).
However, when I multiply both ndarrays with 255 I do not obtain the same output matrices. Both grayscale images are of the same dimensions, produce the same output when I convert them to a figure, are of the same type and produce the same otsu threshold. Why does this happen? Does OpenCV display and save image ndarrays differently?
How can I manipulate the second grayscale image to produce the same output as the first grayscale image after multiplying it with 255?
def func():
rgb_image=cv2.imread('filename')
gray_image=cv2.imread('filename',0)
rgb_converted_to_gray_image=np.dot(rgb_image[...,:3], [0.299, 0.587, 0.144])
print("Before multiplying with 255")
print(gray_image)
print("------------")
print(rgb_converted_to_gray_image)
gray_image=gray_image*255
rgb_converted_to_gray_image=rgb_converted_to_gray_image*255
print("After multiplying with 255")
print(gray_image)
print("------------")
print(rgb_converted_to_gray_image)
The output is as follows:
Before multiplying with 255
[[32 29 34 ... 92 88 86]
[33 28 32 ... 85 85 86]
[35 29 28 ... 85 93 99]
...
[ 8 8 8 ... 32 32 32]
[ 8 8 8 ... 32 32 32]
[ 8 8 8 ... 33 33 33]]
------------
[[ 27.512 24.721 29.129 ... 105.014 100.894 98.989]
[ 29.14 23.99 27.069 ... 97.804 97.804 99.432]
[ 30.912 25.02 23.547 ... 98.701 106.797 112.977]
...
[ 9.292 9.292 9.292 ... 33.558 33.558 33.558]
[ 9.292 9.292 9.292 ... 33.558 33.558 33.558]
[ 9.292 9.292 9.292 ... 34.588 34.588 34.588]]
After multiplying with 255:
[[224 227 222 ... 164 168 170]
[223 228 224 ... 171 171 170]
[221 227 228 ... 171 163 157]
...
[248 248 248 ... 224 224 224]
[248 248 248 ... 224 224 224]
[248 248 248 ... 223 223 223]]
------------
[[ 7015.56 6303.855 7427.895 ... 26778.57 25727.97 25242.195]
[ 7430.7 6117.45 6902.595 ... 24940.02 24940.02 25355.16 ]
[ 7882.56 6380.1 6004.485 ... 25168.755 27233.235 28809.135]
...
[ 2369.46 2369.46 2369.46 ... 8557.29 8557.29 8557.29 ]
[ 2369.46 2369.46 2369.46 ... 8557.29 8557.29 8557.29 ]
[ 2369.46 2369.46 2369.46 ... 8819.94 8819.94 8819.94 ]]69.46 ... 8819.94 8819.94 8819.94 ]]
Thus, what I would like is for the last matrix to look the same as the one above it.
There are two reasons due to which the difference in results is being observed.
Difference in data-type
Channel order
The first reason, as pointed out by #Cris Luengo in a comment is the data type difference between gray_image and rgb_converted_to_gray_image. gray_image has type uint8 whereas rgb_converted_to_gray has type float32. As a result of multiplication by 255, the values of gray_image are scaled to the range of uint8. To get around this issue, you can do floating point multiplication just by changing 255 to 255.0.
gray_image = gray_image * 255.0
Now comes the second issue. Even if we do floating point multiplication, the results will be different because OpenCV images are stored in channel order BGR by default, while you are providing the gray-scale conversion coefficients in RGB order. Also, the coefficient for blue value is incorrect. It should be 0.114 instead of 0.144. To verify the logical correctness of RGB coefficient values, check that their sum should be equal to 1. The corrected coefficients array should be like this:
[0.114, 0.587, 0.299]
The final code may look like this:
def func():
rgb_image=cv2.imread('filename')
gray_image=cv2.imread('filename',0)
rgb_converted_to_gray_image=np.dot(rgb_image[...,:3], [0.114, 0.587, 0.299])
print("Before multiplying with 255")
print(gray_image)
print("------------")
print(rgb_converted_to_gray_image)
gray_image=gray_image*255.0
rgb_converted_to_gray_image=rgb_converted_to_gray_image*255
print("After multiplying with 255")
print(gray_image)
print("------------")
print(rgb_converted_to_gray_image)

numpy array multiply 10 got wrong result

I got a very wired test for python2.7 numpy array. Please look at this code.
import numpy as np
times = np.arange(5., 85, 0.1)
print times
times = np.array(times * 10, dtype=np.int)
print times
the original times should be [5.0 ~ 84.9]. After multiply 10, it should become [50 ~ 849], but result is like this:
[ 50 51 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66
67 68 69 70 71 72 ... ]
There are two 51 between 50 and 52
The problem is, that your third entry isn't exactly 52.0 but 51.999999999999993 (see Is floating point math broken?). Truncating that value therefore results in 51.
The correct way would be to first round the values. (As pointed out in Safest way to convert float to integer in python? all small enough integer numbers can be exactly expressed as a float.) You therefore have to calculate: times = np.array(np.round(times * 10), dtype=np.int)

Categories

Resources