Why size of jpg file is bigger than expected?

Why size of jpg file is bigger than expected? - python

I generate a grayscale image and save it in jpg format.
SCENE_WIDTH = 28
SCENE_HEIGHT = 28
# draw random noice
p, n = 0.5, SCENE_WIDTH*SCENE_HEIGHT
scene_noise = np.random.binomial(1, p, n).reshape((SCENE_WIDTH, SCENE_HEIGHT))*255
scene_noise = scene_noise.astype(np.uint8)
n = scene_noise
print('%d bytes' % (n.size * n.itemsize)) # 784 bytes
cv2.imwrite('scene_noise.jpg', scene_noise)
print('noise: ', os.path.getsize("scene_noise.jpg")) # 1549 bytes
from PIL import Image
im = Image.fromarray(scene_noise)
im.save('scene_noise2.jpg')
print('noise2: ', os.path.getsize("scene_noise2.jpg")) # 1017 bytes
when I change from:
scene_noise = np.random.binomial(1, p, n).reshape((SCENE_WIDTH, SCENE_HEIGHT))*255
to:
scene_noise = np.random.binomial(255, p, n).reshape((SCENE_WIDTH, SCENE_HEIGHT))
The size of file decrease almost 2 times: ~ 775 bytes.
Can you please explain why JPG file is bigger than the raw version and why the size decreases when I change colors from black and white to full grayscale spectrum?
cv2.__version__.split(".") # ['4', '1', '2']

Two things here:
can you explain why the JPEG file is bigger than the raw version?
The size differs because you are not comparing the same things. The first object is a NumPy array, and the second one is a JPEG file. The JPEG file is bigger than the NumPy array (ie. after creating it with OpenCV) because JPEG encoding includes information in the overhead that a NumPy array does not store nor need.
can you explain why the size decreases when I change colours from black and white to a full grayscale spectrum?
This is due to JPEG encoding. If you truly want to understand all of what happens, I highly suggest to understand how JPEG encoding works as I will not go into much detail about this (I am in no way a specialist in this topic). Information on this is well documented on the Wikipedia JPEG article. The general idea is that the more contrast you have in your picture, the bigger it will be in terms of size. Here, having a picture in black and white only will force you to always go between 0 and 255, whereas a grayscale picture will not usually see as big a change between adjacent pixels.

Related

Problem converting an image for a 3-color e-ink display

I am trying to process an image file into something that can be displayed on a Black/White/Red e-ink display, but I am running into a problem with the output resolution.
Based on the example code for the display, it expects two arrays of bytes (one for Black/White, one for Red), each 15,000 bytes. The resolution of the e-ink display is 400x300.
I'm using the following Python script to generate two BMP files: one for Black/White and one for Red. This is all working, but the file sizes are 360,000 bytes each, which won't fit in the ESP32 memory. The input image (a PNG file) is 195,316 bytes.
The library I'm using has a function called EPD_4IN2B_V2_Display(BLACKWHITEBUFFER, REDBUFFER);, which wants the full image (one channel for BW, one for Red) to be in memory. But, with these image sizes, it won't fit on the ESP32. And, the example uses 15KB for each color channel (BW, R), so I feel like I'm missing something in the image processing necessary to make this work.
Can anyone shed some light on what I'm missing? How would I update the Python image-processing script to account for this?
I am using the Waveshare 4.2inch E-Ink display and the Waveshare ESP32 driver board. A lot of the Python code is based on this StackOverflow post but I can't seem to find the issue.
import io
import traceback
from wand.image import Image as WandImage
from PIL import Image
# This function takes as input a filename for an image
# It resizes the image into the dimensions supported by the ePaper Display
# It then remaps the image into a tri-color scheme using a palette (affinity)
# for remapping, and the Floyd Steinberg algorithm for dithering
# It then splits the image into two component parts:
# a white and black image (with the red pixels removed)
# a white and red image (with the black pixels removed)
# It then converts these into PIL Images and returns them
# The PIL Images can be used by the ePaper library to display
def getImagesToDisplay(filename):
print(filename)
red_image = None
black_image = None
try:
with WandImage(filename=filename) as img:
img.resize(400, 300)
with WandImage() as palette:
with WandImage(width = 1, height = 1, pseudo ="xc:red") as red:
palette.sequence.append(red)
with WandImage(width = 1, height = 1, pseudo ="xc:black") as black:
palette.sequence.append(black)
with WandImage(width = 1, height = 1, pseudo ="xc:white") as white:
palette.sequence.append(white)
palette.concat()
img.remap(affinity=palette, method='floyd_steinberg')
red = img.clone()
black = img.clone()
red.opaque_paint(target='black', fill='white')
black.opaque_paint(target='red', fill='white')
red_image = Image.open(io.BytesIO(red.make_blob("bmp")))
black_image = Image.open(io.BytesIO(black.make_blob("bmp")))
red_bytes = io.BytesIO(red.make_blob("bmp"))
black_bytes = io.BytesIO(black.make_blob("bmp"))
except Exception as ex:
print ('traceback.format_exc():\n%s',traceback.format_exc())
return (red_image, black_image, red_bytes, black_bytes)
if __name__ == "__main__":
print("Running...")
file_path = "testimage-tree.png"
with open(file_path, "rb") as f:
image_data = f.read()
red_image, black_image, red_bytes, black_bytes = getImagesToDisplay(file_path)
print("bw: ", red_bytes)
print("red: ", black_bytes)
black_image.save("output/bw.bmp")
red_image.save("output/red.bmp")
print("BW file size:", len(black_image.tobytes()))
print("Red file size:", len(red_image.tobytes()))

As requested, and in the event that it may be useful for future reader, I write a little bit more extensively what I've said in comments (and was verified to be indeed the reason of the problem).
The e-ink display needs usually a black&white image. That is 1 bit per pixel image. Not a grayscale (1 channel byte per pixel), even less a RGB (3 channels/bytes per pixel).
I am not familiar with bi-color red/black displays. But it seems quite logical that it behave just like 2 binary displays (one black & white display, and one black-white & red display). Sharing the same location.
What your code seemingly does is to remove all black pixels from a RGB image, and use it as a red image, and remove all red pixels from the same RDB image, and use it as a black image. But since those images are obtained with clone they are still RGB images. RGB images that happen to contain only black and white pixels, or red and white pixels, but still RGB image.
With PIL, it is the mode that control how images are represented in memory, and therefore, how they are saved to file.
Relevant modes are RGB, L (grayscale aka 1 linear byte/channel per pixel), and 1 (binary aka 1 bit per pixel).
So what you need is to convert to mode 1. Usind .convert('1') method on both your images.
Note that 400x300×3 (uncompressed rgb data for your image) is 360000, which is what you got. 400×300 (L mode for same image) is 120000, and 400×300/8 (1 mode, 1 bit/pixel) is 15000, which is precisely the expected size as you mentioned. So that is another confirmation that, indeed, 1 bit/pixel image is expected.

Convert gray pixel_array from DICOM to RGB image

I'm reading DICOM gray image file as
gray = dicom.dcmread(file).pixel_array
There I've got (x,y) shape but I need RGB (x,y,3) shape
I'm trying to convert using CV
img = cv2.cvtColor(gray, cv2.COLOR_GRAY2RGB)
And for testing I'm writing it to file cv2.imwrite('dcm.png', img)
I've got extremely dark image on output which is wrong, what is correct way to convert pydicom image to RGB?

To answer your question, you need to provide a bit more info, and be a bit clearer.
First what are you trying to do? Are you trying to only get an (x,y,3) array in memory? or are you trying to convert the dicom file to a .png file? ...they are very different things.
Secondly, what modality is your dicom image?
It's likely (unless its ultrasound or perhaps nuc med) a 16 bit greyscale image, meaning the data is 16 bit, meaning your gray array above is 16 bit data.
So the first thing to understand is window levelling and how to display a 16-bit image in 8 bits. have a look here: http://www.upstate.edu/radiology/education/rsna/intro/display.php.
If it's a 16-bit image, if you want to view it as a greyscale image in rgb format, then you need to know what window level you're using or need, and adjust appropriately before saving.
Thirdly, like lenik mention above, you need to apply the dicom slope/intercept values to your pixel data prior to using.
If your problem is just making a new array with extra dimension for rgb (so sizes (r,c) to (r,c,3)), then it's easy
# orig is your read in dcmread 2D array:
r, c = orig.shape
new = np.empty((w, h, 3), dtype=orig.dtype)
new[:,:,2] = new[:,:,1] = new[:,:,0] = orig
# or with broadcasting
new[:,:,:] = orig[:,:, np.newaxis]
That will give you the 3rd dimension. BUT the values will still all be 16-bit, not 8 bit as needed if you want it to be RGB. (Assuming your image you read with dcmread is CT, MR or equivalent 16-bit dicom - then the dtype is likely uint16).
If you want it to be RGB, then you need to convert the values to 8-bit from 16-bit. For that you'll need to decide on a window/level and apply it to select the 8-bit values from the full 16-bit data range.
Likely your problem above - I've got extremely dark image on output which is wrong - is actually correct, but it's dark because the window/level cv is using by default makes it 'look' dark, or it's correct but you didn't apply the slope/intercept.
If what you want to do is convert the dicom to png (or jpg), then you should probably use PIL or matplotlib rather than cv. Both of those offer easy ways to save a 16 bit 2D array (which is what you 'gray' is in your code above), both which allow you to specify window and level when saving to png or jpg. CV is complete overkill (meaning much bigger/slower to load, and much higher learning curve).
Some psueudo code using matplotlib. The vmin/vmax values you need to adjust - the ones here would be approximately ok for a CT image.
import matplotlib.pyplot as plt
df = dcmread(file)
slope = float(df.RescaleSlope)
intercept = float(df.RescaleIntercept)
df_data = intercept + df.pixel_array * slope
# tell matplotlib to 'plot' the image, with 'gray' colormap and set the
# min/max values (ie 'black' and 'white') to correspond to
# values of -100 and 300 in your array
plt.imshow(df_data, cmap='gray', vmin=-100, vmax=300)
# save as a png file
plt.savefig('png-copy.png')
that will save a png version, but with axes drawn as well. To save as just an image, without axes and no whitespace, use this:
inches = (3,3)
dpi = 150
fig, ax = plt.subplots(figsize=inches, dpi=dpi)
fig.subplots_adjust(left=0, right=1, top=1, bottom=0, wspace=0, hspace=0)
ax.imshow(df_data, cmap='gray', vmin=-100, vmax=300)
fig.save('copy-without-whitespace.png')

The full tutorial on reading DICOM files is here: https://www.kaggle.com/gzuidhof/full-preprocessing-tutorial
Basically, you have to extract parameters slope and interception from the DICOM file and do the math for every pixel: hu = pixel_value * slope + intercept -- all this explained in the tutorial with the code samples and pictures.

Python PIL remove every alpha channel completely

I tried so hard to converting PNG to Bitmap smoothly but failed every time.
but now I think I might found a reason.
it's because of the alpha channels.
('feather' in Photoshop)
Input image:
Output I've expected:
Current output:
I want to convert it to 8bit Bitmap and colour every invisible(alpha) pixels to purple(#FF00FF) and set them to dot zero. (very first palette)
but apparently, the background area and the invisible area around the actual image has a different colour.
i want all of them coloured same as background.
what should i do?
i tried these three
image = Image.open(file).convert('RGB')
image = Image.open(file)
image = image.convert('P')
pp = image.getpalette()
pp[0] = 255
pp[1] = 0
pp[2] = 255
image.putpalette(pp)
image = Image.open('feather.png')
result = image.quantize(colors=256, method=2)
the third method looks better but it becomes the same when I save it as a bitmap.
I just want to get it over now. I wasted too much time on this.
if i remove background from the output file,
it still looks awkward.

You question is kind of misleading as You stated:-
I want to convert it to 8bit Bitmap and colour every invisible(alpha) pixels to purple(#FF00FF) and set them to dot zero. (very first palette)
But in the description you gave an input image having no alpha channel. Luckily, I have seen your previous question Convert PNG to 8 bit bitmap, therefore I obtained the image containing alpha (that you mentioned in the description) but didn't posted.
HERE IS THE IMAGE WITH ALPHA:-
Now we have to obtain .bmp equivalent of this image, in P mode.
from PIL import Image
image = Image.open(r"Image_loc")
new_img = Image.new("RGB", (image.size[0],image.size[1]), (255, 0, 255))
cmp_img = Image.composite(image, new_img, image).quantize(colors=256, method=2)
cmp_img.save("Destination_path.bmp")
OUTPUT IMAGE:-

Small change in colors during JPEG compression

It looks like default library under Ubuntu changes colors a bit during the compression. I tried to set quality and sampling but I see no improvements, anyone ever challenged similar issue?
subsampling = 0 , quality = 100
#CORRECT COLORS FROM NPARRAY
cv2.imshow("Object cam:{}".format(self.camera_id), self.out)
print(self.out.item(1,1,0)) # B
print(self.out.item(1,1,1)) # G
print(self.out.item(1,1,2)) # R
self.out=cv2.cvtColor(self.out, cv2.COLOR_BGR2RGB)
#from PIL import Image
im = Image.fromarray(self.out)
r, g, b = im.getpixel((1, 1))
## just printing pixel and they are matching
print(r, g, b)
## WRONG COLORS
im.save(self.out_ramdisk_img,format='JPEG', subsampling=0, quality=100)
JPEG image should have the same colors as in imshow, but it's a bit more purple.

That is a natural result of JPEG compression. JPEG uses floating point arithmetic to calculate integer pixel values. This occurs in several stages of JPEG compression. Thus, small pixel value changes are expected.
When you have blanket changes in color they are usually the result input color values that are outside the gamut of the YCbCr color space. Such values get clamped.

invisible watermarks in images [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
How do you insert invisible watermarks in images for copyright purposes? I'm looking for a python library.
What algorithm do you use? What about performance and efficiency?

You might want to look into Steganography; that is hiding data inside of images. There are forms that won't get lost if you convert to a lossier format or even crop parts of the image out.

I'm looking for "unbreakable" watermarks, so data stored in exif or image metadata are out.
I have found some interesting stuff on the web while waiting for replies here:
http://www.cosy.sbg.ac.at/~pmeerw/Watermarking/
There is a master thesis that's fairly exhaustive about algorithms and their caracteristics (what they do and how unbreakable they are). I haven't got any time to read it in depth, but this stuff looks serious. There are algorithms that support JPEG compression, cropping, gamma correction or down scaling in some way. It's C, but I can port it to Python or use C libraries from Python.
However, it's from 2001 and I guess 7 years are a long time in this field :( Does anybody have some similar and more recent stuff?

I use the following code. It requires PIL:
def reduceOpacity(im, opacity):
"""Returns an image with reduced opacity."""
assert opacity >= 0 and opacity <= 1
if im.mode != 'RGBA':
im = im.convert('RGBA')
else:
im = im.copy()
alpha = im.split()[3]
alpha = ImageEnhance.Brightness(alpha).enhance(opacity)
im.putalpha(alpha)
return im
def watermark(im, mark, position, opacity=1):
"""Adds a watermark to an image."""
if opacity < 1:
mark = reduceOpacity(mark, opacity)
if im.mode != 'RGBA':
im = im.convert('RGBA')
# create a transparent layer the size of the image and draw the
# watermark in that layer.
layer = Image.new('RGBA', im.size, (0,0,0,0))
if position == 'tile':
for y in range(0, im.size[1], mark.size[1]):
for x in range(0, im.size[0], mark.size[0]):
layer.paste(mark, (x, y))
elif position == 'scale':
# scale, but preserve the aspect ratio
ratio = min(float(im.size[0]) / mark.size[0], float(im.size[1]) / mark.size[1])
w = int(mark.size[0] * ratio)
h = int(mark.size[1] * ratio)
mark = mark.resize((w, h))
layer.paste(mark, ((im.size[0] - w) / 2, (im.size[1] - h) / 2))
else:
layer.paste(mark, position)
# composite the watermark with the layer
return Image.composite(layer, im, layer)
img = Image.open('/path/to/image/to/be/watermarked.jpg')
mark1 = Image.open('/path/to/watermark1.png')
mark2 = Image.open('/path/to/watermark2.png')
img = watermark(img, mark1, (img.size[0]-mark1.size[0]-5, img.size[1]-mark1.size[1]-5), 0.5)
img = watermark(img, mark2, 'scale', 0.01)
The watermark is too faint to see. Only a solid color image would really show it. I can use it to create an image that doesn't show a watermark, but if I do a bit-by-bit subtraction using the original image, I can demonstrate that my watermark is there.
If you want to see how it works, go to TylerGriffinPhotography.com. Each image on the site is watermarked twice: once with the watermark in the lower right corner at 50% opacity (5px from the edge), and once over the whole image at 1% opacity (using "scale", which scales the watermark to the whole image). Can you figure out what the second, low opacity watermark shape is?

If you're talking about steganography, here's an old not too-fancy module I did for a friend once (Python 2.x code):
the code
from __future__ import division
import math, os, array, random
import itertools as it
import Image as I
import sys
def encode(txtfn, imgfn):
with open(txtfn, "rb") as ifp:
txtdata= ifp.read()
txtdata= txtdata.encode('zip')
img= I.open(imgfn).convert("RGB")
pixelcount= img.size[0]*img.size[1]
## sys.stderr.write("image %dx%d\n" % img.size)
factor= len(txtdata) / pixelcount
width= int(math.ceil(img.size[0]*factor**.5))
height= int(math.ceil(img.size[1]*factor**.5))
pixelcount= width * height
if pixelcount < len(txtdata): # just a sanity check
sys.stderr.write("phase 2, %d bytes in %d pixels?\n" % (len(txtdata), pixelcount))
sys.exit(1)
## sys.stderr.write("%d bytes in %d pixels (%dx%d)\n" % (len(txtdata), pixelcount, width, height))
img= img.resize( (width, height), I.ANTIALIAS)
txtarr= array.array('B')
txtarr.fromstring(txtdata)
txtarr.extend(random.randrange(256) for x in xrange(len(txtdata) - pixelcount))
newimg= img.copy()
newimg.putdata([
(
r & 0xf8 |(c & 0xe0)>>5,
g & 0xfc |(c & 0x18)>>3,
b & 0xf8 |(c & 0x07),
)
for (r, g, b), c in it.izip(img.getdata(), txtarr)])
newimg.save(os.path.splitext(imgfn)[0]+'.png', optimize=1, compression=9)
def decode(imgfn, txtfn):
img= I.open(imgfn)
with open(txtfn, 'wb') as ofp:
arrdata= array.array('B',
((r & 0x7) << 5 | (g & 0x3) << 3 | (b & 0x7)
for r, g, b in img.getdata())).tostring()
findata= arrdata.decode('zip')
ofp.write(findata)
if __name__ == "__main__":
if sys.argv[1] == 'e':
encode(sys.argv[2], sys.argv[3])
elif sys.argv[1] == 'd':
decode(sys.argv[2], sys.argv[3])
the algorithm
It stores a byte of data per image pixel using: the 3 least-significant bits of the blue band, the 2 LSB of the green one and the 3 LSB of the red one.
encode function: An input text file is compressed by zlib, and the input image is resized (keeping proportions) to ensure that there are at least as many pixels as compressed bytes. A PNG image with the same name as the input image (so don't use a ".png" filename as input if you leave the code as-is :) is saved containing the steganographic data.
decode function: The previously stored zlib-compressed data are extracted from the input image, and saved uncompressed under the provided filename.
I verified the old code still runs, so here's an example image containing steganographic data:
You'll notice that the noise added is barely visible.

Well, invisible watermarking is not that easy. Check digimarc, what money did they earn on it. There is no free C/Python code that a lonely genius has written a leave it for free usage. I've implemented my own algorithm and the name of the tool is SignMyImage. Google it if interested ... F>

What about Exif? It's probably not as secure as what you're thinking, but most users don't even know it exists and if you make it that easy to read the watermark information those who care will still be able to do it anyway.

I don't think there is a library that does this out of the box. If you want to implement your own, I would definitely go with the Python Imaging Library (PIL).
This is a Python Cookbook recipe that uses PIL to add a visible watermark to an image. If it's enough for your needs, you could use this to add a watermark with enough transparency that it is only visible if you know what you are looking for.

There is a newer (2005) digital watermarking FAQ at watermarkingworld.org

I was going to post an answer similar to Ugh. I would suggest putting a small TXT file describing the image source (and perhaps a small copyright statement, if one applies) into the image in a manner that is difficult to detect and break.

I'm not sure how important it is to be unbreakable, but a simple solution might just be to append a text file to the end of the image. Something like "This image belongs to ...".
If you open the image in a viewer/browser, it looks like a normal jpeg, but if you open it in a text editor, the last line would be readable.
The same method allows you include an actual file into an image. (hide a file inside of an image) I've found that it's a bit hit-or-miss, but 7-zip files seem to work. You could hide all sorts of copywrite goodies inside the image.
Again, it's not unbreakable by any stretch of the imagination, but it's completely invisible to the naked eye.

Some image formats have headers where you can store arbitrary information as well.
For example, the PNG specification has a chunk where you can store text data. This is similar to the answers above, but without adding random data to the image data itself.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.