.png file not saving correctly matplotlib

.png file not saving correctly matplotlib - python

While saving a multiple grid figure in png with 300 as dpi, I lose quality
However this error does not occur while saving the figure as a pdf.
Here is the small portion of the code that saves the image generated:
fig.savefig(filepath, format = 'pdf'
,bbox_inches='tight',dpi=300)
fig.savefig(filepath, format = 'png'
,bbox_inches='tight',dpi=300)
Is there a way to obtain a good resolution png of an image such as the above without having to resort to using pdf?

.pdf images are vector graphics, and thus preserve all information. In other words setting dpi=300 in the pdf creation doesn't do anything (unless you have set specific objects to be rasterized using rasterized = True).
.png images are raster graphics (e.g. check this out). Therefore you have to adjust the dpi to get the balance of filesize vs. quality that you want. In other words, the image is saving correctly, it's just lower quality than the 'perfect' pdf.
The choice of image output format depends on how you will use it. Vector graphics (.pdf, .svg) are great if you have simple plots that you want to scale (zoom) perfectly. However, if you are plotting many points (>10,000 or so), this can lead to very large filesizes. In this case it may be better to rasterize the figure because a person can't see that many data points anyway.
"Which raster format should you use?" .png and .jpg are the most common. The former has better compression for images with large patches of the same color, while the latter has better compression for high pixel variability (e.g. photographs). Check this out for more info.
Note that while .png is considered 'lossless', it is only so in the sense that it preserves the rasterized information. Information is still lost when saving/converting to rasterized format.

Related

Python matplotlib reducing the image resolution [duplicate]

In matplotlib, I am using LineCollection to draw and color the countries, where the boundaries of the counties are given. When I am saving the figure as a pdf file:
fig.savefig('filename.pdf',dpi=300)
the figure size are quite big. However, on saving them as png file:
fig.savefig('filename.png',dpi=300)
and then converting them to pdf using linux convert command the files are small. I tried reducing the dpi, however that do not change the pdf file size. Is there a way the figures can be saved directly as smaller-pdf files from matplotlib?

The PDF is larger, since it contains all the vector information. By saving a PNG, you produce a rasterized image. It seems that in your case, you can produce a smaller PDF by rasterizing the plot directly:
plt.plot(x, y, 'r-', rasterized=True)
Here, x, y are some plot coordinates. You basically have to use the additionally keyword argument raterized to achieve the effect.

I think using "rasterized = True" effectively saves the image similarly to png format. When you zoom in, you will see blurring pixels.
If you want the figures to be high quality, my suggestion is to sample from the data and make a plot. The pdf file size is roughly the amount of data points it need to remember.

display FITS file content

I have a bunch of data in of FITS file format that a I need to visualize.
Some details:
I use python with astropy to manipulate and preview the data;
The data stored in the FITS file is basically a numpy array with 70 lines ("orders") of 8096 pixels each (an echelle spectrum)
the data is saved as a multipage pdf where each page corresponds to one specific order of the FITS observations
I want to display the data as per figure 1:
figure 1 corresponds to one single order from each FITS file;
the "grey" region on the top panel corresponds to regions with no data/observations;
each "line" on the top panel corresponds to a different observation (x-axis: wavelenght, y-axis: date of observation; z-axis: flux)
the red line is optional
the bottom panel is the same data as above, but with all obsevations overlapping (x-axis: wavelenght, y-axis: flux)
the flux is normalised to the median on the panels, but the fits will have values sometimes well above 10^7
now, I am facing the following problem. If I save in pdf (or even png, etc), I am limited by the dpi which I use. The higher the dpi, the better I can preview the data, but it becomes impossible to work with due to the file size. But with a low dpi, the data appears blurry. When I do a preview of the data with matplotlib's show, I can zoom in and out with no problems, but it becomes impossible to work with as I generate my images on a serve at work and it becomes impossible to do remotely.
So, my question is: is there a file format I could use to store my data similarly to figure 1 (ideally a multipage format and that I could use python to create, but not limited to.), but would allow me to work with an "infinite" resolution similarly to what I have with matplotlib's show? There are several FITS file viewers "in the wild", but to my knowledge they do not allow to preview the data as I wish....

How to compare two image files pixel by pixel in python using selenium?

I want to compare two images (.png format) pixel by pixel using selenium in python. Or how could i do it using pillow library.
I have a base image and i get the compare image by taking screenshot of the webpage. I want to compare those two images and assert that they are equal. how can I do it.
Below is what I have tried:
def assert_images_are_equal(base_image, compare_image):
with open(base_image, 'rb') as f1, open(compare_image, 'rb') as f2:
base_image_contents = f1.read()
compare_image_contents = f2.read()
assert base_image_contents == compare_image_contents
But this doesnt work always. I want to compare pixel by pixel. Could someone help me with this using pillow library or any other library apart from PIL? thanks.

It is rather difficult to say whether 2 images are the same or similar, because it depends on your definitions of "same" and "similar".
You can make a solid red image, save it as a PNG and then save the exact same image again and it could be different because the PNG format contains a timestamp in the image header that may have ticked over to the next second in between saves.
You can make a solid red PNG file that is 8-bits deep, and another that is 16-bits deep and you cannot see the difference but the data will be grossly different.
You can make a TIF file in Motorola byte order and the same file in Intel byte order. Visually, and in calculations, they will be indistinguishable, but the files will be grossly different.
You can make a GIF file that is red and it will look no different from a PNG file but the files will differ.
You can make a palette image and a true-colour image and the pixels will be grossly different but they will look identical.
You could make a simple black image with a white rectangle in the middle and write it using one JPEG library and it will come out different from the same image written with a different JPEG library, or even a different release version of the same library.
There are many more cases...
One a more helpful note, you may want to look at Perceptual Hashing which tells you if images look pretty similar. One library that does Perceptual Hashing is ImageMagick and it has a Python binding here and here.

Why the frames of a VideoClip change when it is written to a video file?

I wrote the following code:
from moviepy.editor import *
from PIL import Image
clip= VideoFileClip("video.mp4")
video= CompositeVideoClip([clip])
video.write_videofile("video_new.mp4",fps=clip.fps)
then to check whether the frames have changed or not and if changed, which function changed them, i retrieved the first frame of 'clip', 'video' and 'video_new.mp4' and compared them:
clip1= VideoFileClip("video_new.mp4")
img1= clip.get_frame(0)
img2= video.get_frame(0)
img3= clip1.get_frame(0)
a=img1[0,0,0]
b=img2[0,0,0]
c=img3[0,0,0]
I found that a=24, b=24, but c=26....infact on running a array compare loop i found that 'img1' and 'img2' were identical but 'img3' was different.
I suspect that the function video.write_videofile is responsible for the change in array. But i dont know why...Can anybody explain this to me and also suggest a way to write clips without changing their frames?
PS: i read the docs of 'VideoFileClip', 'FFMPEG_VideoWriter', 'FFMPEG_VideoReader' but could not find anything useful...I need to read the exact frame as it was before writing in a code I'm working on. Please, suggest me a way.

Like JPEG, MPEG-4 uses lossy compression, so it's not surprising that the frames read from "video_new.mp4" are not perfectly identical to those in "video.mp4". And as well as the variations caused purely by the lossy compression there are also variations that arise due to the wide variety of encoding options that can be used by programs that write MPEG data.
If you really need to be able to read back the exact same frame data that you write then you will have to use a different file format, but be warned: your files will be huge!
The choice of video format partly depends on what the image data is like and on what you want to do with it. If the data uses 256 colours or less, and you don't intend to perform transformations on it that will modify the colours, a simple GIF anim is a good choice. But bear in mind that even something like non-integer scaling modifies colours.
If you want to analyze the image data and transform it in various ways, it makes sense to use a format with better colour support than GIF, eg a stream of PNG images, which I assume is what Zulko mentions in his answer. FWIW, there's an anim format related to PNG called MNG, but it is not well supported or widely known.
Another option is to use a stream of PPM images, or maybe even a stream of YUV data, which is useful for certain kinds of analysis and convenient if you do intend to encode as MPEG for final consumption. The PPM format is very simple and easy to work with; YUV is slightly messy since it's a raw format with no header data, so you have to keep track of the image size and resolution data yourself.
The file size of PPM or YUV streams is large, since they incorporate no compression at all, but of course they can be compressed using standard compression techniques, if you want to save a little space when saving them to disk. OTOH, typical video processing workflows that use such streams often don't bother writing them to disk: they are sent in pipelines (perhaps using named pipes), so the file size is (mostly) irrelevant.
Although such formats take up a lot of space compared to MPEG-based files, they are far superior for use as intermediate formats while performing image data analysis and transformation, since every time you write & read back MPEG you are losing a little bit of quality.
I assume that you intend to do your image data analysis and transformations using PIL/Pillow. But you can also work with PPM & YUV streams using the ffmpeg / avconv command line programs; and the ffmpeg family happily work with sets of individual image files and GIF anims, too.

You can have lossless compression with the 'png' codec:
clip.write_videoclip('clip_new.avi', codec='png')
EDIT #PM 2Ring: when you write the line above, it makes a video that is compressed using the png algortihm (I'm not sure whether each frame is a png or if it's more subtle).

Python: Import multiple images from a folder and scale/combine them into one image?

I have a script to save between 8 and 12 images to a local folder. These images are always GIFs. I am looking for a python script to combine all the images in that one specific folder into one image. The combined 8-12 images would have to be scaled down, but I do not want to compromise the original quality(resolution) of the images either (ie. when zoomed in on the combined images, they would look as they did initially)
The only way I am able to do this currently is by copying each image to power point.
Is this possible with python (or any other language, but preferably python)?
As an input to the script, I would type in the path where only the images are stores (ie. C:\Documents and Settings\user\My Documents\My Pictures\BearImages)
EDIT: I downloaded ImageMagick and have been using it with the python api and from the command line. This simple command worked great for what I wanted: montage "*.gif" -tile x4 -geometry +1+1 -background none combine.gif

If you want to be able to zoom into the images, you do not want to scale them. You'll have to rely on the image viewer to do the scaling as they're being displayed - that's what PowerPoint is doing for you now.
The input images are GIF so they all contain a palette to describe which colors are in the image. If your images don't all have identical palettes, you'll need to convert them to 24-bit color before you combine them. This means that the output can't be another GIF; good options would be PNG or JPG depending on whether you can tolerate a bit of loss in the image quality.
You can use PIL to read the images, combine them, and write the result. You'll need to create a new image that is the size of the final result, and copy each of the smaller images into different parts of it.

You may want to outsource the image manipulation part to ImageMagick. It has a montage command that gets you 90% of the way there; just pass it some options and the names of the files in the directory.

Have a look at Python Imaging Library.
The handbook contains several examples on both opening files, combining them and saving the result.

The easiest thing to do is turn the images into numpy matrices, and then construct a new, much bigger numpy matrix to house all of them. Then convert the np matrix back into an image. Of course it'll be enormous, so you may want to downsample.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.