Saving images in Python at a very high quality - python

How can I save Python plots at very high quality?
That is, when I keep zooming in on the object saved in a PDF file, why isn't there any blurring?
Also, what would be the best mode to save it in?
png, eps? Or some other? I can't do pdf, because there is a hidden number that happens that mess with Latexmk compilation.

If you are using Matplotlib and are trying to get good figures in a LaTeX document, save as an EPS. Specifically, try something like this after running the commands to plot the image:
plt.savefig('destination_path.eps', format='eps')
I have found that EPS files work best and the dpi parameter is what really makes them look good in a document.
To specify the orientation of the figure before saving, simply call the following before the plt.savefig call, but after creating the plot (assuming you have plotted using an axes with the name ax):
ax.view_init(elev=elevation_angle, azim=azimuthal_angle)
Where elevation_angle is a number (in degrees) specifying the polar angle (down from vertical z axis) and the azimuthal_angle specifies the azimuthal angle (around the z axis).
I find that it is easiest to determine these values by first plotting the image and then rotating it and watching the current values of the angles appear towards the bottom of the window just below the actual plot. Keep in mind that the x, y, z, positions appear by default, but they are replaced with the two angles when you start to click+drag+rotate the image.

Just to add my results, also using Matplotlib.
.eps made all my text bold and removed transparency. .svg gave me high-resolution pictures that actually looked like my graph.
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
# Do the plot code
fig.savefig('myimage.svg', format='svg', dpi=1200)
I used 1200 dpi because a lot of scientific journals require images in 1200 / 600 / 300 dpi, depending on what the image is of. Convert to desired dpi and format in GIMP or Inkscape.
Obviously the dpi doesn't matter since .svg are vector graphics and have "infinite resolution".

You can save to a figure that is 1920x1080 (or 1080p) using:
fig = plt.figure(figsize=(19.20,10.80))
You can also go much higher or lower. The above solutions work well for printing, but these days you want the created image to go into a PNG/JPG or appear in a wide screen format.

Okay, I found spencerlyon2's answer working. However, in case anybody would find himself/herself not knowing what to do with that one line, I had to do it this way:
beingsaved = plt.figure()
# Some scatter plots
plt.scatter(X_1_x, X_1_y)
plt.scatter(X_2_x, X_2_y)
beingsaved.savefig('destination_path.eps', format='eps', dpi=1000)

In case you are working with seaborn plots, instead of Matplotlib, you can save a .png image like this:
Let's suppose you have a matrix object (either Pandas or NumPy), and you want to take a heatmap:
import seaborn as sb
image = sb.heatmap(matrix) # This gets you the heatmap
image.figure.savefig("C:/Your/Path/ ... /your_image.png") # This saves it
This code is compatible with the latest version of Seaborn. Other code around Stack Overflow worked only for previous versions.
Another way I like is this. I set the size of the next image as follows:
plt.subplots(figsize=(15,15))
And then later I plot the output in the console, from which I can copy-paste it where I want. (Since Seaborn is built on top of Matplotlib, there will not be any problem.)

Related

Controlling resolution of full domain pcolormesh cells

I'm not sure whether this is a Cartopy or Matplotlib question, so I apologize if this would have been better suited for Matplotlib.
I am transitioning from NCL (NCAR Command Language https://www.ncl.ucar.edu/) to Python. Previously, I was using NCL to contour with a method of "CellFill" (https://www.ncl.ucar.edu/Document/Graphics/Resources/cn.shtml#cnFillMode). In Python, I am using pcolormesh to render a gridded dataset with a horizontal grid spacing of 3-km. In NCL, regardless of whether I am plotting the full domain or an area zoom, the resolution of the resulting image appears to be consistent using a PNG output. In Python however, if I use pcolormesh with an area zoom it looks identical to my NCL plot but if I try and plot the full domain, it looks different.
I've traced this down to the figure resolution. At the full domain view in Python, however I have my figure settings configured causes the 3-km cells in certain areas to become "blurred together" making it appear as if the entire region is a certain contour value when in actuality there are areas with no values in between.
Here is a CONUS example of pcolormesh:
And here is a full CONUS version from NCL:
There are several areas of note, but one obvious area is the NM/AZ region. If I zoom in very closely in both Python and NCL in this region, the resulting images look identical. But at the CONUS view it looks like there's much more shading in this area than there actually should be in the Python version.
crs = ccrs.PlateCarree() # Lat/Lon
fig = plt.figure(1, figsize=(15, 15))
ax.add_feature(cfeature.COASTLINE.with_scale('50m'), linewidth=BORDERWIDTH,edgecolor=BORDERCOLOR)
ax.add_feature(cfeature.STATES, linewidth=BORDERWIDTH,edgecolor=BORDERCOLOR)
ax.add_feature(cfeature.BORDERS, linewidth=BORDERWIDTH,edgecolor=BORDERCOLOR)
ax1 = plt.subplot(111,projection=crs)
norm = BoundaryNorm(LEVELS,ncolors=plt.get_cmap('plasma').N,clip=False)
cf1 = ax1.pcolormesh(diffsum.lon0,diffsum.lat0,diffsum,cmap='plasma',transform=ccrs.PlateCarree(),norm=norm)
plt.savefig('testing%s.png' % (DSTRING))
Note that if I manually increase the DPI used in the resulting image to something rediculous like 1000, or increase the figure size to 100x100 inches, it also looks OK but the resulting image is so gigantic it makes it cumbersome to view on the screen.
Is there something I am missing about pcolormesh that I should be doing to help better adapt the resolution of the cells being shaded with respect to the resolution of the actual figure itself?

Python matplotlib reducing the image resolution [duplicate]

In matplotlib, I am using LineCollection to draw and color the countries, where the boundaries of the counties are given. When I am saving the figure as a pdf file:
fig.savefig('filename.pdf',dpi=300)
the figure size are quite big. However, on saving them as png file:
fig.savefig('filename.png',dpi=300)
and then converting them to pdf using linux convert command the files are small. I tried reducing the dpi, however that do not change the pdf file size. Is there a way the figures can be saved directly as smaller-pdf files from matplotlib?
The PDF is larger, since it contains all the vector information. By saving a PNG, you produce a rasterized image. It seems that in your case, you can produce a smaller PDF by rasterizing the plot directly:
plt.plot(x, y, 'r-', rasterized=True)
Here, x, y are some plot coordinates. You basically have to use the additionally keyword argument raterized to achieve the effect.
I think using "rasterized = True" effectively saves the image similarly to png format. When you zoom in, you will see blurring pixels.
If you want the figures to be high quality, my suggestion is to sample from the data and make a plot. The pdf file size is roughly the amount of data points it need to remember.

Vector graphics + matplotlib pcolorfast

To keep it short: Is there a way to export plots created with methods like
pcolorfast which basically draw pixels as "real" vector graphics?
I tried to do just that using savefig and saving to a PDF but what would happen is that the plot was actually a vector graphic but the parts drawn by pcolorfast(so basically, what is inside the axes) something like a bitmap. - I checked this using Inkscape.
This resulted in really low resolution plots even though the arrays drawn with pcolorfast where about 3000x4000. I achieved higher resolution by increasing the dpi when exporting, but I'd really appreciate a conversion to a real vector graphic.
Edit: I updated my original code by the piece of code below that should serve to illustrate what exactly I am doing. I tried to involucrate the rasterized tip, but it has had no effect. I still end up with a supersmall PDF-file where the plots are acually raster images (png). I am going to provide you with the data I used and the resulting PDF.
http://www.megafileupload.com/k5ku/test_array1.txt
http://www.megafileupload.com/k5kv/test_array2.txt
http://www.megafileupload.com/k5kw/test.pdf
import numpy as np
import matplotlib.pyplot as plt
arr1= np.loadtxt("test_array1.txt")
arr2= np.loadtxt("test_array2.txt")
fig, (ax1, ax2)=plt.subplots(1, 2)
ax1.set_rasterized(False)
ax1.pcolorfast(arr1)
ax2.pcolorfast(arr2, rasterized=False)
plt.show()
fig.set_rasterized(False)
fig.savefig("test.pdf")

Matplotlib saves pdf with data outside set

I have a problem with Matplotlib. I usually make big plots with many data points and then, after zooming or setting limits, I save in pdf only a specific subset of the original plot. The problem comes when I open this file: matplotlib saves all the data into the pdf making not visible the one outside of the range. This makes almost impossible to open afterwards those plots or to import them into latex.
Any idea of how I could solve this problem is really welcome.
Thanks a lot
If you don't have a requirement to use PDF figures, you can save the matplotlib figures as .png; this format just contains the data on the screen, e.g. I tried saving a large scatter plot as PDF, its size was 198M; as png it came out as 270K; plus I've never had any problems using png inside latex.
I have not tested that this will work, but it might be worth rasterizing some of the artists:
fig, ax = plt.subplots()
ax.imshow(..., rasterized=True)
fig.savefig('test.png', dpi=600)
which will rasterize the artist when saving to vector formats. If you use a high enough dpi this should give you reasonable quality.

How to prevent Matplotlib from clipping away my axis labels?

I'm preparing some plots for a scientific paper, which need to be wide and short in order to fit into the page limit. However, when I save them as pdf, the x axis labels are missing, because (I think) they're outside the bounding box.
Putting the following into an iPython notebook reproduces the problem.
%pylab inline
pylab.rcParams['figure.figsize'] = (8.0, 2.0)
plot([1,5,2,4,6,2,1])
xlabel("$x$")
ylabel("$y$")
savefig("test.pdf")
The resulting pdf file looks like this:
How can I change the bounding box of the pdf file? Ideally I'd like a solution that "does it properly", i.e. automatically adjusts the size so that everything fits neatly, including getting rid of that unnecessary space to the left and right - but I'm in a hurry, so I'll settle for any way to change the bounding box, and I'll guess numbers until it looks right if I have to.
After a spot of Googling, I found an answer: you can give bbox_inches='tight' to the savefig command and it will automatically adjust the bounding box to the size of the contents:
%pylab inline
pylab.rcParams['figure.figsize'] = (8.0, 2.0)
plot([1,5,2,4,6,2,1])
xlabel("$x$")
ylabel("$y$")
savefig("test.pdf",bbox_inches='tight')
Those are some tight inches, I guess.
Note that this is slightly different from Ffisegydd's answer, since it adjusts the bounding box to the plot, rather than changing the plot to fit the bounding box. (But both are fine for my purposes.)
You can use plt.tight_layout() to have matplotlib adjust the layout of your plot. tight_layout() will automatically adjust the dimensions, and can also be used when you have (for example) overlapping labels/ticks/etc.
%pylab inline
pylab.rcParams['figure.figsize'] = (8.0, 2.0)
plot([1,5,2,4,6,2,1])
xlabel("$x$")
ylabel("$y$")
tight_layout()
savefig("test.pdf")
Here is a .png of the output (can't upload pdfs to SO but I've checked it and it works the same way for a pdf).
If you are preparing the plot for a scientific paper, I suggest to do the 'clipping' by yourself,
using
plt.subplots_adjust(left,right,bottom,top,..)
after the creation of the figure and before saving it. If you are running from an ipython console you can also call subplots_adjust after the generation of the figure, and tune the margins by trial and error. Some backends (I think at least the Qt backend) also expose a GUI for this feature.
Doing this by hand takes time, but most times provides a more precise result, especially with Latex rendering in my experience.
This is the only option whenever you have to stack vertically or horizontally two figures (with a package like subfigure for example), as tight_layout will not guarantee the same margins in the two figures, and the axis will result misaligned in the paper.
This is a nice link on using matplotlib for publications, covering for example how to set the figure width to match the journal column width.

Categories

Resources