In matplotlib, I am using LineCollection to draw and color the countries, where the boundaries of the counties are given. When I am saving the figure as a pdf file:
fig.savefig('filename.pdf',dpi=300)
the figure size are quite big. However, on saving them as png file:
fig.savefig('filename.png',dpi=300)
and then converting them to pdf using linux convert command the files are small. I tried reducing the dpi, however that do not change the pdf file size. Is there a way the figures can be saved directly as smaller-pdf files from matplotlib?
The PDF is larger, since it contains all the vector information. By saving a PNG, you produce a rasterized image. It seems that in your case, you can produce a smaller PDF by rasterizing the plot directly:
plt.plot(x, y, 'r-', rasterized=True)
Here, x, y are some plot coordinates. You basically have to use the additionally keyword argument raterized to achieve the effect.
I think using "rasterized = True" effectively saves the image similarly to png format. When you zoom in, you will see blurring pixels.
If you want the figures to be high quality, my suggestion is to sample from the data and make a plot. The pdf file size is roughly the amount of data points it need to remember.
Related
I create a figure with matplotlib figure, axes = plt.subplots(nrows=3, ncols=2), plot various stuff axes[0,0].pcolormesh(...) and then export the figure to PDF figure.savefig('figure.pdf') or to PNG figure.savefig('figure.png').
I have to use PNG, because the PDF-file would be huge, but this makes the figure labels and other texts blurry.
Is there a way to export the figure to PDF -- so that labels, etc. are vector graphics -- but with the plots being exported to PNG within the resulting PDF-file? In short: export to PDF, but plots within that PDF to PNG (for small file sizes).
This is one of the huge advantages of Matplotlib over other libraries. If you do:
fig, ax = plt.subplots()
ax.pcolormesh(np.random.randn(500, 500), rasterized=True)
fig.savefig('Test.pdf', dpi=50)
The axes and labels will still be vectors, but the pcolormesh will be rasterized at 50 dpi. Of course for publication you should used a higher dpi, but it still is excellent for reducing large data sets. Note that you will also get aliasing artifacts if you downsample data, so use with caution.
When I use Matplotlib's plt.show() I get a nice Plot which can can be zoomed to very high precision(practically infinite). But when I save it as a image it loses all this information gives information depending on resolution.
Is there any way I can save the plot with the entire information? i.e Like those interactive plots which can rescaled at any time?
P.S- I know I can set dpi to get high quality images. This is not what I want. I want image similar to Plot which python shows when I run the program. What format is that? Or is it just very high resolution image?
Note- I am plotting .csv files which includes data varying from 10^(-10) to 100's. Thus when I save the plot as .png file I lose all the information/kinks of graph at verŠ½ small scales and only retain features from 1-100.
Maybe the interactive graphic library bokeh is an option for you. See here. It's API is just little different from what you know from matplotlib.
Bokeh creates plots as html files that you can view in your browser. For each graphic you can select wheel zoom to zoom interactively into your graphic. You can change interactively the range that you want to be plotted. Therefore you don't loose information in your graphic.
While saving a multiple grid figure in png with 300 as dpi, I lose quality
However this error does not occur while saving the figure as a pdf.
Here is the small portion of the code that saves the image generated:
fig.savefig(filepath, format = 'pdf'
,bbox_inches='tight',dpi=300)
fig.savefig(filepath, format = 'png'
,bbox_inches='tight',dpi=300)
Is there a way to obtain a good resolution png of an image such as the above without having to resort to using pdf?
.pdf images are vector graphics, and thus preserve all information. In other words setting dpi=300 in the pdf creation doesn't do anything (unless you have set specific objects to be rasterized using rasterized = True).
.png images are raster graphics (e.g. check this out). Therefore you have to adjust the dpi to get the balance of filesize vs. quality that you want. In other words, the image is saving correctly, it's just lower quality than the 'perfect' pdf.
The choice of image output format depends on how you will use it. Vector graphics (.pdf, .svg) are great if you have simple plots that you want to scale (zoom) perfectly. However, if you are plotting many points (>10,000 or so), this can lead to very large filesizes. In this case it may be better to rasterize the figure because a person can't see that many data points anyway.
"Which raster format should you use?" .png and .jpg are the most common. The former has better compression for images with large patches of the same color, while the latter has better compression for high pixel variability (e.g. photographs). Check this out for more info.
Note that while .png is considered 'lossless', it is only so in the sense that it preserves the rasterized information. Information is still lost when saving/converting to rasterized format.
I have a problem with Matplotlib. I usually make big plots with many data points and then, after zooming or setting limits, I save in pdf only a specific subset of the original plot. The problem comes when I open this file: matplotlib saves all the data into the pdf making not visible the one outside of the range. This makes almost impossible to open afterwards those plots or to import them into latex.
Any idea of how I could solve this problem is really welcome.
Thanks a lot
If you don't have a requirement to use PDF figures, you can save the matplotlib figures as .png; this format just contains the data on the screen, e.g. I tried saving a large scatter plot as PDF, its size was 198M; as png it came out as 270K; plus I've never had any problems using png inside latex.
I have not tested that this will work, but it might be worth rasterizing some of the artists:
fig, ax = plt.subplots()
ax.imshow(..., rasterized=True)
fig.savefig('test.png', dpi=600)
which will rasterize the artist when saving to vector formats. If you use a high enough dpi this should give you reasonable quality.
How can I save Python plots at very high quality?
That is, when I keep zooming in on the object saved in a PDF file, why isn't there any blurring?
Also, what would be the best mode to save it in?
png, eps? Or some other? I can't do pdf, because there is a hidden number that happens that mess with Latexmk compilation.
If you are using Matplotlib and are trying to get good figures in a LaTeX document, save as an EPS. Specifically, try something like this after running the commands to plot the image:
plt.savefig('destination_path.eps', format='eps')
I have found that EPS files work best and the dpi parameter is what really makes them look good in a document.
To specify the orientation of the figure before saving, simply call the following before the plt.savefig call, but after creating the plot (assuming you have plotted using an axes with the name ax):
ax.view_init(elev=elevation_angle, azim=azimuthal_angle)
Where elevation_angle is a number (in degrees) specifying the polar angle (down from vertical z axis) and the azimuthal_angle specifies the azimuthal angle (around the z axis).
I find that it is easiest to determine these values by first plotting the image and then rotating it and watching the current values of the angles appear towards the bottom of the window just below the actual plot. Keep in mind that the x, y, z, positions appear by default, but they are replaced with the two angles when you start to click+drag+rotate the image.
Just to add my results, also using Matplotlib.
.eps made all my text bold and removed transparency. .svg gave me high-resolution pictures that actually looked like my graph.
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
# Do the plot code
fig.savefig('myimage.svg', format='svg', dpi=1200)
I used 1200 dpi because a lot of scientific journals require images in 1200 / 600 / 300 dpi, depending on what the image is of. Convert to desired dpi and format in GIMP or Inkscape.
Obviously the dpi doesn't matter since .svg are vector graphics and have "infinite resolution".
You can save to a figure that is 1920x1080 (or 1080p) using:
fig = plt.figure(figsize=(19.20,10.80))
You can also go much higher or lower. The above solutions work well for printing, but these days you want the created image to go into a PNG/JPG or appear in a wide screen format.
Okay, I found spencerlyon2's answer working. However, in case anybody would find himself/herself not knowing what to do with that one line, I had to do it this way:
beingsaved = plt.figure()
# Some scatter plots
plt.scatter(X_1_x, X_1_y)
plt.scatter(X_2_x, X_2_y)
beingsaved.savefig('destination_path.eps', format='eps', dpi=1000)
In case you are working with seaborn plots, instead of Matplotlib, you can save a .png image like this:
Let's suppose you have a matrix object (either Pandas or NumPy), and you want to take a heatmap:
import seaborn as sb
image = sb.heatmap(matrix) # This gets you the heatmap
image.figure.savefig("C:/Your/Path/ ... /your_image.png") # This saves it
This code is compatible with the latest version of Seaborn. Other code around Stack Overflow worked only for previous versions.
Another way I like is this. I set the size of the next image as follows:
plt.subplots(figsize=(15,15))
And then later I plot the output in the console, from which I can copy-paste it where I want. (Since Seaborn is built on top of Matplotlib, there will not be any problem.)