python matplotlib save graph as data file - python

I want to create a python script that zooms in and out of matplotlib graphs along the horizontal axis. My plot is a set of horizontal bar graphs.
I also want to make that able to take any generic matplotlib graph.
I do not want to just load an image and zoom into that, I want to zoom into the graph along the horizontal axis. (I know how to do this)
Is there some way I can save and load a created graph as a data file or is there an object I can save and load later?
(typically, I would be creating my graph and then displaying it with the matplotlib plt.show, but the graph creation takes time and I do not want to recreate the graph every time I want to display it)

You can use pickle package for saving your axes and then load it back.
Save your plot into a pickle file:
import pickle
import matplotlib.pyplot as plt
ax = plt.plot([1,2,5,10])
pickle.dump(ax, open("plot.pickle", "wb"))
And then load it back:
import pickle
import matplotlib.pyplot as plt
ax = pickle.load(open("plot.pickle", "rb"))
plt.show()

#Cedric's Answer.
Additionally, if you get the pickle error for pickling functions, add the 'dill' library to your pickling script. You just need to import it at the start, it will do the rest.

Related

Matplotlib movies from complete figures without using setData

I am interested in making movies using matplotlib. Examples I've seen so far, such as this one for moviewriter, seem to have you editing the data in-place for each frame. This is very efficient, avoiding redrawing the parts of the image that stay the same each time. However, it can be clunky for rapid data exploration. I would like a recipe that lets me simply take a fully drawn figure as each frame (clearing the same figure object each time is fine).
The reason for this: I often create moderately complicated figures using custom functions, with a form like plotme(ax, data, **options). Often I develop these functions without animations in mind, and later want to animate the figures by calling the plotting function in a loop. I don't want to have to change the logic of the functions to "setData" of existing artists in the figure for each frame.
Although the example code you've shown updates existing plot objects, there is no reason that you need to do so. The critical part of the attached code is the writer.grab_frame() which simply gets a screen capture of the current figure.
Here is an example without using existing plot objects
import matplotlib
matplotlib.use("Agg")
import matplotlib.pyplot as plt
import matplotlib.animation as manimation
FFMpegWriter = manimation.writers['ffmpeg']
metadata = dict(title='Movie Test', artist='Matplotlib',
comment='Movie support!')
writer = FFMpegWriter(fps=15, metadata=metadata)
fig = plt.figure()
with writer.saving(fig, "writer_test.mp4", 100):
for k in range(10):
# Create a new plot object
plt.plot(range(k), range(k), 'o')
writer.grab_frame()

Matplotlib doesn't forget previous data when saving figures with savefig

import matplotlib.pyplot as plt
plt.plot([1,2,3],[1,2,3],'ro')
plt.axis([-4,4,-4,4])
plt.savefig('azul.png')
plt.plot([0,1,2],[0,0,0],'ro')
plt.axis([-4,4,-4,4])
plt.savefig('amarillo.png')
Output:
Why does this happen and how to solve?
What you see is a completely expected behaviour. You can plot as many data as often as you want to the same figure, which is very often very useful.
If you want to create several figures in the same script using the matplotlib state machine, you need to first close one figure before generating the next.
So in this very simple case, just add plt.close() between figure creation.
import matplotlib.pyplot as plt
plt.plot([1,2,3],[1,2,3],'bo')
plt.axis([-4,4,-4,4])
plt.savefig('azul.png')
plt.close()
plt.plot([0,1,2],[0,0,0],'yo')
plt.axis([-4,4,-4,4])
plt.savefig('amarillo.png')

add data to an existing histogram with python

As part of a project I'm working on I need to add data to a histogram in a loop. Part of the requirements of the project is that I don't use arrays to store data. Here's the psedo code of what I'm trying to do:
import matplotlib.pyplot as plt #could by numpy if that works better
plt.hist(define histogram with n bins)
for i in range (bignumber):
MCMC to find datapoint
add point to histogram
plt.plot()
The code I'm having trouble with is how to prefine a histogram with no data then append data to it as its generated.
As a bit self-advertisment (disclaimer!)... for updateable histograms, you can use my library called physt: https://github.com/janpipek/physt . After you collect all the data, you may plot the results in a way similar to matplotlib (in fact, using matplotlib in behind).

Matplotlib - Tcl_AsyncDelete: async handler deleted by the wrong thread?

I'm asking this question because I can't solve one problem in Python/Django (actually in pure Python it's ok) which leads to RuntimeError: tcl_asyncdelete async handler deleted by the wrong thread. This is somehow related to the way how I render matplotlib plots in Django. The way I do it is:
...
import matplotlib.pyplot as plt
...
fig = plt.figure()
...
plt.close()
I extremely minimized my code. But the catch is - even if I have just one line of code:
fig = plt.figure()
I see this RuntimeError happening. I hope I could solve the problem, If I knew the correct way of closing/cleaning/destroying plots in Python/Django.
By default matplotlib uses TK gui toolkit, when you're rendering an image without using the toolkit (i.e. into a file or a string), matplotlib still instantiates a window that doesn't get displayed, causing all kinds of problems. In order to avoid that, you should use an Agg backend. It can be activated like so --
import matplotlib
matplotlib.use('Agg')
from matplotlib import pyplot
For more information please refer to matplotlib documentation -- http://matplotlib.org/faq/howto_faq.html#matplotlib-in-a-web-application-server
The above (accepted) answer is a solution in a terminal environment. If you debug in an IDE, you still might wanna use 'TkAgg' for displaying data. In order to prevent this issue, apply these two simple rules:
everytime you display your data, initiate a new fig = plt.figure()
don't close old figures manually (e.g. when using a debug mode)
Example code:
import matplotlib
matplotlib.use('TkAgg')
from matplotlib import pyplot as plt
fig = plt.figure()
plt.plot(data[:,:,:3])
plt.show()
This proves to be the a good intermediate solution under MacOS and PyCharm IDE.
If you don't need to show plots while debugging, the following works:
import matplotlib
matplotlib.use('Agg')
from matplotlib import pyplot as plt
However, if you would like to plot while debugging, you need to do 3 steps:
1.Keep backend to 'TKAgg' as follows:
import matplotlib
matplotlib.use('TKAgg')
from matplot.lib import pyplot as plt
or simply
import matplotlib.pyplot as plt
2.As Fábio also mentioned, you need to add fig(no. #i)=plt.figure(no.#i) for each figure #i. As the following example for plot no.#1, add:
fig1 = plt.figure(1)
plt.plot(yourX,yourY)
plt.show()
3.Add breakpoints. You need to add two breakpoints at least, one somewhere at the beginning of your codes (before the first plot), and the other breakpoint at a point where you would like all plots (before to the second breakpoint) are plotted. All figures are plotted and you even don't need to close any figure manually.
For me, this happened due to parallel access to data by both Matplotlib and by Tensorboard, after Tensorboard's server was running for a week straight.
Rebotting tensorboard tensorboard --logdir . --samples_per_plugin images=100 solved this for me.
I encountered this problem when plotting graphs live with matplotlib in my tkinter application.
The easiest solution I found, was to always delete subplots. I found you didn't need to instantiate a new figure, you only needed to delete the old subplot (using del subplot), then remake it.
Before plotting a new graph, make sure to delete the old subplot.
Example:
f = Figure(figsize=(5,5), dpi=100)
a = f.add_subplot(111)
(For Loop code that updates graph every 5 seconds):
del a #delete subplot
a = f.add_subplot(111) #redefine subplot
Finding this simple solution to fix this "async handler bug" was excruciatingly painful, I hope this helps someone else :)

Exporting plot with full data or saving as a script

I'm using python with matplotlib to create plots out of data, an I'd like to save this plots on a pdf file (but I could use also a more specific format).
I'm using basically this instructions:
plt.plot(data)
figname = ''.join([filename, '_', label, '.pdf'])
plt.savefig(figname)
But what this does is create an image of the plot with the zoom in which it's displayed; I would like to create a copy that shows all points (>10000) that I'm plotting so I would be able to zoom to any level.
Which is the correct way to do that?
EDIT: is there a format (such as '.fig' for Matlab) that calls directly the viewer of Matplotlib with the data i saved?
Maybe it's possible to create a .py script that saves the points and that i can call to quickly re-display them? I think that this is what is done by the .fig Matlab file.
I don't know of any native Matplotlib file format which includes your data; in fact, I'm not sure the Matploblib objects even have a write function defined.
What I do instead to simulate the Matlab .fig concept is to save the processed data (as a numpy array, or pickled) and run a separate .py script to recreate the Matplotlib plots.
So in steps:
Process your data and make some pretty plots until you are fully content
Save/pickle your processed data as close to the plot commands as possible (you might even want to store the data going into a histogram if making the histogram takes a long time)
Write a new script in which you import the data and copy/paste the plotting commands from the original script
It is a bit clumsy, but it works. If you really want, you could embed the pickled data as a string in your plotting script (Embed pickle (or arbitrary) data in python script). This gives you the benefit of working with a single python script containing both the data as well as the plotting code.
Edit
You can check for the existence of your stored processed data file and skip the processing steps if this file exists. So:
if not processed_data.file exists:
my_data = process_raw_data()
else:
my_data = read_data_from_file(processed_data.file)
plot(my_data)
In this way, you can have one script for both creating the graph in the first place, and re-plotting the graph using pre-processed data.
You might want to add a runtime argument for forcing a re-processing of the data in case you change something to the processing script and don't want to manually remove your processed data file.
Use plt.xlim and plt.ylim to set the domain and range.
Set figsize to indirectly control the pixel resolution of the final image. (figsize sets the size of the figure in inches; the default dpi is 100.)
You can also control the dpi in the call to plt.savefig.
With figsize = (10, 10) and dpi = 100, the image will have resolution 1000x1000.
For example,
import matplotlib.pyplot as plt
import numpy as np
x, y = np.random.random((2,10000))
plt.plot(x, y, ',')
figname = '/tmp/test.pdf'
xmin, xmax = 0, 1
ymin, ymax = 0, 1
plt.xlim(xmin, xmax)
plt.ylim(ymin, ymax)
plt.savefig(figname)
Your pdf viewer should be able to zoom in any region so individual points can be distinguished.

Categories

Resources