I'm hoping to find a way to optimise the following situation. I have a large contour plot created with imshow of matplotlib. I then want to use this contour plot to create a large number of png images, where each image is a small section of the contour image by changing the x and y limits and the aspect ratio.
So no plot data is changing in the loop, only the axis limits and the aspect ratio are changing between each png image.
The following MWE creates 70 png images in a "figs" folder demonstrating the simplified idea. About 80% of the runtime is taken up by fig.savefig('figs/'+filename).
I've looked into the following without coming up with an improvement:
An alternative to matplotlib with a focus on speed -- I've struggled to find any examples/documentation of contour/surface plots with similar requirements
Multiprocessing -- Similar questions I've seen here appear to require fig = plt.figure() and ax.imshow to be called within the loop, since fig and ax can't be pickled. In my case this will be more expensive than any speed gains achieved by implementing multiprocessing.
I'd appreciate any insight or suggestions you might have.
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
import time, os
def make_plot(x, y, fix, ax):
aspect = np.random.random(1)+y/2.0-x
xrand = np.random.random(2)*x
xlim = [min(xrand), max(xrand)]
yrand = np.random.random(2)*y
ylim = [min(yrand), max(yrand)]
filename = '{:d}_{:d}.png'.format(x,y)
if not os.path.isdir('figs'):
data = np.random.rand(25, 25)
fig = plt.figure()
ax = fig.add_axes([0., 0., 1., 1.])
# in the real case, imshow is an expensive calculation which can't be put inside the loop
ax.imshow(data, interpolation='nearest')
tstart = time.clock()
for i in range(1, 8):
for j in range(3, 13):
make_plot(i, j, fig, ax)
print('took {:.2f} seconds'.format(time.clock()-tstart))
Since the limitation in this case is the call to plt.savefig() it cannot be optimized a lot. Internally the figure is rendered from scratch and that takes a while. Possibly reducing the number of vertices to be drawn might reduce the time a bit.
The time to run your code on my machine (Win 8, i5 with 4 cores 3.5GHz) is 2.5 seconds. This seems not too bad. One can get a little improvement by using Multiprocessing.
A note about Multiprocessing: It may seem surprising that using the state machine of pyplot inside multiprocessing should work at all. But it does.
And in this case here, since every image is based on the same figure and axes object, one does not even have to create new figures and axes.
I modified an answer I gave here a while ago for your case and the total time is roughly halved using multiprocessing and 5 processes on 4 cores. I appended a barplot which shows the effect of multiprocessing.
import numpy as np
#import matplotlib as mpl
#mpl.use('agg') # use of agg seems to slow things down a bit
import matplotlib.pyplot as plt
import multiprocessing
import time, os
def make_plot(d):
start = time.clock()
#using aspect in this way causes a warning for me
#aspect = np.random.random(1)+y/2.0-x
xrand = np.random.random(2)*x
xlim = [min(xrand), max(xrand)]
yrand = np.random.random(2)*y
ylim = [min(yrand), max(yrand)]
filename = '{:d}_{:d}.png'.format(x,y)
ax = plt.gca()
stop = time.clock()
return np.array([x,y, start, stop])
if not os.path.isdir('figs'):
data = np.random.rand(25, 25)
fig = plt.figure()
ax = fig.add_axes([0., 0., 1., 1.])
ax.imshow(data, interpolation='nearest')
some_list = []
for i in range(1, 8):
for j in range(3, 13):
if __name__ == "__main__":
tstart = time.clock()
print tstart
num_proc = 5
p = multiprocessing.Pool(num_proc)
nu = p.map(make_plot, some_list)
tooktime = 'Plotting of {} frames took {:.2f} seconds'
tooktime = tooktime.format(len(some_list), time.clock()-tstart)
print tooktime
nu = np.array(nu)
fig, ax = plt.subplots(figsize=(8,5))
ax.barh(np.arange(len(some_list)), nu[:,3]-nu[:,2],
height=np.ones(len(some_list)), left=nu[:,2], align="center")
ax.set_xlabel("time [s]")
ax.set_ylabel("image number")
I want to plot a 3D tensor plane by plane using matplotlib in a loop.
However, in this example, matplotlib keeps on adding colorbars to the figure:
data = np.random.rand(100,100,10)
for i in range(10):
plt.imshow(np.squeeze(data[:, :, i]))
Caveat: I've seen some complicated answers to this simple question, which didn't work. The problem may sound simple, but I'm thinking there might be an easy (short) solution.
The easy solution
Clear the figure in each loop run.
import numpy as np
import matplotlib.pyplot as plt
data = np.random.rand(100,100,10) * np.linspace(1,7,10)
fig = plt.figure()
for i in range(10):
plt.imshow(np.squeeze(data[:, :, i]))
The efficient solution
Use the same image and just update the data. Also use a FuncAnimation instead of a loop to run everything within the GUI event loop.
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.animation import FuncAnimation
data = np.random.rand(100,100,10) * np.linspace(1,7,10)
fig, ax = plt.subplots()
im = ax.imshow(np.squeeze(data[:, :, 0]))
cbar = fig.colorbar(im, ax=ax)
def update(i):
im.set_data(data[:, :, i])
ani = FuncAnimation(fig, update, frames=data.shape[2], interval=2000)
So here is a solution. Unfortunately it is not short at all. If someone knows how to make this less complicated, feel free to post another answer.
This is slightly modified version of this answer
import matplotlib.pyplot as plt
import numpy as np
def visualize_tensor(data, delay=0.5):
""" data must be 3 dimensional array and
have format:
[height x width x channels]"""
assert(np.ndim(data) == 3)
# Get number of channels from last dimension
num_channels = np.shape(data)[-1]
# Plot data of first channel
fig = plt.figure()
ax = fig.add_subplot(111)
data_first_channel = data[:, :, 0]
plot = ax.imshow(data_first_channel)
# Create colorbar
cbar = plt.colorbar(plot)
# Iterate over all channels
for i in range(num_channels):
print(f"channel = {i}")
data_nth_channel = np.squeeze(data[:, :, i])
vmin = np.min(data_nth_channel.view()) # get minimum of nth channel
vmax = np.max(data_nth_channel.view()) # get maximum of nth channel
cbar.set_clim(vmin=vmin, vmax=vmax)
cbar_ticks = np.linspace(vmin, vmax, num=11, endpoint=True)
Example execution:
data = np.random.rand(20,20,10)
Using plot.autoscale() forces the colorbar to adapt dynamically, see this answer
This question intrigued me as hacking at matplotlib is somewhat my hobby. Next to the solution posed by #mcExchange one could use this
from matplotlib.pyplot import subplots
import numpy as np
%matplotlib notebook
d = np.random.rand(10, 10)
fig, ax = subplots(figsize = (2,2))
# create mappable
h = ax.imshow(d)
# create colorbar
cb = fig.colorbar(h)
# show non-blocking
for i in range(100):
# generate new data
h.set_data(np.random.randn(*d.shape) + 1)
# flush events update time
ax.set_title(f't = {i}')
fig.canvas.draw(); fig.canvas.flush_events();
How did I get this solution?
The docs state that colorbar.update_normal only updates if the norm on the mappable is different than before. Setting the data doesn't change this. As such manually function have to be called to register this update.
Behind the scene the following happens:
# rescale data for cb trigger
h.norm.autoscale(h._A) #h._A is the representation of the data
# update mappable
I want to show a jpg in a window which updates multiple times per second.
I have coded a very very compact program with just 100 lines of code (a neural network which creates the image) and don't want to put in another 100 lines of code to just show the image.
Is there anything I can do to solve this problem?
Many thx, jj
As it was stated in the comments that IO is not an issue, we shall go straight to the available standard image plot tools used in matplotlib, since it is the defacto standard plotting library for python. While not knowing the dimensions of typical images originating in neural networks, a quick comparison of the average time it would take to call e.g. imshow, pcolormesh and matshow for different image dimensions cannot hurt (pcolor is significantly slower, so it is omitted).
import matplotlib.pyplot as plt
import numpy as np
import timeit
n = 13
repeats = 20
timetable = np.zeros((4, n-1))
labellist = ['imshow', 'matshow', 'pcolormesh']
for i in range(1, n):
image = np.random.rand(2**i, 2**i)
print('image size:', 2**i)
timetable[0, i - 1] = 2**i
timetable[1, i - 1] = timeit.timeit("plt.imshow(image)", setup="from __main__ import plt, image", number=repeats)/repeats
timetable[2, i - 1] = timeit.timeit("plt.matshow(image)", setup="from __main__ import plt, image", number=repeats)/repeats
timetable[3, i - 1] = timeit.timeit("plt.pcolormesh(image)", setup="from __main__ import plt, image", number=repeats)/repeats
for i in range(1, 4):
plt.semilogy(timetable[0, :], timetable[i, :], label=labellist[i - 1])
plt.xlabel('image size')
plt.ylabel('avg. exec. time [s]')
plt.ylim(1e-3, 1)
So, imshow it is. An elegant way to update or animate a plot in matplotlib is the animation framework it offers. That way one does not have to bother with many lines of code, as it was asked for. Here is a simple example:
import matplotlib.pyplot as plt
import numpy as np
import time
from matplotlib import animation
data = np.random.rand(128, 128)
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
im = ax.imshow(data, animated=True)
def update_image(i):
data = np.random.rand(128, 128)
# time.sleep(.5)
# plt.pause(0.5)
ani = animation.FuncAnimation(fig, update_image, interval=0)
In this example the neural network would be called out of the update function. The update behaviour under heavy computational work can be emulated by time.sleep. If your application is multi-threaded plt.pause might come in handy to give the other threads time to do their work. interval=0 basically makes the plot update as often as possible.
I hope this points you in the general direction and is helpful. If you do not want to utilize animations, canvas clearing and/or blitting need to be taken care of manually.
Although there are many matplotlib optimization posts around, I didn't find the exact tips I want here, such as:
Matplotlib slow with large data sets, how to enable decimation?
Matplotlib - Fast way to create many subplots?
My problem is that I have cached CSV files of time-series data (40 of them).
I'd like to plot them in one plot with 40 subplots in a vertical series, and output them to a single rasterized image.
My code using matplotlib is as follows:
def _Draw(self):
"""Output a graph of subplots."""
BigFont = 10
# Prepare subplots.
nFiles = len(self.inFiles)
fig = plt.figure()
for i, f in enumerate(self.inFiles[0:3]):
pltTitle = '{}:{}'.format(i, f)
colorFile = self._GenerateOutpath(f, '_rgb.csv')
data = np.loadtxt(colorFile, delimiter=Separator)
nRows = data.shape[0]
ind = np.arange(nRows)
vals = np.ones((nRows, 1))
ax = fig.add_subplot(nFiles, 1, i+1)
ax.set_title(pltTitle, fontsize=BigFont, loc='left')
ax.bar(ind, vals, width=1.0, edgecolor='none', color=data)
figout = plt.gcf()
plt.savefig(self.args.outFile, dpi=300, bbox_inches='tight')
The script hangs for the whole night. On average my data are all ~10,000 x 3 to ~30,000 x 3 matrix.
In my case, I don't think I can use memmapfile to avoid memory hog because the subplot seems to be the problem here, not the data imported each loop.
I have no idea where to start to optimize this workflow.
I could, however, forget about subplots and generate one plot image per data at a time, and stitch the 40 images later, but that is not ideal.
Is there an easy way in matplotlib to do this?
Your problem is the way you're plotting your data.
Using bar to plot tens of thousands of bars of exactly the same size is very inefficient compared to using imshow to accomplish the same thing.
For example:
import numpy as np
import matplotlib.pyplot as plt
# Random r,g,b data similar to what you seem to be loading in....
data = np.random.random((30000, 3))
# Make data a 1 x size x 3 array
data = data[None, ...]
# Plotting using `imshow` instead of `bar` will be _much_ faster.
fig, ax = plt.subplots()
ax.imshow(data, interpolation='nearest', aspect='auto')
This should be essentially equivalent to what you're currently doing, but will draw much faster and use less memory.
I'm working on some computer vision algorithm and I'd like to show how a numpy array changes in each step.
What works now is that if I have a simple imshow( array ) at the end of my code, the window displays and shows the final image.
However what I'd like to do is to update and display the imshow window as the image changes in each iteration.
So for example I'd like to do:
import numpy as np
import matplotlib.pyplot as plt
import time
array = np.zeros( (100, 100), np.uint8 )
for i in xrange( 0, 100 ):
for j in xrange( 0, 50 ):
array[j, i] = 1
plt.imshow( array )
The problem is that this way, the Matplotlib window doesn't get activated, only once the whole computation is finished.
I've tried both native matplotlib and pyplot, but the results are the same. For plotting commands I found an .ion() switch, but here it doesn't seem to work.
Q1. What is the best way to continuously display updates to a numpy array (actually a uint8 greyscale image)?
Q2. Is it possible to do this with an animation function, like in the dynamic image example? I'd like to call a function inside a loop, thus I don't know how to achieve this with an animation function.
You don't need to call imshow all the time. It is much faster to use the object's set_data method:
myobj = imshow(first_image)
for pixel in pixels:
The draw() should make sure that the backend updates the image.
UPDATE: your question was significantly modified. In such cases it is better to ask another question. Here is a way to deal with your second question:
Matplotlib's animation only deals with one increasing dimension (time), so your double loop won't do. You need to convert your indices to a single index. Here is an example:
import numpy as np
from matplotlib import pyplot as plt
from matplotlib import animation
nx = 150
ny = 50
fig = plt.figure()
data = np.zeros((nx, ny))
im = plt.imshow(data, cmap='gist_gray_r', vmin=0, vmax=1)
def init():
im.set_data(np.zeros((nx, ny)))
def animate(i):
xi = i // ny
yi = i % ny
data[xi, yi] = 1
return im
anim = animation.FuncAnimation(fig, animate, init_func=init, frames=nx * ny,
I struggled to make it work because many post talk about this problem, but no one seems to care about providing a working example. In this case however, the reasons were different :
I couldn't use Tiago's or Bily's answers because they are not in the
same paradigm as the question. In the question, the refresh is
scheduled by the algorithm itself, while with funcanimation or
videofig, we are in an event driven paradigm. Event driven
programming is unavoidable for modern user interface programming, but
when you start from a complex algorithm, it might be difficult to
convert it to an event driven scheme - and I wanted to be able to do
it in the classic procedural paradigm too.
Bub Espinja reply suffered another problem : I didn't try it in the
context of jupyter notebooks, but repeating imshow is wrong since it
recreates new data structures each time which causes an important
memory leak and slows down the whole display process.
Also Tiago mentioned calling draw(), but without specifying where to get it from - and by the way, you don't need it. the function you really need to call is flush_event(). sometime it works without, but it's because it has been triggered from somewhere else. You can't count on it. The real tricky point is that if you call imshow() on an empty table, you need to specify vmin and vmax or it will fail to initialize it's color map and set_data will fail too.
Here is a working solution :
import numpy as np
import matplotlib.pyplot as plt
fig1, ax1 = plt.subplots()
fig2, ax2 = plt.subplots()
fig3, ax3 = plt.subplots()
# this example doesn't work because array only contains zeroes
array = np.zeros(shape=(IMAGE_SIZE, IMAGE_SIZE), dtype=np.uint8)
axim1 = ax1.imshow(array)
# In order to solve this, one needs to set the color scale with vmin/vman
# I found this, thanks to #jettero's comment.
array = np.zeros(shape=(IMAGE_SIZE, IMAGE_SIZE), dtype=np.uint8)
axim2 = ax2.imshow(array, vmin=0, vmax=99)
# alternatively this process can be automated from the data
array[0, 0] = 99 # this value allow imshow to initialise it's color scale
axim3 = ax3.imshow(array)
del array
for _ in range(50):
print(".", end="")
matrix = np.random.randint(0, 100, size=(IMAGE_SIZE, IMAGE_SIZE), dtype=np.uint8)
UPDATE : I added the vmin/vmax solution based on #Jettero's comment (I missed it at first).
If you are using Jupyter, maybe this answer interests you.
I read in this site that the emmbebed function of clear_output can make the trick:
%matplotlib inline
from matplotlib import pyplot as plt
from IPython.display import clear_output
for i in range(len(list_of_frames)):
plt.title('Frame %d' % i)
It is true that this method is quite slow, but it can be used for testing purposes.
I implemented a handy script that just suits your needs. Try it out here
An example that shows images in a custom directory is like this:
import os
import glob
from scipy.misc import imread
img_files = glob.glob(os.path.join(video_dir, '*.jpg'))
def redraw_fn(f, axes):
img_file = img_files[f]
img = imread(img_file)
if not redraw_fn.initialized:
redraw_fn.im = axes.imshow(img, animated=True)
redraw_fn.initialized = True
redraw_fn.initialized = False
videofig(len(img_files), redraw_fn, play_fps=30)
I had a similar problem - want to update image, don't want to repeatedly replace the axes, but plt.imshow() (nor ax.imshow()) was not updating the figure displayed.
I finally discovered that some form of draw() was required. But fig.canvas.draw(), ax.draw() ... all did not work. I finally found the solution here:
%matplotlib notebook #If using Jupyter Notebook
import matplotlib.pyplot as plt
import numpy as np
imData = np.array([[1,3],[3,1]])
# Setup and plot image
fig = plt.figure()
ax = plt.subplot(111)
im = ax.imshow(imData)
# Change image contents
newImData = np.array([[2,2],[2,2]])
im.set_data( newImData )
import numpy as np
import matplotlib.pyplot as plt
k = 10
array = np.zeros((k, k))
for i in range(k):
for j in range(k):
array[i, j] = 1
I'm currently evaluating different python plotting libraries. Right now I'm trying matplotlib and I'm quite disappointed with the performance. The following example is modified from SciPy examples and gives me only ~ 8 frames per second!
Any ways of speeding this up or should I pick a different plotting library?
from pylab import *
import time
fig = figure()
ax1 = fig.add_subplot(611)
ax2 = fig.add_subplot(612)
ax3 = fig.add_subplot(613)
ax4 = fig.add_subplot(614)
ax5 = fig.add_subplot(615)
ax6 = fig.add_subplot(616)
x = arange(0,2*pi,0.01)
y = sin(x)
line1, = ax1.plot(x, y, 'r-')
line2, = ax2.plot(x, y, 'g-')
line3, = ax3.plot(x, y, 'y-')
line4, = ax4.plot(x, y, 'm-')
line5, = ax5.plot(x, y, 'k-')
line6, = ax6.plot(x, y, 'p-')
# turn off interactive plotting - speeds things up by 1 Frame / second
tstart = time.time() # for profiling
for i in arange(1, 200):
line1.set_ydata(sin(x+i/10.0)) # update the data
draw() # redraw the canvas
print 'FPS:' , 200/(time.time()-tstart)
First off, (though this won't change the performance at all) consider cleaning up your code, similar to this:
import matplotlib.pyplot as plt
import numpy as np
import time
x = np.arange(0, 2*np.pi, 0.01)
y = np.sin(x)
fig, axes = plt.subplots(nrows=6)
styles = ['r-', 'g-', 'y-', 'm-', 'k-', 'c-']
lines = [ax.plot(x, y, style)[0] for ax, style in zip(axes, styles)]
tstart = time.time()
for i in xrange(1, 20):
for j, line in enumerate(lines, start=1):
line.set_ydata(np.sin(j*x + i/10.0))
print 'FPS:' , 20/(time.time()-tstart)
With the above example, I get around 10fps.
Just a quick note, depending on your exact use case, matplotlib may not be a great choice. It's oriented towards publication-quality figures, not real-time display.
However, there are a lot of things you can do to speed this example up.
There are two main reasons why this is as slow as it is.
1) Calling fig.canvas.draw() redraws everything. It's your bottleneck. In your case, you don't need to re-draw things like the axes boundaries, tick labels, etc.
2) In your case, there are a lot of subplots with a lot of tick labels. These take a long time to draw.
Both these can be fixed by using blitting.
To do blitting efficiently, you'll have to use backend-specific code. In practice, if you're really worried about smooth animations, you're usually embedding matplotlib plots in some sort of gui toolkit, anyway, so this isn't much of an issue.
However, without knowing a bit more about what you're doing, I can't help you there.
Nonetheless, there is a gui-neutral way of doing it that is still reasonably fast.
import matplotlib.pyplot as plt
import numpy as np
import time
x = np.arange(0, 2*np.pi, 0.1)
y = np.sin(x)
fig, axes = plt.subplots(nrows=6)
# We need to draw the canvas before we start animating...
styles = ['r-', 'g-', 'y-', 'm-', 'k-', 'c-']
def plot(ax, style):
return ax.plot(x, y, style, animated=True)[0]
lines = [plot(ax, style) for ax, style in zip(axes, styles)]
# Let's capture the background of the figure
backgrounds = [fig.canvas.copy_from_bbox(ax.bbox) for ax in axes]
tstart = time.time()
for i in xrange(1, 2000):
items = enumerate(zip(lines, axes, backgrounds), start=1)
for j, (line, ax, background) in items:
line.set_ydata(np.sin(j*x + i/10.0))
print 'FPS:' , 2000/(time.time()-tstart)
This gives me ~200fps.
To make this a bit more convenient, there's an animations module in recent versions of matplotlib.
As an example:
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import numpy as np
x = np.arange(0, 2*np.pi, 0.1)
y = np.sin(x)
fig, axes = plt.subplots(nrows=6)
styles = ['r-', 'g-', 'y-', 'm-', 'k-', 'c-']
def plot(ax, style):
return ax.plot(x, y, style, animated=True)[0]
lines = [plot(ax, style) for ax, style in zip(axes, styles)]
def animate(i):
for j, line in enumerate(lines, start=1):
line.set_ydata(np.sin(j*x + i/10.0))
return lines
# We'd normally specify a reasonable "interval" here...
ani = animation.FuncAnimation(fig, animate, xrange(1, 200),
interval=0, blit=True)
Matplotlib makes great publication-quality graphics, but is not very well optimized for speed.
There are a variety of python plotting packages that are designed with speed in mind:
[ edit: pyqwt is no longer maintained; the previous maintainer is recommending pyqtgraph ]
To start, Joe Kington's answer provides very good advice using a gui-neutral approach, and you should definitely take his advice (especially about Blitting) and put it into practice. More info on this approach, read the Matplotlib Cookbook
However, the non-GUI-neutral (GUI-biased?) approach is key to speeding up the plotting. In other words, the backend is extremely important to plot speed.
Put these two lines before you import anything else from matplotlib:
import matplotlib
Of course, there are various options to use instead of GTKAgg, but according to the cookbook mentioned before, this was the fastest. See the link about backends for more options.
For the first solution proposed by Joe Kington ( .copy_from_bbox & .draw_artist & canvas.blit), I had to capture the backgrounds after the fig.canvas.draw() line, otherwise the background had no effect and I got the same result as you mentioned. If you put it after the fig.show() it still does not work as proposed by Michael Browne.
So just put the background line after the canvas.draw():
# We need to draw the canvas before we start animating...
# Let's capture the background of the figure
backgrounds = [fig.canvas.copy_from_bbox(ax.bbox) for ax in axes]
This may not apply to many of you, but I'm usually operating my computers under Linux, so by default I save my matplotlib plots as PNG and SVG. This works fine under Linux but is unbearably slow on my Windows 7 installations [MiKTeX under Python(x,y) or Anaconda], so I've taken to adding this code, and things work fine over there again:
import platform # Don't save as SVG if running under Windows.
# Plot code goes here.
fig.savefig('figure_name.png', dpi = 200)
if platform.system() != 'Windows':
# In my installations of Windows 7, it takes an inordinate amount of time to save
# graphs as .svg files, so on that platform I've disabled the call that does so.
# The first run of a script is still a little slow while everything is loaded in,
# but execution times of subsequent runs are improved immensely.