GPU memory increases with each frame (leak)?

GPU memory increases with each frame (leak)? - python

I am rendering an image which is updated every frame by making it a texture of a square 2D plate (comprised from 2 triangles). However, GPU memory seems to increase monotonically with every frame.
The draw function looks like this:
prog = gloo.Program(vertex, fragment, count=4)
def Draw(self, inImRGB):
texture = inImRGB.copy().view(gloo.Texture2D)
texture.interpolation = gl.GL_LINEAR
CBackgroundImage._prog['texture'] = texture
CBackgroundImage._prog.draw(gl.GL_TRIANGLE_STRIP)
And it is called periodically, for each new available image, using the following callback:
from glumpy import app
window = app.Window(...)
#window.event
def on_draw():
window.clear()
bgImageObj.Draw(newImRGB)
Any idea why GPU memory keeps accumulating ? Should I somehow free the texture each new frame, or fill it in a different way? If so, how?

texture = inImRGB.copy().view(gloo.Texture2D)
Creates and all new texture; eventually the Phython GC will clean up the old stuff, but that's not going to happen, if there's no shortage of memory.
Create the texture during initialization and then reuse.

Related

Drawing a pyglet batch increases memory usage?

I'm trying to use Pyglet's batch drawing capabilities to speed up the rendering of an animation which mainly consists of drawing a large set of vertices many times per second using GL_LINES. Speed-wise, I succeeded, as previously laggy animations now run crisp up to 60 fps. However, I noticed that every time a new frame of animation is drawn (by calling batch.draw() on a few batches, within a function scheduled to be called every frame by pyglet.clock.schedule_interval()), the memory used by the program steadily rises.
I expect the memory usage to substantially increase while creating the batches (and it does), but I don't understand why a mere call to batch.draw() also seems to be causing it, or at least why the memory usage seems to accumulate over future frames. Do I need to somehow manually delete previously drawn images from memory each time I need to redraw a frame? If so, I've found nothing in the Pyglet documentation about needing to do such a thing.
In my program, I call pyglet.gl.glClear(pyglet.gl.GL_COLOR_BUFFER_BIT) at the beginning of each new frame update in order to clear the previous frame, but perhaps the previous frame is still persisting in memory?
Below is an independent example script which exhibits this precise behavior:
import pyglet as pg
import random
currentFrame = 0
window = pg.window.Window(600,600)
# A list of lists of batches to be drawn at each frame.
frames = []
# Make 200 frames...
for frameCount in range(200):
batches = []
for n in range(3):
batch = pg.graphics.Batch()
# Generate a random set of vertices within the window space
batch.add(1000, pg.gl.GL_LINES, None,
("v2f", tuple(random.random()*600 for i in range(2000))),
("c3B", (255,255,255)*1000))
batches.append(batch)
frames.append(batches)
# This function will be called every time the window is redrawn
def update(dt, frames=frames):
global currentFrame
if currentFrame >= len(frames):
pg.clock.unschedule(update)
print("Animation complete")
return
batches = frames[currentFrame]
# Clear the previous frame
pg.gl.glClearColor(0,0,0,1) # Black background color
pg.gl.glClear(pg.gl.GL_COLOR_BUFFER_BIT)
# Define line width for this frame and draw the batches
pg.gl.glLineWidth(5)
for batch in batches:
batch.draw()
currentFrame += 1
# Call the update() function 30 times per second.
pg.clock.schedule_interval(update, 1.0/30)
pg.app.run()
While the animation is running, no new batches should be created, and yet the memory used by Python steadily increases thruout the duration of the animation.

I'm sorry that this might be an extremely late answer, but for anyone coming across this...
you have created 2 lists which are the source of the linear memory usage increase:
frames = []
and
batches = []
on lines 8 and 12 respectively.
Then, in your for loop you add 3 batches into the "batches" list and then add those into the "frames" list.
for frameCount in range(200):
batches = []
for n in range(3):
batch = pg.graphics.Batch()
# Generate a random set of vertices within the window space
batch.add(1000, pg.gl.GL_LINES, None,
("v2f", tuple(random.random()*600 for i in range(2000))),
("c3B", (255,255,255)*1000))
batches.append(batch)
frames.append(batches)
fortunately, you do override whatever is in "batches" in here:
batches = frames[currentFrame]
but that still leaves out "frames", which is never reset and for every frame, has 3 more batches in it.
if you start with 0 frames and end with 200 frames, your memory usage with this "frames" list will go from storing nothing to storing 600 batches.
try to reset the "frames" lst after each frame and keeping track of your count of frames with a simple integer.
I hope this helps at least someone

Pyglet: blit_into texture and alpha

I've been using pyglet for a while now and I really like it. I've got one thing I'd like to do but have been unable to do so far, however.
I'm working on a 2D roleplaying game and I'd like the characters to be able to look different - that is to say, I wouldn't like use completely prebuilt sprites, but instead I'd like there to be a range of, say, hairstyles and equipment, visible on characters in the game.
So to get this thing working, I thought the most sensible way to go on about it would be to create a texture with pyglet.image.Texture.create() and blit the correct sprite source images on that texture using Texture.blit_into. For example, I could blit a naked human image on the texture, then blit a hair texture on that, etc.
human_base = pyglet.image.load('x/human_base.png').get_image_data()
hair_style = pyglet.image.load('x/human_hair1.png').get_image_data()
texture = pyglet.image.Texture.create(width=human_base.width,height=human_base.height)
texture.blit_into(human_base, x=0, y=0, z=0)
texture.blit_into(hair_style, x=0, y=0, z=1)
sprite = pyglet.sprite.Sprite(img=texture, x=0, y=0, batch=my_sprite_batch)
The problem is that blitting the second image into the texture "overwrites" the texture already blitted in. Even though both of the images have an alpha channel, the image below (human_base) is not visible after hair_style is blit on top of it.
One reading this may be wondering why do it this way instead of, say, creating two different pyglet.sprite.Sprite objects, one for human_base and one for hair_style and just move them together. One thing is the draw ordering: the game is tile-based and isometric, so sorting a visible object consisting of multiple sprites with differing layers (or ordered groups, as pyglet calls them) would be a major pain.
So my question is, is there a way to retain alpha when using blit_into with pyglet. If there is no way to do it, please, any suggestions for alternative ways to go on about this would be very much appreciated!

setting the blend function correctly should fix this:
pyglet.gl.glBlendFunc(pyglet.gl.GL_SRC_ALPHA,pyglet.gl.GL_ONE_MINUS_SRC_ALPHA)

I ran into the very same problem and couldn't find a proper solution. Apparently blitting two RGBA images/textures overlapping together will remove the image beneath. Another approache I came up with was using every 'clothing image' on every character as an independent sprite attached to batches and groups, but that was far from the optimal and reduced the FPS dramatically.
I got my own solution by using PIL
import pyglet
from PIL import Image
class main(pyglet.window.Window):
def __init__ (self):
TILESIZE = 32
super(main, self).__init__(800, 600, fullscreen = False)
img1 = Image.open('under.png')
img2 = Image.open('over.png')
img1.paste(img2,(0,0),img2.convert('RGBA'))
img = img1.transpose(Image.FLIP_TOP_BOTTOM)
raw_image=img.tostring()
self.image=pyglet.image.ImageData(TILESIZE,TILESIZE,'RGBA',raw_image)
def run(self):
while not self.has_exit:
self.dispatch_events()
self.clear()
self.image.blit(0,0)
self.flip()
x = main()
x.run()
This may well not be the optimal solution, but if you do the loading in scene loading, then it won't matter, and with the result you can do almost almost anything you want to (as long as you don't blit it on another texture, heh). If you want to get just 1 tile (or a column or a row or a rectangular box) out of a tileset with PIL, you can use the crop function.

Understanding performance limitations of the Tkinter Canvas

I've created a simple application to display a scatterplot of data using Tkinter's Canvas widget (see the simple example below). After plotting 10,000 data points, the application becomes very laggy, which can be seen by trying to change the size of the window.
I realize that each item added to the Canvas is an object, so there may be some performance issues at some point, however, I expected that level to be much higher than 10,000 simple oval objects. Further, I could accept some delays when drawing the points or interacting with them, but after they are drawn, why would just resizing the window be so slow?
After reading effbot's performance issues with the Canvas widget it seems there may be some unneeded continuous idle tasks during resizing that need to be ignored:
The Canvas widget implements a straight-forward damage/repair display
model. Changes to the canvas, and external events such as Expose, are
all treated as “damage” to the screen. The widget maintains a dirty
rectangle to keep track of the damaged area.
When the first damage event arrives, the canvas registers an idle task
(using after_idle) which is used to “repair” the canvas when the
program gets back to the Tkinter main loop. You can force updates by
calling the update_idletasks method.
So, the question is whether there is any way to use update_idletasks to make the application more responsive once the data has been plotted? If so, how?
Below is the simplest working example. Try resizing the window after it loads to see how laggy the application becomes.
Update
I originally observed this problem in Mac OS X (Mavericks), where I get a substantial spike in CPU usage when just resizing the window. Prompted by Ramchandra's comments I've tested this in Ubuntu and this doesn't seem to occur. Perhaps this is a Mac Python/Tk problem? Wouldn't be the first I've run into, see my other question:
PNG display in PIL broken on OS X Mavericks?
Could someone also try in Windows (I don't have access to a Windows box)?
I may try running on the Mac with my own compiled version of Python and see if the problem persists.
Minimal working example:
import Tkinter
import random
LABEL_FONT = ('Arial', 16)
class Application(Tkinter.Frame):
def __init__(self, master, width, height):
Tkinter.Frame.__init__(self, master)
self.master.minsize(width=width, height=height)
self.master.config()
self.pack(
anchor=Tkinter.NW,
fill=Tkinter.NONE,
expand=Tkinter.FALSE
)
self.main_frame = Tkinter.Frame(self.master)
self.main_frame.pack(
anchor=Tkinter.NW,
fill=Tkinter.NONE,
expand=Tkinter.FALSE
)
self.plot = Tkinter.Canvas(
self.main_frame,
relief=Tkinter.RAISED,
width=512,
height=512,
borderwidth=1
)
self.plot.pack(
anchor=Tkinter.NW,
fill=Tkinter.NONE,
expand=Tkinter.FALSE
)
self.radius = 2
self._draw_plot()
def _draw_plot(self):
# Axes lines
self.plot.create_line(75, 425, 425, 425, width=2)
self.plot.create_line(75, 425, 75, 75, width=2)
# Axes labels
for i in range(11):
x = 75 + i*35
y = x
self.plot.create_line(x, 425, x, 430, width=2)
self.plot.create_line(75, y, 70, y, width=2)
self.plot.create_text(
x, 430,
text='{}'.format((10*i)),
anchor=Tkinter.N,
font=LABEL_FONT
)
self.plot.create_text(
65, y,
text='{}'.format((10*(10-i))),
anchor=Tkinter.E,
font=LABEL_FONT
)
# Plot lots of points
for i in range(0, 10000):
x = round(random.random()*100.0, 1)
y = round(random.random()*100.0, 1)
# use floats to prevent flooring
px = 75 + (x * (350.0/100.0))
py = 425 - (y * (350.0/100.0))
self.plot.create_oval(
px - self.radius,
py - self.radius,
px + self.radius,
py + self.radius,
width=1,
outline='DarkSlateBlue',
fill='SteelBlue'
)
root = Tkinter.Tk()
root.title('Simple Plot')
w = 512 + 12
h = 512 + 12
app = Application(root, width=w, height=h)
app.mainloop()

There is actually a problem with some distributions of TKinter and OS Mavericks. Apparently you need to install ActiveTcl 8.5.15.1. There is a bug with TKinter and OS Mavericks. If it still isn't fast eneough, there are some more tricks below.
You could still save the multiple dots into one image. If you don't change it very often, it should still be faster. If you are changing them more often, here are some other ways to speed up a python program. This other stack overflow thread talks about using cython to make a faster class. Because most of the slowing down is probably due to the graphics this probably won't make it a lot faster but it could help.
Suggestions on how to speed up a distance calculation
you could also speed up the for loop by defining an iterator ( ex: iterator = (s.upper() for s in list_to_iterate_through) ) beforehand, but this is called to draw the window, not constantly as the window is maintained, so this shouldn't matter very much. Also, a another way to speed things up, taken from python docs, is to lower the frequency of python's background checks:
"The Python interpreter performs some periodic checks. In particular, it decides whether or not to let another thread run and whether or not to run a pending call (typically a call established by a signal handler). Most of the time there's nothing to do, so performing these checks each pass around the interpreter loop can slow things down. There is a function in the sys module, setcheckinterval, which you can call to tell the interpreter how often to perform these periodic checks. Prior to the release of Python 2.3 it defaulted to 10. In 2.3 this was raised to 100. If you aren't running with threads and you don't expect to be catching many signals, setting this to a larger value can improve the interpreter's performance, sometimes substantially."
Another thing I found online is that for some reason setting the time by changing os.environ['TZ'] will speed up the program a small amount.
If this still doesn't work, than it is likely that TKinter is not the best program to do this in. Pygame could be faster, or a program that uses the graphics card like open GL (I don't think that is available for python, however)

Tk must be getting bogged down looping over all of those ovals. I'm not
sure that the canvas was ever intended to hold so many items at once.
One solution is to draw your plot into an image object, then place the image
into your canvas.

plotting / wxPython - Why is this call to MemoryDC.SelectObject() slow?

I'm attempting to do real-time plotting of data in matplotlib, in a "production" application that uses wxPython. I had been using Chaco for this purpose, but I'm trying to avoid Chaco in the future for many reasons, one of which is that since it's not well-documented I often must spend a long time reading the Chaco source code when I want to add even the smallest feature to one of my plots. One aspect where Chaco wins out over matplotlib is in speed, so I'm exploring ways to get acceptable performance from matplotlib.
One technique I've seen widely used for fast plots in matplotlib is to set animated to True for elements of the plot which you wish to update often, then draw the background (axes, tick marks, etc.) only once, and use the canvas.copy_from_bbox() method to save the background. Then, when drawing a new foreground (the plot trace, etc.), you use canvas.restore_region() to simply copy the pre-rendered background to the screen, then draw the new foreground with axis.draw_artist() and canvas.blit() it to the screen.
I wrote up a fairly simple example that embeds a FigureCanvasWxAgg in a wxPython Frame and tries to display a single trace of random data at 45 FPS. When the program is running with the Frame at the default size (hard-coded in my source), it achieves ~13-14 frames per second on my machine. When I maximize the window, the refresh drops to around 5.5 FPS. I don't think this will be fast enough for my application, especially once I start adding additional elements to be rendered in real-time.
My code is posted here: basic_fastplot.py
I wondered if this could be made faster, so I profiled the code and found that by far the largest consumers of processing time are the calls to canvas.blit() at lines 99 and 109. I dug a little further, instrumenting the matplotlib code itself to find that most of this time is spent in a specific call to MemoryDC.SelectObject(). There are several calls to SelectObject in surrounding code, but only the one marked below takes any appreciable amount of time.
From the matplotlib source, backend_wxagg.py:
class FigureCanvasWxAgg(FigureCanvasAgg, FigureCanvasWx):
# ...
def blit(self, bbox=None):
"""
Transfer the region of the agg buffer defined by bbox to the display.
If bbox is None, the entire buffer is transferred.
"""
if bbox is None:
self.bitmap = _convert_agg_to_wx_bitmap(self.get_renderer(), None)
self.gui_repaint()
return
l, b, w, h = bbox.bounds
r = l + w
t = b + h
x = int(l)
y = int(self.bitmap.GetHeight() - t)
srcBmp = _convert_agg_to_wx_bitmap(self.get_renderer(), None)
srcDC = wx.MemoryDC()
srcDC.SelectObject(srcBmp) # <<<< Most time is spent here, 30milliseconds or more!
destDC = wx.MemoryDC()
destDC.SelectObject(self.bitmap)
destDC.BeginDrawing()
destDC.Blit(x, y, int(w), int(h), srcDC, x, y)
destDC.EndDrawing()
destDC.SelectObject(wx.NullBitmap)
srcDC.SelectObject(wx.NullBitmap)
self.gui_repaint()
My questions:
What could SelectObject() be doing that is taking so long? I had sort of assumed it would just be setting up pointers, etc., not doing much copying or computation.
Is there any way I might be able to speed this up (to get maybe 10 FPS at full-screen)?

Any way to speed up Python and Pygame?

I am writing a simple top down rpg in Pygame, and I have found that it is quite slow.... Although I am not expecting python or pygame to match the FPS of games made with compiled languages like C/C++ or event Byte Compiled ones like Java, But still the current FPS of pygame is like 15. I tried rendering 16-color Bitmaps instead of PNGs or 24 Bitmaps, which slightly boosted the speed, then in desperation , I switched everything to black and white monochrome bitmaps and that made the FPS go to 35. But not more. Now according to most Game Development books I have read, for a user to be completely satisfied with game graphics, the FPS of a 2d game should at least be 40, so is there ANY way of boosting the speed of pygame?

Use Psyco, for python2:
import psyco
psyco.full()
Also, enable doublebuffering. For example:
from pygame.locals import *
flags = FULLSCREEN | DOUBLEBUF
screen = pygame.display.set_mode(resolution, flags, bpp)
You could also turn off alpha if you don't need it:
screen.set_alpha(None)
Instead of flipping the entire screen every time, keep track of the changed areas and only update those. For example, something roughly like this (main loop):
events = pygame.events.get()
for event in events:
# deal with events
pygame.event.pump()
my_sprites.do_stuff_every_loop()
rects = my_sprites.draw()
activerects = rects + oldrects
activerects = filter(bool, activerects)
pygame.display.update(activerects)
oldrects = rects[:]
for rect in rects:
screen.blit(bgimg, rect, rect)
Most (all?) drawing functions return a rect.
You can also set only some allowed events, for more speedy event handling:
pygame.event.set_allowed([QUIT, KEYDOWN, KEYUP])
Also, I would not bother with creating a buffer manually and would not use the HWACCEL flag, as I've experienced problems with it on some setups.
Using this, I've achieved reasonably good FPS and smoothness for a small 2d-platformer.

All of these are great suggestions and work well, but you should also keep in mind two things:
1) Blitting surfaces onto surfaces is faster than drawing directly. So pre-drawing fixed images onto surfaces (outside the main game loop), then blitting the surface to the main screen will be more efficient. For exmample:
# pre-draw image outside of main game loop
image_rect = get_image("filename").get_rect()
image_surface = pygame.Surface((image_rect.width, image_rect.height))
image_surface.blit(get_image("filename"), image_rect)
......
# inside main game loop - blit surface to surface (the main screen)
screen.blit(image_surface, image_rect)
2) Make sure you aren't wasting resources by drawing stuff the user can't see. for example:
if point.x >= 0 and point.x <= SCREEN_WIDTH and point.y >= 0 and point.y <= SCREEN_HEIGHT:
# then draw your item
These are some general concepts that help me keep FPS high.

When using images it is important to convert them with the convert()-function of the image.
I have read that convert() disables alpha which is normally quite slow.
I also had speed problems until I used a colour depth of 16 bit and the convert function for my images. Now my FPS are around 150 even if I blit a big image to the screen.
image = image.convert()#video system has to be initialed
Also rotations and scaling takes a lot of time to calculate. A big, transformed image can be saved in another image if it is immutable.
So the idea is to calculate once and reuse the outcome multiple times.

When loading images, if you absolutely require transparency or other alpha values, use the Surface.convert_alpha() method. I have been using it for a game I've been programming, and it has been a huge increase in performance.
E.G: In your constructor, load your images using:
self.srcimage = pygame.image.load(imagepath).convert_alpha()
As far as I can tell, any transformations you do to the image retains the performance this method calls. E.G:
self.rotatedimage = pygame.transform.rotate(self.srcimage, angle).convert_alpha()
becomes redundant if you are using an image that has had convert_alpha() ran on it.

First, always use 'convert()' because it disables alpha which makes bliting faster.
Then only update the parts of the screen that need to be updated.
global rects
rects = []
rects.append(pygame.draw.line(screen, (0, 0, 0), (20, 20), (100, 400), 1))
pygame.display.update(rects) # pygame will only update those rects
Note:
When moving a sprite you have to include in the list the rect from their last position.

You could try using Psyco (http://psyco.sourceforge.net/introduction.html). It often makes quite a bit of difference.

There are a few things to consider for a well-performing Pygame application:
Ensure that the image Surface has the same format as the display Surface. Use convert() (or convert_alpha()) to create a Surface that has the same pixel format. This improves performance when the image is blit on the display, because the formats are compatible and blit does not need to perform an implicit transformation. e.g.:
surf = pygame.image.load('my_image.png').convert_alpha()
Do not load the images in the application loop. pygame.image.load is a very time-consuming operation because the image file must be loaded from the device and the image format must be decoded. Load the images once before the application loop, but use the images in the application loop.
If you have a static game map that consists of tiles, you can buy performance by paying with memory usage. Create a large Surface with the complete map. blit the area which is currently visible on the screen:
game_map = pygame.Surface((tile_size * columns, tile_size * rows))
for x in range(columns):
for y in range(rows):
tile_image = # image for this row and column
game_map.blit(tile_image , (x * tile_size, y * tile_size))
while game:
# [...]
map_sub_rect = screen.get_rect(topleft = (camera_x, camera_y))
screen.blit(game_map, (0, 0), map_sub_rect)
# [...]
If the text is static, the text does not need to be rendered in each frame. Create the text surface once at the beginning of the program or in the constructor of a class, and blit the text surface in each frame.
If the text is dynamic, it cannot even be pre-rendered. However, the most time-consuming is to create the pygame.font.Font/pygame.font.SysFont object. At the very least, you should avoid creating the font in every frame.
In a typical application you don't need all permutations of fonts and font sizes. You just need a couple of different font objects. Create a number of fonts at the beginning of the application and use them when rendering the text. For Instance. e.g.:
fontComic40 = pygame.font.SysFont("Comic Sans MS", 40)
fontComic180 = pygame.font.SysFont("Comic Sans MS", 180)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.