I work processing really big images of the likes of GIS and Astronomy images. I need to find a library preferably in python that allows me to append bits to an image and write it piece by piece to disk without having to have all the image in RAM at once.
Edit:
Thanks to those who commented. I work with microscopy images. Mostly those that can be opened with Openslide. Some of them are in this list. My goal is to have just one big file containing an image, a file that can be opened by other people instead of having a bunch of tiles.
But unless I have lots and lots of RAM (which I don't always have and people don't always have) I can't create images as big as the original and store them with things like PIL.image. I wish I could create an initial file, and then append to it the rest of the image as I create it.
Just like with GIS and AStronomy, microscopy has to create images based on the scans, and process them, so I was wondering if anyone knew a way to do this.
I don't think that's totally possible. to use data, a computer copies it to RAM.
If you just want to append your data to your image, use PIL.Image
I am writing a file browser using pygtk. For image files I am showing some previews by loading images by pixbuf_new_from_file and scaling them. In directories with large files (like when browsing a portfolio) it takes too long. Is it possible to load the images with lower resolution?
Whole code can be found on Git. In dirFrame.py the function renderMainDirContent is the part that takes too long.
pixbuf_new_from_file_at_size seems to load full image and scale, as it has almost no effect on performance.
It seems like there is no faster way to do this with python. Using numpy to load and scale images improves performance, but you need to save thumbnails for acceptable performance, at least for large images.
Following the last question: read big image file as an array in python
Due to the memory limitation of my laptop, I would like to implement image segmentation algorithm with python generator which can read every pixel at a time, rather than the whole image.
My laptop is Window 7 (64 bit OS) with 4G ram and Intel(R) Core (TM) i7-2860 QM CPU, and the images I am processing are over 2G. The algorithm I want to apply is watershed segmentation: http://scikits-image.org/docs/dev/auto_examples/plot_watershed.html
The only similar example I can find is http://vkedco.blogspot.com/2012/04/rgb-to-gray-level-to-binary-python.html, but what I need is not just converting a pixel value at a time. I need to consider the relations among near pixels. How can I do?
Any idea or hint for me? Thanks in advance!
Since the RGB to graylevel conversion operation is purely local, a streaming approach is trivial; the position of the pixels is irrelevant. Watershed is a global operation. One pixel can change the output dramatically. You have several options:
Write an implementation of Watershed that works on tiles and iterates on many passes through the image. This sounds difficult to me.
Use a local method to segment (i.e. thresholding).
Get a computer with more RAM. RAM is cheap and you can stick tons of it into a desktop system.
I am trying to decide if it is better to use a pre-rendered large image for a scrolling map game or to render the tiles individual on screen each frame. I have tried to program the game both ways and don't see any obvious difference in speed, but that might be due to my lack of experiences.
Besides for memory, is there a speed reasons to not use a pre-rendered map?
The only reason I can think of for picking one over the other on modern hardware (anything as fast and with as much ram as, say, an iPhone), would be technical ones that make the game code itself easier to follow. There's not much performance wise to distinguish them.
One exception I can think of, is if you are using a truly massive background, and doing tile rendering in a GPU, tiles can be textures and you'll get a modest speed bump since you don't need to push much data between cpu and gpu per frame, and it'll use very little video ram.
Memory and speed are closely related. If your tile set fits in video memory, but the pre-rendered map doesn't, speed will suffer.
Maybe it really depends of the map's size but this shouldn't be a problem even with a low-level-computer.
The problem with big images is that it takes a lot of time to redraw all the stuff on it so you will get an unflexible "map".
But a real advantage with an optimized image(use convert()-function and 16 bit) is are the fast blittings.
I work with big images as well on a maybe middle-good-computer and I have around 150 FPS by blitting huge images which require just ~ 100? MB RAM
image = image.convert()#video system has to be initialed
The following code creates an image(5000*5000), draws something on it, (blit this to the screen, fill the screen)*50 times and at the end it tells how long it took to do one blit and one flip.
def draw(dr,image,count,radius,r):
for i in range(0,5000,5000//count):
for j in range(0,5000,5000//count):
dr.circle(image,(r.randint(0,255),r.randint(0,255),r.randint(0,255)),[i,j],radius,0)
def geschw_test(screen,image,p):
t1 = p.time.get_ticks()
screen.blit(image,(-100,-100))
p.display.flip()
return p.time.get_ticks() - t1
import pygame as p
import random as r
p.init()
image = p.Surface([5000,5000])
image.fill((255,255,255))
image.set_colorkey((255,255,255))
screen = p.display.set_mode([1440,900],p.SWSURFACE,16)
image = image.convert()#extremely efficient
screen.fill((70,200,70))
draw(p.draw,image,65,50,r)#draw on surface
zahler = 0
anz = 20
speed_arr = []
while zahler < anz:
zahler += 1
screen.fill((0,0,0))
speed_arr.append(geschw_test(screen,image,p))
p.quit()
speed = 0
for i in speed_arr:
speed += i
print(round(speed/anz,1),"miliseconds per blit with flip")
Depends on the size of the map you want to make, however, with the actual technologies it's very hard to see a tile-map "rendered" to take longer than expected, tiled based games are almost extinguished, however is always a good practice and a starting point to the world of game programming
I am creating custom images that I later convert to an image pyramid for Seadragon AJAX. The images and image pyramid are created using PIL. It currently take a few hours to generate the images and image pyramid for approximately 100 pictures that have a combined width and height of about 32,000,000 by 1000 (yes, the image is very long and narrow). The performance is roughly similar another algorithm I have tried (i.e. deepzoom.py). I plan to see if python-gd would perform better due to most of its functionality being coded in C (from the GD library). I would assume a significant performance increase however I am curious to hear the opinion of others. In particular the resizing and cropping is slow in PIL (w/ Image.ANTIALIAS). Will this improve considerable if I use Python-GD?
Thanks in advance for the comments and suggestions.
EDIT: The performance difference between PIL and python-GD seems minimal. I will refactor my code to reduce performance bottlenecks and include support for multiple processors. I've tested out the python 'multiprocessing' module. Results are encouraging.
PIL is mostly in C.
Antialiasing is slow. When you turn off antialiasing, what happens to the speed?
VIPS includes a fast deepzoom creator. I timed deepzoom.py and on my machine I see:
$ time ./wtc.py
real 0m29.601s
user 0m29.158s
sys 0m0.408s
peak RES 450mb
where wtc.jpg is a 10,000 x 10,000 pixel RGB JPG image, and wtc.py is using these settings.
VIPS is around three times faster and needs a quarter of the memory:
$ time vips dzsave wtc.jpg wtc --overlap 2 --tile-size 128 --suffix .png[compression=0]
real 0m10.819s
user 0m37.084s
sys 0m15.314s
peak RES 100mb
I'm not sure why sys is so much higher.