Speeding up grabbing alphachannel - python

I don't know if there is anything that can be done to speed up my code at all, probably not by much if at all, but I thought I would ask here.
I am working on a python script for a program that uses a custom embedded python interpreter so I can only use the default libraries. External libraries like Pillow and Numpy don't work because they changed the name of the python dll and so the precompiled libraries can't interact with it.
This program doesn't support pasting transparent images from the clipboard outside of its own proprietary format. So I'm writing a script to cover that feature. It grabs the CF_DIBv5 format from the clipboard using ctypes and checks to see if it is 32bpp and that an alphamask exists.
Here's the slow part. I then need to isolate the alpha channel and save it as its own separate image. I can do this easily enough. Just grab a Long from the byte string, & the mask to get the alpha channel, and add pack it back to my new bitmap bytestring. On a small 300x300 image, this takes close to 10 seconds. Which isn't horrible. I will gladly live with that. However, I fear it's going to be horribly slow on larger megapixel images.
I'm not showing the complete code here because it's a horrible ugly mess and most of it is just defining the structures I'm using for my bitmap class and getting ctypes working. But here are the important parts where I loop over the data.
rowsizemask = calcRowSize(24,bmp.header.bV5Width) #returns bytes per row needed
rowmaskpadding = b'\x00'*(rowsizemask - bmp.header.bV5Width*3) #creates padding bytes
#loop over image data
for y in range(bmp.header.bV5Height):
for x in range(bmp.header.bV5Width):
offset, color = unpack(offset,">L",buff) #calls struct.unpack in custom function
color = color[0] & bmp.header.bV5AlphaMask #gets alpha channel
newbmp.pixels += struct.pack(">3B", color,color,color) #creates 24bpp listing
newbmp.pixels += rowmaskpadding #pad row to meet BMP specs
So what do you think? Am I missing something obvious? Or is this about as good as it's going to get with pure python only?

Okay, so after some more digging. I realized I could use ctypes.create_string_buffer to create a binary string of the perfect size and then use slices to change the values.
There are more tiny optimizations and code cleanups I can do but this has taken it from a script that can easily take several minutes to complete on a 900x900 pixel image, to just a few seconds.
Is this the best option? No idea, but it works. And it's faster than I had thought possible. See the edited code here. The changes are minor.
rowSizeMask = calcRowSize(24,bmp.header.bV5Width) #returns bytes per row needed
paddingLength = (rowSizeMask = bmp.header.bV5Width*3)
rowMaskPadding = b'\x00'*paddingLength #creates padding bytes
writeOffset = 0
#create pixel buffer
#rowsize mask includes padding, multiply by height for total byte count
newBmp.pixels = ctypes.create_string_buffer(bmp.heaer.bV5Height * rowSizeMask)
#loop over image data
for y in range(bmp.header.bV5Height):
for x in range(bmp.header.bV5Width):
offset, color = unpack(offset,">L",buff) #calls struct.unpack in custom function
color = color[0] & bmp.header.bV5AlphaMask #gets alpha channel
newBmp.pixels[writeOffset:writeOffset+3] = struct.pack(">3B", color,color,color) #creates 24bpp listing
writeOffset += 3
newBmp.pixels += rowMaskPadding #pad row to meet BMP specs
writeOffset += paddingLength

Related

Is there a proper way to convert common picture file extensions into a .PGM "P2" using PIL or cv2?

Edit: Problem solved and code updated.
I apologize in advance for the long post. I wanted to bring as much as I could to the table. My question consists of two parts.
Background: I was in need of a simple Python script that would convert common picture file extensions into a .PGM ASCII file. I had no issues coming up with a naive solution as PGM seems pretty straight forward.
# convert-to-pgm.py is a script for converting image types supported by PIL into their .pgm
# ascii counterparts, as well as resizing the image to have a width of 909 and keeping the
# aspect ratio. Its main purpose will be to feed NOAA style images into an APT-encoder
# program.
from PIL import Image, ImageOps, ImageEnhance
import numpy as np
# Open image, convert to greyscale, check width and resize if necessary
im = Image.open(r"pics/NEKO.JPG").convert("L")
image_array = np.array(im)
print(f"Original 2D Picture Array:\n{image_array}") # data is stored differently depending on
# im.mode (RGB vs L vs P)
image_width, image_height = im.size
print(f"Size: {im.size}") # Mode: {im.mode}")
# im.show()
if image_width != 909:
print("Resizing to width of 909 keeping aspect ratio...")
new_width = 909
ratio = (new_width / float(image_width))
new_height = int((float(image_height) * float(ratio)))
im = im.resize((new_width, new_height))
print(f"New Size: {im.size}")
# im.show()
# Save image data in a numpy array and make it 1D.
image_array1 = np.array(im).ravel()
print(f"Picture Array: {image_array1}")
# create file w .pgm ext to store data in, first 4 lines are: pgm type, comment, image size,
# maxVal (=white, 0=black)
file = open("output.pgm", "w+")
file.write("P2\n# Created by convert-to-pgm.py \n%d %d\n255\n" % im.size)
# Storing greyscale data in file with \n delimiter
for number in image_array1:
# file.write(str(image_array1[number]) + '\n') #### This was the culprit of the hindered image quality...changed to line below. Thanks to Mark in comments.
file.write(str(number) + '\n')
file.close()
im = im.save(r"pics/NEKO-greyscale.jpg")
# Strings to replace the newline characters
WINDOWS_LINE_ENDING = b'\r\n'
UNIX_LINE_ENDING = b'\n'
with open('output.pgm', 'rb') as open_file:
content = open_file.read()
content = content.replace(WINDOWS_LINE_ENDING, UNIX_LINE_ENDING)
with open('output.pgm', 'wb') as open_file:
open_file.write(content)
open_file.close()
This produces a .PGM file that, when opened with a text editor, looks similar to the same image that was exported as a .PGM using GIMP (My prior solution was to use the GIMP export tool to manually convert the pictures and I couldn't find any other converters that supported the "P2" format). However, the quality of the resulting picture is severely diminished compared to what is produced using the GIMP export tool. I have tried a few methods of image enhancement (brightness, equalize, posterize, autocontrast, etc.) to get a better result, but none have been entirely successful. So my first question: what can I do differently to obtain a result that looks more like what GIMP produces? I am not looking for perfection, just a little clarity and a learning experience. How can I automatically adjust {insert whatever} for the best picture?
Below is the .PGM image produced by my version compared GIMP's version, open in a text editor, using the same input .jpg
My version vs. GIMP's version:
Below are comparisons of adding various enhancements before creating the .pgm file compared to the original .jpg and the original .jpg converted as a greyscale ("L"). All photos are opened through GIMP.
Original .jpg
Greyscale .jpg, after .convert("L") command
**This is ideally what I want my .PGM to look like. Why is the numpy array data close, yet different than the data in the GIMP .PGM file, even though the produced greyscale image looks identical to what GIMP produces?
Answer: Because it wasn't saving the correct data. :D
GIMP's Resulting .PGM
My Resulting .PGM
My Resulting .PGM with lower brightness, with Brightness.enhance(0.5)
Resulting .PGM with posterize, ImageOps.posterize(im, 4)
SECOND PROBLEM:
My last issue comes when viewing the .PGM picture using various PGM viewers, such as these online tools (here and here). The .PGM file is not viewable through one of the above links, but works "fine" when viewing with the other link or with GIMP. Likewise, the .PGM file I produce with my script is also not currently compatible with the program that I intend to use it for. This is most important to me, since its purpose is to feed the properly formatted PGM image into the program. I'm certain that something in the first four lines of the .PGM file is altering the program's ability to sense that it is indeed a PGM, and I'm pretty sure that it's something trivial, since some other viewers are also not capable of reading my PGM. So my main question is: Is there a proper way to do this conversion or, with the proper adjustments, is my script suitable? Am I missing something entirely obvious? I have minimal knowledge on image processing.
GitHub link to the program that I'm feeding the .PGM images into: here
More info on this particular issue: The program throws a fault when ran with one of my .PGM images, but works perfectly with the .PGM images produced with GIMP. The program is in C++ and the line "ASSERT(buf[2] == '\n')" returns the error, implying that my .PGM file is not in the correct format. If I comment this line out and recompile, another "ASSERT(width == 909)..." throws an error, implying that my .PGM does not have a width of 909 pixels. If I comment this line out as well and recompile, I am left with the infamous "segmentation fault (core dumped)." I compiled this on Windows, with cygwin64. Everything seems to be in place, so the program is having trouble reading the contents of the file (or understanding '\n'?). How could this be if both my version and GIMP's version are essentially identical in format, when viewed with a text editor?
Terminal output:
Thanks to all for the help, any and all insight/criticism is acceptable.
The first part of my question was answered in the comments, it was a silly mistake on my end as I'm still learning syntax. The above code now works as intended.
I was able to do a little more research on the second part of my problems and I noticed something very important, and also feel quite silly for missing it yesterday.
So of course the reason why my program was having a problem reading the '\n' character was simply because Windows encodes newline characters as CRLF aka '\r\n' as opposed to the Unix way of LF aka '\n'. So in my script at the very end I just add the simple code [taken from here]:
# replacement strings
WINDOWS_LINE_ENDING = b'\r\n'
UNIX_LINE_ENDING = b'\n'
with open('output.pgm', 'rb') as open_file:
content = open_file.read()
content = content.replace(WINDOWS_LINE_ENDING, UNIX_LINE_ENDING)
with open('output.pgm', 'wb') as open_file:
open_file.write(content)
Now, regardless on whether the text file is encoded with CRLF or LF, the script will work properly.

How do I create a Mat matrix using OpenCV in Python?

Obviously this line is wrong.
matOut = cv2::Mat::Mat(height, width, cv2.CV_8UC4)
this one too.
Mat matOut(height, width, cv2.CV_8UC4)
How do I create an modern openCV empty matrix with a given size, shape, format? The latest openCV docs on topic don't seem to be helping... I gleaned the formats used above directly from that document. Note: I'm assuming OpenCV (import cv2) Python3, import numpy, etc...
I was looking to create an empty matrix with the intent of copying content from a different buffer into it...
edit, more failed attempts...
matOut = cv2.numpy.Mat(height, width, cv2.CV_8UC4)
matOut = numpy.array(height, width, cv2.CV_8UC4)
So user696969 called out a largely successful solution to the question asked. You create a new shaped area via:
matOut = numpy.zeros([height, width, 4], dtype=numpy.uint8)
note I have replaced the desired content, cv2.CV_8UC4, with its expected response, the number 4. There are 4 eight bit bytes in a simple RGBA pixel descriptor. I would have preferred if OpenCV tooling had performed that response as a function call, but that didn't seem to work...
I do want to share my use case. I originally was going to create an empty shaped matrix so I could transfer data from a single dimensional array there. As I worked this problem I realized there was a better way. I start the routine where I have received a file containing 8 bit RGBA data without any prefix metadata. Think raw BMP without any header info.
matContent = numpy.frombuffer(fileContent, numpy.uint8)
matContentReshaped = matContent.reshape(height, width, 4)
cv2.imshow("Display Window", matContentReshaped)
k = cv2.waitKey(0)
Thats it. Easy, Peasy.... Thanks to user696969 and eldesgraciado for help here.

How to view real time mosaicing of large image?

I have built a code which will stitch 100X100 images approx. I want to view this stitiching process in real time. I am using pyvips to create large image. I am saving final image in .DZI format as it will take very less memory footprint to display.
Below code is copied just for testing purpose https://github.com/jcupitt/pyvips/issues/43.
#!/usr/bin/env python
import sys
import pyvips
# overlap joins by this many pixels
H_OVERLAP = 100
V_OVERLAP = 100
# number of images in mosaic
ACROSS = 40
DOWN = 40
if len(sys.argv) < 2 + ACROSS * DOWN:
print 'usage: %s output-image input1 input2 ..'
sys.exit(1)
def join_left_right(filenames):
images = [pyvips.Image.new_from_file(filename) for filename in filenames]
row = images[0]
for image in images[1:]:
row = row.merge(image, 'horizontal', H_OVERLAP - row.width, 0)
return row
def join_top_bottom(rows):
image = rows[0]
for row in rows[1:]:
image = image.merge(row, 'vertical', 0, V_OVERLAP - image.height)
return image
rows = []
for y in range(0, DOWN):
start = 2 + y * ACROSS
end = start + ACROSS
rows.append(join_left_right(sys.argv[start:end]))
image = join_top_bottom(rows)
image.write_to_file(sys.argv[1])
To run this code:
$ export VIPS_DISC_THRESHOLD=100
$ export VIPS_PROGRESS=1
$ export VIPS_CONCURRENCY=1
$ mkdir sample
$ for i in {1..1600}; do cp ~/pics/k2.jpg sample/$i.jpg; done
$ time ./mergeup.py x.dz sample/*.jpg
here cp ~/pics/k2.jpg will copy k2.jpg image 1600 times from pics folder, so change according to your image name and location.
I want to display this process in real time. Right now after creating final mosaiced image I am able to display. Just an idea,I am thinking to make a large image and display it, then insert smaller images. I don't know, how it can be done. I am confused as we also have to make pyramidal structure. So If we create large image first we have to replace each level images with the new images. Creating .DZI image is expensive, so I don't want to create it in every running loop. Replacing images may be a solution. Any suggestion folks??
I suppose you have two challenges: how to keep the pyramid up-to-date on the server, and how to keep it up-to-date on the client. The brute force method would be to constantly rebuild the DZI on the server, and periodically flush the tiles on the client (so they reload). For something like that you'll also need to add a cache bust to the tile URLs each time, or the browser will think it should just use its local copy (not realizing it has updated). Of course this brute force method is probably too slow (though it might be interesting to try!).
For a little more finesse, you'd want to make a pyramid that's exactly aligned with the sub images. That way when you change a single sub image, it's obvious which tiles need to be updated. You can do this with DZI if you have square sub images and you use a tile size that is some even fraction of the sub image size. Also no tile overlap. Of course you'll have to build your own DZI constructor, since the existing ones aren't primed to simply replace individual tiles. If you know which tiles you changed on the server, you can communicate that to the client (either via periodic polling or with something like web sockets) and then flush only those tiles (again with the cache busting).
Another solution you could experiment with would be to not attempt a pyramid, per se, but just a flat set of tiles at a reasonable resolution to allow the user to pan around the scene. This would greatly simplify your pyramid updating on the server, since all you would need to do would be replace a single image for each sub image. This could be loaded and shown in a custom (non-OpenSeadragon) fashion on the client, or you could even use OpenSeadragon's multi-image feature to take advantage of its panning and zooming, like here: http://www.letsfathom.com/ (each album cover is its own independent image object).

Reading pre-processed cr2 RAW image data in python

I am trying to read raw image data from a cr2 (canon raw image file). I want to read the data only (no header, etc.) pre-processed if possible (i.e pre-bayer/the most native unprocessed data) and store it in a numpy array. I have tried a bunch of libraries such as opencv, rawkit, rawpy but nothing seems to work correctly.
Any suggestion on how I should do this? What I should use? I have tried a bunch of things.
Thank you
Since libraw/dcraw can read cr2, it should be easy to do. With rawpy:
#!/usr/bin/env python
import rawpy
raw = rawpy.imread("/some/path.cr2")
bayer = raw.raw_image # with border
bayer_visible = raw.raw_image_visible # just visible area
Both bayer and bayer_visible are then a 2D numpy array.
You can use rawkit to get this data, however, you won't be able to use the actual rawkit module (which provides higher level APIs for dealing with Raw images). Instead, you'll want to use mostly the libraw module which allows you to access the underlying LibRaw APIs.
It's hard to tell exactly what you want from this question, but I'm going to assume the following: Raw bayer data, including the "masked" border pixels (which aren't displayed, but are used to calculate various things about the image). Something like the following (completely untested) script will allow you to get what you want:
#!/usr/bin/env python
import ctypes
from rawkit.raw import Raw
with Raw(filename="some_file.CR2") as raw:
raw.unpack()
# For more information, see the LibRaw docs:
# http://www.libraw.org/docs/API-datastruct-eng.html#libraw_rawdata_t
rawdata = raw.data.contents.rawdata
data_size = rawdata.sizes.raw_height * rawdata.sizes.raw_width
data_pointer = ctypes.cast(
rawdata.raw_image,
ctypes.POINTER(ctypes.c_ushort * data_size)
)
data = data_pointer.contents
# Grab the first few pixels for demonstration purposes...
for i in range(5):
print('Pixel {}: {}'.format(i, data[i]))
There's a good chance that I'm misunderstanding something and the size is off, in which case this will segfault eventually, but this isn't something I've tried to make LibRaw do before.
More information can be found in this question on the LibRaw forums, or in the LibRaw struct docs.
Storing in a numpy array I leave as an excersize for the user, or for a follow up answer (I have no experience with numpy).

Best dtype for creating large arrays with numpy

I am looking to store pixel values from satellite imagery into an array. I've been using
np.empty((image_width, image_length)
and it worked for smaller subsets of an image, but when using it on the entire image (3858 x 3743) the code terminates very quickly and all I get is an array of zeros.
I load the image values into the array using a loop and opening the image with gdal
img = gdal.Open(os.path.join(fn + "\{0}".format(fname))).ReadAsArray()
but when I include print img_array I end up with just zeros.
I have tried almost every single dtype that I could find in the numpy documentation but keep getting the same result.
Is numpy unable to load this many values or is there a way to optimize the array?
I am working with 8-bit tiff images that contain NDVI (decimal) values.
Thanks
Not certain what type of images you are trying to read, but in the case of radarsat-2 images you can the following:
dataset = gdal.Open("RADARSAT_2_CALIB:SIGMA0:" + inpath + "product.xml")
S_HH = dataset.GetRasterBand(1).ReadAsArray()
S_VV = dataset.GetRasterBand(2).ReadAsArray()
# gets the intensity (Intensity = re**2+imag**2), and amplitude = sqrt(Intensity)
self.image_HH_I = numpy.real(S_HH)**2+numpy.imag(S_HH)**2
self.image_VV_I = numpy.real(S_VV)**2+numpy.imag(S_VV)**2
But that is specifically for that type of images (in this case each image contains several bands, so i need to read in each band separately with GetRasterBand(i), and than do ReadAsArray() If there is a specific GDAL driver for the type of images you want to read in, life gets very easy
If you give some more info on the type of images you want to read in, i can maybe help more specifically
Edit: did you try something like this ? (not sure if that will work on tiff, or how many bits the header is, hence the something:)
A=open(filename,"r")
B=numpy.fromfile(A,dtype='uint8')[something:].reshape(3858,3743)
C=B*1.0
A.close()
Edit: The problem is solved when using 64bit python instead of 32bit, due to memory errors at 2Gb when using the 32bit python version.

Categories

Resources