How to create an new format of image?

How to create an new format of image? - python

I have read a lot of essays and articles about (Compressing Image Algorithm). There are many algorithms which I can only understand some of them because I'm a student and I haven't gone to high school yet. I read this article which it helps me a lot! Article In page 3 at this part (Run length code). It's a very EZ and helpful algorithm but I don't know how do I make new format of image. I am a python developer but I don't know how to make a new format which it has a separate algorithm and program. --> like .jpeg, ,jpg, .png, .bmp
(Sorry I have studied English for 1 years so if I have some problems such as grammar or vocabulary just excuse me )

Sure, you can make your own image file format. Choose a filename extension, define how it will be stored and write Python code to:
read the format from disk into a Numpy array, and
write an image contained in a Numpy array to disk
That way you will be interoperable with all the major image processing libraries such as OpenCV, scikit-image, PIL, wand.
Have a look how NetPBM works to get started with a simple format. Maybe look at PCX format if you like the thought of RLE.
Read up on how to write binary to a file with Python.

Related

Export PSD Layers to EXR in Python

I'm trying to write a program to read in a .psd file, split the layers into individual images (maintaining the original image's dimensions) and export them as EXR files.
I'm currently trying to use the OpenImageIo library to accomplish this but the documentation isn't particularly clear on how this can be achieved in python.
I've successfully managed to read the full .psd and export it to .exr, but nothing I've been trying seems to indicate that there is more than one layer (subimage) to interact with.
Is there:
something obvious that I'm missing, or
a better way to accomplish this?
Side note:
I have had some success using psd_tools2 but the images can't be exported as .exr nor are they the correct dimensions.

This is actually relatively straightforward, however there is one caveat in that it only seems to be supported for 8-bit psd files at the moment.
import OpenImageIO as oiio
sourcefile = '/path/to/sourcefile.psd'
buf = oiio.ImageBuf(sourcefile)
for layer in range(buf.nsubimages):
buf.reset(sourcefile, subimage=layer)
buf.write('/tmp/mylayer_{l}.exr'.format(l=layer))

Searching for data in a PDF

I've got a PDF file that I'm trying to obtain specific data from.
I've been able to parse the PDF via PyPDF2 into one long string but searching for specific data is difficult because of - I assume - formatting in the original PDF.
What I am looking to do is to retrieve specific known fields and the data that immediately follows (as formatted in the PDF) and then store these in seperate variables.
The PDFs are bills and hence are all presented in the exact same way, with defined fields and images. So what I am looking to do is to extract these fields.
What would be the best way to achieve this?

I've got a PDF file that I'm trying to obtain specific data from.
In general, it is probably impossible (or extremely difficult), and details (than you don't mention) are very important. Study in details the complex PDF specification. Notice that PDF is (more or less accidentally) Turing complete (so your problem is undecidable in general, since equivalent to the halting problem).
For example, a normal human reader could read digits in the document as text, or as a JPEG image, etc. And in practice many PDF documents have such kind of data.... Practically speaking, PDF is an output-only format and is designed for screen displaying and printing, not for extracting data from it.
You need to understand how exactly that PDF file was generated (with what exact software, from what actual data). That could take a lot of time (maybe several years of full time reverse-engineering work) without help.
A much better approach is to contact the person or entity providing that PDF file and negotiate some way of accessing the actual data (or at least get detailed explanation about the generation of that particular PDF file). For example, if the PDF file is computed from some database, you'll better access that database.
Perhaps using metadata or comments in your PDF file might help in guessing how it was generated.
The source of the data might produce various kinds of PDF file. For example, my cheap scanner is able to produce PDF. But your program would have hard time in extracting some numerical data from it (because that kind of PDF is essentially wrapping a pixelated image à la JPEG) and would need to deploy image recognition techniques (i.e. OCR) to do so.

I want to create a jpg resizer in pure python. What are good resources for understanding the jpg format?

From my current understanding, png is relatively easier to decode than bitmap-based formats like jpg in python and is already implemented in python elsewhere. For my own purposes though I need the jpg format.
What are good resources for building a jpg library from scratch? At the moment I only wish to support the resizing of images, but this would presumably involve both encoding/decoding ops.
Edit: to make myself more clear: I am hoping that there is a high level design type treat of how to implement a jpg library in code: specifically considerations when encoding/decoding, perhaps even pseudocode. Maybe it doesn't exist, but better to ask and stand on the shoulders of giants rather than reinvent the wheel.

Use PIL, it already has highlevel APIs for image handling.
If you say "I don't want to use PIL" (and remember, there are private/unofficial ports to 3.x) then I would say read the wikipedia article on JPEG, as it will describe the basics, and also links to in depth articles/descriptions of the JPEG format.
Once you read over that, pull up the source code for PIL JPEGS to see what they are doing there (it is surprisingly simple stuff) The only things they import really, are Image, which is a class they made to hold the raw image data.

Is there any document for the Python win32com operations on Powerpoint?

I have gat some samples about how to open a presentation and access the slides and shapes. But I want to do some more other operations(e.g. generate a thumbnail from a specified slide). What methods can I use? Is there any document illustrating all the functionalities?

Not to discourage you, but my experience using COM from Python is that you won't find many examples.
I would be shocked (but happy to see) if anybody posted a big tutorial or reference using PowerPoint in Python. Probably the best you'll find, which you've probably already found, is this article
However, if you follow along through that article and some of the other Python+COM code around, you start to see the patterns of how VB and C# code converts to Python code using the same interfaces.
Once you understand that, your best source of information is probably the PowerPoint API reference on MSDN.

From looking at the samples Jeremiah pointed to, it looks like you'd start there then do something like this, assuming you wanted to export slide #42:
Slide = Presentation.Slides(42)
Slide.Export FileName, "PNG", 1024, 768
Substitute the full path\filename.ext to the file you want to export to for Filename; string.
Use PNG, JPG, GIF, WMF, EMF, TIF (not always a good idea from PowerPoint), etc; string
The next two numbers are the width and height (in pixels) at which to export the image; VBLong (signed 32-bit (4-byte) numbers ranging in value from -2,147,483,648 to 2,147,483,647)
I've petted pythons but never coded in them; this is my best guess as to syntax. Shouldn't be too much of a stretch to fix any errors.

How to edit raw PCM audio data without an audio library?

I'm interested in precisely extracting portions of a PCM WAV file, down to the sample level. Most audio modules seem to rely on platform-specific audio libraries. I want to make this cross platform and speed is not an issue, are there any native python audio modules that can do this?
If not, I'll have to interpret the PCM binary. While I'm sure I can dig up the PCM specs fairly easily, and raw formats are easy enough to walk, I've never actually dealt with binary data in Python before. Are there any good resources that explain how to do this? Specifically relating to audio would just be icing.

I read the question and the answers and I feel that I must be missing something completely obvious, because nobody mentioned the following two modules:
audioop: manipulate raw audio data
wave: read and write WAV files
Perhaps I come from a parallel universe and Guido's time machine is actually a space-time machine :)
Should you need example code, feel free to ask.
PS Assuming 48kHz sampling rate, a video frame at 24/1.001==23.976023976… fps is 2002 audio samples long, and at 25fps it's 1920 audio samples long.

I've only written a PCM reader in C++ and Java, but the format itself is fairly simple. A decent description can be found here: http://ccrma.stanford.edu/courses/422/projects/WaveFormat/
Past that you should be able to just read it in (binary file reading, http://www.johnny-lin.com/cdat_tips/tips_fileio/bin_array.html) and just deal with the resulting array. You may need to use some bit shifting to get the alignments correct (https://docs.python.org/reference/expressions.html#shifting-operations) but depending on how you read it in, you might not need to.
All of that said, I'd still lean towards David's approach.

Is it really important that your solution be pure Python, or would you accept something that can work with native audio libraries on various platforms (so it's effectively cross-platform)? There are several examples of the latter at http://wiki.python.org/moin/PythonInMusic

Seems like a combination of open(..., "rb"), struct module, and some details about the wav/riff file format (probably better reference out there) will do the job.
Just curious, what do you intend on doing with the raw sample data?

I was looking this up and I found this: http://www.swharden.com/blog/2009-06-19-reading-pcm-audio-with-python/
It requires Numpy (and matplotlib if you want to graph it)
import numpy
data = numpy.memmap("test.pcm", dtype='h', mode='r')
print "VALUES:",data
Check out the original author's site for more details.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.