Is it possible to identify the graphics program from the image? - python

Is there a good way to identify (or at least approximate) the graphics program used to obtain a particular image? For instance, I want to know if there is a certain signature that these programs embed into an image. Any suggestions?
If not, is there a reference where I can find what all meta-information can be extracted out of an image?

Certain image file formats do have meta-data. It is format dependent. Digital cameras usually write some of their information into the meta-data. EXIF is what comes to mind. Images not acquired through a digital camera may or may not have relevant meta-data, so you can't consider meta-data of any sort to be a guaranteed reliable identifier. That's about as much as I can give as an answer, alas. I'm sure someone else may have more details.

Related

Breaking Down 3D models up to lines and curves

I'm working on a project to breakdown 3D models but I'm quite lost. I hope you can help me.
I'm getting a 3D model from Autodesk BIM and the format could be native or generic CAD formats (.stp, .igs, .x_t, .stl). Then, I need to "measure" somehow the maximum dimensions to model a raw material body, it will always have the shape of a huge panel. Once I get both bodies, I will get the difference to extract the solids I need to analyze; and, on each of these bodies, I need to extract the faces, and then the lines or curves of each face.
This sounds something really easy to do on a CAD software, but the idea is to automate this process. I was looking into openSCAD, but seems that works only to model geometry and it doesn't handle well imported solids. I'm leaving a picture with the idea of what I need to do in the link below.
So, Any idea how can I do this? which langue and library can help in this project?
I can see this automation possible with a few in between steps:
OpenSCAD can handle differences well, so your "Extract Bodies" seems plausible
1.5 Before going further, you'll have to explain how you "filtered out" the cylinder. Will you do this manually? If you don't, you will have it considered for analysis and have a lot of faces as a result.
I don't think openSCAD provides you a vertex array. However, it can save to .STL, which is kinda easy to parse with the programming language of your choice, you'll have to study .stl file structure a bit (this sounds much more frightening than it is - if you open an stl with an editor you will probably immediately realize what's happening).
Since you've parsed the file, you can now calculate lines with high school math.
This is not an easy, GUI way to do what you ask, but if you have a few skills you'll have your automation, and depending on the amount of your projects it may be worth it.
I have been working in this project, and foundt the library "trimesh" is better to solve this concern. Give it a shot, and save some time.

What is a sensible way to store matrices (which represent images) either in memory or on disk, to make them available to a GUI application?

I am looking for some high level advice about a project that I am attempting.
I want to write a PyQt application (following the model-view pattern) to read in images from a directory one by one and process them. Typically there will be a few thousand .png images (each around 1 megapixel, 16 bit grayscale) in the directory. After being read in, the application will then process the integer pixel values of each image in some way, and crucially the result will be a matrix of floats for each. Once processed, the user should be able be able to then go back and explore any of the matrices they choose (or multiple at once), and possibly apply further processing.
My question is regarding a sensible way to store the matrices in memory, and access them when needed. After reading in the raw .png files and obatining the corresponding matrix of floats, I can then see the following options for handling the result:
Simply store each matrix as a numpy array and have every one of them stored in a class attribute. That way they will all be easily accessible to the code when requested by the user, but will this be poor in terms of RAM required?
After processing each, write out the matrix to a text file, and read it back in from the text file when requested by the user.
I have seen examples (see here) of people using SQLite databases to store data for a GUI application (using MVC pattern), and then query the database when you need access to data. This seems like it would have the advantage that data is not stored in RAM by the "model" part of the application (like in option 1), and is possibly more storage-efficient than option 2, but is this suitable given that my data are matrices?
I have seen examples (see here) of people using something called HDF5 for storing application data, and that this might be similar to using a SQLite database? Again, suitable for matrices?
Finally, I see that PyQt has the classes QImage and QPixmap. Do these make sense for solving the problem I have described?
I am a little lost with all the options, and don't want to spend too much time investigating all of them in too much detail so would appreciate some general advice. If someone could offer comments on each of the options I have described (as well as letting me know if any can be ruled out in this situation) that would be great!
Thank you

Object recognition with CNN, what is the best way to train my model : photos or videos?

I aim to design an app that recognize a certain type of objects (let's say, a book) and that can say whether the input is effectively a book or not (binary classification).
For a better user experience, I would like the input to be a video rather than a picture: that way, the user won't have to deal with issues such as sharpness, centering of the object... He'll just have to make a "scan" of the object, without much consideration for the quality of a single image.
And there comes my problem : As I intend to create my training dataset from scratch (the true object I want to detect being absent from existing datasets such as ImageNet),
I was wondering if videos were irrelevant for this type of binary classification and if I should rather ask the user to take a good picture of the object.
On one hand, videos have the advantage of constituting a larger dataset than one created only from photos (though I can expand my picture's dataset thanks to data augmentation) as it is easier to take a 10s video of an object rather than taking 10x24 (more or less…) pictures of it.
But on the other hand I fear the result will be less precise, as in a video many frames are redundant and the average quality might not be as good as in a single, proper image.
Moreover, I do not intend to use the time property of a video (as in a scan the temporality is useless) but rather working one frame at a time (as depicted in this article).
What is the proper way of constituting my dataset? As I really would like to keep this “scan” for the user’s comfort and if images are more precise than videos in such a classification is it eventually possible to automatically extract a single image from a “scan”, and working directly on it?
Good question! The answer is: you should train your model on how you plan to use it. So if you ask the user to take photos, train it on photos. If you ask the user to film the object, train on frames extracted from video.
The images might seem blurry to you, but they won't be for a computer. It will just learn to detect "blurry books", but that's OK, that's what you want.
Of course this is not always the case. The image might become so blurry that the information whether or not there is a book in the frame is no longer there. Where is the line? A general rule of thumb: if you can see it's a book, the computer will also see it. As I think blurry images of books will still be recognizable as books, I think you could totally do it.
Creating "photos (single image, sharp)" from "scan (more blurry, frames from video)" can be done, it's called super-resolution. But those models are pretty beefy, not something you would want to run on a mobile device.
On a completely unrelated note: try googling Transfer Learning! It will benefit you for sure :D.

Right way to equalize mp3 file in python

I'm currently working on a little python script to equalize MP3 file.
I've read some docs about MP3 file format (at https://en.wikipedia.org/wiki/ID3)
And i've noticed that in the ID3v2 format there is a field for Equalization (EQUA, EQU2)
Using the python librarie mutagen i've tried to extract theses information from the MP3 but the field isn't present.
What's the right way to equalize MP3 file regardless of the ID3 version ?
Thank in advance. Creekorful
There are two high-level approaches you can take: modify the encoded audio stream, or put metadata on it describing the desired change. Modifying the audio stream is the most compatible, but generally less desirable. However, ID3v1 has no place for this metadata, only ID3v2.2 and up do.
Depending on what you mean by equalize, you might want equalization information stored in the EQA/EQUA/EQU2 frames, or a replay gain volume adjustment stored in the RVA/RVAD/RVA2 frames. Mutagen supports the linked frames, so all but EQA/EQUA. If you need them, it should be straightforward to add them from the information in the actual specification (see 4.12 on http://id3.org/id3v2.4.0-frames). With tests they could likely be contributed back to the project.
Note that Quod Libet, the player paired with Mutagen, has taken a preference for reading and storing replay gain information in a TXXX frame.

I want to create a jpg resizer in pure python. What are good resources for understanding the jpg format?

From my current understanding, png is relatively easier to decode than bitmap-based formats like jpg in python and is already implemented in python elsewhere. For my own purposes though I need the jpg format.
What are good resources for building a jpg library from scratch? At the moment I only wish to support the resizing of images, but this would presumably involve both encoding/decoding ops.
Edit: to make myself more clear: I am hoping that there is a high level design type treat of how to implement a jpg library in code: specifically considerations when encoding/decoding, perhaps even pseudocode. Maybe it doesn't exist, but better to ask and stand on the shoulders of giants rather than reinvent the wheel.
Use PIL, it already has highlevel APIs for image handling.
If you say "I don't want to use PIL" (and remember, there are private/unofficial ports to 3.x) then I would say read the wikipedia article on JPEG, as it will describe the basics, and also links to in depth articles/descriptions of the JPEG format.
Once you read over that, pull up the source code for PIL JPEGS to see what they are doing there (it is surprisingly simple stuff) The only things they import really, are Image, which is a class they made to hold the raw image data.

Categories

Resources