I was looking for a better technique to find a JPEG incomplete more flexible rather than the common technic which consist on search for the EOI byte at the end of the file. The idea is to not mark as incomplete image with a minimum percentile of the image data missed. However, seems this is not an easy thing.
The first idea was to try to detect which MCU are corrupted or missed, but giving the fact is compressed on a Huffman table, that is not easy job.
The other option I had was to calculate the content expected after the SOS Marker, and then if the real content does not much with the specified then I can calculate how much Image Data is missing.
Anybody can help me on this? I have tried several tools like ImageMagick or PIL (python) but I cannot fine a proper solution for that.
Thanks.
Related
I need to convert a lot PDF tables data scans with bad quality to excel tables. The only way I see the solution is to train tesseract or some other framework on pre-generated images(all tables in PDF are the same in most cases). Is it real to have a great solution around 70-80% at home conditions and what you can advice. I will appreciate any advice other than Abby FineReader or similar solution(tested on my dataset - result is so bad and few opportunities for automation)
All tables structures need to be correct in result for further handwork.
You should use a PDF parser for that.
Here's the parsed result using Parsio (https://parsio.io). It looks correct to me. You can export the parsed data to Sheets / Excel / CSV / Zapier.
When the input image is a very poor quality the dirt tends to get in the way of text recognition. This is exacerbated when trying to look for areas without dictionary entries, thus only numbers can be the worst type of text to train, for every twist and turn that bad scanning produces.
If the electronic source before manual stamp and scan is available it might be possible to meld the text with the distorted image , but its a highly manual task defeating the aim.
The docents need to be rescanned, by a trained operator, with a good eye for details. That, with an OCR scan device, will be faster than tuning images that are never likely to provide a reasonably trustworthy output. There are too many cases of numeric fails, that would make any single page worthless for reading or computations.
Recently scanned some accounts and spent more time check/correct than if it had been typed, but it needed to be "legal" copy, however clearly it was not as I did it after the event.
The best result I could squeeze from Adobe PDF to Excel was "Pants"
There are some improvements in image contrast and noise reduction (handwork).
Some effect but not obvious.
Image2word
I was asked this peculiar question today and I couldn't give a straight answer.
I have an image depicting base64 text. How can I convert this to text?
I tried this via pytesseract, but in tesseract is a language component that garbles the text. So I don't think that's a way to go. I tried researching a bit, but seems it's not a fairly common problem (to say the least). I've no clue how it could be useful, but for sure it's vexing!
What other things could I try?
What an interesting question. This task isn’t super irregular, however, as I’ve seen people extract plenty of jumbled words from images before. Extracting a long jumbled line of base64 text could prove to be more challenging. Some OCR tools ive seen used are:
opencv-python wrapper of OpenCV
pytesseract wrapper of Tesseract (As you stated)
More OCR wrappers I found other than the two popular ones: https://pythonrepo.com/repo/kba-awesome-ocr-python-computer-vision
For these to work the image also needs to be fairly good quality. If the base64 image is predictable and in a structured form, you could create your own reference images and compare them to the original also to determine each character in the string and bypass the need for an OCR completely.
There is limitations to OCR obviously such as the fact the image needs scaling, contrast, and alignment, and any small error can ruin the base64 text. I obviously have never seen OCR used for such a thing before so I’m unsure where to go past there, but I am positive you are on the right track!
I have read a lot of essays and articles about (Compressing Image Algorithm). There are many algorithms which I can only understand some of them because I'm a student and I haven't gone to high school yet. I read this article which it helps me a lot! Article In page 3 at this part (Run length code). It's a very EZ and helpful algorithm but I don't know how do I make new format of image. I am a python developer but I don't know how to make a new format which it has a separate algorithm and program. --> like .jpeg, ,jpg, .png, .bmp
(Sorry I have studied English for 1 years so if I have some problems such as grammar or vocabulary just excuse me )
Sure, you can make your own image file format. Choose a filename extension, define how it will be stored and write Python code to:
read the format from disk into a Numpy array, and
write an image contained in a Numpy array to disk
That way you will be interoperable with all the major image processing libraries such as OpenCV, scikit-image, PIL, wand.
Have a look how NetPBM works to get started with a simple format. Maybe look at PCX format if you like the thought of RLE.
Read up on how to write binary to a file with Python.
Is there a good way to identify (or at least approximate) the graphics program used to obtain a particular image? For instance, I want to know if there is a certain signature that these programs embed into an image. Any suggestions?
If not, is there a reference where I can find what all meta-information can be extracted out of an image?
Certain image file formats do have meta-data. It is format dependent. Digital cameras usually write some of their information into the meta-data. EXIF is what comes to mind. Images not acquired through a digital camera may or may not have relevant meta-data, so you can't consider meta-data of any sort to be a guaranteed reliable identifier. That's about as much as I can give as an answer, alas. I'm sure someone else may have more details.
I have gat some samples about how to open a presentation and access the slides and shapes. But I want to do some more other operations(e.g. generate a thumbnail from a specified slide). What methods can I use? Is there any document illustrating all the functionalities?
Not to discourage you, but my experience using COM from Python is that you won't find many examples.
I would be shocked (but happy to see) if anybody posted a big tutorial or reference using PowerPoint in Python. Probably the best you'll find, which you've probably already found, is this article
However, if you follow along through that article and some of the other Python+COM code around, you start to see the patterns of how VB and C# code converts to Python code using the same interfaces.
Once you understand that, your best source of information is probably the PowerPoint API reference on MSDN.
From looking at the samples Jeremiah pointed to, it looks like you'd start there then do something like this, assuming you wanted to export slide #42:
Slide = Presentation.Slides(42)
Slide.Export FileName, "PNG", 1024, 768
Substitute the full path\filename.ext to the file you want to export to for Filename; string.
Use PNG, JPG, GIF, WMF, EMF, TIF (not always a good idea from PowerPoint), etc; string
The next two numbers are the width and height (in pixels) at which to export the image; VBLong (signed 32-bit (4-byte) numbers ranging in value from -2,147,483,648 to 2,147,483,647)
I've petted pythons but never coded in them; this is my best guess as to syntax. Shouldn't be too much of a stretch to fix any errors.