I have a number of HDF5 files that I want to convert the data inside these files to JPG or PNG format.
This is my HDF5 files link.
These files are from the PICMUS challenge and the only way to open them is through the MATLAB codes provided in this link.
However, I need to open these files in Python or at least convert the data inside them into PNG or JPG images so that I can use them in my Python project.
I tried different ways but none of them worked. Can anyone convert these files for me or at least tell me how to open them?
Please provide me a detailed solution.
Related
I have a bunch of p7m files (used to digitally sign some files, usually pdf files) and I would like some help to find a way to extract the content. I know how to iterate a process over the files in a folder using Python, I need help just with the extraction part.
I tried with PyPDF2.PdfFileReader.decrypt() but I get a "EOF marker not found" error because apparently PyPDF2 cannot manage encrypted files.
I saw somebody used the mime library, but that is way above my level honestly.
Thank you
I wanted to create a pdf using Python 3x.
The pdf should have some text data which is stored in a .xlsx file i.e.., it should read data from .xlsx file and write into the .pdf file.
Along with that, the pdf should have a png image of passport size.
I have come up with two basic ideas which are:-
First one is by writing a program which create a text file in which all required data from the pdf will be written along with the png image. After that the program will convert it into a pdf file.
Second one is by writing a program which will create the pdf file and write the data from .xlsx file as well as insert the image too into the pdf file.
I don't know whether these ideas can be used or not and how it can be used but after going through some researches on GFG, Stack overflow..., I have got totally confused and ended up asking this problem on this platform.
I have tried some modules like PIL, FPDF, reportlab,.. and am successfully able to create a pdf file with either texts or images but unable to combine both in the same text file.
Also I am confused in deciding which idea I should implement.
What I need from you guys is the answer of few of my questions which are:-
Are the ideas I mentioned above(second one specially) practically possible?
Can I make a program which imports data from file as well as png image into the same pdf. What modules and functions will be used there and how.
Please provide the code with comments or defining/elaborating the work of function used.
I hope I will get the desired result soon. Meanwhile I will try to solve it out by myself.
I have thousands of .wfs(windows script files) files and I am looking for a solution to help me convert this huge amount of .wfs files into .csv files using Python Anaconda or Spyder.
May I know any IO API similar to pandas where could read wfs files using similar functions like read.excel? or does python read script files?
Or could I convert these script files into Excel read quick (I know how to convert excel files into csv using Python)?
An example .wfs along with the converted csv is contained in the link below:
https://www.dropbox.com/sh/uwbajpubzuxn7g5/AABbD7W4pXFlxiIi1UAlHKTFa?dl=0
We get PDF files delivered to us daily and we need to get the images out. For example, what I want to do is to get the image back out of this PDF file I have, with python. Most pdf files we get are multipage and we want to export each embedded image to separate files. Most have jpeg files in them, but his one does not.
Object 5 is embedded as a zlib compressed stream. I am pretty sure it is zlib compressed because it is marked as FlateDecode and the start of the stream is \x78\x9c which is typical for zlib. You can see (part of) the hex dump here
The question is, how do I 'deflate' it and save the resulting file.
Thank you for sharing your wisdom.
I searched everywhere and tried many things but couldn't get to work. I managed to decompress the data like this:
import zlib
with open("MDL1703140088.pdf", "rb") as f:
pdf = f.read()
image = zlib.decompress(pdf[640:69307])
640 is zlib header(b'x\x9c') position and 69307 is the position of something like footer of pdf spec. b'\nendstream\n' is there. Detail is in the spec and some helpful Q&A can be found here. But omitting the end position is allowed in this case because decompress() seems to ignore following non-compressed data. You can validate this by:
decomp = zlib.decompressobj()
image = decomp.decompress(pdf[640:])
print(decomp.unused_data) # starts from b'\nendstream\n
So far so good. But when I write image to a PNG file, it cannot be read by any image viewer. Actually decompressed data looks so quite empty here and there. I attached some PNG header, but no luck. Hey, it's too much...
As I said earlier (strangely my comment was removed by someone), you'd better use some other existing tools. If Acrobat is not your option, what about pdftopng (part of Xpdf)? pdftopng MDL1703140088.pdf . gave me a valid PNG file flawlessly. Obviously command-line tools can be executed in Python, as you may know.
I'm trying to work with files from this site:
NADP Website
The files are .e00 format. When I attempt to open them with GeoPandas, I get a message that they appear to be compressed.
If I try using e00conv or AVCE00 to decompress the files, then open them with GeoPandas, I get a FionaValueError, that no dataset has been found.
Any suggestions for how I can get these files opened so I can put them in a format I can use?
I can load the decompressed file using np.fromfile but then all I have is a vector.
I finally figured this out. In this instance, even though the .e00 format is not usually used to store raster file, these files are raster images. They open fine with rasterio.