I'm trying to work with files from this site:
NADP Website
The files are .e00 format. When I attempt to open them with GeoPandas, I get a message that they appear to be compressed.
If I try using e00conv or AVCE00 to decompress the files, then open them with GeoPandas, I get a FionaValueError, that no dataset has been found.
Any suggestions for how I can get these files opened so I can put them in a format I can use?
I can load the decompressed file using np.fromfile but then all I have is a vector.
I finally figured this out. In this instance, even though the .e00 format is not usually used to store raster file, these files are raster images. They open fine with rasterio.
Related
I wanted to create a pdf using Python 3x.
The pdf should have some text data which is stored in a .xlsx file i.e.., it should read data from .xlsx file and write into the .pdf file.
Along with that, the pdf should have a png image of passport size.
I have come up with two basic ideas which are:-
First one is by writing a program which create a text file in which all required data from the pdf will be written along with the png image. After that the program will convert it into a pdf file.
Second one is by writing a program which will create the pdf file and write the data from .xlsx file as well as insert the image too into the pdf file.
I don't know whether these ideas can be used or not and how it can be used but after going through some researches on GFG, Stack overflow..., I have got totally confused and ended up asking this problem on this platform.
I have tried some modules like PIL, FPDF, reportlab,.. and am successfully able to create a pdf file with either texts or images but unable to combine both in the same text file.
Also I am confused in deciding which idea I should implement.
What I need from you guys is the answer of few of my questions which are:-
Are the ideas I mentioned above(second one specially) practically possible?
Can I make a program which imports data from file as well as png image into the same pdf. What modules and functions will be used there and how.
Please provide the code with comments or defining/elaborating the work of function used.
I hope I will get the desired result soon. Meanwhile I will try to solve it out by myself.
I have a number of HDF5 files that I want to convert the data inside these files to JPG or PNG format.
This is my HDF5 files link.
These files are from the PICMUS challenge and the only way to open them is through the MATLAB codes provided in this link.
However, I need to open these files in Python or at least convert the data inside them into PNG or JPG images so that I can use them in my Python project.
I tried different ways but none of them worked. Can anyone convert these files for me or at least tell me how to open them?
Please provide me a detailed solution.
I'm trying to analyze genome data from a huge (1.75GB compressed) vcf file using Python. The technician suggested I use scikit-allel and gave me this link: http://alimanfoo.github.io/2017/06/14/read-vcf.html. I wasn't able to install the module on my computer; but I successfully installed it on a cluster which I access through vpn. There, I successfully opened the file and have been able to access the data. But I can only access the cluster through a command line interface, and that isn't as friendly as the Spyder I have on my computer; so I've been trying to bring the data back. The GitHub link says I can save the data into a npz file which I can read straight into Python's numpy; so I've been trying to do that.
First, I tried allel.vcf_to_npz('existing_name.vcf','new_name.npz',fields='calldata/GT') on the cluster. This created a (suspiciously small) new npz file on the cluster, which I downloaded. But when I opened up Spyder on my computer and typed genotypes=np.load('real_genotypes.npz'), no new variable called genotypes appeared in the Variable Explorer. Adding the line print(genotypes) produces <numpy.lib.npyio.NpzFile object at 0x00000__________>
Next, thinking that I should copy everything to be sure, I tried allel.vcf_to_npz('existing_name.vcf','new_name.npz',fields='*',overwrite=True)
This created a 2.10GB file. After a lengthy download, I tried the same thing, but got the same results: No new variable when I try to np.load the file, and <numpy.lib.npyio.NpzFile object at 0x000001DB0DEC7F88> when I ask to print it.
When I tried to Google search this problem, I saw this question: Load compressed data (.npz) from file using numpy.load. But my case looks different. I don't get an error message; I just get nothing. So what's wrong?
Thanks
We get PDF files delivered to us daily and we need to get the images out. For example, what I want to do is to get the image back out of this PDF file I have, with python. Most pdf files we get are multipage and we want to export each embedded image to separate files. Most have jpeg files in them, but his one does not.
Object 5 is embedded as a zlib compressed stream. I am pretty sure it is zlib compressed because it is marked as FlateDecode and the start of the stream is \x78\x9c which is typical for zlib. You can see (part of) the hex dump here
The question is, how do I 'deflate' it and save the resulting file.
Thank you for sharing your wisdom.
I searched everywhere and tried many things but couldn't get to work. I managed to decompress the data like this:
import zlib
with open("MDL1703140088.pdf", "rb") as f:
pdf = f.read()
image = zlib.decompress(pdf[640:69307])
640 is zlib header(b'x\x9c') position and 69307 is the position of something like footer of pdf spec. b'\nendstream\n' is there. Detail is in the spec and some helpful Q&A can be found here. But omitting the end position is allowed in this case because decompress() seems to ignore following non-compressed data. You can validate this by:
decomp = zlib.decompressobj()
image = decomp.decompress(pdf[640:])
print(decomp.unused_data) # starts from b'\nendstream\n
So far so good. But when I write image to a PNG file, it cannot be read by any image viewer. Actually decompressed data looks so quite empty here and there. I attached some PNG header, but no luck. Hey, it's too much...
As I said earlier (strangely my comment was removed by someone), you'd better use some other existing tools. If Acrobat is not your option, what about pdftopng (part of Xpdf)? pdftopng MDL1703140088.pdf . gave me a valid PNG file flawlessly. Obviously command-line tools can be executed in Python, as you may know.
I'm trying to work with WAV files using scipy.io.wavfile but the file I want to read has headers inside of it (NIST). I tried deleting the headers (that was dumb), I tried using other libraries (wave), custom functions found online but It still doesn't work. I get "Not a WAV file."
The .wav files are from mocha-timit, a speech training corpus.
can someone help me out?
Thanks.
Some things you could try:
(1) Use scikits.audiolab as in this question
(2) Convert the wav format from NIST format to the standard RIFF format with a tool like sndfile-convert from 'libsndfile' (You'll need to change the original file endings to .nist).
I got (2) to work on my own system and could read the files with scipy.io.wavfile.read