how to check if a file is a .gz file in Python - python

I am working on python input-output and was given a CSV file(possible gzipped)
. If it is gzipped, I have to decompress it, and then read it.
I was trying to read the first two bytes do like this:
def func(filename):
fi = open(filenam,"rb")
byte1 = fi.read(1)
byte2 = fi.read(1)
then I will check byte1 and byte2 to see if they are equal to 0x1f and 0x8b, then decompress it then print every line of it.
But when I run it, I got this error:
TypeError: 'NoneType' object is not iterable
I'm new to python, can anyone help?

Understanding from what you said in the comment - "that's all I have in the function" I would assume the issue is that the function has no return value. So probably the caller of the function tries to run on the result of a function call with no return value, i.e NoneType.

you need to use endwith() in Python to check whether a folder has .gz extension file then use gzip module to decompress it and read .gz contents
import os
import gzip
for file in os.listdir(r"C:\Directory_name"):
if file.endswith(".gz"):
print file
os.chdir(r"C:\Directory_name")
f = gzip.open(file, 'rb')
file_content = f.read()
f.close()
so here "file_content" variable will hold the data of your csv gzipped file

Related

TypeError: argument should be integer or None, not '_io.TextIOWrapper'

f = open(r"C:\Users\raghu\3D Objects\myDJZ.py","w+")
f.read(f)
This is my code..
I don't understand what I am doing wrong
Your problem is that you open file for writing and you try to read from it.
Your variable "f" is a file object which supports input and output operations.
To read the contents of the file you should use the method read().
read() can have parameter which need to be integer. The integer parameter will tell the method read how much data will it return to a variable e.g
file = open(<dir_to_some_file>, 'r')
text = read(25)
The variable text will contain 25 characters from the file content.
If you need to read all the contents of the file just do not pass any parameter to the read method e.g.
file = open(<dir_to_some_file>, 'r')
text = read()
To write in file:
text_to_file = 'This will be written to file'
file = open(<dir_to_some_file>, 'w')
file.write(text_to_file)
Use this for reference:
https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files
f.read(4) means you only want to read 4 bytes. Since you pass an TextIOWrapper instance to the read function, you get this error because there is no integer representation for that object. As already mentioned, simply do f.read() without an argument when calling read().
edit: See other answer, also the mode is wrong

Python: how to pass a file from a zip to a function that reads data from that file

I have a zip-file that contains .nrrd type files. The pynrrd lib comes with a read function. How can I pull the .nrrd file from the zip and pass it to the nrrd.read() function?
I tried following, but that gives the following error at the nrrd.read() line:
TypeError was unhandled by user code, file() argument 1 must be
encoded string without NULL bytes, not str
in_dir = r'D:\Temp\Slikvideo\JPEG\SV_4_1_mask'
zip_file = 'Annotated.mitk'
zf = zipfile.ZipFile(in_dir + '\\' + zip_file)
f_name = 'datafile.nrrd' # .nrrd file in zip
file_nrrd = zf.read(f_name) # pull the file from the zip
img_nrrd, options = nrrd.read(file_nrrd) # read the .nrrd image data from the file
I could write the file pulled from the .zip to disk, and then read it from disk with nrrd.read() but I am sure there is a better way.
I think that your is a good way...
Here there is a similar question:
Similar question
Plus answer:
I think that the problem maybe is that when you use zipfile.ZipFile you not set the attribute:
Try using:
zipfile.ZipFile (path,"r")
The following works:
file_nrrd = zf.extract(f_name) # extract the file from the zip

unable to read the data from file in python

I am unable to read the data from file in python . Below is the sample code and Error which I am getting .
abc.txt has the value 2015-05-07
f = open("/opt/test/abc.txt","r")
f.read()
last_Exe_date = f.read()
f.close()
while reading the file(anc.txt) i am getting the error : TypeError: argument 1 must be string or read-only character buffer, not file . i am not able to read the value into last_Exe_date from file(abc.txt) . could you please correct me if I am wrong with the code.
When you read a file once, the cursor is at the end of the file and you won't get anything more by re-reading it. Read through the docs to understand it more. And use readline to read a file line by line.
Oh, and remove the semicolon at the end of your read calls...
The following should work fine:
f = open("/opt/test/abc.txt","r")
last_Exe_date = f.read()
f.close()
As was said, you had f.read() twice, so when you tried to store the contents into last_Exe_date, it would be empty.
You could also consider using the following method:
with open("/opt/test/abc.txt","r") as f:
last_Exe_date = f.read()
This will ensure that the file is automatically closed afterwards.

Write to file and save to ftp with python 2.6

I'm trying to store a file I create on an ftp server.
I've been able to create the temp file and store it as an empty file, but I haven't been able to write any data to the file before storing it.
Here is the partially working code:
#Loggin to server.
ftp = FTP(Integrate.ftp_site)
ftp.login(paths[0], paths[1])
ftp.cwd(paths[3])
f = tempfile.SpooledTemporaryFile()
# Throws error.
f.write(bytes("hello", 'UTF-8'))
#No error, doesn't work.
#f.write("hello")
#Also, doesn't throw error, and doesn't write anything to the file.
# f.write("hello".encode('UTF-8'))
file_name = "test.txt"
ftp.storlines("Stor " + file_name, f)
#Done.
f.close()
ftp.quit()
What am I doing wrong?
Thanks
Seeking!
To know where to read or write in the file (or file-like object), Python keeps a pointer to a location in the file. The documentation simply calls it "the file's current position". So, if you have a filed with these lines in it:
hello world
how are you
You can read it with Python like in the following code. Note that the tell() function tells you the file's position.
>>> f = open('file.txt', 'r')
>>> f.tell()
0
>>> f.readline()
'hello world\n'
>>> f.tell()
12
Python is now twelve characters "into" the file. If you'd count the characters, that means it's right after the newline character (\n is a single character). Continuing to read from the file with readlines() or any other reading function will use this position to know where to start reading.
Writing to the file will also use and increment the position. This means that if, after writing to the file you read from the file, Python will start reading at the position it has saved (which is right after whatever you just wrote), not the beginning of the file.
The ftp.storlines() function uses the same readlines() function, which only starts reading at the file's position, so after whatever you wrote. You can solve this by seeking back to the start of the file before calling ftp.storlines(). Use f.seek(0) to reset the file position to the very start of the file.

Reading individual bz2 files from a tar file

I'm trying to read many bz2 files within a tar file, a file has the following structure:
2013-01.tar
01\01\00\X.json.bz2\X.json
01\01\02\X.json.bz2\X.json
I'm able to get the filenames as follows:
import tarfile
tar = tarfile.open(filepath, 'r')
tar_members_names = [filename for filename in tar.getnames()]
# Side question: How would I only return files and no directories?
Which returns a list of the .bz2 files. Now I'm trying to extract them (temporarily) using:
inner_filename = tar_members_names[0]
t_extract = tar.extractfile(inner_filename)
The following code to extract the json file returns an error, however. How would I go about retrieving the JSON files line by line?
import bz2
txt = bz2.BZ2File(t_extract)
TypeError: coercing to Unicode: need string or buffer, ExFileObject found
txt = bz2.decompress(t_extract)
TypeError: must be convertible to a buffer, not ExFileObject
I've been unable to figure out how to return a buffer from the tar file instead of the current ExFileObject (how to convert it to a buffer?), any suggestions are greatly appreciated.
BZ2File expects a file name as first argument and you pass a file object (i.e. an object which has the same API as what Python returns for open()).
To do what you want, you'll have to read all the bytes from t_extract yourself and call bz2.decompress(data) or use BZ2Decompressor to stream the data through it.

Categories

Resources