I'm using xlrd 0.9.4 and I would like verify if the file that I must open is valid.
To do this, I wrote this code in according with this question:
try:
book = xlrd.open_workbook(file_path)
print "Done"
except XLRDError:
print "Wrong type of file."
where file_path is the path of my file.
This works fine, the problem is the following. First of all I have a valid .xls file, so script prints Done. Now, assume that the valid .xls file is renamed (also extension), for example from test.xls to test.txt.
If I run the script, i have the same result (Done).
Instead, if I use a "real" .txt file (empty or with some text), the script prints Wrong type of file.
This behavior happens because the "structure" of the file is not changed? Am I doing something wrong? There is another type of Exception that I can add to except branch?
Thanks in advance
You can see how to xlrd check the file before reading. In xldr source at lines 18-19 defined a «magic» bytes. First bytes of file compared with this byte sequence at line 85. If its not equal exception will be rise. File extention not involved.
Signatures for different file types can be found there.
Related
I have a Python script that runs properly on my laptop, but when running on my raspberry pi, the following code does not seem to be working properly. Specifically, "TextFile.txt" is not being updated and/or saved.
openfile = open('/PATH/TextFile.txt','w')
for line in lines:
if line.startswith(start):
openfile.write(keep+'\n')
print ("test 1")
else:
openfile.write(line)
print ("test 2")
openfile.close()
I am seeing "test 1" and "test 2" in my output, so I know that the code is being reached, paths are correct, etc
It may be due to a permissions problem. I am running the script from the terminal by using:
usr/bin/python PATH/script.py
Python is owned by "root" and script.py is owned by "Michael".
My first guess:
Does the file exist? If it does not exist then you cannot write to it. Try this to create the file if it does not exist: file = open('myfile.dat', 'w+')
Additionally manually opening and closing file handles is bad practice in python. The with statement handles the opening and closing of the resource automatically for you:
with open("myfile.dat", "w+") as f:
#doyourcalculations with the file object here
for line in f:
print line
All, thank you for your input. I was able to figure out that it was writing to the new file, but it was overwriting with the same text. The reason was because ".startswith" was returning false when I expected true. The misconception was due to the difference between how Windows and Unix treat new line characters (/n /r).
Since your code is running, there should be a file somewhere.
You call "PATH/script.py", but there is "/PATH/TextFile.txt" in your program. Is the slash before PATH a mistake? Have you checked the path in your program is really where you are looking for the output file?
I want to detect if a file is being written to by another process before I start to read the contents of that file.
This is on Windows and I am using Python (2.7.x).
(By the way, the Python script is providing a service where it acts on files that are placed in a specified folder. It acts on the files as soon as they are detected and it deletes the files after having acted on them. So I don't want to start acting on a file that is only partially written.)
I have found empirically that trying to rename the file to the same name will fail if the file is being written to (by another process) and will succeed (as a null-op) if the file is not in use by another process.
Something like this:
def isFileInUse(filePath):
try:
os.rename(filePath, filePath)
return False
except Exception:
return True
I haven't seen anything documented about the behaviour of os.rename when source and destination are the same.
Does anyone know of something that might go wrong with what I am doing above?
I emphasize that I am looking for a solution that works in Windows,
and I note that os.access doesn't seem to work - even with os.W_OK it returns True even if the file is being written by another process.
One thing that is nice about the above solution (renaming to the same name) is that it is atomic - which is not true if I try to rename to a temp name, then rename back to the original name.
Since you only want to read the file - why not just try to do it? Since this is the operation you are trying to do:
try:
with open("file.txt", "r") as handle:
content = handle.read()
except IOError as msg:
pass # error handling
This will try to read the content, and fail if the file is locked, or unreadable.
I see no reason to check if the file is locked if you just want to read from it - just try reading and see if that throws an exception.
I have the following code:
with open('EcoDocs TK pdfs.csv', 'rb') as pdf_in:
pdflist = csv.reader(pdf_in, quotechar='"')
for row in pdflist:
if row[1].endswith(row[2]):#check if file type is appended to file name
pathname = ''.join(row[0:2])
else:
pathname = ''.join(row)
if os.path.isfile(pathname):
filehash = md5.md5(file(pathname).read()).hexdigest()
It reads in file paths, file names and file types from a csv file. It then checks to see if the file type is appended to the file name, before joining the file path and file name. It then checks to see if the file exists, before doing something with the file. There are about 5000 file names in the csv file, but isfile only returns True for about half of these. I've manually checked that some of those isfile returns False for exist. As all the data is read in, there shouldn't be any problems with escape characters or single backslashes, so I'm a bit stumped. Any ideas? An example of the csv file format is below, as well as an example of some of the pathnamethat isfile can't find.
csv file-
c:\2dir\a. dir\d dir\lo dir\fu dir\wdir\5dir\,5_l B.xls,.xls
c:\2dir\a. dir\d dir\lo dir\fu dir\wdir\5dir\,5_l A.pdf,.pdf
pathname created-
c:\2dir\a. dir\d dir\lo dir\fu dir\wdir\5dir\5_l B.xls
c:\2dir\a. dir\d dir\lo dir\fu dir\wdir\5dir\5_l A.pdf
Thanks.
You can safely assume that os.path.isfile() works correctly. Here is my process to debug issues like this:
Add a print(pathname) before I use it.
Eyeball the output. Does anything look suspicious?
Copy the output into the clipboard -> Win+RcmdReturndirSpace" + paste into new command prompt + "Return
That checks whether the path is really correct (finds slight mistakes that eyeballing will miss). It also helps to validate the insane DOS naming conventions which are still enforced even on Windows.
if this also works, the next step is to check file and folder permissions: Make sure the user that runs the script actually has permissions to see and read the file.
EDIT Paths on Windows are ... complicated. An important detail, for example, is that "." is a very, very special character. The name "a.something very long" isn't valid in the command prompt because it demands that you have at most three characters after the last "." in a file name! You're just lucky that it doesn't demand that the name before the last dot is at most 8 characters.
Conclusion: You must be very, very, very careful with "strange characters" in file names and paths on Windows. The only characters which are safe are listed in this document.
I want to run this program, http://pymedia.org/tut/src/dump_video.py.html
It converts video file to image files. I've installed all the modules. When I execute it by run in Python IDLE, it prints Usage... same stuff at the end of the program. My video file is in .avi, xvid codec, says it supports it on page pymedia. I believe that program and my file arent connected, but how to input my file (test.avi) to the program? I put the video file in same folder as program. Says something at the end of the page http://pymedia.org/tut/index.html, to put in cmd, and I did but i keep getting the same message about Usage. Its in the if statement at the end of the file. I worked a little in python, but never with functions, so please help.
Thanks!
The source code contains this Usage message:
print 'Usage: dump_video <file_name> <image_pattern> <format_number>\n<format_number> can be: RGB= 2'+\
'\n<image_patter> should include %d in the name. ex. test_%d.bmp.'+ \
'\nThe resulting image will be in a bmp format'
Per the comments, dump_video.py can be called this way:
dump_video.py myvideo.avi myvideo_%d.bmp 2
This will attempt to save the frames in myvideo.avi into BMP files of the form myvideo_%d.bpm where %d will be replaced by numbers. I'm not sure what the last argument, the so-called "format number", 2 does.
try:
directoryListing = os.listdir(inputDirectory)
#other code goes here, it iterates through the list of files in the directory
except WindowsError as winErr:
print("Directory error: " + str((winErr)))
This works fine, and I have tested that it doesnt choke and die when the directory doesn't exist, but I was reading in a Python book that I should be using "with" when opening files. Is there a preferred way to do what I am doing?
You are perfectly fine. The os.listdir function does not open files, so ultimately you are alright. You would use the with statement when reading a text file or similar.
an example of a with statement:
with open('yourtextfile.txt') as file: #this is like file=open('yourtextfile.txt')
lines=file.readlines() #read all the lines in the file
#when the code executed in the with statement is done, the file is automatically closed, which is why most people use this (no need for .close()).
What you are doing is fine. With is indeed the preferred way for opening files, but listdir is perfectly acceptable for just reading the directory.