isfile not recognising files python 2.5 - python

I have the following code:
with open('EcoDocs TK pdfs.csv', 'rb') as pdf_in:
pdflist = csv.reader(pdf_in, quotechar='"')
for row in pdflist:
if row[1].endswith(row[2]):#check if file type is appended to file name
pathname = ''.join(row[0:2])
else:
pathname = ''.join(row)
if os.path.isfile(pathname):
filehash = md5.md5(file(pathname).read()).hexdigest()
It reads in file paths, file names and file types from a csv file. It then checks to see if the file type is appended to the file name, before joining the file path and file name. It then checks to see if the file exists, before doing something with the file. There are about 5000 file names in the csv file, but isfile only returns True for about half of these. I've manually checked that some of those isfile returns False for exist. As all the data is read in, there shouldn't be any problems with escape characters or single backslashes, so I'm a bit stumped. Any ideas? An example of the csv file format is below, as well as an example of some of the pathnamethat isfile can't find.
csv file-
c:\2dir\a. dir\d dir\lo dir\fu dir\wdir\5dir\,5_l B.xls,.xls
c:\2dir\a. dir\d dir\lo dir\fu dir\wdir\5dir\,5_l A.pdf,.pdf
pathname created-
c:\2dir\a. dir\d dir\lo dir\fu dir\wdir\5dir\5_l B.xls
c:\2dir\a. dir\d dir\lo dir\fu dir\wdir\5dir\5_l A.pdf
Thanks.

You can safely assume that os.path.isfile() works correctly. Here is my process to debug issues like this:
Add a print(pathname) before I use it.
Eyeball the output. Does anything look suspicious?
Copy the output into the clipboard -> Win+RcmdReturndirSpace" + paste into new command prompt + "Return
That checks whether the path is really correct (finds slight mistakes that eyeballing will miss). It also helps to validate the insane DOS naming conventions which are still enforced even on Windows.
if this also works, the next step is to check file and folder permissions: Make sure the user that runs the script actually has permissions to see and read the file.
EDIT Paths on Windows are ... complicated. An important detail, for example, is that "." is a very, very special character. The name "a.something very long" isn't valid in the command prompt because it demands that you have at most three characters after the last "." in a file name! You're just lucky that it doesn't demand that the name before the last dot is at most 8 characters.
Conclusion: You must be very, very, very careful with "strange characters" in file names and paths on Windows. The only characters which are safe are listed in this document.

Related

Take the path of a file by dropping it on batch or python

in my head this problem seems simple but I cant for the life of me figure it out.
I want to use a function similar to os.replace() to move a file/folder from one location which could vary to one that is set whilst also preserving the name of it.
At this point I couldn't figure it out however to make it slightly more difficult I want to be able to drop a file onto the batch/python script and have the code detect the filepath for the file i dropped on it.
Sorry for the bad explanation in short:
import os
initialfilepath = "The filepath of the file i drop onto the batch/python file"
finalfilepath = "Predetermined/file/path etc"
os.replace(initialfilepath,finalfilepath) <--However i want to preserve the name of the file.
Any help would be greatly appreciated!
With a single line in batch file you can handle all your dropped files:
CMD /k FOR %%s in (%*) do ECHO %%s
In this example it will print all dropped files.
The parameter %* get all file pathnames.
The command FOR as it is shown, split the string by spaces, and handle one by one.
If the file pathname has spaces, so it will come in quotes as one thing.
Note that you don't need CMD /k, but it will keep the console opened at the end.
If you just wanna see before to close, insert PAUSE > nul at the end.
You can group the do commands between ( ... ) and then break the lines.
Consider set #ECHO off in beggining.
Now try it yourself :)
Type FOR /? to see more possibilities to manipulate parameters %~nX, %~dpX, %~xX...

What is the best way to read a JSON file and obtain the values without the invisible characters in Python?

I have a simple JSON file that I was supposed to use as a configuration file, it contains the default directories for whoever is running the script using their MacBooks:
{
"main_sheet_path": "/Users/jammer/Documents/Studios⁩/⁨CAT/⁨000-WeeklyReports⁩/2020/",
"reference_sheet_path": "/Users/jammer/Documents/DownloadedFiles/"
}
I read the JSON file and obtain the values using this code:
with open('reportconfig.json','r') as j:
config_data = json.load(j)
main_sheet_path = str(config_data.get('main_sheet_path'))
reference_sheet_path = str(config_data.get('reference_sheet_path'))
I use the path to check for a source file's existence before doing anything with it:
source_file = 'source.xlsx'
source_file = main_sheet_path + filename
if not os.path.isfile(source_file) :
print ('ERROR: Source file \'' + source_file + '\' NOT FOUND!')
return
Note that the filename is inputted as a parameter when the script is run (there are multiple files, the script has to know which one to target).
The file is there for sure but the script never seems to "see" it so I get that "ERROR" that I printed in the above code. Why do I think there are invisible characters? Because when I copy and paste from what was printed in the "error" notice above into the terminal, the last few characters of the file name always gets substituted by some invisible characters and hitting backspace erases characters where the cursor isn't supposed to be.
How do I know for sure that the file is there and that my problem is with reading the JSON file and not in the Directory names or anywhere else in the code? Because I finally gave up on using a JSON config file and went with a configuration file like this instead:
#!/usr/local/bin/python3.7
# -*- coding: utf-8 -*-
file_paths = { "main_sheet_path": "/Users/jammer/Documents/Studios⁩/⁨CAT/⁨000-WeeklyReports⁩/2020/",
"reference_sheet_path": "/Users/jammer/Documents/DownloadedFiles/"
}
I then just import the file and obtain the values like this:
import reportconfig as cfg
main_sheet_path = cfg.file_paths['main_sheet_path']
reference_sheet_path = cfg.file_paths['reference_sheet_path']
...
This workaround works perfectly — I don't get the "error" that the file isn't there when it is and the rest of the script is executed as expected. When the file isn't there, I get the proper "error" I expect and copying-and-pasting the full path and filename from the "error message" gives me the complete file name and hitting the backspace erases the right characters (no funny behavior, no invisible characters).
But could anyone please tell me how read the JSON file properly without getting those pesky invisible characters? I've spent hours trying to figure it out including searching seemingly related questions in stackoverflow but couldn't find the answer. TIA!
I think there is just a typo error in this code:
source_file = 'source.xlsx'
source_file = main_sheet_path + filename
Maybe filename is set to some other file which is not present hence it is giving you error.
Try to set filename='source.xlsx'
Maybe it will help

WindowsError when using os.listdir and os.stat of the result

I have the following snippet of code that just gets the Timestamp of the file.
files_list = os.listdir(os.path.join(path, folder))
for files in files_list:
stats = os.stat(os.path.join(path, folder, files))
Is it possible for me to ever get the below error as it seems counter intuitive that it is not able to find a file that it has just got in listdir, except ofcourse for a race condition which is not what I suspect in this case.
WindowsError: [Error 2] The system cannot find the file specified:
'\\\\sftp-server.domain.com\\homes\\server\\location\\FOLDER\\FILE.PDF'
I also wonder if something like domain lookup/temporary network issue can cause this error? For example
\\sftp-server.domain\\homes\\server\\location\\FOLDER
and
\\sftp-server.domain\\homes\\server\\location\\FOLDER\FILE
are just URL Strings and has nothing to do with the real file system traversal.
Presumably FOLDER and FILE are not actual names? Take a careful look at the file names reported by the WindowsError. If they contain question marks in the last component, you have an issue with Unicode file names. Specifically, when the directory contains a file name with Unicode characters not representable in the current code page (such as Japanese characters in a Western or Eastern European locale), os.listdir will return file names with unrepresentable Unicode characters converted to ?. Obviously, such essentially broken names cannot be passed to the IO functions such as open or os.stat.
To fix this, request Unicode file names from os.listdir by passing it the directory as a Unicode string. These will contain correct characters and can be passed to os.stat, which will internally call the wide API:
dirname = unicode(os.path.join(path, folder), 'mbcs')
file_list = os.listdir(dirname)
for filename in file_list:
stats = os.stat(os.path.join(dirname, filename))
# ...
The server was doing multi-threading and we send multiple Ajax requests in a single Javascript method to the same folder resource.
In the event of os.listdir processing first this error occurs because it takes a long time to execute it over SFTP. During this time the os.remove happened in another request and removed the file that was showing up in os.listdir result. After making he os.listdir function as a proper callback it worked just fine.

Python: A way to detect a filetype attached to a string?

In IronPython 2.6*, I'm trying to build a function that "corrects" a string; I have two arguments, FILE and EXTN. The idea is for them to be concatenated as necessary later in the program, but you know some people don't read instructions and you're bound to have someone enter "FILE.*" as their FILE, which would mess everything up.
I'm looking for a way to take FILE, have my function detect and strip .* (any extension of any length) from FILE if .* exists; It doesn't need to be in the string, and the user will be entering the same extension into EXTN**, so it needs not be prepared, merely consistently stripped.
My current method has me passing FILE and EXTN separately, but it's not inconceivable to redo things to take FILE.EXTN and break that into FILE and EXTN if need be; I don't want to if I don't have to, though, as my program is built around the former system.
*A note regarding IronPython 2.6; I'm trying to avoid IronPython-specific codes and use as simple of ones as possible, for UNIX-WIN cross-compatibility's sake. So far, everything I've done works in Python 2.7 IDE's, but obviously will not work in Python 3.x
**A note regaring EXTN; I want users to enter the proper extension into EXTN too, but as we know, we can't be sure of this and so the method for stripping .i from FILE must not automatically include EXTN as part of it.
Here is a snippet of code that may help as a reference for what I have so far. The FILE and EXTN variables have been added, and in practice, are pulled in through a middle-man program from an XML file into the script at run-time.
FILE = "test"
PATH = "C:\\"
EXTN = ".txt"
def CheckCorrect_FILE(srcFile): #Check-corrects FILE
#Meh, I got nothin'...
def CheckCorrect_PATH(srcPath): #Check-corrects PATH
if srcPath.endswith('\\') == False:
srcPath = srcPath + "\\"
else:
srcPath = srcPath
return srcPath
You can do this using os.path.splitext. The following will always remove an extension if one exists (and do nothing if it doesn't):
import os
FILE = os.path.splitext(FILE)[0]

File names have a `hidden' m character prepended

I have a simple python script which produces some data in a Neutron star mode. I use it to automate file names so I don't later forget the inputs. The script succesfully saves the file as
some_parameters.txt
but when I then list the files in terminal I see
msome_parameters.txt
The file name without the "m" is still valid and trying to call the file with the m returns
$ ls m*
No such file or directory
So I think the "m" has some special meaning of which numerous google searches do not yields answers. While I can carry on without worrying, I would like to know the cause. Here is how I create the file in python
# chi,epsI etc are all floats. Make a string for the file name
file_name = "chi_%s_epsI_%s_epsA_%s_omega0_%s_eta_%s.txt" % (chi,epsI,epsA,omega0,eta)
# a.out is the compiled c file which outputs data
os.system("./a.out > %s" % (file_name) )
Any advise would be much appreciated, usually I can find the answer already posted in the stackoverflow but this time I'm really confused.
You have a file with some special characters in the name which is confusing the terminal output. What happens if you do ls -l or (if possible) use a graphical file manager - basically, find a different way of listing the files so you can see what's going on. Another possibility would be to do ls > some_other_filename and then look at the file with a hex editor.

Categories

Resources