Python does not recognise text files in a directory - python

The following piece of code works fine, reads all the text files in the specified directory:
files_ = glob.glob('D:\Test files\Case 1\*.txt')
But when I change the path to another directory, it gives me an empty list of files:
files_ = glob.glob('D:\Test files\Case 2\*.txt')
print files_ >> []
Both directories contain a couple of text files. Text file names and sizes are different though.
It's really wired and I couldn't think of any thing to solve the problem. Has anyone faced such a problem?

You need to either escape your backslashes:
files_ = glob.glob('D:\\Test files\\Case 2\\*.txt')
Or specify that your string is a raw string (meaning backslashes should not be specially interpreted):
files_ = glob.glob(r'D:\Test files\Case 2\*.txt')
What happened to break your second glob is that \1 turned into the ASCII control character \x01. The error message contains a clue to that:
WindowsError: [Error 123] The filename, directory name, or volume label syntax is incorrect: 'D:\\Test files\\B1\x01rgb/*.*'
Notice how a \1 turned into the literal \x01. The reason your first directory worked is that you basically got lucky and didn't accidentally specify any special characters:
'\T'
Out[27]: '\\T'
'\B'
Out[28]: '\\B'
'\1'
Out[29]: '\x01'

Related

Python OSError: [Errno 22] Invalid argument

I am trying, in Python 3, to append some data to a file, like this:
prueba = open(streamingResultFile, "a")
... when I previously declare:
streamingResultFile = time.asctime().replace(' ', ' ').replace(' ', '_') + '.txt'
... to get a file whose name will be the current time and date, in this format:
Tue_Apr_4_03:08:55_2017.txt
But I run it, and I get the message in the title complaining about the name of my file not being correct. But if I put something else, like "hello.txt" it works.
Why can't I put that text as the name of my output file?
Check the allowed filename characters for your operating system.
For example, characters like \, :, >, ... are not allowed in Windows filenames.
See What characters are forbidden in Windows and Linux directory names? for details on forbidden characters in Windows/Linux filenames.
Regarding your specific problem: Replacing the colons : with other characters should solve the error.

how can one use os.listdir correctly on a network path?

Following code :
def tema_get_file():
logdir='T:\\'
logfiles = sorted([ f for f in os.listdir(logdir) if f.startswith('tms_int_calls-')])
return logfiles[-1]
This runs fine, but I am trying to get logdir to run with a direct path :
\\servername\path\folder
The drive T is a mapped drive. Originally, the files are on the C Drive.
As soon as I do that, I get the error message :
WindowsError: [Error 3] The system cannot find the path specified:
'\servername\path\folder/.'
I've tried :
"\\servername\\path\\folder" , "\\servername\\path\\folder\\"
and
r"\\servername\path\folder" , r"\\servername\path\folder\"
and
"\\\\servername\\path\\folder" , "\\\\servername\\path\\folder\\"
For me both of the following work
os.listdir(r'\\server\folder')
os.listdir('\\\\server\\folder')
os.listdir(myUNCpath) cannot handle Windows UNC path correctly if the path string was not defined by a literal like myUNCpath = "\\\\servername\\dir1\\dir2" or using a raw string like myUNCpath = "\\servername\dir1\dir2 even if the string variable is defined like that because listdir always doubles backslash from string variable.
But what the heck one could do if getting the UNC path string by reading it from a ini file or by any other config file?
There is no way to edit as a literal, nor is it possible to make it a raw string using this r character in front of.
As a work around I found out that, it is possible to split an overall UNC path string variable into it's single components (to get rid off this dammned backslash characters) and to recompose it using a literal definition and by this setting the backslash characters again. Then the string works well - incredibel but true!
Here is my function, to perform this work around. The string which is given back from the function will work as expected if the path in the file is defined as
\servername\dir1\dir2 (without added backslash as a escape character)
...
myworkswellUNCPath = recomposeUNCpathstring(myUNCpath)
...
def recomposeUNCpathstring(UNCstring):
pathstring1 = UNCstring.replace("\\\\", "").strip()
pathComponents = pathstring1.split("\\")
pathstring = "\\\\" + pathComponents[0]
for i in range(1, len(pathComponents)-1):
pathstring = pathstring + "\\" + pathComponents[i]
return pathstring
Cheers
Stefan

How to save a dataframe as a csv file with '/' in the file name

I want to save a dataframe to a .csv file with the name '123/123', but it will split it in to two strings if I just type like df.to_csv('123/123.csv').
Anyone knows how to keep the slash in the name of the file?
You can't use a slash in the name of the file.
But you can use Unicode character that looks like slash if your file system support it http://www.fileformat.info/info/unicode/char/2215/index.htm
... "/" is used for indicating a path to a child folder, so your filename says the file 123.csv is in a folder "123"
however, that does not make it entirely impossible, just very difficult see this question:
https://superuser.com/questions/187469/how-would-i-go-about-creating-a-filename-with-invalid-characters-such-as
and that a charmap can find a character that looks like it, which is legal. In this case a division character
You can not use any of these chars in the file name ;
/:*?\"|
You can use a similar unicode character as the "Fraction slash - Unicode hexadecimal: 0x2044"
Example:
df.to_csv("123{0}123".format(u'\u2044'.encode('utf-8')))
It gives you the filename that you asked.

Getting files in directory(directory has numbers in directory name)in python

I want to get files from directory (which has numbers in directory name). I am using below script. But it is throwing error.
yesterday=140402
os.chdir("C:\pythonPrograms\04-03-2014")
for file in glob.glob("MY*"+str(yesterday)+".log"):
print file
Error received:
WindowsError: [Error 123] The filename, directory name, or volume label syntax is incorrect: 'C:\\pythonPrograms\x04-03-2014'
Do I need to follow some convention while giving the path? The code works fine if I search in C:\pythonPrograms
"C:\pythonPrograms\04-03-2014"
The issue is the "\04", the \ character is used to denote an escape character, you may know about \n for new line. You can fix this by just doing:
os.chdir(r"C:\pythonPrograms\04-03-2014")
Which makes the string into a raw string. Or you can add another escape character to escape the escape character like:
"C:\\pythonPrograms\\04-03-2014"

os.walk() not picking up my file names

I'm trying to use a python script to edit a large directory of .html files in a loop. I'm having trouble looping through the filenames using os.walk(). This chunk of code just turns the html files into strings that I can work with, but the script does not even enter the loop, as if the files don't exist. Basically it prints point1 but never reaches point2. The script ends without an error message. The directory is set up inside the folder called "amazon", and there is one level of 20 subfolders inside of it with 20 html files in each of those.
Oddly the code works perfectly on a neighboring directory that only contains .txt files, but it seems like it's not grabbing my .html files for some reason. Is there something I don't understand about the structure of the for root, dirs, filenames in os.walk() loop? This is my first time using os.walk, and I've looked at a number of other pages on this site to try to make it work.
import os
rootdir = 'C:\filepath\amazon'
print "point1"
for root, dirs, filenames in os.walk(rootdir):
print "point2"
for file in filenames:
with open (os.path.join(root, file), 'r') as myfile:
g = myfile.read()
print g
Any help is much appreciated.
The backslash is used as an escape. Either double them, or use "raw strings" by putting a prefix "r" on it.
Example:
>>> 'C:\filepath\amazon'
'C:\x0cilepath\x07mazon'
>>> r'\x'
'\\x'
>>> '\x'
ValueError: invalid \x escape
Explanation: In Python, what does preceding a string literal with “r” mean?
You can avoid having to explicitly handle slashes of any sort by using os.path.join:
rootdir = os.path.join('C:', 'filepath', 'amazon')
Your problem is that you're using backslashes in your path:
>>> rootdir = 'C:\filepath\amazon'
>>> rootdir
'C:\x0cilepath\x07mazon'
>>> print(rootdir)
C:
ilepathmazon
Because Python strings use the backslash to escape special characters, in your rootdir the \f represents an ASCII Form Feed character, and the \a represents an ASCII Bell character.
You can either use a raw string (note the r before the apostrophe) to avoid this:
>>> rootdir = r'C:\filepath\amazon'
>>> rootdir
'C:\\filepath\\amazon'
>>> print(rootdir)
C:\filepath\amazon
... or just use regular slashes, which work fine on Windows anyway:
>>> rootdir = 'C:/filepath/amazon'
>>> rootdir
'C:/filepath/amazon'
>>> print(rootdir)
C:/filepath/amazon
As Huu Nguyen points out, it's considered good practice to construct paths using os.path.join() when possible ... that way you avoid the problem altogether:
>>> rootdir = os.path.join('C:', 'filepath', 'amazon')
>>> rootdir
'C:\\filepath\\amazon' # presumably ... I don't use Windows.
>>> print(rootdir)
C:\filepath\amazon
I had an issue that sounds similar to this with os.walk. The escape character (\) added to filepaths on Mac due to spaces in the path was causing the problem.
For example, the path:
/Volumes/MacHD/My Folder/MyFiles/...
when accessed via Terminal is shown as:
/Volumes/MacHD/My\ Folder/MyFiles/...
The solution was to read the path to a string and then create a new string that removed the escape characters, e.g:
# Ask user for directory tree to scan for master files
masterpathraw = raw_input("Specify directory of master files:")
# Clear escape characters from the path
masterpath = masterpathraw.replace('\\', '')
# Provide this path to os.walk
for fullpath, _, filenames in os.walk(masterpath):
# Do stuff

Categories

Resources