Python's os.walk() fails in Windows when there are long filenames - python

I use python os.walk() to get files and dirs in some directories, but there're files whose names are too long(>300), os.walk() return nothing, use onerror I get '[Error 234] More data is available'. I tried to use yield, but also get nothing and shows 'Traceback: StopIteration'.
OS is windows, code is simple. I have tested with a directory, if there's long-name file, problem occur, while if rename the long-name files with short names, code can get correct result.
I can do nothing for these directories, such as rename or move the long-name files.
Please help me to solve the problem!
def t(a):
for root,dirs,files in os.walk(a):
print root,dirs,files
t('c:/test/1')

In Windows file names (including path) can not be greater than 255 characters, so the error you're seeing comes from Windows, not from Python - because somehow you managed to create such big file names, but now you can't read them. See this post for more details.

The only workaround I can think of is to map the the folder to the specific directory. This will make the path way shorter. e.g. z:\myfile.xlsx instead of c:\a\b\c\d\e\f\g\myfile.xlsx

Related

Incorrect file reading when using os.walk in python3

I am crawling through folders using the os.walk() method. In one of the folders, there is a large number of files, around 100,000 of them. The files look like: p_123_456.zip. But they are read as p123456.zip. Indeed, when I open windows explorer to browse the folder, for the first several seconds the files look like p123456.zip, but then change their appearance to p_123_456.zip. This is a strange scenario.
Now, I can't use time.sleep() because all folders and and files are being read into python variables in the looping line. Here is a snippet of the code:
for root, dirs, files in os.walk(srcFolder):
os.chdir(root)
for file in files:
shutil.copy(file, storeFolder)
In the last line, I get a file not found exception, saying that the file p123456.zip does not exist. Has anyone run into this mysterious issue? Anyway to bypass this? What is the cause of this? Thank you.
You don't seem to be concatenating the actual folder name with the filenames. Try changing your code to:
for root, dirs, files in os.walk(srcFolder):
for file in files:
shutil.copy(os.path.join(root, file), storeFolder)
os.chdir should be avoided like the plague. For one thing - if the changes suceeeds, it won't be the directory from which you are running your os.walk anymore - and then, a second chdir on another folder will fail (either stop your porgram or change you to an unexpected folder).
Just add the folder name as prefixes, and don't try using chdir.
Moreover, as for the comment from ShadowRanger above, os.walk officially breaks if you chdir inside its iteration - https://docs.python.org/3/library/os.html#os.walk - that is likely the root of the problem you had.

Python - extract and modify a file path in all files in a directory in linux

I have files .sh files and .json files in which there are file paths given to point to a specific directory, but I should keep on changing the file path, depending on where my python scipt is run.
eg:content of one of my .sh file is
"cd /home/aswany/BotStudioInstallation/databricks/platform/databricksastro"
and I should change the file path via python code where the following path
"/home/aswany/BotStudioInstallation/" keep on changing depending on where databicks is located,
I tried the following code:
replaceAll(str(self.currentdirectory)+
"/databricks/platform/devsettings.json",
"/home/holmes/BotStudioInstallation",self.currentdirectory)
and function replaceAll is:
def replaceAll(self,file,searchExp,replaceExp):
for line in fileinput.input(file, inplace=1):
if searchExp in line:
line = line.replace(searchExp,replaceExp)
sys.stdout.write(line)
but above code only replaces a line
"home/holmes/BotStudioInstallation" to the current directory I am logged in,bt it cannot be sure that "home/holmes/BotStudioInstallation" is the only possibility it keep on changing like "home/aswany/BotStudioInstallation","home/dev3/BotStudioInstallation" etc ,I thought of regular expression for this.
please help me
Not sure I 100% understood your issue, but maybe I can help nonetheless.
As pointed out by J.F. Sebastian, you can use relative paths and remove the base part of the path. Using ./databricks/platform/devsettings.json might be enough. This is by far the most elegant solution.
If for any reason it is not, you can keep the directory you need to access, then append it to the base directory whenever you need it. That should allow you to deal with changes in the base directory. Though in the case the files will be used by other applications than your own, that might not be an option.
dir = get_dir_from_json()
dir_with_base = self.currentdirectory + dir
Alternatively, not an elegant solution though, without using regex you can use a "pattern" to always replace.
{
"directory": "<<_replace_me_>>/databricks/platform"
}
Then you know you can always replace "<<_replace_me_>>" with the base directory.

full path name in the "root" variable returned in os.walk

I am using os.walk to run through a tree of directories check for some input files and then run a program if the proper inputs are there. I notice I am having a problem because of the away that os.walk is evaluating the root variable in the loop:
for root, dirs, files in os.walk('.'):# I use '.' because I want the walk to
# start where I run the script. And it
# will/can change
if "input.file" in files:
infile = os.path.join(root,"input.file")
subprocess.check_output("myprog input.file", Shell=True)
# if an input file is found, store the path to it
# and run the program
This is giving me an issue because the infile string looks like this
./path/to/input.file
When it needs to look like this for the program to be able to find it
/home/start/of/walk/path/to/input.file
I want to know if there is a better method/ a different way to use os.walk such that I can leave the starting directory arbitrary, but still be able to use the full path to any files that it finds for me. Thanks
The program I am using is written by me in c++ and I suppose I could modify it as well. But I am not asking about how to do that in this question just to clarify this question is about python's os.walk and related topics that is why there is no examples of my c++ code here.
Instead of using ., convert it to the absolute path by using os.path.abspath("."). That will convert your current path to an absolute path before you begin.

How to get the path of a program in python?

I'm doing a program in which Chimera needs to be opened, I'm opening it with:
def generate_files_bat(filename):
f = open(filename, 'w')
text = """echo off SET PATH=%PATH%;"C:\\Program Files (x86)\\Chimera 1.6.1\\bin" chimera colpeps.cmd"""
print >>f, text
f.close()
But I need to find Chimera apart from the computer the python program is running. Is there any way the path can be searched by the python program in any computer?
Generally speaking, I don't think it is such a good idea to search the path for a program. Imagine, for example that two different versions were installed on the machine. Are-you sure to find the right one? Maybe a configuraition file parsed with the standard module ConfigParser would be a better option?
Anyway, to go back to your question, in order to find a file or directory, you could try to use os.walk which recursively walks trough a directory tree.
Here is an example invoking os.walk from a generator, allowing you to collect either the first or all matching file names. Please note that the generator result is only based on file name. If you require more advanced filtering (say, to only keep executable files), you will probably use something like os.stat() to extend the test.
import os
def fileInPath(name, root):
for base, dirs, files in os.walk(root):
if name in files:
yield os.path.join(base, name)
print("Search for only one result:")
print(next(fileInPath("python", "/home/sylvain")))
print("Display all matching files:")
print([i for i in fileInPath("python", "/home/sylvain")])
There is which for Linux and where for Windows. They both give you the path to the executable, provided it lies in a directory that is 'searched' by the console (so it has to be in %PATH% in case of Windows)
There is a package called Unipath, that does elegant, clean path calculations.
Have look here for the AbstractPath constructor
Example:
from unipath import Path
prom_dir = Path(__file__)

Distinguishing Files From Directories

So I'm sure this is a stupid question, but I've looked through Python's documentation and attempted a couple of Google codes and none of them has worked.
It seems like the following should work, but it returns "False" for
In my directory /foo/bar I have 3 items: 1 Folder "[Folder]", 1 file "test" (no extension), and 1 file "test.py".
I'm look to have a script that can distinguish folders from files for a bunch of functions, but I can't figure out anything that works.
#!/usr/bin/python
import os, re
for f in os.listdir('/foo/bar'):
print f, os.path.isdir(f)
Currently returns false for everything.
This is because listdir() returns the names of the files in /foo/bar. When you later do os.path.isdir() on one of these, the OS interprets it relative to the current working directory which is probably the directory your script is in, not /foo/bar, and it probably does not contain a directory of the specified name. A path that doesn't exist is not a directory and so isdir() returns False..
Use the complete pathname. Best way is to use os.path.join, e.g., os.path.isdir(os.path.join('/foo/bar', f)).
You might want to use os.walk instead: http://docs.python.org/library/os.html#os.walk
When it returns the contents of the directory, it returns files and directories in separate lists, negating the need for checking.
So you could do:
import os
root, dirs, files = next(os.walk('/foo/bar'))
print 'directories:', dirs
print 'files:', files
I suppose that os.path.isdir(os.path.join('/foo/bar', f)) should work.

Categories

Resources