Python os module path functions - python

From the documentation:
os.path.realpath(path)
Return the canonical path of the specified filename, eliminating any
symbolic links encountered in the path (if they are supported by the
operating system).
When I invoke this with an extant file's name, I get the path to it: /home/myhome/myproject.
When I invoke this with a 'nonsense.xxx' string argument, I still get a path to /home/myhome/myproject/nonsense.xxx. This is a little inconsistent because it looks like nonsense.xxx is taken to be a directory not a file (though it is neither: it does not exist).
When I invoke this with a null string file name, I still get a path to /home/myhome/myproject.
How can I account for this behaviour when the documentation says so little about realpath()? (I am using Python 2.5.)
Edit: Somebody suggested a way to test if files exist. My concern is not to test if files exist. My concern is to account for behaviour.

os.path isn't interested in whether or not the files exist. It is merely concerned with constructing paths.
realpath eliminates known symlinks from the equation, but directories that do not exist are assumed to be valid elements of a path regardless.

Rather than guess, just read the code! It's there in your python installation. Or browse here, it's only 14 lines minus comments.

Place test such as "os.path.isfile(x)", "x is not None" and "os.path.isdir(x)" before the call?

Related

Python - extract and modify a file path in all files in a directory in linux

I have files .sh files and .json files in which there are file paths given to point to a specific directory, but I should keep on changing the file path, depending on where my python scipt is run.
eg:content of one of my .sh file is
"cd /home/aswany/BotStudioInstallation/databricks/platform/databricksastro"
and I should change the file path via python code where the following path
"/home/aswany/BotStudioInstallation/" keep on changing depending on where databicks is located,
I tried the following code:
replaceAll(str(self.currentdirectory)+
"/databricks/platform/devsettings.json",
"/home/holmes/BotStudioInstallation",self.currentdirectory)
and function replaceAll is:
def replaceAll(self,file,searchExp,replaceExp):
for line in fileinput.input(file, inplace=1):
if searchExp in line:
line = line.replace(searchExp,replaceExp)
sys.stdout.write(line)
but above code only replaces a line
"home/holmes/BotStudioInstallation" to the current directory I am logged in,bt it cannot be sure that "home/holmes/BotStudioInstallation" is the only possibility it keep on changing like "home/aswany/BotStudioInstallation","home/dev3/BotStudioInstallation" etc ,I thought of regular expression for this.
please help me
Not sure I 100% understood your issue, but maybe I can help nonetheless.
As pointed out by J.F. Sebastian, you can use relative paths and remove the base part of the path. Using ./databricks/platform/devsettings.json might be enough. This is by far the most elegant solution.
If for any reason it is not, you can keep the directory you need to access, then append it to the base directory whenever you need it. That should allow you to deal with changes in the base directory. Though in the case the files will be used by other applications than your own, that might not be an option.
dir = get_dir_from_json()
dir_with_base = self.currentdirectory + dir
Alternatively, not an elegant solution though, without using regex you can use a "pattern" to always replace.
{
"directory": "<<_replace_me_>>/databricks/platform"
}
Then you know you can always replace "<<_replace_me_>>" with the base directory.

What directory does os.path.join start at?

I made a script in the past to mass rename any file greater than x characters in a directory. When I made that script I had a source directory which you would need to input manually. Any file that was over x characters in that directory would be stripped of it's extension, renamed, then the extension would be re added and it would use os.path.join to join the source and the newly created filename+ext. I'm now making another script and used os.path.join("Folder in the current dir", "file in that dir"). Because this worked I'm guessing that when os.path.join is called with just a foldername and no full path in it's first parameter it starts it's search from the directory that the script it was run in? Just wondering if this is correct.
os.path.join has nothing to do with any actual filesystem, and does not "start" anywhere. It simply joins two arbitrary paths, whether they exist or not.
What os.path.join does is to just join path elements the system-compatible way, taking into effect the particular directory separator character, etc., into account. It's a simple string manipulation tool.
So the returned result simply starts from whatever you give to it as the first argument.

normalize non-existing path using pathlib only

python has recently added the pathlib module (which i like a lot!).
there is just one thing i'm struggling with: is it possible to normalize a path to a file or directory that does not exist? i can do that perfectly well with os.path.normpath. but wouldn't it be absurd to have to use something other than the library that should take care of path related stuff?
the functionality i would like to have is this:
from os.path import normpath
from pathlib import Path
pth = Path('/tmp/some_directory/../i_do_not_exist.txt')
pth = Path(normpath(str(pth)))
# -> /tmp/i_do_not_exist.txt
but without having to resort to os.path and without having to type-cast to str and back to Path. also pth.resolve() does not work for non-existing files.
is there a simple way to do that with just pathlib?
is it possible to normalize a path to a file or directory that does not exist?
Starting from 3.6, it's the default behavior. See https://docs.python.org/3.6/library/pathlib.html#pathlib.Path.resolve
Path.resolve(strict=False)
...
If strict is False, the path is resolved as far as possible and any remainder is appended without checking whether it exists
As of Python 3.5: No, there's not.
PEP 0428 states:
Path resolution
The resolve() method makes a path absolute, resolving
any symlink on the way (like the POSIX realpath() call). It is the
only operation which will remove " .. " path components. On Windows,
this method will also take care to return the canonical path (with the
right casing).
Since resolve() is the only operation to remove the ".." components, and it fails when the file doesn't exist, there won't be a simple means using just pathlib.
Also, the pathlib documentation gives a hint as to why:
Spurious slashes and single dots are collapsed, but double dots ('..')
are not, since this would change the meaning of a path in the face of
symbolic links:
PurePath('foo//bar') produces PurePosixPath('foo/bar')
PurePath('foo/./bar') produces PurePosixPath('foo/bar')
PurePath('foo/../bar') produces PurePosixPath('foo/../bar')
(a naïve approach would make PurePosixPath('foo/../bar') equivalent to PurePosixPath('bar'), which is wrong if foo is a symbolic link to another directory)
All that said, you could create a 0 byte file at the location of your path, and then it'd be possible to resolve the path (thus eliminating the ..). I'm not sure that's any simpler than your normpath approach, though.
If this fits you usecase(e.g. ifle's directory already exists) you might try to resolve path's parent and then re-append file name, e.g.:
from pathlib import Path
p = Path()/'hello.there'
print(p.parent.resolve()/p.name)
Old question, but here is another solution in particular if you want POSIX paths across the board (like nix paths on Windows too).
I found pathlib resolve() to be broken as of Python 3.10, and this method is not currently exposed by PurePosixPath.
What I found worked was to use posixpath.normpath(). Also found PurePosixPath.joinpath() to be broken. I.E. It will not join ".." with "myfile.txt" as expected. It will return just "myfile.txt". But posixpath.join() works perfectly; will return "../myfile.txt".
Note this is in path strings, but easily back to pathlib.Path(my_posix_path) et al for an OOP container.
And easily transpose to Windows platform paths too by just constructing this way, as the module takes care of the platform independence for you.
Might be the solution for others with Python file path woes..

When isfile() isdir() is just not enough

filePath variable points to non-existent file (yet to be created later).
directoryPath variable points to non-existent directory (yet to be created later).
filePath="/VolumeA/DiskA/DirectoryA/textFile.txt"
directoryPath="/VolumeB/DiskB/DirectoryB"
Since both do not exist we can't use:
os.path.isfile()
os.path.isdir()
What would be a way to check/verify which variable most likely is pointing to file and which points to directory.
Because a file can have just about any character in its name, and because a directory can have just about any character in its name, if you have a string and no way to check with the OS, you have no way to tell which it is supposed to be.

how to find the target file's full(absolute path) of the symbolic link or soft link in python

when i give
ls -l /etc/fonts/conf.d/70-yes-bitmaps.conf
lrwxrwxrwx <snip> /etc/fonts/conf.d/70-yes-bitmaps.conf -> ../conf.avail/70-yes-bitmaps.conf
so for a symbolic link or soft link, how to find the target file's full(absolute path) in python,
If i use
os.readlink('/etc/fonts/conf.d/70-yes-bitmaps.conf')
it outputs
../conf.avail/70-yes-bitmaps.conf
but i need the absolute path not the relative path, so my desired output must be,
/etc/fonts/conf.avail/70-yes-bitmaps.conf
how to replace the .. with the actual full path of the parent directory of the symbolic link or soft link file.
os.path.realpath(path)
os.path.realpath returns the canonical path of the specified filename, eliminating any symbolic links encountered in the path.
As unutbu says, os.path.realpath(path) should be the right answer, returning the canonical path of the specified filename, resolving any symbolic links to their targets. But it's broken under Windows.
I've created a patch for Python 3.2 to fix this bug, and uploaded it to:
http://bugs.python.org/issue9949
It fixes the realpath() function in Python32\Lib\ntpath.py
I've also put it on my server, here:
http://www.burtonsys.com/ntpath_fix_issue9949.zip
Unfortunately, the bug is present in Python 2.x, too, and I know of no fix for it there.
http://docs.python.org/library/os.path.html#os.path.abspath
also joinpath() and normpath(), depending on whether you're in the current working directory, or you're working with things elsewhere. normpath() might be more direct for you.
Specifically:
os.path.normpath(
os.path.join(
os.path.dirname( '/etc/fonts/conf.d/70-yes-bitmaps.conf' ),
os.readlink('/etc/fonts/conf.d/70-yes-bitmaps.conf')
)
)
I recommend using pathlib library for filesystem operations.
import pathlib
x = pathlib.Path('lol/lol/path')
x.resolve()
Documentation for Path.resolve(strict=False): make the path absolute, resolving any symlinks. A new path object is returned.
On windows 10, python 3.5, os.readlink("C:\\Users\PP") where "C:\Users\PP" is a symbolic link (not a junction link) works.
It returns the absolute path to the directory.
This works on Ubuntu 16.04, python 3.5 as well.
The documentation says to use os.path.join():
The result may be either an absolute or relative pathname; if it is relative, it may be converted to an absolute pathname using os.path.join(os.path.dirname(path), result).

Categories

Resources