Case sensitive path comparison in python

Case sensitive path comparison in python - python

I have to check whether a file is present a particular path in Mac OS X.
There is a file called foo.F90 inside the directory.
But when I do if(os.path.exists(PATH_TO_foo.f90)), it returns true and does not notice that f90 is lower case and the file which exists is upper case F90.
I tried open(PATH_TO_foo.f90, "r"), even this does not work
How do I get around this?

As some commenters have noted, Python doesn't really care about case in paths on case-insensitive filesystems, so none of the path comparison or manipulation functions will really do what you need.
However, you can indirectly test this with os.listdir(), which does give you the directory contents with their actual cases. Given that, you can test if the file exists with something like:
'foo.f90' in os.listdir('PATH_TO_DIRECTORY')

This is something related to your underlying Operating system and not python. For example, in windows the filesystem is case-insensitive while in Linux it is case sensitive. So if I run the same check, as you are running, on a linux based system I won't get true for case insensitive matches -
>>> os.path.exists('F90')
True
>>> os.path.exists('f90')
False # on my linux based OS
Still if you really want to get a solution around this, you can do this -
if 'f90' in os.listdir(os.path.dirname('PATH_TO_foo.f90')):
# do whatever you want to do

For anyone still struggling with this. The following snippet works for me.
from pathlib import Path
def path_exists_case_sensitive(p: Path) -> bool:
"""Check if path exists, enforce case sensitivity.
Arguments:
p: Path to check
Returns:
Boolean indicating if the path exists or not
"""
# If it doesn't exist initially, return False
if not p.exists():
return False
# Else loop over the path, checking each consecutive folder for
# case sensitivity
while True:
# At root, p == p.parent --> break loop and return True
if p == p.parent:
return True
# If string representation of path is not in parent directory, return False
if str(p) not in map(str, p.parent.iterdir()):
return False
p = p.parent

So, since you do not have a case sensitive filesystem, how about
import os
if 'foo.F90' in os.listdir(PATH_TO_DIRECTORY):
open(PATH_TO_DIRECTORY + 'foo.F90')

If the path changes between systems or something you could use:
import os, fnmatch
for file in os.listdir('.'):
if fnmatch.fnmatch(file, 'foo.*'):
print file
This will return all files foo.

I think this should work for you -
import os
def exists(path):
import win32api
if path.find('/') != -1:
msg = 'Invalid path for Windows %s'%path
raise Exception(msg)
path = path.rstrip(os.sep)
try:
fullpath = win32api.GetLongPathNameW(win32api.GetShortPathName(path))
except Exception, fault:
return False
if fullpath == path:
return True
else:
return False

Easy way to do this for Python 3.5+ using pathlib. Works for files and dirs:
from pathlib import Path
def exists_case_sensitive(path) -> bool:
p = Path(path)
if not p.exists():
# If it already doesn't exist(),
# we can skip iterdir().
return False
return p in p.parent.iterdir()

Related

How to find steam executable path in Python

What are the best ways to find the Steam install path (for example using registry, possible paths, and the Steam start-menu shortcut?)

This code works for a personal project on Windows.
import os
import winreg
import win32api
def read_reg(ep, p = r"", k = ''):
try:
key = winreg.OpenKeyEx(ep, p)
value = winreg.QueryValueEx(key,k)
if key:
winreg.CloseKey(key)
return value[0]
except Exception as e:
return None
return None
Path1 = "{}\\Microsoft\\Windows\\Start Menu\\Programs\\Steam\\Steam.lnk".format(os.getenv('APPDATA'))
if os.path.exists(Path1):
import sys
import win32com.client
shell = win32com.client.Dispatch("WScript.Shell")
shortcut = shell.CreateShortCut(Path1)
Path1Res = shortcut.Targetpath
else:
Path1Res = False
Path2 = str(read_reg(ep = winreg.HKEY_LOCAL_MACHINE, p = r"SOFTWARE\Wow6432Node\Valve\Steam", k = 'InstallPath'))+r"\steam.exe"
Path3 = str(read_reg(ep = winreg.HKEY_LOCAL_MACHINE, p = r"SOFTWARE\Valve\Steam", k = 'InstallPath'))+r"\steam.exe"
if not os.path.exists(Path2):
Path2 = None
if not os.path.exists(Path3):
Path3 = None
PossiblePaths = [r"X:\Steam\steam.exe", r"X:\Program Files\Steam\steam.exe", r"X:\Program Files (x86)\Steam\steam.exe"]
ValidHardPaths = []
for Drive in win32api.GetLogicalDriveStrings().split('\000')[:-1]:
Drive = Drive.replace(':\\', '')
for path in PossiblePaths:
path = path.replace("X", Drive)
if os.path.exists(path):
ValidHardPaths.append(path)
if len(ValidHardPaths) == 0:
ValidHardPaths = ["None"]
print("Registry64: " + str(Path2)+"|"+ "Registry32: "+ str(Path3)+"|"+ "Start Menu Shortcut: "+ str(Path1Res)+"|"+ "Possible Locations: " + ', '.join(ValidHardPaths)+"|")
As I said earlier, this code was for a personal project, but ill still try to explain the code the best I can.
Method 1: (start menu shortcut) works by first trying to find the steam start menu shortcut, if it exists it will read the destination and add 'steam.exe' to it, then it will check if the path is valid (source: https://stackoverflow.com/a/571573/14132974).
Method 2: (registry) works by attempting to find the steam registry path and reading the key: "InstallPath", adding 'steam.exe' to it, and then checking if the path is valid. It will also do the same using the Steam32 registry path (source: https://tutorialexample.com/python-read-and-write-windows-registry-a-step-guide-python-tutorial/, https://github.com/NPBruce/valkyrie/issues/1056).
Method 3: (possible paths) is fairly simple, there is a list of paths where there is a big chance Steam might be installed, it will check this path for every drive in the system and check if the path is valid (source: https://stackoverflow.com/a/827397/14132974).
Lastly, support:
This code obviously supports having a valid path found with a method, if not it will become 'None', it also supports multiple paths being found in the 'possible paths' method

Is it possible to mock os.scandir and its attributes?

for entry in os.scandir(document_dir)
if os.path.isdir(entry):
# some code goes here
else:
# else the file needs to be in a folder
file_path = entry.path.replace(os.sep, '/')
I am having trouble mocking os.scandir and the path attribute within the else statement. I am not able to mock the mock object's property I created in my unit tests.
with patch("os.scandir") as mock_scandir:
# mock_scandir.return_value = ["docs.json", ]
# mock_scandir.side_effect = ["docs.json", ]
# mock_scandir.return_value.path = PropertyMock(return_value="docs.json")
These are all the options I've tried. Any help is greatly appreciated.

It depends on what you realy need to mock. The problem is that os.scandir returns entries of type os.DirEntry. One possibility is to use your own mock DirEntry and implement only the methods that you need (in your example, only path). For your example, you also have to mock os.path.isdir. Here is a self-contained example for how you can do this:
import os
from unittest.mock import patch
def get_paths(document_dir):
# example function containing your code
paths = []
for entry in os.scandir(document_dir):
if os.path.isdir(entry):
pass
else:
# else the file needs to be in a folder
file_path = entry.path.replace(os.sep, '/')
paths.append(file_path)
return paths
class DirEntry:
def __init__(self, path):
self.path = path
def path(self):
return self.path
#patch("os.scandir")
#patch("os.path.isdir")
def test_sut(mock_isdir, mock_scandir):
mock_isdir.return_value = False
mock_scandir.return_value = [DirEntry("docs.json")]
assert get_paths("anydir") == ["docs.json"]
Depending on your actual code, you may have to do more.
If you want to patch more file system functions, you may consider to use pyfakefs instead, which patches the whole file system. This will be overkill for a single test, but can be handy for a test suite relying on file system functions.
Disclaimer: I'm a contributor to pyfakefs.

How to search upwards through directories? Can I os.walk upwards to the root of the filesystem?

I am trying to search for a specific directory, starting from a given directory but going upwards, rather than down as in os.walk. For example, this function returns whether or not the given directory is the root of an Alire project - which just means it contains alire/*.toml:
''' Check if this directory contains a 'alire/*.toml' file '''
def is_alire_root(dir):
dir = dir / "alire"
if dir.is_dir():
for x in dir.iterdir():
if x.suffixes == [".toml"]:
return True
return False
else:
return False
So, given such a predicate that tells us whether we have found the directory we need, how would I search upwards from a given path, so that e.g.
os_walk_upwards(os.path.abspath("."), is_alire_root)
Will tell us if the current directory or any directories above it contain alire/*.toml? Although os_walk_upwards could be used for various searches, I am specifically looking for something that will work as a plugin in Gnatstudio.

For python version >= 3.4 we can use pathlib:
import os.path
from pathlib import Path
def is_alire_root(dir):
(... as above ...)
''' Search upwards from path for a directory matching the predicate '''
def os_walk_upwards(directory_predicate, path=Path(os.path.abspath("."))):
if directory_predicate(path):
return True
else:
parent = path.parent
if parent == path:
return False # reached root of filesystem
return directory_predicate(parent)
print(os_walk_upwards(is_alire_root))
But Gnatstudio uses python 2.7.16, so that isn't going to work. Instead, use:
import os.path
''' Check if this directory contains a 'alire/*.toml' file '''
def is_alire_root(dir):
dir = os.path.join(dir, "alire")
if os.path.isdir(dir):
for x in os.listdir(dir):
if os.path.splitext(x)[1] == ".toml": # will also match e.g. *.tar.gz.toml
return True
return False
else:
return False
''' Check if this or any parent directories are alire_root directories '''
def os_walk_upwards(directory_predicate, path=os.path.abspath(".")):
if directory_predicate(path):
return True
else:
parent = os.path.dirname(path)
if parent == path:
return False # reached root of filesystem
return directory_predicate(parent)
print(os_walk_upwards(is_alire_root))

Check Contents of Python Package without Running it?

I would like a function that, given a name which caused a NameError, can identify Python packages which could be imported to resolve it.
That part is fairly easy, and I've done it, but now I have an additional problem: I'd like to do it without causing side-effects. Here's the code I'm using right now:
def necessaryImportFor(name):
from pkgutil import walk_packages
for package in walk_packages():
if package[1] == name:
return name
try:
if hasattr(__import__(package[1]), name):
return package[1]
except Exception as e:
print("Can't check " + package[1] + " on account of a " + e.__class__.__name__ + ": " + str(e))
print("No possible import satisfies " + name)
The problem is that this code actually __import__s every module. This means that every side-effect of importing every module occurs. When testing my code I found that side-effects that can be caused by importing all modules include:
Launching tkinter applications
Requesting passwords with getpass
Requesting other input or raw_input
Printing messages (import this)
Opening websites (import antigravity)
A possible solution that I considered would be finding the path to every module (how? It seems to me that the only way to do this is by importing the module then using some methods from inspect on it), then parsing it to find every class, def, and = that isn't itself within a class or def, but that seems like a huge PITA and I don't think it would work for modules which are implemented in C/C++ instead of pure Python.
Another possibility is launching a child Python instance which has its output redirected to devnull and performing its checks there, killing it if it takes too long. That would solve the first four bullets, and the fifth one is such a special case that I could just skip antigravity. But having to start up thousands of instances of Python in this single function seems a bit... heavy and inefficient.
Does anyone have a better solution I haven't considered? Is there a simple way of just telling Python to generate an AST or something without actually importing a module, for example?

So I ended up writing a few methods which can list everything from a source file, without importing the source file.
The ast module doesn't seem particularly well documented, so this was a bit of a PITA trying to figure out how to extract everything of interest. Still, after ~6 hours of trial and error today, I was able to get this together and run it on the 3000+ Python source files on my computer without any exceptions being raised.
def listImportablesFromAST(ast_):
from ast import (Assign, ClassDef, FunctionDef, Import, ImportFrom, Name,
For, Tuple, TryExcept, TryFinally, With)
if isinstance(ast_, (ClassDef, FunctionDef)):
return [ast_.name]
elif isinstance(ast_, (Import, ImportFrom)):
return [name.asname if name.asname else name.name for name in ast_.names]
ret = []
if isinstance(ast_, Assign):
for target in ast_.targets:
if isinstance(target, Tuple):
ret.extend([elt.id for elt in target.elts])
elif isinstance(target, Name):
ret.append(target.id)
return ret
# These two attributes cover everything of interest from If, Module,
# and While. They also cover parts of For, TryExcept, TryFinally, and With.
if hasattr(ast_, 'body') and isinstance(ast_.body, list):
for innerAST in ast_.body:
ret.extend(listImportablesFromAST(innerAST))
if hasattr(ast_, 'orelse'):
for innerAST in ast_.orelse:
ret.extend(listImportablesFromAST(innerAST))
if isinstance(ast_, For):
target = ast_.target
if isinstance(target, Tuple):
ret.extend([elt.id for elt in target.elts])
else:
ret.append(target.id)
elif isinstance(ast_, TryExcept):
for innerAST in ast_.handlers:
ret.extend(listImportablesFromAST(innerAST))
elif isinstance(ast_, TryFinally):
for innerAST in ast_.finalbody:
ret.extend(listImportablesFromAST(innerAST))
elif isinstance(ast_, With):
if ast_.optional_vars:
ret.append(ast_.optional_vars.id)
return ret
def listImportablesFromSource(source, filename = '<Unknown>'):
from ast import parse
return listImportablesFromAST(parse(source, filename))
def listImportablesFromSourceFile(filename):
with open(filename) as f:
source = f.read()
return listImportablesFromSource(source, filename)
The above code covers the titular question: How do I check the contents of a Python package without running it?
But it leaves you with another question: How do I get the path to a Python package from just its name?
Here's what I wrote to handle that:
class PathToSourceFileException(Exception):
pass
class PackageMissingChildException(PathToSourceFileException):
pass
class PackageMissingInitException(PathToSourceFileException):
pass
class NotASourceFileException(PathToSourceFileException):
pass
def pathToSourceFile(name):
'''
Given a name, returns the path to the source file, if possible.
Otherwise raises an ImportError or subclass of PathToSourceFileException.
'''
from os.path import dirname, isdir, isfile, join
if '.' in name:
parentSource = pathToSourceFile('.'.join(name.split('.')[:-1]))
path = join(dirname(parentSource), name.split('.')[-1])
if isdir(path):
path = join(path, '__init__.py')
if isfile(path):
return path
raise PackageMissingInitException()
path += '.py'
if isfile(path):
return path
raise PackageMissingChildException()
from imp import find_module, PKG_DIRECTORY, PY_SOURCE
f, path, (suffix, mode, type_) = find_module(name)
if f:
f.close()
if type_ == PY_SOURCE:
return path
elif type_ == PKG_DIRECTORY:
path = join(path, '__init__.py')
if isfile(path):
return path
raise PackageMissingInitException()
raise NotASourceFileException('Name ' + name + ' refers to the file at path ' + path + ' which is not that of a source file.')
Trying the two bits of code together, I have this function:
def listImportablesFromName(name, allowImport = False):
try:
return listImportablesFromSourceFile(pathToSourceFile(name))
except PathToSourceFileException:
if not allowImport:
raise
return dir(__import__(name))
Finally, here's the implementation for the function that I mentioned I wanted in my question:
def necessaryImportFor(name):
packageNames = []
def nameHandler(name):
packageNames.append(name)
from pkgutil import walk_packages
for package in walk_packages(onerror=nameHandler):
nameHandler(package[1])
# Suggestion: Sort package names by count of '.', so shallower packages are searched first.
for package in packageNames:
# Suggestion: just skip any package that starts with 'test.'
try:
if name in listImportablesForName(package):
return package
except ImportError:
pass
except PathToSourceFileException:
pass
return None
And that's how I spent my Sunday.

What is NamedTemporaryFile useful for on Windows?

The Python module tempfile contains both NamedTemporaryFile and TemporaryFile. The documentation for the former says
Whether the name can be used to open the file a second time, while the named temporary file is still open, varies across platforms (it can be so used on Unix; it cannot on Windows NT or later)
What is the point of the file having a name if I can't use that name? If I want the useful (for me) behaviour of Unix on Windows, I've got to make a copy of the code and rip out all the bits that say if _os.name == 'nt' and the like.
What gives? Surely this is useful for something, since it was deliberately coded this way, but what is that something?

It states that accessing it a second time while it is still open. You can still use the name otherwise, just be sure to pass delete=False when creating the NamedTemporaryFile so that it persists after it is closed.

I use this:
import os, tempfile, gc
class TemporaryFile:
def __init__(self, name, io, delete):
self.name = name
self.__io = io
self.__delete = delete
def __getattr__(self, k):
return getattr(self.__io, k)
def __del__(self):
if self.__delete:
try:
os.unlink(self.name)
except FileNotFoundError:
pass
def NamedTemporaryFile(mode='w+b', bufsize=-1, suffix='', prefix='tmp', dir=None, delete=True):
if not dir:
dir = tempfile.gettempdir()
name = os.path.join(dir, prefix + os.urandom(32).hex() + suffix)
if mode is None:
return TemporaryFile(name, None, delete)
fh = open(name, "w+b", bufsize)
if mode != "w+b":
fh.close()
fh = open(name, mode)
return TemporaryFile(name, fh, delete)
def test_ntf_txt():
x = NamedTemporaryFile("w")
x.write("hello")
x.close()
assert os.path.exists(x.name)
with open(x.name) as f:
assert f.read() == "hello"
def test_ntf_name():
x = NamedTemporaryFile(suffix="s", prefix="p")
assert os.path.basename(x.name)[0] == 'p'
assert os.path.basename(x.name)[-1] == 's'
x.write(b"hello")
x.seek(0)
assert x.read() == b"hello"
def test_ntf_del():
x = NamedTemporaryFile(suffix="s", prefix="p")
assert os.path.exists(x.name)
name = x.name
del x
gc.collect()
assert not os.path.exists(name)
def test_ntf_mode_none():
x = NamedTemporaryFile(suffix="s", prefix="p", mode=None)
assert not os.path.exists(x.name)
name = x.name
f = open(name, "w")
f.close()
assert os.path.exists(name)
del x
gc.collect()
assert not os.path.exists(name)
works on all platforms, you can close it, open it up again, etc.
The feature mode=None is what you want, you can ask for a tempfile, specify mode=None, which gives you a UUID styled temp name with the dir/suffix/prefix that you want. The link references tests that show usage.
It's basically the same as NamedTemporaryFile, except the file will get auto-deleted when the object returned is garbage collected... not on close.

You don't want to "rip out all the bits...". It's coded like that for a reason. It says you can't open it a SECOND time while it's still open. Don't. Just use it once, and throw it away (after all, it is a temporary file). If you want a permanent file, create your own.
"Surely this is useful for something, since it was deliberately coded this way, but what is that something". Well, I've used it to write emails to (in a binary format) before copying them to a location where our Exchange Server picks them up & sends them. I'm sure there are lots of other use cases.

I'm pretty sure the Python library writers didn't just decide to make NamedTemporaryFile behave differently on Windows for laughs. All those _os.name == 'nt' tests will be there because of platform differences between Windows and Unix. So my inference from that documentation is that on Windows a file opened the way NamedTemporaryFile opens it cannot be opened again while NamedTemporaryFile still has it open, and that this is due to the way Windows works.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Case sensitive path comparison in python - python

So, since you do not have a case sensitive filesystem, how about import os if 'foo.F90' in os.listdir(PATH_TO_DIRECTORY): open(PATH_TO_DIRECTORY + 'foo.F90')

If the path changes between systems or something you could use: import os, fnmatch for file in os.listdir('.'): if fnmatch.fnmatch(file, 'foo.*'): print file This will return all files foo.

Easy way to do this for Python 3.5+ using pathlib. Works for files and dirs: from pathlib import Path def exists_case_sensitive(path) -> bool: p = Path(path) if not p.exists(): # If it already doesn't exist(), # we can skip iterdir(). return False return p in p.parent.iterdir()

Related

How to find steam executable path in Python

Is it possible to mock os.scandir and its attributes?

How to search upwards through directories? Can I os.walk upwards to the root of the filesystem?

Check Contents of Python Package without Running it?

What is NamedTemporaryFile useful for on Windows?

Categories

Resources