Python: generate secure temporary file name - python

In the context of writing unit tests for a backend class, I need a secure way to generate a temporary file name. My current approach is:
fp = tempfile.NamedTemporaryFile(delete=False)
fp.close()
with Backend(fp.name) as backend:
...run the test...
os.unlink(fp.name)
This is a bit awkward. Does there exist a standard library context manager which allows to achieve the same by:
with TempFileName() as name:
with Backend(name) as backend:
...run the test...
Current Solution
It appears that no pre-made context manager exists. I am now using:
class TemporaryBackend(object):
def __init__(self):
self.fp = tempfile.NamedTemporaryFile(delete=False)
self.fp.close()
self.backend = Backend(self.fp.name)
def __enter__(self):
return self.backend
def __exit__(self, exc_type, exc_value, traceback):
self.backend.close()
os.unlink(self.fp.name)
Which can then be used with:
with TemporaryBackend() as backend:
...run the test...

Rather than creating a temporary file, create a temporary directory which only you have access to. Once you have that, you can simply use an arbitrary string as the name of a file in that directory.
d = tempfile.mkdtemp()
tmp_name = "somefile.txt"
with Backend(os.path.join(d, tmp_name)) as backend:
... run test ...
os.remove(tmp_name) # If necessary
os.rmdir(d)
Depending on your needs, you may just want a random string of characters:
with Backend(''.join(random.sample(string.lowercase, 8))) as backend:
... run test ...

The creation of unique file names relays on the ability of file systems to grant exclusive access to files. So one have to create a file, not only a file name.
Another way to have a place, where you can create files safely, is, to create a temporary directory and put your files inside this directory. This would be my preferred way for test cases.

Related

Why FileNotFoundError on Path.rename while using Pyfakefs?

I wrote a test for a function that renames files from e.g. /videos/vid_youtube.mp4 to /videos/youtube/vid.mp4. The test patches the fs with Pyfakefs.
When the code actually renames the file, I get this error.
FileNotFoundError: [Errno 2] No such file or directory: '/home/user/code/project/test/DLV/videos/vid_youtube.mp4' -> '/home/user/code/project/test/DLV/videos/youtube/vid.mp4'
This is how I set up fakefs
def setUp(self) -> None:
self.setUpPyfakefs()
self.fs.create_dir(Path(Dirs.VIDEOS)) # /home/user/code/project/test/DLV/videos
self.fs.create_file(Path(Dirs.VIDEOS / "vid_youtube.mp4"))
The code under test.
class Files:
#staticmethod
def rename_video_files():
all_files = Collect.video_files()
for files_for_type in all_files:
for file in all_files[files_for_type]:
path = Path(file)
platform = Files.detect_platform(path)
platform_dir = Path(Dirs.VIDEOS, platform)
platform_dir.mkdir(exist_ok=True)
new_name = path.stem.replace(f'_{platform}', '')
new_path = Dirs.VIDEOS / platform / f'{new_name}{path.suffix}'
old_path = Dirs.VIDEOS / path
old_path.rename(new_path) # throws FileNotFoundError
I debugged the test and the method under test and even passed the fake fs to rename_video_files(fakefs) to inspect the files and directories. All files and directories look correct.
What is going wrong here?
The problem here is most likely the static initialization of Dirs.VIDEOS. This is initialized at load time as a pathlib.Path, and won't be patched later at the time you setup pyfakefs (the same problem would happen if you where to use unittest.patch for patching).
There are two ways to fix this:
adapt the code to not initialize the path statically
This could be done by statically defining the str path, and converting it to a Path at run time, or by using a method to get the path instead of an attribute (e.g. Dirs.VIDEO() instead of Dirs.VIDEO`).
adapt the test to reload the tested code
If reloading the tested code after pyfakefs has been initialized, it will be correctly patched. pyfakefs provides an argument in setUpPyfakefs that does that:
from pyfakefs.fake_filesystem_unittest import TestCase
from my_module import video_files
from my_module.video_files import Dirs, Files
class MyTest(TestCase):
def setUp(self) -> None:
self.setUpPyfakefs(modules_to_reload=[video_files])
self.fs.create_dir(
Path(Dirs.VIDEOS)) # /home/user/code/project/test/DLV/videos
self.fs.create_file(Path(Dirs.VIDEOS / "vid_youtube.mp4"))
(under the assumption, that your code under test is located in my_module.video_files.py)
Disclaimer:
I'm a contributor to pyfakefs.

Is there a "proper" way to load config files using ConfigParser? Or, is there an equivalent to logging.getLogger for config files?

I have a particularly large python project that has modules upon modules and classes that call other classes and so on. It's an organized mess.
I want to be able to just read the config file once and get/set key-value pairs out of it from any part of my program.
Right now, my setup looks like this: I have a module with these function
def initialize():
config_file = configparser.ConfigParser()
config_file_path = os.path.join(Path(__file__).resolve().parents[2], 'config.ini')
try:
config_file.read(config_file_path)
except FileNotFoundError:
raise FileNotFoundError
else:
return config_file, config_file_path
def get_config_data(section, key):
config_file, _ = initialize()
return config_file[section][key]
def set_config_data(section, key, value):
config_file, config_file_path = initialize()
config_file.set(section, key, str(value))
with open(config_file_path, 'w') as f:
config_file.write(f)
and whenever I need a config key-value pair, I just import it as CFG and use CFG.get_config_data(KEY, VALUE). Which means I have to run initialize every single time I need something. I don't think it's ideal (or is it? I genuinely don't know)
Is there a "proper and standard" method for reading config files in large python projects? Something that I can just import and get in the beginning? Or there's nothing wrong with my set-up as it is?

Is it possible to mock os.scandir and its attributes?

for entry in os.scandir(document_dir)
if os.path.isdir(entry):
# some code goes here
else:
# else the file needs to be in a folder
file_path = entry.path.replace(os.sep, '/')
I am having trouble mocking os.scandir and the path attribute within the else statement. I am not able to mock the mock object's property I created in my unit tests.
with patch("os.scandir") as mock_scandir:
# mock_scandir.return_value = ["docs.json", ]
# mock_scandir.side_effect = ["docs.json", ]
# mock_scandir.return_value.path = PropertyMock(return_value="docs.json")
These are all the options I've tried. Any help is greatly appreciated.
It depends on what you realy need to mock. The problem is that os.scandir returns entries of type os.DirEntry. One possibility is to use your own mock DirEntry and implement only the methods that you need (in your example, only path). For your example, you also have to mock os.path.isdir. Here is a self-contained example for how you can do this:
import os
from unittest.mock import patch
def get_paths(document_dir):
# example function containing your code
paths = []
for entry in os.scandir(document_dir):
if os.path.isdir(entry):
pass
else:
# else the file needs to be in a folder
file_path = entry.path.replace(os.sep, '/')
paths.append(file_path)
return paths
class DirEntry:
def __init__(self, path):
self.path = path
def path(self):
return self.path
#patch("os.scandir")
#patch("os.path.isdir")
def test_sut(mock_isdir, mock_scandir):
mock_isdir.return_value = False
mock_scandir.return_value = [DirEntry("docs.json")]
assert get_paths("anydir") == ["docs.json"]
Depending on your actual code, you may have to do more.
If you want to patch more file system functions, you may consider to use pyfakefs instead, which patches the whole file system. This will be overkill for a single test, but can be handy for a test suite relying on file system functions.
Disclaimer: I'm a contributor to pyfakefs.

Sharing Variables across new object instances

Background:
I am writing a module in order to set-up an embedded system. In this context I need to load some modules and perform some system settings.
Context:
I have a parent class holding some general code (load the config file, build ssh connection etc.) used for several child classes. One of them is the module class that sets up the module and therefore uses among otherthings the ssh connection and the configuration file.
My goal is to share the configuration file and the connection with the next module, that will be setup. For the connection its just a waste to build and destroy it all the time, but for the configuration file, changes during setup can lead to undefined behavior.
Research/ approaches:
I tried using class variables, however they aren't passed when initaiting
a new module object.
Futher, I tried using global variables, but since the parent class and the
child classes are in different files, This won't work (Yes, i can put them
all in one file but this will be a mess) Also using a getter function from
the file where I defined the global variable didn't work.
I am aware of the 'builtin' solution from
How to make a cross-module variable?
variable, but feel this would be a bit of an overkill...
Finally, I can keep the config file and and the connection in a central
script and pass them to each of the instances, but this will lead to loads
of dependencies and I don't think it's a good solution.
So here is a bit of code with an example method, to get some file paths.
The code is set up according to approach 1 (class vaiables)
An example config file:
Files:
Core:
Backend:
- 'file_1'
- 'file_2'
Local:
File_path:
Backend: '/the/path/to'
The Parent class in setup_class.py
import os
import abc
import yaml
class setup(object):
__metaclass__ = abc.ABCMeta
configuration = []
def check_for_configuration(self, config_file):
if not self.configuration:
with open(config_file, "r") as config:
self.configuration = yaml.safe_load(config)
def get_configuration(self):
return self.configuration
def _make_file_list(self, path, names):
full_file_path = []
for file_name in names:
temp_path = path + '/' + file_name
temp_path = temp_path.split('/')
full_file_path.append(os.path.join(*temp_path))
return full_file_path
#abc.abstractmethod
def install(self):
raise NotImplementedError
The module class in module_class.py
from setup_class import setup
class module(setup):
def __init__(self, module_name, config_file = ''):
self.name = module_name
self.check_for_configuration(config_file)
def install(self):
self._backend()
def _backend(self):
files = self._make_file_list(
self.configuration['Local']['File_path']['Backend'],
self.configuration['Files'][self.name]['Backend'])
if files:
print files
And finally a test script:
from module_class import module
Analysis = module('Analysis',"./example.yml")
Core = module('Core', "./example.yml")
Core.install()
Now, when running the code, the config file is loaded everytime, a new module object is initaiated. I would like to avoid this. Are there approaches I have not considdered? What's the neatest way to achive this?
Save your global values in a global dict, and refer to that inside your module.
cache = {}
class Cache(object):
def __init__(self):
global cache
self.cache = cache

Creating a new TemporaryDirectory object doesn't create the directory

Case 1:
The directory 'C:\Users\jim\AppData\Local\Temp\tmp9lf9xalc' is created.
In [1]:
from tempfile import TemporaryDirectory
temp_dir = TemporaryDirectory()
temp_dir.name
Out [1]:
'C:\\Users\\jim\\AppData\\Local\\Temp\\tmp9lf9xalc'
Case 2:
The directory 'C:\Users\jim\AppData\Local\Temp\tmpm861vgbn' is NOT created.
In [2]:
from tempfile import TemporaryDirectory
temp_dir = TemporaryDirectory().name
temp_dir
Out [2]:
'C:\\Users\\jim\\AppData\\Local\\Temp\\tmpm861vgbn'
I don't understand why in Case 2 the directory is not created.
The source code of TemporaryDirectory is as follows. It's located at ..\Anaconda3\envs\my_env\Lib\tempfile.py
class TemporaryDirectory(object):
"""Create and return a temporary directory. This has the same
behavior as mkdtemp but can be used as a context manager. For
example:
with TemporaryDirectory() as tmpdir:
...
Upon exiting the context, the directory and everything contained
in it are removed.
"""
def __init__(self, suffix=None, prefix=None, dir=None):
self.name = mkdtemp(suffix, prefix, dir)
self._finalizer = _weakref.finalize(
self, self._cleanup, self.name,
warn_message="Implicitly cleaning up {!r}".format(self))
#classmethod
def _cleanup(cls, name, warn_message):
_shutil.rmtree(name)
_warnings.warn(warn_message, ResourceWarning)
def __repr__(self):
return "<{} {!r}>".format(self.__class__.__name__, self.name)
def __enter__(self):
return self.name
def __exit__(self, exc, value, tb):
self.cleanup()
def cleanup(self):
if self._finalizer.detach():
_shutil.rmtree(self.name)
As the doc string says, you’re supposed to use with:
with TemporaryDirectory() as tmpdir:
loc=tmpdir.name
# ...
Then it knows when you’re done with the directory and removes it for you. As a backup, it also cleans up when the TemporaryDirectory object is destroyed, issuing a ResourceWarning because that behavior and its timing cannot be guaranteed across Python implementations.
This backup happens immediately (for CPython) in your second case, since you kept no reference to the TemporaryDirectory, so the directory is removed as soon as it is created.
Calling .name() on the method causes the original class to go out of scope since it returns a string, which triggers the cleanup. To fix it, assign the name after the class is assigned as in your first example.
If the only thing you want a directory to be created and just have the string, you can call the lower-level mkdtemp function, which simply creates the directory and returns a string.
It's on you (or, possibly the operating system depending on where the temporary directory lands) to clean it up.
import shutil
from tempfile import mkdtemp
temp_dir = mkdtemp()
try:
...some stuff...
finally:
shutil.rmtree(temp_dir)

Categories

Resources