I am looking to use .yaml to manage several global parameters for a program. I would prefer to manage this from within a function, something like the below. However, it seems globals().update() does not work when included inside a function. Additionally, given the need to load an indeterminate number of variables with unknown names, using the basic global approach is not appropriate. Ideas?
.yaml
test:
- 12
- 13
- 14
- stuff:
john
test2: yo
Python
import os
import yaml
def load_config():
with open(os.path.join(os.getcwd(), {file}), 'r') as reader:
vals = yaml.full_load(reader)
globals().update(vals)
Desired output
load_config()
test
---------------
[12,13,14,{'stuff':'john'}]
test2
---------------
yo
What I get
load_config()
test
---------------
NameError: name 'test' is not defined
test2
---------------
NameError: name 'test2' is not defined
Please note: {file} is for you, the code is not actually written that way. Also note that I understand the use of global is not normally recommended, however it is what is required for the answer of this question.
You had {file} in your code, I've assumed that was intended to just be a string of the actual filename. I certainly hope you weren't looking to .format() and then eval() this code? That would be a very bad and unsafe way to run code.
Just return the dictionary vals itself, and access it as needed:
import os
import yaml
def load_config(fn):
with open(os.path.join(os.getcwd(), fn), 'r') as reader:
# only returning the value, so doing it in one step:
return yaml.full_load(reader)
cfg = load_config('test.yaml')
print(cfg)
print(cfg['test2'])
Output:
{'test': [12, 13, 14, {'stuff': 'john'}], 'test2': 'yo'}
yo
You should definitely never just update globals() with content from an external file. Use of globals() is only for very specific use cases anyway.
Getting the exact desired output is just a matter of formatting the contents of the dictionary:
import os
import yaml
def load_config(fn):
with open(os.path.join(os.getcwd(), fn), 'r') as reader:
return yaml.full_load(reader)
def print_config(d):
for k, v in d.items():
print(f'{k}\n---------------\n{v}\n')
cfg = load_config('test.yaml')
print_config(cfg)
Which gives exactly the output you described.
Note that this is technically superfluous:
os.path.join(os.getcwd(), fn)
By default, file operations are executed on the current working directory, so you'd achieve the same with:
def load_config(fn):
with open(fn, 'r') as reader:
return yaml.full_load(reader)
If you wanted to open the file in the same folder as the script itself, consider this instead:
def load_config(fn):
with open(os.path.join(os.path.dirname(__file__), fn), 'r') as reader:
return yaml.full_load(reader)
Related
I'm having a function in Python that when being called first, reads the content of a file to a list and checks whether or not an element is within that list.
def is_in_file(element, path):
with open(path, 'r') as f:
lines = [line.strip() for line in f.readlines()]
return element in lines
When the function is being called again, however, the content of the file should not be read again; the function should instead remember the value of lines from the first call.
Is there a way to preserve the context of a function when calling the function again? I don't want to make lines global to not litter the above namespace. I guess it's quite similar to the use of a generator and the yield statement...
My opinion is that the correct way is to encapsulate this in a class. The path is set at instance creation, and method calls use the list of lines. That way you can even have different files at the same time:
class finder:
def __init__(self, path):
with open(path, 'r') as f:
self.lines = [line.strip() for line in f]
def is_in_file(self, element):
return element in lines
That is not exactly what you have asked for, but is much more OO.
Dirty hack: add variable to function object and store value there.
def is_in_file(element, path):
if not hasattr(is_in_file, "__lines__"):
with open(path, 'r') as f:
setattr(is_in_file, "__lines__", [line.strip() for line in f.readlines()])
return element in is_in_file.__lines__
You could save the lines in a keyword argument declared with a mutable default value:
def is_in_file(element, path, lines=[]):
if lines:
return element in lines
with open(path, 'r') as f:
lines += [line.strip() for line in f.readlines()]
return element in lines
Caveat:
you must be sure that this function is only called with one file; if you call it with a second file, it will not open it and continue to return values based on the first file opened.
A more flexible solution:
A more flexible solution is maybe to use a dictionary of lines, where each new file can be opened once and stored, using the path as key; you can then call the function with different files, and get the correct results while memoizing the contents.
def is_in_file(element, path, all_lines={}):
try:
return element in all_lines[path]
except KeyError:
with open(path, 'r') as f:
all_lines[path] = [line.strip() for line in f.readlines()]
return element in lines
OO solution:
Create a class to encapsulate the content of a file, like what #SergeBallesta proposed; although it does not address exactly what you requested, it is likely the better solution in the long run.
Use the functools.lru_cache decorator to set up a helper function that reads in any given file only once and then stores the result.
from functools import lru_cache
#lru_cache(maxsize=1)
def read_once(path):
with open(path) as f:
print('reading {} ...'.format(path))
return [line.strip() for line in f]
def in_file(element, path):
return element in read_once(path)
Demo:
>>> in_file('3', 'file.txt')
reading file.txt ...
True
>>> in_file('3', 'file.txt')
True
>>> in_file('3', 'anotherfile.txt')
reading anotherfile.txt ...
False
>>> in_file('3', 'anotherfile.txt')
False
This has the serious advantage that in_file does not have to be called with the same file name every time.
You can adjust the maxsize argument to a higher number if you want more than one file to be cached at any given time.
Lastly: consider at set for the return value of read_once if all you are interested in are membership tests.
This answer proposes a class similar similar to Serge Ballesta's idea.
The difference is that it totally feels like a function because we use it's __call__ method instead of dot-notation in order to conduct the search.
In addition, you can add as many searchable files as you want.
Setup:
class in_file:
def __init__(self):
self.files = {}
def add_path(self, path):
with open(path) as f:
self.files[path] = {line.strip() for line in f}
def __call__(self, element, path):
if path not in self.files:
self.add_path(path)
return element in self.files[path]
in_file = in_file()
Usage
$ cat file1.txt
1
2
3
$ cat file2.txt
hello
$ python3 -i demo.py
>>> in_file('1', 'file1.txt')
True
>>> in_file('hello', 'file1.txt')
False
>>> in_file('hello', 'file2.txt')
True
I'm using pytest and want to test that a function writes some content to a file. So I have writer.py which includes:
MY_DIR = '/my/path/'
def my_function():
with open('{}myfile.txt'.format(MY_DIR), 'w+') as file:
file.write('Hello')
file.close()
I want to test /my/path/myfile.txt is created and has the correct content:
import writer
class TestFile(object):
def setup_method(self, tmpdir):
self.orig_my_dir = writer.MY_DIR
writer.MY_DIR = tmpdir
def teardown_method(self):
writer.MY_DIR = self.orig_my_dir
def test_my_function(self):
writer.my_function()
# Test the file is created and contains 'Hello'
But I'm stuck with how to do this. Everything I try, such as something like:
import os
assert os.path.isfile('{}myfile.txt'.format(writer.MYDIR))
Generates errors which lead me to suspect I'm not understanding or using tmpdir correctly.
How should I test this? (If the rest of how I'm using pytest is also awful, feel free to tell me that too!)
I've got a test to work by altering the function I'm testing so that it accepts a path to write to. This makes it easier to test. So writer.py is:
MY_DIR = '/my/path/'
def my_function(my_path):
# This currently assumes the path to the file exists.
with open(my_path, 'w+') as file:
file.write('Hello')
my_function(my_path='{}myfile.txt'.format(MY_DIR))
And the test:
import writer
class TestFile(object):
def test_my_function(self, tmpdir):
test_path = tmpdir.join('/a/path/testfile.txt')
writer.my_function(my_path=test_path)
assert test_path.read() == 'Hello'
Currently, I have a file called utils.py where I keep all my functions and another file called main.py.
In my utils file, I have a two functions that load and save to a json file, along with a bunch of other functions that will edit the data.
def save_league(league_name, records):
with open('%s.json' % league_name, 'w') as f:
f.write(json.dumps(records))
def load_league(league_name):
with open('%s.json' % league_name, 'r') as f:
content = f.read()
records = json.loads(content)
return records
I am trying to add optional arguments for the save_league function by changing the function to:
def save_league(name = league_name, r = records):
with open('%s.json' % name, 'w') as f:
f.write(json.dumps(r))
This way the file will save just from save_league().
However, when I try to import a function with optional arguments in main.py, I get a name error because the default arguments are not set at the beginning.
NameError: name 'league_name' is not defined
Is it possible import functions with optional args into another file or do I have to combine the two files into one?
I would like to be able to use a list in a file to 'upload' a code to the program.
NotePad file:
savelist = ["Example"]
namelist = ["Example2"]
Python Code:
with open("E:/battle_log.txt", 'rb') as f:
gamesave = savelist[(name)](f)
name1 = namelist [(name)](f)
print ("Welcome back "+name1+"! I bet you missed this adventure!")
f.close()
print savelist
print namelist
I would like this to be the output:
Example
Example2
It looks like you're trying to serialize a program state, the re-load it later! You should consider using a database instead, or even simply pickle
import pickle
savelist = ["Example"]
namelist = ["Example2"]
obj_to_pickle = (savelist, namelist)
with open("path/to/savefile.pkl", 'wb') as p:
pickle.dump(obj_to_pickle, p)
# save data
with open('path/to/savefile.pkl', 'rb') as p:
obj_from_pickle = pickle.load(p)
savelist, namelist = obj_from_pickle
# load data
There are several options:
Save your notepad file with the .py extension and import it. As long as it contains valid python code, everything will be accessible
Load the text as a string and execute it (e.g., via eval())
Store the information in an easy to read configuration file (e.g., YAML) and parse it when you need it
Precompute the data and store it in a pickle file
The first two are risky if you don't have control over who will provide the file as someone can insert malicious code into the inputs.
You could simply import it as long the file is in the same folder as the one your program is in. Kinda like this:
import example.txt
or:
from example.txt import*
Then access it through one of two ways. The first one:
print Example.savelist[0]
print Example.namelist[0]
The second way:
print savelist[0]
print namelist[0]
Is there a way of getting the doc string of a python file if I have only the name of the file ? For instance I have a python file named a.py. I know that it has a doc string ( being mandated before) but don't know of its internal structure i.e if it has any classes or a main etc ? I hope I not forgetting something pretty obvious
If I know it has a main function I can do it this way that is using import
filename = 'a.py'
foo = __import__(filename)
filedescription = inspect.getdoc(foo.main())
I can't just do it this way:
filename.__doc__ #it does not work
You should be doing...
foo = __import__('a')
mydocstring = foo.__doc__
or yet simpler...
import a
mydocstring = a.__doc__
import ast
filepath = "/tmp/test.py"
file_contents = ""
with open(filepath) as fd:
file_contents = fd.read()
module = ast.parse(file_contents)
docstring = ast.get_docstring(module)
if docstring is None:
docstring = ""
print(docstring)
And if you need the docstrings of the module your are already in :
import sys
sys.modules[__name__].__doc__