Using pytest to ensure a file is created and written to

Using pytest to ensure a file is created and written to - python

I'm using pytest and want to test that a function writes some content to a file. So I have writer.py which includes:
MY_DIR = '/my/path/'
def my_function():
with open('{}myfile.txt'.format(MY_DIR), 'w+') as file:
file.write('Hello')
file.close()
I want to test /my/path/myfile.txt is created and has the correct content:
import writer
class TestFile(object):
def setup_method(self, tmpdir):
self.orig_my_dir = writer.MY_DIR
writer.MY_DIR = tmpdir
def teardown_method(self):
writer.MY_DIR = self.orig_my_dir
def test_my_function(self):
writer.my_function()
# Test the file is created and contains 'Hello'
But I'm stuck with how to do this. Everything I try, such as something like:
import os
assert os.path.isfile('{}myfile.txt'.format(writer.MYDIR))
Generates errors which lead me to suspect I'm not understanding or using tmpdir correctly.
How should I test this? (If the rest of how I'm using pytest is also awful, feel free to tell me that too!)

I've got a test to work by altering the function I'm testing so that it accepts a path to write to. This makes it easier to test. So writer.py is:
MY_DIR = '/my/path/'
def my_function(my_path):
# This currently assumes the path to the file exists.
with open(my_path, 'w+') as file:
file.write('Hello')
my_function(my_path='{}myfile.txt'.format(MY_DIR))
And the test:
import writer
class TestFile(object):
def test_my_function(self, tmpdir):
test_path = tmpdir.join('/a/path/testfile.txt')
writer.my_function(my_path=test_path)
assert test_path.read() == 'Hello'

Related

Setting global variables from a dictionary within a function

I am looking to use .yaml to manage several global parameters for a program. I would prefer to manage this from within a function, something like the below. However, it seems globals().update() does not work when included inside a function. Additionally, given the need to load an indeterminate number of variables with unknown names, using the basic global approach is not appropriate. Ideas?
.yaml
test:
- 12
- 13
- 14
- stuff:
john
test2: yo
Python
import os
import yaml
def load_config():
with open(os.path.join(os.getcwd(), {file}), 'r') as reader:
vals = yaml.full_load(reader)
globals().update(vals)
Desired output
load_config()
test
---------------
[12,13,14,{'stuff':'john'}]
test2
---------------
yo
What I get
load_config()
test
---------------
NameError: name 'test' is not defined
test2
---------------
NameError: name 'test2' is not defined
Please note: {file} is for you, the code is not actually written that way. Also note that I understand the use of global is not normally recommended, however it is what is required for the answer of this question.

You had {file} in your code, I've assumed that was intended to just be a string of the actual filename. I certainly hope you weren't looking to .format() and then eval() this code? That would be a very bad and unsafe way to run code.
Just return the dictionary vals itself, and access it as needed:
import os
import yaml
def load_config(fn):
with open(os.path.join(os.getcwd(), fn), 'r') as reader:
# only returning the value, so doing it in one step:
return yaml.full_load(reader)
cfg = load_config('test.yaml')
print(cfg)
print(cfg['test2'])
Output:
{'test': [12, 13, 14, {'stuff': 'john'}], 'test2': 'yo'}
yo
You should definitely never just update globals() with content from an external file. Use of globals() is only for very specific use cases anyway.
Getting the exact desired output is just a matter of formatting the contents of the dictionary:
import os
import yaml
def load_config(fn):
with open(os.path.join(os.getcwd(), fn), 'r') as reader:
return yaml.full_load(reader)
def print_config(d):
for k, v in d.items():
print(f'{k}\n---------------\n{v}\n')
cfg = load_config('test.yaml')
print_config(cfg)
Which gives exactly the output you described.
Note that this is technically superfluous:
os.path.join(os.getcwd(), fn)
By default, file operations are executed on the current working directory, so you'd achieve the same with:
def load_config(fn):
with open(fn, 'r') as reader:
return yaml.full_load(reader)
If you wanted to open the file in the same folder as the script itself, consider this instead:
def load_config(fn):
with open(os.path.join(os.path.dirname(__file__), fn), 'r') as reader:
return yaml.full_load(reader)

Mock function without executing it using unittest

I have a function that loads data from a data.json file defined in models.py as follow:
def load_data():
file_path = Path(__file__).parent / 'data.json'
with open(file_path, 'r') as file:
data = json.load(file)['data']
return data
loaded_data = load_data()
I used loaded_data throughout all fuctions defined in models.py. The data.json file contains a JSON array.
My test_models.py is as follow:
from unittest.mock import patch
from models import ... (a list of function to test)
# For replaceing model.load_data()
mock_data = []
def get_mock_data():
return mock_data
#patch('models.load_data', side_effect= get_mock_restaurants)
class TestRestaurantsModel(unittest.TestCase):
However, somehow the real models.load_data still get executed. I know it because I changed the file_path to randomabc.json and got FileNotFoundError. How to I prevent the execution of models.load_data? I do not need to mock models.load_data essentially. I just need to prevent its execution during the test and assigned a mock data to models.data.

pickle is not working in a proper way

import nltk
import pickle
input_file=open('file.txt', 'r')
input_datafile=open('newskills1.txt', 'r')
string=input_file.read()
fp=(input_datafile.read().splitlines())
def extract_skills(string):
skills=pickle.load(fp)
skill_set=[]
for skill in skills:
skill= ''+skill+''
if skill.lower() in string:
skill_set.append(skill)
return skill_set
if __name__ == '__main__':
skills= extract_skills(string)
print(skills)
I want to print the skills from file but, here pickle is not working.
It shows the error:
_pickle.UnpicklingError: the STRING opcode argument must be quoted

The file containing the pickled data must be written and read as a binary file. See the documentation for examples.
Your extraction function should look like:
def extract_skills(path):
with open(path, 'rb') as inputFile:
skills = pickle.load(inputFile)
Of course, you will need to dump your data into a file open as binary as well:
def save_skills(path, skills):
with open(path, 'wb') as outputFile:
pickle.dump(outputFile, skills)
Additionally, the logic of your main seems a bit flawed.
While the code that follows if __name__ == '__main__' is only executed when the script is run as main module, the code that is not in the main should only be static, ie definitions.
Basically, your script should not do anything, unless run as main.
Here is a cleaner version.
import pickle
def extract_skills(path):
...
def save_skills(path, skills):
...
if __name__ == '__main__':
inputPath = "skills_input.pickle"
outputPath = "skills_output.pickle"
skills = extract_skills(inputPath)
# Modify skills
save_skills(outputPath, skills)

Call Python class methods from the command line

so I wrote some class in a Python script like:
#!/usr/bin/python
import sys
import csv
filepath = sys.argv[1]
class test(object):
def __init__(self, filepath):
self.filepath = filepath
def method(self):
list = []
with open(self.filepath, "r") as table:
reader = csv.reader(table, delimiter="\t")
for line in reader:
list.append[line]
If I call this script from the command line, how am I able to call method?
so usually I enter: $ python test.py test_file
Now I just need to know how to access the class function called "method".

You'd create an instance of the class, then call the method:
test_instance = test(filepath)
test_instance.method()
Note that in Python you don't have to create classes just to run code. You could just use a simple function here:
import sys
import csv
def read_csv(filepath):
list = []
with open(self.filepath, "r") as table:
reader = csv.reader(table, delimiter="\t")
for line in reader:
list.append[line]
if __name__ == '__main__':
read_csv(sys.argv[1])
where I moved the function call to a __main__ guard so that you can also use the script as a module and import the read_csv() function for use elsewhere.

Open Python interpreter from the command line.
$ python
Import your python code module, make a class instance and call the method.
>>> import test
>>> instance = test(test_file)
>>> instance.method()

Pythonic equivalent of ./foo.py < bar.png

I've got a Python program that reads from sys.stdin, so I can call it with ./foo.py < bar.png. How do I test this code from within another Python module? That is, how do I set stdin to point to the contents of a file while running the test script? I don't want to do something like ./test.py < test.png. I don't think I can use fileinput, because the input is binary, and I only want to handle a single file. The file is opened using Image.open(sys.stdin) from PIL.

You should generalise your script so that it can be invoked from the test script, in addition to being used as a standalone program. Here's an example script that does this:
#! /usr/bin/python
import sys
def read_input_from(file):
print file.read(),
if __name__ == "__main__":
if len(sys.argv) > 1:
# filename supplied, so read input from that
filename = sys.argv[1]
file = open(filename)
else:
# no filename supplied, so read from stdin
file = sys.stdin
read_input_from(file)
If this is called with a filename, the contents of that file will be displayed. Otherwise, input read from stdin will be displayed. (Being able to pass a filename on the command line might be a useful improvement for your foo.py script.)
In the test script you can now invoke the function in foo.py with a file, for example:
#! /usr/bin/python
import foo
file = open("testfile", "rb")
foo.read_input_from(file)

Your function or class should accept a stream instead of choosing which stream to use.
Your main function will choose sys.stdin.
Your test method will probably choose a StringIO instance or a test file.
The program:
# foo.py
import sys
from PIL import Image
def foo(stream):
im = Image.open(stream)
# ...
def main():
foo(sys.stdin)
if __name__ == "__main__":
main()
The test:
# test.py
import StringIO, unittest
import foo
class FooTest(unittest.TestCase):
def test_foo(self):
input_data = "...."
input_stream = StringIO.StringIO(input_data)
# can use a test file instead:
# input_stream = open("test_file", "rb")
result = foo.foo(input_stream)
# asserts on result
if __name__ == "__main__":
unittest.main()

A comp.lang.python post showed the way: Substitute a StringIO() object for sys.stdout, and then get the output with getvalue():
def setUp(self):
"""Set stdin and stdout."""
self.stdin_backup = sys.stdin
self.stdout_backup = sys.stdout
self.output_stream = StringIO()
sys.stdout = self.output_stream
self.output_file = None
def test_standard_file(self):
sys.stdin = open(EXAMPLE_PATH)
foo.main()
self.assertNotEqual(
self.output_stream.getvalue(),
'')
def tearDown(self):
"""Restore stdin and stdout."""
sys.stdin = self.stdin_backup
sys.stdout = self.stdout_backup

You can always monkey patch Your stdin. But it is quite ugly way. So better is to generalize Your script as Richard suggested.
import sys
import StringIO
mockin = StringIO.StringIO()
mockin.write("foo")
mockin.flush()
mockin.seek(0)
setattr(sys, 'stdin', mockin)
def read_stdin():
f = sys.stdin
result = f.read()
f.close()
return result
print read_stdin()
Also, do not forget to restore stdin when tearing down your test.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Using pytest to ensure a file is created and written to - python

Related

Setting global variables from a dictionary within a function

Mock function without executing it using unittest

pickle is not working in a proper way

Call Python class methods from the command line

Pythonic equivalent of ./foo.py < bar.png

Categories

Resources