Use pytest to test and grade student code

Use pytest to test and grade student code - python

Say I want to grade some student python code using tests, something like (this is pseudo-code I wish I could write):
code = __import__("student_code") # Import the code to be tested
grade = 100
for test in all_tests(): # Loop over the tests that were gathered
good = perform(test, code) # Perform the test individually on the code
if not good: # Do something if the code gives the wrong result
grade -= 1
For that, I would like to use pytest (easy to write good tests), but there are many things I don't know how to do:
how to run tests on external code? (here the code imported from student's code)
how to list all the tests available? (here all_tests())
how to run them individually on code? (here perform(test, code))
I couldn't find anything related to this user-case (pytest.main() does not seem to do the trick anyhow...)
I hope you see my point, cheers!
EDIT
I finally found how to perform my 1st point (apply tests on external code). In the repository where you want to perform the tests, generate a conftest.py file with:
import imp # Import standard library
import pytest
def pytest_addoption(parser):
"""Add a custom command-line option to py.test."""
parser.addoption("--module", help="Code file to be tested.")
#pytest.fixture(scope='session')
def module(request):
"""Import code specified with the command-line custom option '--module'."""
codename = request.config.getoption("--module")
# Import module (standard __import__ does not support import by filename)
try:
code = imp.load_source('code', codename)
except Exception as err:
print "ERROR while importing {!r}".format(codename)
raise
return code
Then, gather your tests in a tests.py file, using the module fixture:
def test_sample(module):
assert module.add(1, 2) == 3
Finally, run the tests with py.test tests.py --module student.py.
I'm still working on points 2 and 3.
EDIT 2
I uploaded my (incomplete) take at this question:
https://gitlab.in2p3.fr/ycopin/pyTestExam
Help & contributions are welcome!

Very cool and interesting project. It's difficult to answer without knowing more.
Basically you should be able to do this by writing a custom plugin. Probably something you could place in a conftest.py in a test or project folder with your unittest subfolder and a subfolder for each student.
Probably would want to write two plugins:
One to allow weighting of tests (e.g. test_foo_10 and and test_bar_5) and calculation of final grade (e.g. 490/520) (teamcity-messages is an example that uses the same hooks)
Another to allow distribution of test to separate processes. (xdist as an example)
I know this is not a very complete answer but I wanted to at least point out the last point. Since there would be a very high probability that students would be using overlapping module names, they would clash in a pytest world where tests are collected first and then run in a process that would attempt to not re-import modules with a common name.
Even if you attempt to control for that, you will eventually have a student manipulate the global namespace in a sloppy way that could cause another students code to fail. You will, for that reason, need either a bash script to run each students file or a plugin that would run them in separate processes.
Make this use case a graded take-home exam and see what they come up with :-) ... naaah ... but you could ;-)

I come up with something like this (whre it is assument that the sum function is the student code):
import unittest
score = 0
class TestAndGrade(unittest.TestCase):
def test_correctness(self):
self.assertEqual(sum([2,2]), 4)
global score; score += 6 # Increase score
def test_edge_cases(self):
with self.assertRaises(Exception):
sum([2,'hello'])
global score; score += 1 # Increase score
# Print the score
#classmethod
def tearDownClass(cls):
global score
print('\n\n-------------')
print('| Score: {} |'.format(score))
print('-------------\n')
# Run the tests
unittest.main()

Related

Calling a multiple python scripts from python with predefined environment

Probably related to globals and locals in python exec(), Python 2 How to debug code injected by the exec block and How to get local variables updated, when using the `exec` call?
I am trying to develop a test framework for our desktop applications which uses click bot like functions.
My goal was to enable non-programmers to write small scripts which could work as a test script. So my idea is to structure the test scripts by files like:
tests-folder
| -> 01-first-test.py
| -> 02-second-test.py
| ... etc
| -> fixture.py
And then just execute these scripts in alphabetical order. However, I also wanted to have fixtures which would define functions, classes, variables and make them available to the different scripts without having the scripts to import that fixture explicitly. If that works, I could also have that approach for 2 or more directory levels.
I could get it working-ish with some hacking around, but I am not entirely convinced. I have a test_sequence.py which looks like this:
from pathlib import Path
from copy import deepcopy
from my_module.test import Failure
def run_test_sequence(test_dir: str):
error_occured = False
fixture = {
'some_pre_defined_variable': 'this is available in all scripts and fixtures',
'directory_name': test_dir,
}
# Check if fixture.py exists and load that first
fixture_file = Path(dir) / 'fixture.py'
if fixture_file.exists():
with open(fixture_file.absolute(), 'r') as code:
exec(code.read(), fixture, fixture)
# Go over all files in test sequence directory and execute them
for test_file in sorted(Path(test_dir).iterdir()):
if test_file.name == 'fixture.py':
continue
# Make a deepcopy, so scripts cannot influence one another
fixture_copy = deepcopy(fixture)
fixture_copy.update({
'some_other_variable': 'this is available in all scripts but not in fixture'
})
try:
with open(test_file.absolute(), 'r') as code:
exec(code.read(), fixture_locals, fixture_locals)
except Failure:
error_occured = True
return error_occured
This iterates over all files in the directory tests-folder and executes them in order (with fixture.py first). It also makes the local variables, functions and classes from fixture.py available to each test-script.
A test script could then just be arbitrary code that will be executed and if it raises my custom Failure exception, this will be noted as a failed test.
The whole sequence is started with a script that does
from my_module.test_sequence import run_test_sequence
if __name__ == '__main__':
exit(run_test_sequence('tests-folder')
This mostly works.
What it cannot do, and what leaves me unsatisfied with this approach:
I cannot debug the scripts itself. Since the code is loaded as string and then interpreted, breakpoints inside the test scripts are not recognized.
Calling fixture functions behaves weird. When I define a function in fixture.py like:
from my_hello_module import print_hello
def printer():
print_hello()
I will receive a message during execution that print_hello is undefined because the variables/modules/etc. in the scope surrounding printer are lost.
Stacktraces are useless. On failure it shows the stacktrace but of course only shows my line which says `exec(...)' and the insides of that function, but none of the code that has been loaded.
I am sure there are other drawbacks, that I have not found yet, but these are the most annoying ones.
I also tried to find a solution through __import__ but I couldn't get it to inject my custom locals or globals into the imported script.
Is there a solution, that I am too inexperienced to find or another builtin Python function that actually does, what I am trying to do? Or is there no way to achieve this and I should rather have each test-script import the fixture and file/directory names from the test-scripts itself. I want those scripts to have as few dependencies and pythony code as possible. Ideally they are just:
from my_module.test import *
click(x, y, LEFT)
write('admin')
press('tab')
write('password')
press('enter')
if text_on_screen('Login successful'):
succeed('Test successful')
else:
fail('Could not login')
Additional note: I think I had the debugger working when I still used execfile, but it is not available in python3 environments.

Common Python scripts are getting called multiple times

I have the following project structure.
- README.rst
- LICENSE
- setup.py
- requirements.txt
- common_prj/__init__.py
- common_prj/common_lib.py
- common_prj/config.xml
- Project1/__init__.py
- Project1/some_app_m1.py
- Project1/some_app_m2.py
- Project1/some_app.py
- Project2/some_app1.py
I am having all my common classes in common_prj/common_lib.py file. Now each module file from Project1 is calling common_prj/common_lib.py to access common classes and setup project env using config.xml
#import in some_app_m1.py
from common_prj import common_lib
#import in some_app_m2.py
from common_prj import common_lib
Imports in some_app.py
from Project1 import some_app_m1
from Project1 import some_app_m2
from common_prj import common_lib
###
<some functions>
<some functions>
###
if __name__ == '__main__':
With the above three import it seems common_lib.py is getting executed multiple time , but if I keep the common_lib.py in Project1 then I don't see this issue.
Please let me now how can I keep common_lib.py in common package and call that from Project1 scripts without executing it multiple time. The purpose of common_lib.py is to share common classes with scripts in Project2.
I have the below code in common_lib.py which is repeating for calling classes from each module after import.
self.env = dict_env.get(int(input("Choose Database ENV for this execution : \n" + str(dict_env) + "\nSelect the numeric value => ")))
self.app = dict_app.get(int(input("Choose application for this execution : \n" + str(dict_app) + "\nSelect the numeric value => ")))
My question why I am not facing this issue if I keep common_lib.py in Project1 rather common_prj. I added the above code in common_lib.py because I don't wanted to repeat these lines and ENV setup in my all app code. These are global env settings for all application scripts code inside Project1.
output
Choose Database ENV for this execution :
{1: 'DEV', 2: 'SIT', 3: 'UAT', 4: 'PROD'}
Select the numeric value => 1
Choose application for this execution :
{1: 'app1', 2: 'app2', 3: 'app3', 4: 'app4', 5: 'app5', 6: 'app6'}
Select the numeric value => 1
Choose Database ENV for this execution : ### Here it is repeating again from common_lib.py
{1: 'DEV', 2: 'SIT', 3: 'UAT', 4: 'PROD'}
Select the numeric value => 1
Choose application for this execution : ### Here it is repeating again from common_lib.py
{1: 'app1', 2: 'app2', 3: 'app3', 4: 'app4', 5: 'app5', 6: 'app6'}
Select the numeric value => 1

Note: This is my first answer where I answer in-depth of the logistics of python. If I have said anything incorrect, please let me know or edit my answer (with a comment letting me know why). Please feel free to clarify any questions you may have too.
As the comments have discussed if __name__ == '__main__': is exactly the way to go. You can check out the answers on the SO post for a full description or check out the python docs.
In essence, __name__ is a variable for a python file. Its behavior includes:
When used in a file, its value is defaulted to '__main__'.
When a file imports another program. (import foo). foo.__name__ will return 'foo' as its value.
Now, when importing a file, python compiles the new file completely. This means that it will run everything that can be run. Python compiles all the methods, variables, etc. As it's compiling, if python comes across a callable function (i.e hello_world()) it will run it. This happens even if you're trying to import a specific part of the package or not.
Now, these two things are very important. If you have some trouble understanding the logic, you can take a look at some sample code I created.
Since you are importing from common_prj 3 times (from the some_app_m* file and my_app.py file), you are running the program in common_prj three times (This is also what makes python one of the slower programming languages).
Although I'm unable to see your complete code, I'm assuming that common_prj.py calls common_lib() at some point in your code (at no indentation level). This means that the two methods inside:
self.env = dict_env.get(int(input("Choose Database ENV for this execution : \n" + str(dict_env) + "\nSelect the numeric value => ")))
self.app = dict_app.get(int(input("Choose application for this execution : \n" + str(dict_app) + "\nSelect the numeric value => ")))
were also called.
So, in conclusion:
You imported common_prj 3 times
Which ran common_lib multiple times as well.
Solution:
This is where the __name__ part I talked about earlier comes into play
Take all callable functions that shouldn't be called during importing, and place them under the conditional of if __name__ == '__name__':
This will ensure that only if the common_prj file is run then the condition will pass.
Again, if you didn't understand what I said, please feel free to ask or look at my sample code!
Edit: You can also just remove these callable functions. Files that are meant to be imported use them to make sure that the functions work as intended. If you look at any of the packages that you import daily, you'll either find that the methods are called in the condition OR not in the file at all.

This is more of a question of library design, rather than the detailed mechanics of Python.
A library module (which is what common_prj is) should be designed to do nothing on import, only define functions, classes, constants etc. It can do things like pre-calculating constants, but it should be restricted to things that don't depend on anything external, produce the same results each time and don't involve much heavy computation. At that point, the code running three times will no longer be a problem.
If the library needs parameters, the caller should supply those, ideally in a way that allows multiple different sets of parameters to be used within the same program. Depending on the situation, the parameters can be a main object that manages the rest of the interaction, or parameters that are passed into each object constructor or function call, or a configuration object that's passed in. The library can provide a function to gather these parameters from the environment or user input, but it shouldn't be required; it should always be possible for the program to supply the parameters directly. This will also make tests easier to write.
Side notes:
The name "common" is a bit generic; consider renaming the library after what it does, or splitting it into several libraries each of which does one thing?
Python will try to cache imported modules so it doesn't re-run them multiple times; this is probably why it only ran the code once in one situation but three times in another. This is another reason to have only definitions happen on import, so the details of the caching don't affect functionality.

It seems after commenting the below in
Project1/init.py
the issue has been resolved. I was calling the same common_lib from init.py as well which was causing this issue . Removing the below entry resolved the issue.
#from common_prj import commonlib

Py.test - Applying variables to decorators from a csv?

Please bear with me while I try to explain my predicament, I'm still a Python novice and so my terminology may not be correct. Also I'm sorry for the inevitable long-windedness of this post, but I'll try to expalin in as much relevant detail as possible.
A quick rundown:
I'm currently developing a suite of Selenium tests for a set of websites that are essentially the same in functionality, using py.test
Tests results are uploaded to TestRail, using the pytest plugin pytest-testrail.
Tests are tagged with the decorator #pytestrail.case(id) with a unique case ID
A typical test of mine looks like this:
#pytestrail.case('C100123') # associates the function with the relevant TR case
#pytest.mark.usefixtures()
def test_login():
# test code goes here
As I mentioned before, I'm aiming to create one set of code that handles a number of our websites with (virtually) identical functionality, so a hardcoded decorator in the example above won't work.
I tried a data driven approach with a csv and a list of the tests and their case IDs in TestRail.
Example:
website1.csv:
Case ID | Test name
C100123 | test_login
website2.csv:
Case ID | Test name
C222123 | test_login
The code I wrote would use the inspect module to find the name of the test running, find the relevant test ID and put that into a variable called test_id:
import csv
import inspect
class trp(object):
def __init__(self):
pass
with open(testcsv) as f: # testcsv could be website1.csv or website2.csv
reader = csv.reader(f)
next(reader) # skip header
tests = [r for r in reader]
def gettestcase(self):
self.current_test = inspect.stack()[3][3]
for row in trp.tests:
if self.current_test == row[2]:
self.test_id = (row[0])
print(self.test_id)
return self.test_id, self.current_test
def gettestid(self):
self.gettestcase()
The idea was that the decorator would change dynamically based on the csv that I was using at the time.
#pytestrail.case(test_id) # now a variable
#pytest.mark.usefixtures()
def test_login():
trp.gettestid()
# test code goes here
So if I ran test_login for website1, the decorator would look like:
#pytestrail.case('C100123')
and if I ran test_login for website2 the decorator would be:
#pytestrail.case('C222123')
I felt mighty proud of coming up with this solution by myself and tried it out...it didn't work. While the code does work by itself, I would get an exception because test_id is undefined (I understand why - gettestcase is executed after the decorator, so of course it would crash.
The only other way I can handle this is to apply the csv and testIDs before any test code is executed. My question is - how would I know how to associate the tests with their test IDs? What would an elegant, minimal solution to this be?
Sorry for the long winded question. I'll be watching closely to answer any questions if you need more explanation.

pytest is very good at doing all kind of metaprogramming stuff for the tests. If I understand your question correctly, the code below will do the dynamic tests marking with pytestrail.case marker. In the project root dir, create a file named conftest.py and place this code in it:
import csv
from pytest_testrail.plugin import pytestrail
with open('website1.csv') as f:
reader = csv.reader(f)
next(reader)
tests = [r for r in reader]
def pytest_collection_modifyitems(items):
for item in items:
for testid, testname in tests:
if item.name == testname:
item.add_marker(pytestrail.case(testid))
Now you don't need to mark the test with #pytestrail.case()at all - just write the rest of code and pytest will take care of the marking:
def test_login():
assert True
When pytest starts, the code above will read website1.csv and store the test IDs and names just as you did in your code. Before the test run starts, pytest_collection_modifyitems hook will execute, analyzing the collected tests - if a test has the same name as in csv file, pytest will add the pytestrail.case marker with the test ID to it.

I believe the reason this isn't working as you would expect has to do with how python reads and executes files. When python starts executing it reads in the linked python file(s) and executes each line one-by-one, in turn. For things at the 'root' indention level (function/class definitions, decorators, variable assignments, etc) this means they get run exactly one time as they are loaded in, and never again. In your case, the python interpreter reads in the pytest-testrail decorator, then the pytest decorator, and finally the function definition, executing each one once, ever.
(Side note, this is why you should never use mutable objects as function argument defaults: Common Gotchas)
Given that you want to first deduce the current test name, then associate that with a test case ID, and finally use that ID with the decorator, I'm not sure that is possible with pytest-testrail's current functionality. At least, not possible without some esoteric and difficult to debug/maintain hack using descriptors or the like.
I think you realistically have one option: use a different TestRail client and update your pytest structure to use the new client. Two python clients I can recommend are testrail-python and TRAW (TestRail Api Wrapper)(*)
It will take more work on your part to create the fixtures for starting a run, updating results, and closing the run, but I think in the end you will have a more portable suite of tests and better results reporting.
(*) full disclosure: I am the creator/maintainer of TRAW, and also made significant contributions to testrail-python

Separate test cases per input files?

Most test frameworks assume that "1 test = 1 Python method/function",
and consider a test as passed when the function executes without
raising assertions.
I'm testing a compiler-like program (a program that reads *.foo
files and process their contents), for which I want to execute the same test on many input (*.foo) files. IOW, my test looks like:
class Test(unittest.TestCase):
def one_file(self, filename):
# do the actual test
def list_testcases(self):
# essentially os.listdir('tests/') and filter *.foo files.
def test_all(self):
for f in self.list_testcases():
one_file(f)
My current code uses
unittest from
Python's standard library, i.e. one_file uses self.assert...(...)
statements to check whether the test passes.
This works, in the sense that I do get a program which succeeds/fails
when my code is OK/buggy, but I'm loosing a lot of the advantages of
the testing framework:
I don't get relevant reporting like "X failures out of Y tests" nor
the list of passed/failed tests. (I'm planning to use such system
not only to test my own development but also to grade student's code
as a teacher, so reporting is important for me)
I don't get test independence. The second test runs on the
environment left by the first, and so on. The first failure stops
the testsuite: testcases coming after a failure are not ran at all.
I get the feeling that I'm abusing my test framework: there's only
one test function so automatic test discovery of unittest sounds
overkill for example. The same code could (should?) be written in
plain Python with a basic assert.
An obvious alternative is to change my code to something like
class Test(unittest.TestCase):
def one_file(self, filename):
# do the actual test
def test_file1(self):
one_file("first-testcase.foo")
def test_file2(self):
one_file("second-testcase.foo")
Then I get all the advantages of unittest back, but:
It's a lot more code to write.
It's easy to "forget" a testcase, i.e. create a test file in
tests/ and forget to add it to the Python test.
I can imagine a solution where I would generate one method per testcase dynamically (along the lines of setattr(self, 'test_file' + str(n), ...)), to generate the code for the second solution without having to write it by hand. But that sounds really overkill for a use-case which doesn't seem so complex.
How could I get the best of both, i.e.
automatic testcase discovery (list tests/*.foo files), test
independence and proper reporting?

If you can use pytest as your test runner, then this is actually pretty straightforward using the parametrize decorator:
import pytest, glob
all_files = glob.glob('some/path/*.foo')
#pytest.mark.parametrize('filename', all_files)
def test_one_file(filename):
# do the actual test
This will also automatically name the tests in a useful way, so that you can see which files have failed:
$ py.test
================================== test session starts ===================================
platform darwin -- Python 3.6.1, pytest-3.1.3, py-1.4.34, pluggy-0.4.0
[...]
======================================== FAILURES ========================================
_____________________________ test_one_file[some/path/a.foo] _____________________________
filename = 'some/path/a.foo'
#pytest.mark.parametrize('filename', all_files)
def test_one_file(filename):
> assert False
E assert False
test_it.py:7: AssertionError
_____________________________ test_one_file[some/path/b.foo] _____________________________
filename = 'some/path/b.foo'
#pytest.mark.parametrize('filename', all_files)
def test_one_file(filename):
[...]

Here is a solution, although it might be considered not very beautiful... The idea is to dynamically create new functions, add them to the test class, and use the function names as arguments (e.g., filenames):
# import
import unittest
# test class
class Test(unittest.TestCase):
# example test case
def test_default(self):
print('test_default')
self.assertEqual(2,2)
# set string for creating new function
func_string="""def test(cls):
# get function name and use it to pass information
filename = inspect.stack()[0][3]
# print function name for demonstration purposes
print(filename)
# dummy test for demonstration purposes
cls.assertEqual(type(filename),str)"""
# add new test for each item in list
for f in ['test_bla','test_blu','test_bli']:
# set name of new function
name=func_string.replace('test',f)
# create new function
exec(name)
# add new function to test class
setattr(Test, f, eval(f))
if __name__ == "__main__":
unittest.main()
This correctly runs all four tests and returns:
> test_bla
> test_bli
> test_blu
> test_default
> Ran 4 tests in 0.040s
> OK

How to write test cases for assignment

The part of my assignment is to create tests for each function. This ones kinda long but I am so confused. I put a link below this function so you can see how it looks like
first code is extremely long because.
def load_profiles(profiles_file, person_to_friends, person_to_networks):
'''(file, dict of {str : list of strs}, dict of {str : list of strs}) -> NoneType
Update person to friends and person to networks dictionaries to include
the data in open file.'''
# for updating person_to_friends dict
update_p_to_f(profiles_file, person_to_friends)
update_p_to_n(profiles_file, person_to_networks)
heres the whole code: http://shrib.com/8EF4E8Z3, I tested it through mainblock and it works.
This is the text file(profiles_file) we were provided that we are using to convert them :
http://shrib.com/zI61fmNP
How do I run test cases for this through nose, what kinda of test outcomes are there? Or am I not being specific enough?
import nose
import a3_functions
def test_load_profiles_
if name == 'main':
nose.runmodule()
I went that far then I didn't know what I can test for the function.

Lets assume the code you wrote so far is in a module called "mycode".
Write a new module called testmycode. (i.e. create a python file called testmycode.py)
In there, import the module you want to test (mycode)
Write a function called testupdate().
In that function, first write a text file (with file.write) that you expect to be valid. Then let update_p_to_f update it. Verify that it did what you expect, using assert. This is a test for reading a text file.
Then you can write a second function called testupdate_write(), where you let your code write to a file -- then verify that what it wrote is correct.
To run the tests, use (on the commandline)
nosetests -sx testmycode.py
Which will load testmycode and run all functions it finds there that start with test.

You probably want to test both the overall output of your program is correct, and that individual parts of your program are correct.
#j13r has already covered how to test the overall correctness of your program for a full run.
You mention that you have four helper functions. You can write tests for these separately.
Testing smaller pieces of your code is helpful because you can test each piece in more numerous and more specific ways than if you only test the whole thing.
The unittest module is a framework for performing tests.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Use pytest to test and grade student code - python

Related

Calling a multiple python scripts from python with predefined environment

Common Python scripts are getting called multiple times

Py.test - Applying variables to decorators from a csv?

Separate test cases per input files?

How to write test cases for assignment

Categories

Resources