How to unittest a class that reads a command line arg? - python

I want to write unit tests for a Python script. The script itself runs correctly.
The script consists of one class, which reads a value from the command line into a class variable.
When I import the class into my test file, I get the error 'list index out of range' at the point where the class reads from sys.argv[].
I'm new to testing, in Python and in general. I've read a lot of doc and SO pages about this over the last couple of days.
Here's the code:
File bongo.py --
import sys
class Bongo:
my_int = int(sys.argv[1])
def __init__(self, n):
self.n = n
def get_sum(self):
return self.n + Bongo.my_int
if __name__ == '__main__':
bongo = Bongo(5)
m = bongo.get_sum()
print('bongo.get_sum() returns {}'.format(m))
File bongo_test.py --
import unittest
from bongo import Bongo
class TestBongoMethods(unittest.TestCase):
def setUp(self):
self.bongo = Bongo(10)
self.bongo.my_int = 5
def test_get_n(self):
self.assertEqual(self.bongo.get_sum(), 15)
if __name__ == '__main__':
unittest.main()
The output of running python bongo_test.py is
Traceback (most recent call last):
File "bongo_test.py", line 2, in <module>
from bongo import Bongo
File "/home/me/class_for_testing/bongo.py", line 4, in <module>
class Bongo:
File "/home/me/class_for_testing/bongo.py", line 6, in Bongo
my_int = int(sys.argv[1])
IndexError: list index out of range
I've tried just about everything I can think of; I've been using Py 2.7 in PyCharm Pro 2016.1,
but the results are no different using Py 3.4 or running from the command line.
Can this be done using unittest? Or do I need something else?
Any help would be much appreciated!

Generally, unit testing is helpful in figuring out the boundaries of your classes. If your class is hard to unit test, you should think: "how can I make it easier to test?" Almost always this leads to better structured classes.
One of these symptoms is that you have external dependencies, like sys.argv, that are hard to control. Generally you want to inject dependencies like this into a class. This is one of the advantages of object-oriented programming -- that you can isolate dependencies and minimize the complexity of the objects themselves.
The __main__ block is perfect for command-line specific logic, like sys.argv. I would recommend reading the variable there, and passing it into the constructor of Bongo. Then you can unittest Bongo as a class that takes two variables, and have it be ignorant of sys.argv.

For command line args I suggest you argparse package (included in plain Python.)
The __main__ part will contain parsing by and the call of the main method.
You will can test separately both:
arg parsing
business logic with given parameters

Related

Python Best Practice. Call commandline python file from another python file

I retrieved a python project from some git repo. To run this project, there is a file that must be launched by command line with the correct arguments. Here is an example :
#! /usr/bin/env python
import argparse
parser = argparse.ArgumentParser(description='Description')
parser.add_argument('arg1')
parser.add_argument('arg2')
# %%
def _main(args):
# Execute the code using args
if __name__ == '__main__':
_main(parser.parse_args())
I want to use this code in my own project, and so, call the main function from another python file, using a set of predefined arguments.
I have found different ways of doing so, but I don't know what is the good way to do it.
Calling the file using the os package, but seems like a bad practice to me.
Refactoring the file so that the main function take the wanted parameters (and getting rid of args object), but it means that command line call would not work anymore.
Other ?
Import otherprogram into your own program.
Call otherprogram._main() from the appropriate point in your own code, passing it an argparse.Namespace instance.
You can build that using your own argparse calls if you need to, or just construct the values some other way.
You can actually call the main method as long as you make sure args contain the object with the correct attributes. Consider this
import argparse
def _main(args):
print(args.arg1)
_main(argparse.Namespace(arg1='test1', arg2='test2'))
In another file you can do this:
from some_git_repo import _main
from argparse import Namespace
_main(Namespace(arg1='test1', arg2='test2'))

using import inside class

I am completely new to the python class concept. After searching for a solution for some days, I hope I will get help here:
I want a python class where I import a function and use it there. The main code should be able to call the function from the class. for that I have two files in the same folder.
Thanks to #cdarke, #DeepSpace and #MosesKoledoye, I edited the mistake, but sadly that wasn't it.
I still get the Error:
test 0
Traceback (most recent call last):
File "run.py", line 3, in <module>
foo.doit()
File "/Users/ls/Documents/Entwicklung/RaspberryPi/test/test.py", line 8, in doit
self.timer(5)
File "/Users/ls/Documents/Entwicklung/RaspberryPi/test/test.py", line 6, in timer
zeit.sleep(2)
NameError: global name 'zeit' is not defined
#wombatz got the right tip:
it must be self.zeit.sleep(2) or Test.zeit.sleep(2). the import could be also done above the class declaration.
Test.Py
class Test:
import time as zeit
def timer(self, count):
for i in range(count):
print("test "+str(i))
self.zeit.sleep(2) <-- self is importent, otherwise, move the import above the class declaration
def doit(self):
self.timer(5)
and
run.py
from test import Test
foo = Test()
foo.doit()
when I try to python run.py I get this error:
test 0
Traceback (most recent call last):
File "run.py", line 3, in <module>
foo.doit()
File "/Users/ls/Documents/Entwicklung/RaspberryPi/test/test.py", line 8, in doit
self.timer(5)
File "/Users/ls/Documents/Entwicklung/RaspberryPi/test/test.py", line 6, in timer
sleep(2)
NameError: global name 'sleep' is not defined
What I understand from the error is that the import in the class is not recognized. But how can I achive that the import in the class is recognized?
Everything defined inside the namespace of a class has to be accessed from that class. That holds for methods, variables, nested classes and everything else including modules.
If you really want to import a module inside a class you must access it from that class:
class Test:
import time as zeit
def timer(self):
self.zeit.sleep(2)
# or Test.zeit.sleep(2)
But why would you import the module inside the class anyway? I can't think of a use case for that despite from wanting it to put into that namespace.
You really should move the import to the top of the module. Then you can call zeit.sleep(2) inside the class without prefixing self or Test.
Also you should not use non-english identifiers like zeit. People who only speak english should be able to read your code.
sleep is not a python builtin, and the name as is, does not reference any object. So Python has rightly raised a NameEror.
You intend to:
import time as zeit
zeit.sleep(2)
And move import time as zeit to the top of the module.
The time module aliased as zeit is probably not appearing in your module's global symbol table because it was imported inside a class.
You want time.sleep. You can also use;
from time import sleep
Edit: Importing within class scope issues explained here.
You're almost there! sleep is a function within the time module. This means that the name sleep doesn't exist unless its understood within the context of time, unless you define it on your own. Since you didn't define it on your own, you can access it by running time.sleep(2).
In your specific example, you used:
import time as zeit
you'll have to run:
zeit.sleep(2)
Alternatively, you can import sleep directly from time, by running:
from time import sleep
sleep(2)
Good luck!
You can read more about the time module here: https://docs.python.org/2/library/time.html
You can learn more about imports here: https://docs.python.org/3/reference/import.html
and I highly recommend learning about namespace in python, here: https://bytebaker.com/2008/07/30/python-namespaces/
I agree with #Wombatz on his solution, but I do not have enough reputation to comment on his question
One use case that I have found for importing a module within a class is when I want to initialize a class from a config file.
Say my config file is
config.py
__all__ = ['logfile', ... ]
logfile = 'myevent.log'
...
And in my main module
do_something.py
class event():
from config import *
def __init__(self):
try : self.logfile
except NameError: self.logfile = './generic_event.log'
Now the advantage of this scheme is that we do not need to import logfile in the global namespace if it is not needed
Whereas, importing at the beginning of do_something.py, I will have to use globals inside the class, which is a little ugly in my opinion.
It's probably a bit late, but I agree with idea of not polluting the module-level namespace (of course this can probably be remedied with a better design of a module, plus 'explicit is better than implicit' anyways).
Here is what I would do. The basic idea is this: import is an implicit assignment in which an entire module object gets assigned to a single name. Thus:
class Test:
import time as zeit
self.zeit = zeit # This line binds the module object to an attribute of an instance created from the class
def timer(self, count):
for i in range(count):
print("test "+str(i))
self.zeit.sleep(2) # This necessitates the `zeit` attribute within the instance created from the class
def doit(self):
self.timer(5)
import importlib
class importery():
def __init__(self, y,z):
self.my_name = y
self.pathy = z
self.spec = importlib.util.spec_from_file_location(self.my_name, self.pathy)
x = importlib.util.module_from_spec(self.spec)
self.spec.loader.exec_module(x)
print(dir(x))
root = x.Tk()
root.mainloop()
pathy = r'C:\Users\mine\Desktop\python310\Lib\tkinter\__init__.py'
importery('tk', pathy)
There is a 'time and a place' to do this type of black magic, thankfully very rare times and places. Of the few I've found I've normally been able to use subprocess to get some other flavor of python to do my dirty work, but that is not always an option.
Now, I have 'used' this in blender when I've needed to have conflicting versions of a module loaded at the same time. This is not a good way to do things and really should be a last resort.
If you are a blender user and you happen to decide to commit this sin, I suggest doing so in a clean version of blender, install a like version of python next to it to use that to do your pip installs with, and please make sure you have added your config folder to your blender folder, else this black magic may come back to bite you in the arse later.

Breaking a single python .py file into multiple files with inter-dependencies

I want to split a large python module i wrote into multiple files within a directory, where each file is a function that may or may not have dependencies with other functions within the module. Here's a simple example of what i came up with:
First, here's a self contained .py module
#[/pie.py]
def getpi():
return pi()
def pi():
return 3.1416
Obviously, this works fine when importing and calling either function. So now i split it in different files with an init.py file to wrap it all up:
#[/pie/__init__.py]
from getpi import *
from pi import *
__all__=['getpi','pi']
#[/pie/getpi.py]
def getpi():
return pi()
#[/pie/pi.py]
def pi():
return 3.1416
Because getpi() has a dependency with pi(), calling it as currently structured raises an exception:
>>> import pie
>>> pie.getpi()
Traceback (most recent call last):
File "<pyshell#7>", line 1, in <module>
pie.getpi()
File "C:\python\pie\getpi.py", line 2, in getpi
return pi()
NameError: global name 'pi' is not defined
And so to fix this issue, my current solution is to write init.py like so:
#[/pie/__init__.py]
import os as _os
__all__ = []
for _f in _os.listdir(__path__[0]):
if not _f == '__init__.py' and _f.endswith('.py'):
execfile('%s\\%s'%(__path__[0],_f))
__all__.append(_os.path.splitext(_f)[0])
So now it works fine:
>>> import pie
>>> pie.getpi()
3.1416
So now everything works as if everything was contained in a single .py file. init.py can contain all the high level imports (numpy, os, sys, glob...) that all the individual functions need.
Structuring a module this way feels "right" to me. New functions are loaded automatically at the next import (no need to append init.py each time). It lets me see at a glance which functions are meant to be used just by looking at what's within a directory, plus it keeps everything nicely sorted alphabetically.
The only negative i can see at this time is that only init.py gets byte-compiled and not any of the sub .py files. But loading speed hasn't been an issue so i don't mind. Also, i do realize this might cause an issue with packaging, but it's also something i don't mind because our scripts get distributed via our own revision control system.
Is this an acceptable way of structuring a python module? And if not, what would be the correct way to achieve what i've done properly.
The "correct" way would be to import the necessary modules where they are needed:
# pi.py
def pi(): return 3.1417
# getpi.py
from .pi import pi
def getpi(): return pi()
# __init__.py
from .pi import *
from .getpi import *
Make sure you don't have cyclic dependencies. These are bad in any case, but you can avoid them by abstracting up to the necessary level.

Understanding the main method of python [duplicate]

This question already has answers here:
What does if __name__ == "__main__": do?
(45 answers)
Closed 8 years ago.
I am new to Python, but I have experience in other OOP languages. My course does not explain the main method in python.
Please tell me how main method works in python ? I am confused because I am trying to compare it to Java.
def main():
# display some lines
if __name__ == "__main__": main()
How is main executed and why do I need this strange if to execute main. My code is terminated without output when I remove the if.
The minimal code -
class AnimalActions:
def quack(self): return self.strings['quack']
def bark(self): return self.strings['bark']
class Duck(AnimalActions):
strings = dict(
quack = "Quaaaaak!",
bark = "The duck cannot bark.",
)
class Dog(AnimalActions):
strings = dict(
quack = "The dog cannot quack.",
bark = "Arf!",
)
def in_the_doghouse(dog):
print(dog.bark())
def in_the_forest(duck):
print(duck.quack())
def main():
donald = Duck()
fido = Dog()
print("- In the forest:")
for o in ( donald, fido ):
in_the_forest(o)
print("- In the doghouse:")
for o in ( donald, fido ):
in_the_doghouse(o)
if __name__ == "__main__": main()
The Python approach to "main" is almost unique to the language(*).
The semantics are a bit subtle. The __name__ identifier is bound to the name of any module as it's being imported. However, when a file is being executed then __name__ is set to "__main__" (the literal string: __main__).
This is almost always used to separate the portion of code which should be executed from the portions of code which define functionality. So Python code often contains a line like:
#!/usr/bin/env python
from __future__ import print_function
import this, that, other, stuff
class SomeObject(object):
pass
def some_function(*args,**kwargs):
pass
if __name__ == '__main__':
print("This only executes when %s is executed rather than imported" % __file__)
Using this convention one can have a file define classes and functions for use in other programs, and also include code to evaluate only when the file is called as a standalone script.
It's important to understand that all of the code above the if __name__ line is being executed, evaluated, in both cases. It's evaluated by the interpreter when the file is imported or when it's executed. If you put a print statement before the if __name__ line then it will print output every time any other code attempts to import that as a module. (Of course, this would be anti-social. Don't do that).
I, personally, like these semantics. It encourages programmers to separate functionality (definitions) from function (execution) and encourages re-use.
Ideally almost every Python module can do something useful if called from the command line. In many cases this is used for managing unit tests. If a particular file defines functionality which is only useful in the context of other components of a system then one can still use __name__ == "__main__" to isolate a block of code which calls a suite of unit tests that apply to this module.
(If you're not going to have any such functionality nor unit tests than it's best to ensure that the file mode is NOT executable).
Summary: if __name__ == '__main__': has two primary use cases:
Allow a module to provide functionality for import into other code while also providing useful semantics as a standalone script (a command line wrapper around the functionality)
Allow a module to define a suite of unit tests which are stored with (in the same file as) the code to be tested and which can be executed independently of the rest of the codebase.
It's fairly common to def main(*args) and have if __name__ == '__main__': simply call main(*sys.argv[1:]) if you want to define main in a manner that's similar to some other programming languages. If your .py file is primarily intended to be used as a module in other code then you might def test_module() and calling test_module() in your if __name__ == '__main__:' suite.
(Ruby also implements a similar feature if __file__ == $0).
In Python, execution does NOT have to begin at main. The first line of "executable code"
is executed first.
def main():
print("main code")
def meth1():
print("meth1")
meth1()
if __name__ == "__main__":main() ## with if
Output -
meth1
main code
More on main() - http://ibiblio.org/g2swap/byteofpython/read/module-name.html
A module's __name__
Every module has a name and statements in a module can find out the name of its module. This is especially handy in one particular situation - As mentioned previously, when a module is imported for the first time, the main block in that module is run. What if we want to run the block only if the program was used by itself and not when it was imported from another module? This can be achieved using the name attribute of the module.
Using a module's __name__
#!/usr/bin/python
# Filename: using_name.py
if __name__ == '__main__':
print 'This program is being run by itself'
else:
print 'I am being imported from another module'
Output -
$ python using_name.py
This program is being run by itself
$ python
>>> import using_name
I am being imported from another module
>>>
How It Works -
Every Python module has it's __name__ defined and if this is __main__, it implies that the module is being run standalone by the user and we can do corresponding appropriate actions.
Python does not have a defined entry point like Java, C, C++, etc. Rather it simply executes a source file line-by-line. The if statement allows you to create a main function which will be executed if your file is loaded as the "Main" module rather than as a library in another module.
To be clear, this means that the Python interpreter starts at the first line of a file and executes it. Executing lines like class Foobar: and def foobar() creates either a class or a function and stores them in memory for later use.
If you import the module (.py) file you are creating now from another python script it will not execute the code within
if __name__ == '__main__':
...
If you run the script directly from the console, it will be executed.
Python does not use or require a main() function. Any code that is not protected by that guard will be executed upon execution or importing of the module.
This is expanded upon a little more at python.berkely.edu

Python NameError when attempting to use a user-defined class

I'm getting a weird instance of a NameError when attempting to use a class I wrote. In a directory, I have the following file structure:
dir/
ReutersParser.py
test.py
reut-xxx.sgm
Where my custom class is defined in ReutersParser.py and I have a test script defined in test.py.
The ReutersParser looks something like this:
from sgmllib import SGMLParser
class ReutersParser(SGMLParser):
def __init__(self, verbose=0):
SGMLParser.__init__(self, verbose)
... rest of parser
if __name__ == '__main__':
f = open('reut2-short.sgm')
s = f.read()
p = ReutersParser()
p.parse(s)
It's a parser to deal with SGML files of Reuters articles. The test works perfectly. Anyway, I'm going to use it in test.py, which looks like this:
from ReutersParser import ReutersParser
def main():
parser = ReutersParser()
if __name__ == '__main__':
main()
When it gets to that parser line, I'm getting this error:
Traceback (most recent call last):
File "D:\Projects\Reuters\test.py", line 34, in <module>
main()
File "D:\Projects\Reuters\test.py", line 19, in main
parser = ReutersParser()
File "D:\Projects\Reuters\ReutersParser.py", line 38, in __init__
SGMLParser.__init__(self, verbose)
NameError: global name 'sgmllib' is not defined
For some reason, when I try to use my ReutersParser in test.py, it throws an error that says it cannot find sgmllib, which is a built-in module. I'm at my wits' end trying to figure out why the import won't work.
What's causing this NameError? I've tried importing sgmllib in my test.py and that works, so I don't understand why it can't find it when trying to run the constructor for my ReutersParser.
Your problem is not your code, but what you run it in. If you read the error and the code it displays closely:
File "D:\Projects\Reuters\ReutersParser.py", line 38, in __init__
SGMLParser.__init__(self, verbose)
NameError: global name 'sgmllib' is not defined
you'll notice there's no reference to 'sgmllib' on the line that Python thinks produced this error. That means one of two things: either the error didn't originate there (and Python is quite confused), or the code that's being displayed is not the code that is being executed. The latter is quite common when you, for example, run your code in an IDE that doesn't restart the Python interpreter between code executions. It will execute your old code, but when displaying the traceback will show the new code. I'm guessing you did sgmllib.SGMLParser.__init__(self, verbose) on that line at some point in the past.
The reason it was fixed by renaming the class is probably that you did something -- like editing the code -- that caused the IDE to either restart the interpreter, properly clean it up or (by accident) reloaded the right module the right way for it to see the new code. Since you name your module after your class (which is bad style, by the way) I assume you renamed your module when you renamed your class, and so your IDE picked up the new code this time. Until the next time the same thing happens, of course.

Categories

Resources