Common Python scripts are getting called multiple times - python

I have the following project structure.
- README.rst
- LICENSE
- setup.py
- requirements.txt
- common_prj/__init__.py
- common_prj/common_lib.py
- common_prj/config.xml
- Project1/__init__.py
- Project1/some_app_m1.py
- Project1/some_app_m2.py
- Project1/some_app.py
- Project2/some_app1.py
I am having all my common classes in common_prj/common_lib.py file. Now each module file from Project1 is calling common_prj/common_lib.py to access common classes and setup project env using config.xml
#import in some_app_m1.py
from common_prj import common_lib
#import in some_app_m2.py
from common_prj import common_lib
Imports in some_app.py
from Project1 import some_app_m1
from Project1 import some_app_m2
from common_prj import common_lib
###
<some functions>
<some functions>
###
if __name__ == '__main__':
With the above three import it seems common_lib.py is getting executed multiple time , but if I keep the common_lib.py in Project1 then I don't see this issue.
Please let me now how can I keep common_lib.py in common package and call that from Project1 scripts without executing it multiple time. The purpose of common_lib.py is to share common classes with scripts in Project2.
I have the below code in common_lib.py which is repeating for calling classes from each module after import.
self.env = dict_env.get(int(input("Choose Database ENV for this execution : \n" + str(dict_env) + "\nSelect the numeric value => ")))
self.app = dict_app.get(int(input("Choose application for this execution : \n" + str(dict_app) + "\nSelect the numeric value => ")))
My question why I am not facing this issue if I keep common_lib.py in Project1 rather common_prj. I added the above code in common_lib.py because I don't wanted to repeat these lines and ENV setup in my all app code. These are global env settings for all application scripts code inside Project1.
output
Choose Database ENV for this execution :
{1: 'DEV', 2: 'SIT', 3: 'UAT', 4: 'PROD'}
Select the numeric value => 1
Choose application for this execution :
{1: 'app1', 2: 'app2', 3: 'app3', 4: 'app4', 5: 'app5', 6: 'app6'}
Select the numeric value => 1
Choose Database ENV for this execution : ### Here it is repeating again from common_lib.py
{1: 'DEV', 2: 'SIT', 3: 'UAT', 4: 'PROD'}
Select the numeric value => 1
Choose application for this execution : ### Here it is repeating again from common_lib.py
{1: 'app1', 2: 'app2', 3: 'app3', 4: 'app4', 5: 'app5', 6: 'app6'}
Select the numeric value => 1

Note: This is my first answer where I answer in-depth of the logistics of python. If I have said anything incorrect, please let me know or edit my answer (with a comment letting me know why). Please feel free to clarify any questions you may have too.
As the comments have discussed if __name__ == '__main__': is exactly the way to go. You can check out the answers on the SO post for a full description or check out the python docs.
In essence, __name__ is a variable for a python file. Its behavior includes:
When used in a file, its value is defaulted to '__main__'.
When a file imports another program. (import foo). foo.__name__ will return 'foo' as its value.
Now, when importing a file, python compiles the new file completely. This means that it will run everything that can be run. Python compiles all the methods, variables, etc. As it's compiling, if python comes across a callable function (i.e hello_world()) it will run it. This happens even if you're trying to import a specific part of the package or not.
Now, these two things are very important. If you have some trouble understanding the logic, you can take a look at some sample code I created.
Since you are importing from common_prj 3 times (from the some_app_m* file and my_app.py file), you are running the program in common_prj three times (This is also what makes python one of the slower programming languages).
Although I'm unable to see your complete code, I'm assuming that common_prj.py calls common_lib() at some point in your code (at no indentation level). This means that the two methods inside:
self.env = dict_env.get(int(input("Choose Database ENV for this execution : \n" + str(dict_env) + "\nSelect the numeric value => ")))
self.app = dict_app.get(int(input("Choose application for this execution : \n" + str(dict_app) + "\nSelect the numeric value => ")))
were also called.
So, in conclusion:
You imported common_prj 3 times
Which ran common_lib multiple times as well.
Solution:
This is where the __name__ part I talked about earlier comes into play
Take all callable functions that shouldn't be called during importing, and place them under the conditional of if __name__ == '__name__':
This will ensure that only if the common_prj file is run then the condition will pass.
Again, if you didn't understand what I said, please feel free to ask or look at my sample code!
Edit: You can also just remove these callable functions. Files that are meant to be imported use them to make sure that the functions work as intended. If you look at any of the packages that you import daily, you'll either find that the methods are called in the condition OR not in the file at all.

This is more of a question of library design, rather than the detailed mechanics of Python.
A library module (which is what common_prj is) should be designed to do nothing on import, only define functions, classes, constants etc. It can do things like pre-calculating constants, but it should be restricted to things that don't depend on anything external, produce the same results each time and don't involve much heavy computation. At that point, the code running three times will no longer be a problem.
If the library needs parameters, the caller should supply those, ideally in a way that allows multiple different sets of parameters to be used within the same program. Depending on the situation, the parameters can be a main object that manages the rest of the interaction, or parameters that are passed into each object constructor or function call, or a configuration object that's passed in. The library can provide a function to gather these parameters from the environment or user input, but it shouldn't be required; it should always be possible for the program to supply the parameters directly. This will also make tests easier to write.
Side notes:
The name "common" is a bit generic; consider renaming the library after what it does, or splitting it into several libraries each of which does one thing?
Python will try to cache imported modules so it doesn't re-run them multiple times; this is probably why it only ran the code once in one situation but three times in another. This is another reason to have only definitions happen on import, so the details of the caching don't affect functionality.

It seems after commenting the below in
Project1/init.py
the issue has been resolved. I was calling the same common_lib from init.py as well which was causing this issue . Removing the below entry resolved the issue.
#from common_prj import commonlib

Related

Calling a multiple python scripts from python with predefined environment

Probably related to globals and locals in python exec(), Python 2 How to debug code injected by the exec block and How to get local variables updated, when using the `exec` call?
I am trying to develop a test framework for our desktop applications which uses click bot like functions.
My goal was to enable non-programmers to write small scripts which could work as a test script. So my idea is to structure the test scripts by files like:
tests-folder
| -> 01-first-test.py
| -> 02-second-test.py
| ... etc
| -> fixture.py
And then just execute these scripts in alphabetical order. However, I also wanted to have fixtures which would define functions, classes, variables and make them available to the different scripts without having the scripts to import that fixture explicitly. If that works, I could also have that approach for 2 or more directory levels.
I could get it working-ish with some hacking around, but I am not entirely convinced. I have a test_sequence.py which looks like this:
from pathlib import Path
from copy import deepcopy
from my_module.test import Failure
def run_test_sequence(test_dir: str):
error_occured = False
fixture = {
'some_pre_defined_variable': 'this is available in all scripts and fixtures',
'directory_name': test_dir,
}
# Check if fixture.py exists and load that first
fixture_file = Path(dir) / 'fixture.py'
if fixture_file.exists():
with open(fixture_file.absolute(), 'r') as code:
exec(code.read(), fixture, fixture)
# Go over all files in test sequence directory and execute them
for test_file in sorted(Path(test_dir).iterdir()):
if test_file.name == 'fixture.py':
continue
# Make a deepcopy, so scripts cannot influence one another
fixture_copy = deepcopy(fixture)
fixture_copy.update({
'some_other_variable': 'this is available in all scripts but not in fixture'
})
try:
with open(test_file.absolute(), 'r') as code:
exec(code.read(), fixture_locals, fixture_locals)
except Failure:
error_occured = True
return error_occured
This iterates over all files in the directory tests-folder and executes them in order (with fixture.py first). It also makes the local variables, functions and classes from fixture.py available to each test-script.
A test script could then just be arbitrary code that will be executed and if it raises my custom Failure exception, this will be noted as a failed test.
The whole sequence is started with a script that does
from my_module.test_sequence import run_test_sequence
if __name__ == '__main__':
exit(run_test_sequence('tests-folder')
This mostly works.
What it cannot do, and what leaves me unsatisfied with this approach:
I cannot debug the scripts itself. Since the code is loaded as string and then interpreted, breakpoints inside the test scripts are not recognized.
Calling fixture functions behaves weird. When I define a function in fixture.py like:
from my_hello_module import print_hello
def printer():
print_hello()
I will receive a message during execution that print_hello is undefined because the variables/modules/etc. in the scope surrounding printer are lost.
Stacktraces are useless. On failure it shows the stacktrace but of course only shows my line which says `exec(...)' and the insides of that function, but none of the code that has been loaded.
I am sure there are other drawbacks, that I have not found yet, but these are the most annoying ones.
I also tried to find a solution through __import__ but I couldn't get it to inject my custom locals or globals into the imported script.
Is there a solution, that I am too inexperienced to find or another builtin Python function that actually does, what I am trying to do? Or is there no way to achieve this and I should rather have each test-script import the fixture and file/directory names from the test-scripts itself. I want those scripts to have as few dependencies and pythony code as possible. Ideally they are just:
from my_module.test import *
click(x, y, LEFT)
write('admin')
press('tab')
write('password')
press('enter')
if text_on_screen('Login successful'):
succeed('Test successful')
else:
fail('Could not login')
Additional note: I think I had the debugger working when I still used execfile, but it is not available in python3 environments.

Use function from another file python

I have 2 files, 1 file with the classes and 1 file with the 'interface'
In the first file I have:
from second_file import *
class Catalog:
def ListOfBooks(self):
more_20 = input()
# If press 1 starts showing from 20 until 40
if more_20 == '1':
for item in C[20:41:1]:
print("ID:",item['ID'],"Title:",item['title']," Author: ", item['author'])
elif more_20 == '2':
return librarian_option()
test = Catalog()
test.ListOfBooks()
What I try to achieve is when the user presses 2, I want to go back to the function in my other file.
Second file:
def librarian_option():
.......
I don't want to use globals and I have read that the librarian_option() is in the scope of the second file and that's why I can't call it directly. I can't find a solution to it.
I get the following error:
NameError: name 'librarian_option' is not defined
Have you tried? It's just best practice to be explicit instead of using * wildcard
from second_file import librarian_option
Make sure the second_file is in the same directory.
Note: this is not an answer, but an example to help improve the question with more details.
You need to reduce your problem to the minimum amount of code (and actions) necessary to produce your problem. You also need provide how exactly you are running your script, and what version (of Python and your OS) you are using.
For example, I have created the following two scripts (named exactly as shown):
first_file.py:
from second_file import *
class Catalog:
def ListOfBooks(self):
return librarian_option()
test = Catalog()
a = test.ListOfBooks()
print(a)
second_file.py:
def librarian_option():
return 1
These two files are located in the same, random, directory on my computer (MacOS). I run this as follows:
python3.7 first_file.py
and my output is
1
Hence, I can't reproduce your problem.
See if you can still produce your problem with such simplified scripts (i.e., no extra functions or classes, no __init__.py file, etc). You probably want to do this in a temporary directory elsewhere on your system.
If your problem goes away, slowly build back up to where it reappears again. By then, possibly, you've discovered the actual problem as well. If you then don't understand why this (last) change you made caused the problem, feel free to update your question with all the new information.

Python script entry point: How to call "__main__2"?

I have inherited a python script which appears to have multiple distinct entry points. For example:
if __name__ == '__main__1':
... Do stuff for option 1
if __name__ == '__main__2':
... Do stuff for option 2
etc
Google has turned up a few other examples of this syntax (e.g. here) but I'm still no wiser on how to use it.
So the question is: How can I call a specific entry point in a python script that has multiple numbered __main__ sections?
Update:
I found another example of it here, where the syntax appears to be related to a specfic tool.
https://github.com/brython-dev/brython/issues/163
The standard doc mentions only main as a reserved module namespace. After looking at your sample I notice that every main method seems separate, does its imports, performs some enclosed functionality. My suspicion is that the developer wanted to quickly swap functionalities and didn't bother to use command line arguments for that, opting instead to swap 'main2' to 'main' as needed.
This is by no means proven, though - any chance of contacting the one who wrote this in the first place?

Use pytest to test and grade student code

Say I want to grade some student python code using tests, something like (this is pseudo-code I wish I could write):
code = __import__("student_code") # Import the code to be tested
grade = 100
for test in all_tests(): # Loop over the tests that were gathered
good = perform(test, code) # Perform the test individually on the code
if not good: # Do something if the code gives the wrong result
grade -= 1
For that, I would like to use pytest (easy to write good tests), but there are many things I don't know how to do:
how to run tests on external code? (here the code imported from student's code)
how to list all the tests available? (here all_tests())
how to run them individually on code? (here perform(test, code))
I couldn't find anything related to this user-case (pytest.main() does not seem to do the trick anyhow...)
I hope you see my point, cheers!
EDIT
I finally found how to perform my 1st point (apply tests on external code). In the repository where you want to perform the tests, generate a conftest.py file with:
import imp # Import standard library
import pytest
def pytest_addoption(parser):
"""Add a custom command-line option to py.test."""
parser.addoption("--module", help="Code file to be tested.")
#pytest.fixture(scope='session')
def module(request):
"""Import code specified with the command-line custom option '--module'."""
codename = request.config.getoption("--module")
# Import module (standard __import__ does not support import by filename)
try:
code = imp.load_source('code', codename)
except Exception as err:
print "ERROR while importing {!r}".format(codename)
raise
return code
Then, gather your tests in a tests.py file, using the module fixture:
def test_sample(module):
assert module.add(1, 2) == 3
Finally, run the tests with py.test tests.py --module student.py.
I'm still working on points 2 and 3.
EDIT 2
I uploaded my (incomplete) take at this question:
https://gitlab.in2p3.fr/ycopin/pyTestExam
Help & contributions are welcome!
Very cool and interesting project. It's difficult to answer without knowing more.
Basically you should be able to do this by writing a custom plugin. Probably something you could place in a conftest.py in a test or project folder with your unittest subfolder and a subfolder for each student.
Probably would want to write two plugins:
One to allow weighting of tests (e.g. test_foo_10 and and test_bar_5) and calculation of final grade (e.g. 490/520) (teamcity-messages is an example that uses the same hooks)
Another to allow distribution of test to separate processes. (xdist as an example)
I know this is not a very complete answer but I wanted to at least point out the last point. Since there would be a very high probability that students would be using overlapping module names, they would clash in a pytest world where tests are collected first and then run in a process that would attempt to not re-import modules with a common name.
Even if you attempt to control for that, you will eventually have a student manipulate the global namespace in a sloppy way that could cause another students code to fail. You will, for that reason, need either a bash script to run each students file or a plugin that would run them in separate processes.
Make this use case a graded take-home exam and see what they come up with :-) ... naaah ... but you could ;-)
I come up with something like this (whre it is assument that the sum function is the student code):
import unittest
score = 0
class TestAndGrade(unittest.TestCase):
def test_correctness(self):
self.assertEqual(sum([2,2]), 4)
global score; score += 6 # Increase score
def test_edge_cases(self):
with self.assertRaises(Exception):
sum([2,'hello'])
global score; score += 1 # Increase score
# Print the score
#classmethod
def tearDownClass(cls):
global score
print('\n\n-------------')
print('| Score: {} |'.format(score))
print('-------------\n')
# Run the tests
unittest.main()

Common variables in modules

I have three python files, let's call them master.py, slave1.py and slave2.py. Now slave1.py and slave2.py do not have any functions, but are required to do two different things using the same input (say the variable inp).
What I'd like to do is to call both the slave programs from master, and specify the one input variable inp in master, so I don't have to do it twice. Also so I can change the outputs of both slaves in one master program etc.
I'd like to keep the code of both slave1.py and slave2.py separate, so I can debug them individually if required, but when I try to do
#! /usr/bin/python
# This is master.py
import slave1
import slave2
inp = #some input
both slave1 and slave2 run before I can change the input. As I understand it, the way python imports modules is to execute them first. But is there some way to delay executing them so I can specify the common input? Or any other way to specify the input for both files from one place?
EDIT: slave1 and slave2 perform two different simulations being given a particular initial condition. Since the output of the two are the same, I'd like to display them in a similar manner, as well as have control over which files to write the simulated data to. So I figured importing both of them into a master file was the easiest way to do that.
Write the code in your slave modules as functions, import the functions, then call the functions from master with whatever input you need. If you need to have more stateful information, consider constructing an object.
You can do imports at any time:
inp = #some input
import slave1
import slave2
Note that this is generally considered bad design - you would be better off making the modules contain a function, rather than just having it happen when you import the module.
It looks like the architecture of your program is not really optimal. I think you have two files that execute immediately when you run them with python slave1.py. That is nice for scripting, but when you import them you run in trouble as you have experienced.
Best is to wrap your code in the slave files in a function (as suggested by #sr2222) and call these explicitly from master.py:
slave1.py/ slave2.py
def run(inp):
#your code
master.py
import slave1, slave2
inp = "foo"
slave1.run(inp)
slave2.run(inp)
If you still want to be able to run the slaves independently you could add something like this at the end:
if __name__ == "__main__":
inp = "foo"
run(inp)

Categories

Resources