Write python code that execute python scripts - python

I have a Python file (sql_script.py) with some methods to add/modify data into a SQL database, say
import_data_into_specifications_table
import_data_into_linkage_table
truncate_linkage_table
....(do_other_stuff_on_db)
connect_db
Sometimes I have to call only one of the methods, some others several of them
Until now what I did was modify the main method according to what I needed to do:
if __name__ == '__main__':
conn = connect_db()
import_data_into_specifications_table(conn= conn)
import_data_into_linkage_table(conn=conn)
conn.close()
But I find it a bad practice, as I always have to remember removing the main before committing the code
A possible option could be to write an external python file, say launch_sql_script.py), in which I write all possible combinations of methods I have to run, say:
def import_spec_and_linkage():
conn = connect_db()
import_data_into_specifications_table(conn= conn)
import_data_into_linkage_table(conn=conn)
conn.close()
...
if __name__ == '__main__':
import_spec_and_linkage()
It can be useful to version this file, but still I will need to modify the main code according to what I need to do.
Do you think this is a good practise? Do you have any other suggestions?

The simplest way is to use program arguments mechanism: describe intended action during script execution.
Get a peek at sys.argv
Here is the scratch:
def meow():
print("Meow!")
def bark():
print("Bark!")
def moo():
print("Moo!")
actions = {
"meow": meow,
"bark": bark,
"moo": moo,
}
from sys import argv
actions[argv[1]]()
If you're going to parse sophisticated program arguments, check out argparse library.

Option 1: Separate them into individual scripts and run each from command line
# import_data_into_specifications_table.py
if name == '__main__':
conn = connect_db() # import from a shared fiel
import_data_into_specifications_table(conn= conn)
# in bash
$ import_data_into_specifications_table
Option 2: Write one file that parses command line arguments
# my_sql_script.py
if name == '__main__':
conn = connect_db()
if args.spec_table: # use argumentparser to get these
import_data_into_specifications_table(conn=conn)
if args.linkage_table:
import_data_into_linkage_table(conn=conn)
...
# in bash
$ my_sql_script.py --spec_table --linkage_table
I would favour option 2 if the order of the operations doesn't matter or is always constant. If there are many permutations, I would go with option 1.

Related

exec() python command stops the whole execution

I am trying to run a script that sequentially changes some parameters in a config file (MET_config_EEv40.cfg) and runs a script ('IS_MET_EEv40_RAW.py') that retrieves these new config parameters:
config_filename = os.getcwd() + '/MET_config_EEv40.cfg'
import sys
parser = configparser.ConfigParser()
parser.read('MET_config_EEv40.cfg')
parser.set('RAW', 'product', 'ERA')
parser.set('RAW', 'gee_product', 'ECMWF/ERA5_LAND/HOURLY')
parser.set('RAW', 'indicator', 'PRCP')
parser.set('RAW', 'resolution', '11110')
with open('MET_config_EEv40.cfg', 'w') as configfile:
parser.write(configfile)
## execute file
import sys
os.system(exec(open('IS_MET_EEv40_RAW.py').read()))
#exec(open('IS_MET_EEv40_RAW.py').read())
print('I am here')
After this execution, I get the output of my script as expected:
Period of Reference: 2005 - 2019
Area of Interest: /InfoSequia/GIS/ink/shp_basin_wgs84.shp
Raw data is up to date. No new dates available in raw data
Press any key to continue . . .
But it never prints the end line: I am here, so that means that after the execution of the script, the algorithm is terminated. That is not what I want it to do, as I would like to be able to change some other config parameters and run the script again.
That output is showed because of this line of the code:
if (delta.days<=1):
sys.exit('Raw data is up to date. No new dates available in raw data')
So could be that sys.exit is ending both processes? Any ideas to replace sys.exit() inside the code to avoid this?
Im executing this file from a .bat file that contains the following:
#echo OFF
docker exec container python MET/PRCPmain.py
pause
exec(source, globals=None, locals=None, /) does
Execute the given source in the context of globals and locals.
So
import sys
exec("sys.exit(0)")
print("after")
is same as writing
import sys
sys.exit(0)
print("after")
which obviously terminate and does not print after.
exec has optional argument globals which you can use to provide your alternative to sys for example
class MySys:
def exit(self, *args):
pass
exec("sys.exit(0)",{"sys":MySys()})
print("after")
which does output
after
as it does use exit from MySys instance. If your codes make use of other things from sys and want it to work normally you would need method mimicking sys function in MySys class

One script to run multiple scripts

I am trying to simplify a workflow that requires several individual scripts to be run. So far, I have been able to write a script that runs the other scripts but I have one issue that I can't seem to resolve. Each of the sub-scripts requires a file path and one argument within the path needs to be changed depending on who runs the scripts. Currently, I have to open each sub-script and manually changing this argument.
Is it possible to set this argument to a variable in the parent script, which can then be passed to the subscripts? Thus, it will only need to be set once and will no longer require it to be updated in each sub-script.
So far I have.....
import os
def driver(path: str):
path_base = path
path_use = os.path.join(path_base, 'docs', 'analysis', 'forecast')
file_cash = os.path.join(path_use, 'cash.py')
file_cap = os.path.join(path_use, 'cap.py')
exec(open(file_cash).read())
exec(open(file_cap).read())
return
if __name__ == '__main__':
driver(path=r'c:\users\[username]')
I would like to set path=r'c:\users\[username]' and then pass that to cash.py and cap.py.
Instead of trying to replicate the behaviour of the import statement, you should directly import these subscripts and pass the values you need them to use as function / method aruments. To import a script from a specific path, you can use importlib.import(), like this:
main.py
import os
def driver(path: str):
path_use = os.path.join(path, 'docs', 'analysis', 'forecast')
file_cash = os.path.join(path_use, 'cash.py')
file_cap = os.path.join(path_use, 'cap.py')
importlib.import(file_cash)
importlib.import(file_cap)
cash.cash("some_arg")
cap.cap("some_other_arg")
if __name__ == '__main__':
driver(path=r'c:\users\[username]')

Python unit testing execute main() before test cases

I'm fairly new to unit testing in general and I'm trying to generate unit tests for a script I'm developing in Python. The script is designed to take in a data file and several command line arguments, and then run several functions in order while updating a database. For example, here is how my script would run:
import argparse
if __name__ == "__main__":
parser = argparse.ArgumentParser(description='Script to process data files')
parser.add_argument('-f', help='The absolute file_path to the source file')
args = parser.parse_args()
main(args)
def main(args):
file_info = get_file_info(args)
process1(args, file_info)
process2(args, file_info)
def file_info(args):
file_info = {'path':args.file_path...}
...
return file_info
def process1(args):
#update database1
...
def process2(args):
#update database2
...
As you can see process1 and process2 are meant to modify the database, not to return a value, so the only way to check if those worked properly is to check the database to see if it has the correct values. I can do this easily in a unit test, but I want to run similar unit tests on separate calls to the main function. So I want my unit tests to look like this:
import unittest
import myScript
class Test_Load(unittest.TestCase):
test_args_1 = ['-f','C:\file1.txt']
myScript.main(test_args_1)
def test_get_file_info_1(self):
test_file_info = {'foo':'bar'}
self.assertDictEqual(test_file_info, myScript.get_file_info(test_args_1))
def test_process1_1(self):
#get list from sql query on database
test_db_info = {'load_time':'1997-07-16T19:20:30', 'file_size':512}
self.assertDictEqual(test_db_info, sql_query)
def test_process2_1(self):
#get list from sql query on database
test_db_info = {'foo':'bar'}
self.assertDictEqual(test_db_info, sql_query)
test_args_2 = ['-f','C:\file2.txt']
myScript.main(test_args_2)
def test_get_file_info_2(self):
test_file_info = {'foo':'bar'}
self.assertDictEqual(test_file_info, myScript.get_file_info(test_args_2))
def test_process1_2(self):
#get list from sql query on database
test_db_info = {'load_time':'2017-02-08T12:00:00', 'file_size':1024}
self.assertDictEqual(test_db_info, sql_query)
def test_process2_2(self):
#get list from sql query on database
test_db_info = {'foo':'foobar'}
self.assertDictEqual(test_db_info, sql_query)
...
if __name__ == '__main__':
Test_Load.main()
The get_file_info() function returns a dictionary so I should have no problem testing that.
So I want to call the main of my script and then run some tests on the expected results. Am I able to call my scripts main like this in the Test_Load class? And is there a better way to group these unit tests so they can be run independently of each other?

Count and return number of nosetests

I have a set of unit test scripts saved in the pwd. I would like to be able to count the number of unit tests (nosetests) that would be executed (without actually executing them) and return that number into a python variable like this:
>>> number_of_unit_tests = count_unit_tests('.')
>>> number_of_unit_tests
400
I know I can collect from the command line like this:
nosetests --collect-only
But is it possible to do this from within a script?
You can run the any nose command form python script, as described in basic nose usage, the only trick would be to extract the number of tests. I took a look at functional tests in nose and figured something like this should work, but you might be able to trim it down further:
import sys
import unittest
from cStringIO import StringIO
import nose
from nose.result import _TextTestResult
class TestRunner(unittest.TextTestRunner):
def _makeResult(self):
self.result = _TextTestResult(
self.stream, self.descriptions, self.verbosity)
return self.result
def count_unit_tests(module_name):
stream = StringIO()
runner = TestRunner(stream=stream)
result = nose.run(
testRunner=runner,
argv=[sys.argv[0],
module_name,
'-s',
'-v',
'--collect-only'
]
)
return runner.result.testsRun
if __name__ == '__main__':
print count_unit_tests('.')

Why are environment variables empty in Flask apps?

I have a flask app (my_app) that calls a function in a different file (my_function):
my_app.py:
from my_functions import my_function
#app.route('/')
def index():
my_function()
return render_template('index.html')
my_functions.py:
def my_function():
try:
import my_lib
except:
print("my_lib not found in system!")
# do stuff...
if __name__ == "__main__":
my_function()
When I execute my_functions.py directly (i.e., python my_functions.py) "my_lib" is imported without error; however, when I execute the flask app (i.e., python my_app.py) I get an import error for "my_lib".
When I print the LD_LIBRARY_PATH variable at the beginning of each file:
print(os.environ['LD_LIBRARY_PATH'])
I get the correct value when calling my_functions.py, but get no value (empty) when calling my_app.py.Trying to set this value at the beginning of my_app.py has no effect:
os.environ['LD_LIBRARY_PATH'] = '/usr/local/lib'
Questions:
(1) Why is 'LD_LIBRARY_PATH' empty when called within the Flask app?
(2) How do I set it?
Any help appreciated.
LD_LIBRARY_PATH is cleared when executing the flask app, likely for security reasons as Mike suggested.
To get around this, I use subprocess to make a call directly to an executable:
import subprocess
call_str = "executable_name -arg1 arg1_value -arg2 arg2_value"
subprocess.call(call_str, shell=True, stderr=subprocess.STDOUT)
Ideally the program should be able to use the python bindings, but for now calling the executable works.

Categories

Resources