How do I make a module think its name is __main__? [duplicate] - python

This question already has answers here:
How to make __name__ == '__main__' when running module
(4 answers)
Closed 6 years ago.
I have a python script (actually, lots of scripts) with code that is executed only when this module is run as the main script:
if __name__ == '__main__':
print("I am the main script")
But now I want a testing script to load them as modules, so that it can then poke in their internal state. Rewriting (to turn the code block into a function) is not an option. How do I import a module in such a way that it thinks its name is __main__? I'm sure I've seen this done before, with the help of some import library or other, but it's not coming up in my searches.

You'd have to bypass the import machinery, and use exec instead:
import imp
main = imp.new_module('__main__')
with open(module_filename, 'r') as source:
exec(source.read(), vars(main))
Demo:
>>> source = '''\
... if __name__ == '__main__':
... print("I am the main script")
... '''
>>> import imp
>>> main = imp.new_module('__main__')
>>> exec(source, vars(main))
I am the main script
Rather than go this route, consider creating a function you call from the __main__ guard instead, so you can just import that function for testing:
def main():
print("I am the main script")
if __name__ == '__main__':
main()

Related

NameError when putting variable declaration in if __name__ == '__main__': [duplicate]

This question already has an answer here:
How to process requests from multiiple users using ML model and FastAPI?
(1 answer)
Closed 15 days ago.
I have a Python file named main.py. I am running it on Python 3.9.13 on Windows.
import uvicorn
from fastapi import FastAPI
app = FastAPI()
#app.post('/c')
async def c(b: str):
print(a)
if __name__ == '__main__':
a = load_embeddings('embeddings')
uvicorn.run('main:app', host='127.0.0.1', port=80)
Running this, then invoking POST /c will cause a 500 error with NameError 'a' is not defined.
However it is obvious that a will be defined first before the server is ran. If I move a outside of the if __name__ == '__main__': then it works, but it causes load_embeddings to be ran multiple times for unknown reasons (3 exact). Since load_embeddings for me takes long time, I do not want the duplicate execution.
I wish to look for either of these as a solution to my issue: stop whatever outside if __name__ == '__main__': from executing multiple times, OR make a defined globally when it is being defined under if __name__ == '__main__':.
Note: variable names are intentionally renamed for ease of reading. Please do not advise me anything on coding style/naming conventions. I know the community is helpful but that's not the point here, thanks.
You can resolve the issue by moving the a variable definition inside the c function. Then, you can add a check inside the function to load the embeddings only if they have not been loaded yet. You can achieve this by using a global variable, which will keep track of whether the embeddings have been loaded or not.
Here is an example:
import uvicorn
from fastapi import FastAPI
app = FastAPI()
EMBEDDINGS_LOADED = False
def load_embeddings(filename):
# Load embeddings code here
...
#app.post('/c')
async def c(b: str):
global EMBEDDINGS_LOADED
if not EMBEDDINGS_LOADED:
load_embeddings('embeddings')
EMBEDDINGS_LOADED = True
print(a)
if __name__ == '__main__':
uvicorn.run('main:app', host='127.0.0.1', port=80)

Python Multiprocessing workflow troubleshooting [duplicate]

I am trying my very first formal python program using Threading and Multiprocessing on a windows machine. I am unable to launch the processes though, with python giving the following message. The thing is, I am not launching my threads in the main module. The threads are handled in a separate module inside a class.
EDIT: By the way this code runs fine on ubuntu. Not quite on windows
RuntimeError:
Attempt to start a new process before the current process
has finished its bootstrapping phase.
This probably means that you are on Windows and you have
forgotten to use the proper idiom in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce a Windows executable.
My original code is pretty long, but I was able to reproduce the error in an abridged version of the code. It is split in two files, the first is the main module and does very little other than import the module which handles processes/threads and calls a method. The second module is where the meat of the code is.
testMain.py:
import parallelTestModule
extractor = parallelTestModule.ParallelExtractor()
extractor.runInParallel(numProcesses=2, numThreads=4)
parallelTestModule.py:
import multiprocessing
from multiprocessing import Process
import threading
class ThreadRunner(threading.Thread):
""" This class represents a single instance of a running thread"""
def __init__(self, name):
threading.Thread.__init__(self)
self.name = name
def run(self):
print self.name,'\n'
class ProcessRunner:
""" This class represents a single instance of a running process """
def runp(self, pid, numThreads):
mythreads = []
for tid in range(numThreads):
name = "Proc-"+str(pid)+"-Thread-"+str(tid)
th = ThreadRunner(name)
mythreads.append(th)
for i in mythreads:
i.start()
for i in mythreads:
i.join()
class ParallelExtractor:
def runInParallel(self, numProcesses, numThreads):
myprocs = []
prunner = ProcessRunner()
for pid in range(numProcesses):
pr = Process(target=prunner.runp, args=(pid, numThreads))
myprocs.append(pr)
# if __name__ == 'parallelTestModule': #This didnt work
# if __name__ == '__main__': #This obviously doesnt work
# multiprocessing.freeze_support() #added after seeing error to no avail
for i in myprocs:
i.start()
for i in myprocs:
i.join()
On Windows the subprocesses will import (i.e. execute) the main module at start. You need to insert an if __name__ == '__main__': guard in the main module to avoid creating subprocesses recursively.
Modified testMain.py:
import parallelTestModule
if __name__ == '__main__':
extractor = parallelTestModule.ParallelExtractor()
extractor.runInParallel(numProcesses=2, numThreads=4)
Try putting your code inside a main function in testMain.py
import parallelTestModule
if __name__ == '__main__':
extractor = parallelTestModule.ParallelExtractor()
extractor.runInParallel(numProcesses=2, numThreads=4)
See the docs:
"For an explanation of why (on Windows) the if __name__ == '__main__'
part is necessary, see Programming guidelines."
which say
"Make sure that the main module can be safely imported by a new Python
interpreter without causing unintended side effects (such a starting a
new process)."
... by using if __name__ == '__main__'
Though the earlier answers are correct, there's a small complication it would help to remark on.
In case your main module imports another module in which global variables or class member variables are defined and initialized to (or using) some new objects, you may have to condition that import in the same way:
if __name__ == '__main__':
import my_module
As #Ofer said, when you are using another libraries or modules, you should import all of them inside the if __name__ == '__main__':
So, in my case, ended like this:
if __name__ == '__main__':
import librosa
import os
import pandas as pd
run_my_program()
hello here is my structure for multi process
from multiprocessing import Process
import time
start = time.perf_counter()
def do_something(time_for_sleep):
print(f'Sleeping {time_for_sleep} second...')
time.sleep(time_for_sleep)
print('Done Sleeping...')
p1 = Process(target=do_something, args=[1])
p2 = Process(target=do_something, args=[2])
if __name__ == '__main__':
p1.start()
p2.start()
p1.join()
p2.join()
finish = time.perf_counter()
print(f'Finished in {round(finish-start,2 )} second(s)')
you don't have to put imports in the if __name__ == '__main__':, just running the program you wish to running inside
In yolo v5 with python 3.8.5
if __name__ == '__main__':
from yolov5 import train
train.run()
In my case it was a simple bug in the code, using a variable before it was created. Worth checking that out before trying the above solutions. Why I got this particular error message, Lord knows.
The below solution should work for both python multiprocessing and pytorch multiprocessing.
As other answers mentioned that the fix is to have if __name__ == '__main__': but I faced several issues in identifying where to start because I am using several scripts and modules. When I can call my first function inside main then everything before it started to create multiple processes (not sure why).
Putting it at the very first line (even before the import) worked. Only calling the first function return timeout error. The below is the first file of my code and multiprocessing is used after calling several functions but putting main in the first seems to be the only fix here.
if __name__ == '__main__':
from mjrl.utils.gym_env import GymEnv
from mjrl.policies.gaussian_mlp import MLP
from mjrl.baselines.quadratic_baseline import QuadraticBaseline
from mjrl.baselines.mlp_baseline import MLPBaseline
from mjrl.algos.npg_cg import NPG
from mjrl.algos.dapg import DAPG
from mjrl.algos.behavior_cloning import BC
from mjrl.utils.train_agent import train_agent
from mjrl.samplers.core import sample_paths
import os
import json
import mjrl.envs
import mj_envs
import time as timer
import pickle
import argparse
import numpy as np
# ===============================================================================
# Get command line arguments
# ===============================================================================
parser = argparse.ArgumentParser(description='Policy gradient algorithms with demonstration data.')
parser.add_argument('--output', type=str, required=True, help='location to store results')
parser.add_argument('--config', type=str, required=True, help='path to config file with exp params')
args = parser.parse_args()
JOB_DIR = args.output
if not os.path.exists(JOB_DIR):
os.mkdir(JOB_DIR)
with open(args.config, 'r') as f:
job_data = eval(f.read())
assert 'algorithm' in job_data.keys()
assert any([job_data['algorithm'] == a for a in ['NPG', 'BCRL', 'DAPG']])
job_data['lam_0'] = 0.0 if 'lam_0' not in job_data.keys() else job_data['lam_0']
job_data['lam_1'] = 0.0 if 'lam_1' not in job_data.keys() else job_data['lam_1']
EXP_FILE = JOB_DIR + '/job_config.json'
with open(EXP_FILE, 'w') as f:
json.dump(job_data, f, indent=4)
# ===============================================================================
# Train Loop
# ===============================================================================
e = GymEnv(job_data['env'])
policy = MLP(e.spec, hidden_sizes=job_data['policy_size'], seed=job_data['seed'])
baseline = MLPBaseline(e.spec, reg_coef=1e-3, batch_size=job_data['vf_batch_size'],
epochs=job_data['vf_epochs'], learn_rate=job_data['vf_learn_rate'])
# Get demonstration data if necessary and behavior clone
if job_data['algorithm'] != 'NPG':
print("========================================")
print("Collecting expert demonstrations")
print("========================================")
demo_paths = pickle.load(open(job_data['demo_file'], 'rb'))
########################################################################################
demo_paths = demo_paths[0:3]
print (job_data['demo_file'], len(demo_paths))
for d in range(len(demo_paths)):
feats = demo_paths[d]['features']
feats = np.vstack(feats)
demo_paths[d]['observations'] = feats
########################################################################################
bc_agent = BC(demo_paths, policy=policy, epochs=job_data['bc_epochs'], batch_size=job_data['bc_batch_size'],
lr=job_data['bc_learn_rate'], loss_type='MSE', set_transforms=False)
in_shift, in_scale, out_shift, out_scale = bc_agent.compute_transformations()
bc_agent.set_transformations(in_shift, in_scale, out_shift, out_scale)
bc_agent.set_variance_with_data(out_scale)
ts = timer.time()
print("========================================")
print("Running BC with expert demonstrations")
print("========================================")
bc_agent.train()
print("========================================")
print("BC training complete !!!")
print("time taken = %f" % (timer.time() - ts))
print("========================================")
# if job_data['eval_rollouts'] >= 1:
# score = e.evaluate_policy(policy, num_episodes=job_data['eval_rollouts'], mean_action=True)
# print("Score with behavior cloning = %f" % score[0][0])
if job_data['algorithm'] != 'DAPG':
# We throw away the demo data when training from scratch or fine-tuning with RL without explicit augmentation
demo_paths = None
# ===============================================================================
# RL Loop
# ===============================================================================
rl_agent = DAPG(e, policy, baseline, demo_paths,
normalized_step_size=job_data['rl_step_size'],
lam_0=job_data['lam_0'], lam_1=job_data['lam_1'],
seed=job_data['seed'], save_logs=True
)
print("========================================")
print("Starting reinforcement learning phase")
print("========================================")
ts = timer.time()
train_agent(job_name=JOB_DIR,
agent=rl_agent,
seed=job_data['seed'],
niter=job_data['rl_num_iter'],
gamma=job_data['rl_gamma'],
gae_lambda=job_data['rl_gae'],
num_cpu=job_data['num_cpu'],
sample_mode='trajectories',
num_traj=job_data['rl_num_traj'],
num_samples= job_data['rl_num_samples'],
save_freq=job_data['save_freq'],
evaluation_rollouts=job_data['eval_rollouts'])
print("time taken = %f" % (timer.time()-ts))
I ran into the same problem. #ofter method is correct because there are some details to pay attention to. The following is the successful debugging code I modified for your reference:
if __name__ == '__main__':
import matplotlib.pyplot as plt
import numpy as np
def imgshow(img):
img = img / 2 + 0.5
np_img = img.numpy()
plt.imshow(np.transpose(np_img, (1, 2, 0)))
plt.show()
dataiter = iter(train_loader)
images, labels = dataiter.next()
imgshow(torchvision.utils.make_grid(images))
print(' '.join('%5s' % classes[labels[i]] for i in range(4)))
For the record, I don't have a subroutine, I just have a main program, but I have the same problem as you. This demonstrates that when importing a Python library file in the middle of a program segment, we should add:
if __name__ == '__main__':
I tried the tricks mentioned above on the following very simple code. but I still cannot stop it from resetting on any of my Window machines with Python 3.8/3.10. I would very much appreciate it if you could tell me where I am wrong.
print('script reset')
def do_something(inp):
print('Done!')
if __name__ == '__main__':
from multiprocessing import Process, get_start_method
print('main reset')
print(get_start_method())
Process(target=do_something, args=[1]).start()
print('Finished')
output displays:
script reset
main reset
spawn
Finished
script reset
Done!
Update:
As far as I understand, you guys are not preventing either the script containing the __main__ or the .start() from resetting (which doesn't happen in Linux), rather you are suggesting workarounds so that we don't see the reset. One has to make all imports minimal and put them in each function separately, but it is still, relative to Linux, slow.

if __name__ == '__main__' function call

I am trying to work around a problem I have encountered in a piece of code I need to build on. I have a python module that I need to be able to import and pass arguments that will then be parsed by the main module. What I have been given looks like this:
#main.py
if __name__ == '__main__'
sys.argv[] #pass arguments if given and whatnot
Do stuff...
What I need is to add a main() function that can take argument(s) and parse them and then pass them on like so:
#main.py with def main()
def main(args):
#parse args
return args
if __name__ == '__main__':
sys.argv[] #pass arguments if given and whatnot
main(sys.argv)
Do stuff...
To sum up: I need to import main.py and pass in arguments that are parsed by the main() function and then give the returned information to the if __name_ == '__main_' part.
EDIT
To clarify what I am doing
#hello_main.py
import main.py
print(main.main("Hello, main"))
ALSO I want to still be able to call main.py from shell via
$: python main.py "Hello, main"
Thus preserving the name == main
Is what I am asking even possible? I have been spending the better part of today researching this issue because I would like to, if at all possible, preserve the main.py module that I have been given.
Thanks,
dmg
Within a module file you can write if __name__ == "__main__" to get specific behaviour when calling that file directly, e.g. via shell:
#mymodule.py
import sys
def func(args):
return 2*args
#This only happens when mymodule.py is called directly:
if __name__ == "__main__":
double_args = func(sys.argv)
print("In mymodule:",double_args)
One can then still use the function when importing to another file:
#test.py
import mymodule
print("In test:",mymodule.func("test "))
Thus, calling python test.py will result in "In test: test test ", while calling python mymodule.py hello will result in "In mymodule: hello hello ".

Importing values in config.py

I wanted to mix a config.py approach and ConfigParser to set some default values in config.py which could be overridden by the user in its root folder:
import ConfigParser
import os
CACHE_FOLDER = 'cache'
CSV_FOLDER = 'csv'
def main():
cp = ConfigParser.ConfigParser()
cp.readfp(open('defaults.cfg'))
cp.read(os.path.expanduser('~/.python-tools.cfg'))
CACHE_FOLDER = cp.get('folders', 'cache_folder')
CSV_FOLDER = cp.get('folders', 'csv_folder')
if __name__ == '__main__':
main()
When running this module I can see the value of CACHE_FOLDER being changed. However when in another module I do the following:
import config
def main()
print config.CACHE_FOLDER
This will print the original value of the variable ('cache').
Am I doing something wrong ?
The main function in the code you show only gets run when that module is run as a script (due to the if __name__ == '__main__' block). If you want that turn run any time the module is loaded, you should get rid of that restriction. If there's extra code that actually does something useful in the main function, in addition to setting up the configuration, you might want to split that part out from the setup code:
def setup():
# the configuration stuff from main in the question
def main():
# other stuff to be done when run as a script
setup() # called unconditionally, so it will run if you import this module
if __name__ == "__main__":
main() # this is called only when the module is run as a script

How to run multiple test case from python nose test

I am a newbie in process of learning python and currently working on a automation project.
And i have N numbers of testcase which needs to be run on reading material people suggest me to use nosetest.
What is the way to run multiple testcase using nosetest?
And is the correct approach doing it:
import threading
import time
import logging
import GLOBAL
import os
from EPP import EPP
import Queue
import unittest
global EPP_Queue
from test1 import test1
from test2 import test2
logging.basicConfig(level=logging.DEBUG,
format='(%(threadName)-10s) %(message)s',
)
class all_test(threading.Thread,unittest.TestCase):
def cleanup():
if os.path.exists("/dev/epp_dev"):
os.unlink("/dev/epp_dev")
print "starts here"
server_ip ='192.168.10.15'
EppQueue = Queue.Queue(1)
EPP = threading.Thread(name='EPP', target=EPP,
args=('192.168.10.125',54321,'/dev/ttyS17',
EppQueue,))
EPP.setDaemon(True)
EPP.start()
time.sleep(5)
suite1 = unittest.TestLoader().loadTestsFromTestCase(test1)
suite2 = unittest.TestLoader().loadTestsFromTestCase(test2)
return unittest.TestSuite([suite1, suite2])
print "final"
raw_input("keyy")
def main():
unittest.main()
if __name__ == '__main__':
main()
Read
http://ivory.idyll.org/articles/nose-intro.html.
Download the package
http://darcs.idyll.org/~t/projects/nose-demo.tar.gz
Follow the instructions provided in the first link.
nosetest, when run from command line like 'nosetest' or 'nosetest-2.6' will recursively hunt for tests in the directory you execute it in.
So if you have a directory holding N tests, just execute it in that directory. They will all be executed.

Categories

Resources