I have inherited a python script which appears to have multiple distinct entry points. For example:
if __name__ == '__main__1':
... Do stuff for option 1
if __name__ == '__main__2':
... Do stuff for option 2
etc
Google has turned up a few other examples of this syntax (e.g. here) but I'm still no wiser on how to use it.
So the question is: How can I call a specific entry point in a python script that has multiple numbered __main__ sections?
Update:
I found another example of it here, where the syntax appears to be related to a specfic tool.
https://github.com/brython-dev/brython/issues/163
The standard doc mentions only main as a reserved module namespace. After looking at your sample I notice that every main method seems separate, does its imports, performs some enclosed functionality. My suspicion is that the developer wanted to quickly swap functionalities and didn't bother to use command line arguments for that, opting instead to swap 'main2' to 'main' as needed.
This is by no means proven, though - any chance of contacting the one who wrote this in the first place?
Related
I need to run a .tcl file via command line which get invoked with a Python script. However, a single line in that .tcl file needs to change based on input from the user. For example:
info = input("Prompt for the user: ")
Now I need the string contained in info to replace one of the lines in .tcl file.
Rewriting the script is one of the trickier options to pick. It makes things harder to audit and it is tremendously easy to make a mess of. It's not recommended at all unless you take special steps, such as factoring out the bit you set into its own file:
File that you edit, e.g., settings.tcl (simple enough that it is pretty trivial to write and you can rewrite the whole lot each time without making a mess of it)
set value "123"
Use of that file:
set value 0
if {[file readable settings.tcl]} {
source settings.tcl
}
puts "value is $value"
More sophisticated versions of that are possible with safe interpreters and language profiling… but they're only really needed when the settings and the code are in different trust domains.
That said, there are other approaches that are usually easier. If you are invoking the Tcl script by running a subprocess, the easiest ways to pass an arbitrary parameter are to use one of:
A command line argument. These can be read on the Tcl side from the $argv global, which holds a list of all arguments after the script name. (The lindex and lassign commands tend to be useful here, e.g., set value [lindex $argv 0].)
An environment variable. These can be read on the Tcl side from the env global array, e.g., set value $env(MyVarName)
On standard input. A line can be read from that on the Tcl side using set line [gets stdin].
In more complex cases, you'd pass values in their own files, or by writing them into something like an SQLite database, or… well, there's lots of options.
If on the other hand the Tcl interpreter is in the same process, pass the values by setting the variables in it before asking for the script to run. (Tcl has almost no true globals — environment variables are a special exception, and only because the OS forces it upon us — so everything is specific to the interpreter context.)
Specifically, if you've got a Tcl instance object from tkinter (Tk is a subclass of that) then you can do:
import tkinter
interp = tkinter.Tcl()
interp.call("set", "value", 123)
interp.eval("source program.tcl")
# Or interp.call("source", "program.tcl")
That has the advantage of doing all the quoting for you.
I have the following project structure.
- README.rst
- LICENSE
- setup.py
- requirements.txt
- common_prj/__init__.py
- common_prj/common_lib.py
- common_prj/config.xml
- Project1/__init__.py
- Project1/some_app_m1.py
- Project1/some_app_m2.py
- Project1/some_app.py
- Project2/some_app1.py
I am having all my common classes in common_prj/common_lib.py file. Now each module file from Project1 is calling common_prj/common_lib.py to access common classes and setup project env using config.xml
#import in some_app_m1.py
from common_prj import common_lib
#import in some_app_m2.py
from common_prj import common_lib
Imports in some_app.py
from Project1 import some_app_m1
from Project1 import some_app_m2
from common_prj import common_lib
###
<some functions>
<some functions>
###
if __name__ == '__main__':
With the above three import it seems common_lib.py is getting executed multiple time , but if I keep the common_lib.py in Project1 then I don't see this issue.
Please let me now how can I keep common_lib.py in common package and call that from Project1 scripts without executing it multiple time. The purpose of common_lib.py is to share common classes with scripts in Project2.
I have the below code in common_lib.py which is repeating for calling classes from each module after import.
self.env = dict_env.get(int(input("Choose Database ENV for this execution : \n" + str(dict_env) + "\nSelect the numeric value => ")))
self.app = dict_app.get(int(input("Choose application for this execution : \n" + str(dict_app) + "\nSelect the numeric value => ")))
My question why I am not facing this issue if I keep common_lib.py in Project1 rather common_prj. I added the above code in common_lib.py because I don't wanted to repeat these lines and ENV setup in my all app code. These are global env settings for all application scripts code inside Project1.
output
Choose Database ENV for this execution :
{1: 'DEV', 2: 'SIT', 3: 'UAT', 4: 'PROD'}
Select the numeric value => 1
Choose application for this execution :
{1: 'app1', 2: 'app2', 3: 'app3', 4: 'app4', 5: 'app5', 6: 'app6'}
Select the numeric value => 1
Choose Database ENV for this execution : ### Here it is repeating again from common_lib.py
{1: 'DEV', 2: 'SIT', 3: 'UAT', 4: 'PROD'}
Select the numeric value => 1
Choose application for this execution : ### Here it is repeating again from common_lib.py
{1: 'app1', 2: 'app2', 3: 'app3', 4: 'app4', 5: 'app5', 6: 'app6'}
Select the numeric value => 1
Note: This is my first answer where I answer in-depth of the logistics of python. If I have said anything incorrect, please let me know or edit my answer (with a comment letting me know why). Please feel free to clarify any questions you may have too.
As the comments have discussed if __name__ == '__main__': is exactly the way to go. You can check out the answers on the SO post for a full description or check out the python docs.
In essence, __name__ is a variable for a python file. Its behavior includes:
When used in a file, its value is defaulted to '__main__'.
When a file imports another program. (import foo). foo.__name__ will return 'foo' as its value.
Now, when importing a file, python compiles the new file completely. This means that it will run everything that can be run. Python compiles all the methods, variables, etc. As it's compiling, if python comes across a callable function (i.e hello_world()) it will run it. This happens even if you're trying to import a specific part of the package or not.
Now, these two things are very important. If you have some trouble understanding the logic, you can take a look at some sample code I created.
Since you are importing from common_prj 3 times (from the some_app_m* file and my_app.py file), you are running the program in common_prj three times (This is also what makes python one of the slower programming languages).
Although I'm unable to see your complete code, I'm assuming that common_prj.py calls common_lib() at some point in your code (at no indentation level). This means that the two methods inside:
self.env = dict_env.get(int(input("Choose Database ENV for this execution : \n" + str(dict_env) + "\nSelect the numeric value => ")))
self.app = dict_app.get(int(input("Choose application for this execution : \n" + str(dict_app) + "\nSelect the numeric value => ")))
were also called.
So, in conclusion:
You imported common_prj 3 times
Which ran common_lib multiple times as well.
Solution:
This is where the __name__ part I talked about earlier comes into play
Take all callable functions that shouldn't be called during importing, and place them under the conditional of if __name__ == '__name__':
This will ensure that only if the common_prj file is run then the condition will pass.
Again, if you didn't understand what I said, please feel free to ask or look at my sample code!
Edit: You can also just remove these callable functions. Files that are meant to be imported use them to make sure that the functions work as intended. If you look at any of the packages that you import daily, you'll either find that the methods are called in the condition OR not in the file at all.
This is more of a question of library design, rather than the detailed mechanics of Python.
A library module (which is what common_prj is) should be designed to do nothing on import, only define functions, classes, constants etc. It can do things like pre-calculating constants, but it should be restricted to things that don't depend on anything external, produce the same results each time and don't involve much heavy computation. At that point, the code running three times will no longer be a problem.
If the library needs parameters, the caller should supply those, ideally in a way that allows multiple different sets of parameters to be used within the same program. Depending on the situation, the parameters can be a main object that manages the rest of the interaction, or parameters that are passed into each object constructor or function call, or a configuration object that's passed in. The library can provide a function to gather these parameters from the environment or user input, but it shouldn't be required; it should always be possible for the program to supply the parameters directly. This will also make tests easier to write.
Side notes:
The name "common" is a bit generic; consider renaming the library after what it does, or splitting it into several libraries each of which does one thing?
Python will try to cache imported modules so it doesn't re-run them multiple times; this is probably why it only ran the code once in one situation but three times in another. This is another reason to have only definitions happen on import, so the details of the caching don't affect functionality.
It seems after commenting the below in
Project1/init.py
the issue has been resolved. I was calling the same common_lib from init.py as well which was causing this issue . Removing the below entry resolved the issue.
#from common_prj import commonlib
My python scripts often contain "executable code" (functions, classes, &c) in the first part of the file and "test code" (interactive experiments) at the end.
I want python, py_compile, pylint &c to completely ignore the experimental stuff at the end.
I am looking for something like #if 0 for cpp.
How can this be done?
Here are some ideas and the reasons they are bad:
sys.exit(0): works for python but not py_compile and pylint
put all experimental code under def test():: I can no longer copy/paste the code into a python REPL because it has non-trivial indent
put all experimental code between lines with """: emacs no longer indents and fontifies the code properly
comment and uncomment the code all the time: I am too lazy (yes, this is a single key press, but I have to remember to do that!)
put the test code into a separate file: I want to keep the related stuff together
PS. My IDE is Emacs and my python interpreter is pyspark.
Use ipython rather than python for your REPL It has better code completion and introspection and when you paste indented code it can automatically "de-indent" the pasted code.
Thus you can put your experimental code in a test function and then paste in parts without worrying and having to de-indent your code.
If you are pasting large blocks that can be considered individual blocks then you will need to use the %paste or %cpaste magics.
eg.
for i in range(3):
i *= 2
# with the following the blank line this is a complete block
print(i)
With a normal paste:
In [1]: for i in range(3):
...: i *= 2
...:
In [2]: print(i)
4
Using %paste
In [3]: %paste
for i in range(10):
i *= 2
print(i)
## -- End pasted text --
0
2
4
In [4]:
PySpark and IPython
It is also possible to launch PySpark in IPython, the enhanced Python interpreter. PySpark works with IPython 1.0.0 and later. To use IPython, set the IPYTHON variable to 1 when running bin/pyspark:1
$ IPYTHON=1 ./bin/pyspark
Unfortunately, there is no widely (or any) standard describing what you are talking about, so getting a bunch of python specific things to work like this will be difficult.
However, you could wrap these commands in such a way that they only read until a signifier. For example (assuming you are on a unix system):
cat $file | sed '/exit(0)/q' |sed '/exit(0)/d'
The command will read until 'exit(0)' is found. You could pipe this into your checkers, or create a temp file that your checkers read. You could create wrapper executable files on your path that may work with your editors.
Windows may be able to use a similar technique.
I might advise a different approach. Separate files might be best. You might explore iPython notebooks as a possible solution, but I'm not sure exactly what your use case is.
Follow something like option 2.
I usually put experimental code in a main method.
def main ():
*experimental code goes here *
Then if you want to execute the experimental code just call the main.
main()
With python-mode.el mark arbitrary chunks as section - for example via py-sectionize-region.
Than call py-execute-section.
Updated after comment:
python-mode.el is delivered by melpa.
M-x list-packages RET
Look for python-mode - the built-in python.el provides 'python, while python-mode.el provides 'python-mode.
Developement just moved hereto: https://gitlab.com/python-mode-devs/python-mode
I think the standard ('Pythonic') way to deal with this is to do it like so:
class MyClass(object):
...
def my_function():
...
if __name__ == '__main__':
# testing code here
Edit after your comment
I don't think what you want is possible using a plain Python interpreter. You could have a look at the IEP Python editor (website, bitbucket): it supports something like Matlab's cell mode, where a cell can be defined with a double comment character (##):
## main code
class MyClass(object):
...
def my_function():
...
## testing code
do_some_testing_please()
All code from a ##-beginning line until either the next such line or end-of-file constitutes a single cell.
Whenever the cursor is within a particular cell and you strike some hotkey (default Ctrl+Enter), the code within that cell is executed in the currently running interpreter. An additional feature of IEP is that selected code can be executed with F9; a pretty standard feature but the nice thing here is that IEP will smartly deal with whitespace, so just selecting and pasting stuff from inside a method will automatically work.
I suggest you use a proper version control system to keep the "real" and the "experimental" parts separated.
For example, using Git, you could only include the real code without the experimental parts in your commits (using add -p), and then temporarily stash the experimental parts for running your various tools.
You could also keep the experimental parts in their own branch which you then rebase on top of the non-experimental parts when you need them.
Another possibility is to put tests as doctests into the docstrings of your code, which admittedly is only practical for simpler cases.
This way, they are only treated as executable code by the doctest module, but as comments otherwise.
I would like R scripts to have a main() function that gets executed while in interactive mode. But the main() function should not be executed while sourcing the file.
There is already a question about this and a very good answer suggests using the interactive() function. However this doesn't work for me. I don't have enough reputation points to comment or answer in that question. So I ask the question here again.
I write this in script_1.r
if(interactive()){
your main code here
}
If I use knitr to waive a html or pdf document, sourcing the script. This code under if(interactive()) won't be executed. This is good for me, that's what I want.
My problem is that if I source("script_1.r") from script_2.r in interactive mode, it will still run the code under this if(interactive()) part.
The best way to get the kind of control you're looking for is to use options.
For instance, 'script.r' would look like this:
main <- function() {
message('main!')
}
if (getOption('run.main', default=TRUE)) {
main()
}
If you are sourcing the file in interactive mode and don't want main to execute, simply call options(run.main=FALSE) before you call source. If you are using the script with knitr and you want main to execute, don't set the option, and it will default to TRUE. Or if you don't want the main to run with knitr, call options(run.main=FALSE) before calling it.
As you’ve noticed, no, it’s not the same thing. if(interactive()) does exactly what the name says – it tests whether the code is run in an interactive shell. No more, no less.
There’s no direct equivalent of if __name__ == '__main__' from Python in R, since R has no concept of modules in the same way as Python, and source’d code is just executed directly.
You could write your own source command to replace the default one and perform the necessary check, however.
That said, the question you’ve linked does contain an answer which presents a workaround and essentially replicates Python’s functionality. However, this doesn’t seem to be what you want since it won’t work as you expect when invoked by Knitr.
Another way I have found that will work as you described without the need to use and change options is:
if (sys.nframe() == 0) {
# ... do main stuff
}
sys.nframe() is equal to 0 when run from the interactive terminal or using Rscript.exe, in which case the main code will run. Otherwise when the code is sourced, sys.nframe() is equal to 4 (in my case, not sure how it works exactly) which will prevent the main code from running.
source
In my python file, I have made a GUI widget that takes some inputs from user. I have imported a python module in my python file that takes some input using raw_input(). I have to use this module as it is, I have no right to change it. When I run my python file, it ask me for the inputs (due to raw_input() of imported module). I want to use GUI widget inputs in that place.
How can I pass the user input (that we take from widget) as raw_input() of imported module?
First, if importing it directly into your script isn't actually a requirement (and it's hard to imagine why it would be), you can just run the module (or a simple script wrapped around it) as a separate process, using subprocess or pexpect.
Let's make this concrete. Say you want to use this silly module foo.py:
def bar():
x = raw_input("Gimme a string")
y = raw_input("Gimme another")
return 'Got two strings: {}, {}'.format(x, y)
First write a trivial foo.wrapper.py:
import foo
print(foo.bar())
Now, instead of calling foo.do_thing() directly in your real script, run foo_wrapper as a child process.
I'm going to assume that you already have the input you want to send it in a string, because that makes the irrelevant parts of the answer simpler (in fact, it makes them possible—if you wanted to use some GUI code for that, there's really no way I could show you how unless you first tell us which GUI library you're using).
So:
foo_input = 'String 1\nString 2\n'
with subprocess.Popen([sys.executable, 'foo_wrapper.py'],
stdin=subprocess.PIPE, stdout=subprocess.PIPE) as p:
foo_output, _ = p.communicate(foo_input)
Of course in real life you'll want to use an appropriate path for foo_wrapper.py instead of assuming that it's in the current working directory, but this should be enough to illustrate the idea.
Meanwhile, if "I have no right to change it" just means "I don't (and shouldn't) have checkin rights to the foo project's github site or the relevant subtree on our company's P4 server" or whatever, there's a really easy answer: Fork it, and change the fork.
Even if it's got a weak copyleft license like LGPL: fork it, change the fork, publish your fork under the same license as the original, then use your fork.
If you're depending on the foo package being installed on every target system, and can't depend on your replacement foo being installed instead, that's a bit more of a problem. But if the function or method that actually calls raw_input is just a small fraction of the actual code in foo, you can fix that by monkeypatching foo at runtime.
And that leads to the last-ditch possibility: You can always monkeypatch raw_input itself.
Again, I'm going to assume that you already have the input you need to give it to make things simpler.
So, first you write a replacement function:
foo_input = ['String 1\n', 'String 2\n']
def fake_raw_input(prompt):
global foo_input
return foo_input.pop()
Now, there are two ways you can patch this in. Usually, you want to do this:
import foo
foo.raw_input = fake_raw_input
This means any code in foo that calls raw_input will see the function you crammed into its module globals instead of the normal builtin. Unless it does something really funky (like looking up the builtin directly and copying it to a local variable or something), this is the answer.
If you need to handle one of those really funky edge cases, and you don't mind doing something questionable, you can do this:
import __builtin__
__builtin__.raw_input = fake_raw_input
You must do this before the first import foo anywhere in your problem. Also, it's not clear whether this is intentionally guaranteed to work, accidentally guaranteed to work (and should be fixed in the future), or not guaranteed to work. But it does work (at least for CPython 2.5-2.7, which is what you're probably using).