I was doing some troubleshooting and I was curious if it is possible to run a Python script interactively, change a function defined in the script, save the file, then have the interactive shell recognize the changes. Here is an example of what I am doing currently:
my_script.py:
def dummy_func():
print('Something')
def main():
dummy_func()
if __name__ == '__main__':
main()
I go to my terminal and run:
>python -i my_script.py
Something
>>>
If I go back to my_script.py in my editor and make the following change:
def dummy_func():
print('Something else')
Then go back to the terminal (which is still open) and re-run the updated function:
>>>dummy_func()
Something
>>>
Is it possible to do something to instead get the following behavior?:
>>>dummy_func()
Something else
>>>
I know it is possible to reload modules using importlib and reload but as far as I can tell that does not apply here since I am not importing anything.
I think this may be distinct from How do I unload (reload) a Python module?. I am asking if there is a way to reload the current file you are running interactively through the python shell, while that question is asking about reloading a module you have imported into another python script.
From what I can find, the short answer is:
No, normally the Python interpreter does not recognize changes to a file once that file has been parsed, analyzed, and fed into the interpreter.
What you should do instead apparently is use your .py file as a module, import that as a module into another .py file, then run that new file. This allows your first file to be reloaded through the interactive interpreter. Here's an example:
from importlib import reload # Python 3.4+ only.
import foo
while True:
# Do some things.
if is_changed(foo):
foo = reload(foo)
I am still a little fuzzy on the details, but maybe someone can help fill those in. As far as I can tell from the sources I linked below, the interpreter basically takes some steps to load your program from the saved python file into memory (glossing over a lot of details). Once this process has been performed, the interpreter does not perform it again unless you explicitly ask it to do so, for example by using the importlib's reload() function to again perform the process.
Sources:
How do I unload (reload) a Python module? (quoted above)
A Python Interpreter Written in Python:
This link has a lot more information about how the interpreter works, and I found this section particularly helpful:
Real Python Bytecode
At this point, we'll abandon our toy instruction
sets and switch to real Python bytecode. The structure of bytecode is
similar to our toy interpreter's verbose instruction sets, except that
it uses one byte instead of a long name to identify each instruction.
To understand this structure, we'll walk through the bytecode of a
short function. Consider the example below:
>>> def cond():
... x = 3
... if x < 5:
... return 'yes'
... else:
... return 'no'
...
Python exposes a boatload of its internals at run time, and we can access them right
from the REPL. For the function object cond, cond.code is the code
object associated it, and cond.code.co_code is the bytecode.
There's almost never a good reason to use these attributes directly
when you're writing Python code, but they do allow us to get up to all
sorts of mischief—and to look at the internals in order to understand
them.
>>> cond.__code__.co_code # the bytecode as raw bytes
b'd\x01\x00}\x00\x00|\x00\x00d\x02\x00k\x00\x00r\x16\x00d\x03\x00Sd\x04\x00Sd\x00\x00S'
>>> list(cond.__code__.co_code) # the bytecode as numbers
[100, 1, 0, 125, 0, 0, 124, 0, 0, 100, 2, 0, 107, 0, 0, 114, 22, 0, 100, 3, 0, 83,
100, 4, 0, 83, 100, 0, 0, 83]
When we just print the bytecode, it
looks unintelligible—all we can tell is that it's a series of bytes.
Luckily, there's a powerful tool we can use to understand it: the dis
module in the Python standard library.
dis is a bytecode disassembler. A disassembler takes low-level code
that is written for machines, like assembly code or bytecode, and
prints it in a human-readable way. When we run dis.dis, it outputs an
explanation of the bytecode it has passed.
>>> dis.dis(cond)
2 0 LOAD_CONST 1 (3)
3 STORE_FAST 0 (x)
3 6 LOAD_FAST 0 (x)
9 LOAD_CONST 2 (5)
12 COMPARE_OP 0 (<)
15 POP_JUMP_IF_FALSE 22
4 18 LOAD_CONST 3 ('yes')
21 RETURN_VALUE
6 >> 22 LOAD_CONST 4 ('no')
25 RETURN_VALUE
26 LOAD_CONST 0 (None)
29 RETURN_VALUE
What does all this mean? Let's look at the first instruction LOAD_CONST as an example. The number in the
first column (2) shows the line number in our Python source code. The
second column is an index into the bytecode, telling us that the
LOAD_CONST instruction appears at position zero. The third column is
the instruction itself, mapped to its human-readable name. The fourth
column, when present, is the argument to that instruction. The fifth
column, when present, is a hint about what the argument means.
How does the Python Runtime actually work?:
With Python, it uses an interpreter rather than a compiler. An
interpreter works in exactly the same way as a compiler, with one
difference: instead of code generation, it loads the output in-memory
and executes it directly on your system. (The exact details of how
this happens can vary wildly between different languages and different
interpreters.)
importlib — The implementation of import:
When reload() is executed:
Python module’s code is recompiled and the module-level code
re-executed, defining a new set of objects which are bound to names in
the module’s dictionary by reusing the loader which originally loaded
the module. The init function of extension modules is not called a
second time.
Again, please let me know if I need to edit this answer to follow etiquette.
Related
I have an AI program written in Python. I want to deploy the script on Ubuntu/Windows machine without exposing the source script.
How can I encrypt the Python script so that it is irreversibly encoded but can be used as usual (say by calling python <script_name>.py from the terminal)?
Obfuscate Python Scripts
At first, compile python source file to code objects
Iterate code object, wrap bytecode of each code object as the
following format
0 JUMP_ABSOLUTE n = 3 + len(bytecode)
3
...
... Here it's obfuscated bytecode
...
n LOAD_GLOBAL ? (__pyarmor__)
n+3 CALL_FUNCTION 0
n+6 POP_TOP
n+7 JUMP_ABSOLUTE 0
Save obfuscated code objects as .pyc or .pyo, so it can be run or
imported by common Python interpreter.
Run or Import Obfuscated Python Scripts
Those obfuscated file (.pyc or .pyo) can be run and imported as normal
way by common python interpreter. But when those code object is called
first time, from the wrapped bytecode descripted in above section, we
know
First op is JUMP_ABSOLUTE, it will jump to offset n
At offset n, the instruction is to call a PyCFunction. This function
will restore those obfuscated bytecode between offset 3 and n, and
place the original bytecode at offset 0
After function call, the last instruction is to jump to
offset 0. The really bytecode now is executed.
There is a tool Pyarmor to obfuscate python scripts by this way.
Real encryption is not really possible but by obfuscation and compression you can make it pretty hard for anyone to understand- or reuse the source code. You can take a look at this:
https://liftoff.github.io/pyminifier/
You can also try compiling your python script to byte-code, but all these processes are still semi-reversible if someone really wants to reverse-engineer your code.
I'm trying to access some Fortran subroutines using F2PY, but I've ran into the following problem during consecutive calls from IPython. Take this minimal Fortran code (hope that I didn't code anything stupid; my Fortran is a bit rusty..):
! test.f90
module mod
integer i
contains
subroutine foo
i = i+1
print*,i
end subroutine foo
end module mod
If I compile this using F2PY (f2py3.5 -c -m test test.f90), import it in Python and call it twice:
# run.py
import test
test.mod.foo()
test.mod.foo()
The resulting output is:
$ python run.py
1
2
So on every call of foo(), i is incremented, which is supposed to happen. But between different calls of run.py (either from the command line or IPython interpreter), everything should be "reset", i.e. the printed counter should start from 1 for every call. This happens when calling run.py from the command line, but if I call the script multiple times from IPython, i keeps increasing:
In [1]: run run.py
1
2
In [2]: run run.py
3
4
I know that there are lots of posts showing how to reload imports (using autoreload in IPython, importlib.reload(), ...), but none of them seem to work for this example. Is there a way to force a clean reload/import?
Some side notes: (1) The Fortran code that I'm trying to access is quite large, old and messy, so I'd prefer not to change anything in there; (2) I could easily do test.mod.i = something in between calls, but the real Fortran code is too complex for such solutions; (3) I'd really prefer a solution which I can put in the Python code over e.g. settings (autoreload, ..) which I have to manually put in the IPython interpreter (forget it once and ...)
If you can slightly change your fortran code you may be able to reset without re-import (probably faster too).
The change is about introducing i as a common and resetting it from outside. Your changed fortran code will look this
! test.f90
module mod
common /set1/ i
contains
subroutine foo
common /set1/ i
i = i+1
print*,i
end subroutine foo
end module mod
reset the variable i from python as below:
import test
test.mod.foo()
test.mod.foo()
test.set1.i = 0 #reset here
test.mod.foo()
This should produce the result as follows:
python run.py
1
2
1
In my python file, I have made a GUI widget that takes some inputs from user. I have imported a python module in my python file that takes some input using raw_input(). I have to use this module as it is, I have no right to change it. When I run my python file, it ask me for the inputs (due to raw_input() of imported module). I want to use GUI widget inputs in that place.
How can I pass the user input (that we take from widget) as raw_input() of imported module?
First, if importing it directly into your script isn't actually a requirement (and it's hard to imagine why it would be), you can just run the module (or a simple script wrapped around it) as a separate process, using subprocess or pexpect.
Let's make this concrete. Say you want to use this silly module foo.py:
def bar():
x = raw_input("Gimme a string")
y = raw_input("Gimme another")
return 'Got two strings: {}, {}'.format(x, y)
First write a trivial foo.wrapper.py:
import foo
print(foo.bar())
Now, instead of calling foo.do_thing() directly in your real script, run foo_wrapper as a child process.
I'm going to assume that you already have the input you want to send it in a string, because that makes the irrelevant parts of the answer simpler (in fact, it makes them possible—if you wanted to use some GUI code for that, there's really no way I could show you how unless you first tell us which GUI library you're using).
So:
foo_input = 'String 1\nString 2\n'
with subprocess.Popen([sys.executable, 'foo_wrapper.py'],
stdin=subprocess.PIPE, stdout=subprocess.PIPE) as p:
foo_output, _ = p.communicate(foo_input)
Of course in real life you'll want to use an appropriate path for foo_wrapper.py instead of assuming that it's in the current working directory, but this should be enough to illustrate the idea.
Meanwhile, if "I have no right to change it" just means "I don't (and shouldn't) have checkin rights to the foo project's github site or the relevant subtree on our company's P4 server" or whatever, there's a really easy answer: Fork it, and change the fork.
Even if it's got a weak copyleft license like LGPL: fork it, change the fork, publish your fork under the same license as the original, then use your fork.
If you're depending on the foo package being installed on every target system, and can't depend on your replacement foo being installed instead, that's a bit more of a problem. But if the function or method that actually calls raw_input is just a small fraction of the actual code in foo, you can fix that by monkeypatching foo at runtime.
And that leads to the last-ditch possibility: You can always monkeypatch raw_input itself.
Again, I'm going to assume that you already have the input you need to give it to make things simpler.
So, first you write a replacement function:
foo_input = ['String 1\n', 'String 2\n']
def fake_raw_input(prompt):
global foo_input
return foo_input.pop()
Now, there are two ways you can patch this in. Usually, you want to do this:
import foo
foo.raw_input = fake_raw_input
This means any code in foo that calls raw_input will see the function you crammed into its module globals instead of the normal builtin. Unless it does something really funky (like looking up the builtin directly and copying it to a local variable or something), this is the answer.
If you need to handle one of those really funky edge cases, and you don't mind doing something questionable, you can do this:
import __builtin__
__builtin__.raw_input = fake_raw_input
You must do this before the first import foo anywhere in your problem. Also, it's not clear whether this is intentionally guaranteed to work, accidentally guaranteed to work (and should be fixed in the future), or not guaranteed to work. But it does work (at least for CPython 2.5-2.7, which is what you're probably using).
Programming in C I used to have code sections only used for debugging purposes (logging commands and the like). Those statements could be completely disabled for production by using #ifdef pre-processor directives, like this:
#ifdef MACRO
controlled text
#endif /* MACRO */
What is the best way to do something similar in python?
If you just want to disable logging methods, use the logging module. If the log level is set to exclude, say, debug statements, then logging.debug will be very close to a no-op (it just checks the log level and returns without interpolating the log string).
If you want to actually remove chunks of code at bytecode compile time conditional on a particular variable, your only option is the rather enigmatic __debug__ global variable. This variable is set to True unless the -O flag is passed to Python (or PYTHONOPTIMIZE is set to something nonempty in the environment).
If __debug__ is used in an if statement, the if statement is actually compiled into only the True branch. This particular optimization is as close to a preprocessor macro as Python ever gets.
Note that, unlike macros, your code must still be syntactically correct in both branches of the if.
To show how __debug__ works, consider these two functions:
def f():
if __debug__: return 3
else: return 4
def g():
if True: return 3
else: return 4
Now check them out with dis:
>>> dis.dis(f)
2 0 LOAD_CONST 1 (3)
3 RETURN_VALUE
>>> dis.dis(g)
2 0 LOAD_GLOBAL 0 (True)
3 JUMP_IF_FALSE 5 (to 11)
6 POP_TOP
7 LOAD_CONST 1 (3)
10 RETURN_VALUE
>> 11 POP_TOP
3 12 LOAD_CONST 2 (4)
15 RETURN_VALUE
16 LOAD_CONST 0 (None)
19 RETURN_VALUE
As you can see, only f is "optimized".
It is important to understand that in Python def and class are two regular executable statements...
import os
if os.name == "posix":
def foo(x):
return x * x
else:
def foo(x):
return x + 42
...
so to do what you do with preprocessor in C and C++ you can use the the regular Python language.
Python language is fundamentally different from C and C++ on this point because there exist no concept of "compile time" and the only two phases are "parse time" (when the source code is read in) and "run time" when the parsed code (normally mostly composed of definition statements but that is indeed arbitrary Python code) is executed.
I am using the term "parse time" even if technically when the source code is read in the transformation is a full compilation to bytecode because the semantic of C and C++ compilation is different and for example the definition of a function happens during that phase (while instead it happens at runtime in Python).
Even the equivalent of #include of C and C++ (that in Python is import) is a regular statement that is executed at run time and not at compile (parse) time so it can be placed inside a regular python if. Quite common is for example having an import inside a try block that will provide alternate definitions for some functions if a specific optional Python library is not present on the system.
Finally note that in Python you can even create new functions and classes at runtime from scratch by the use of exec, not necessarily having them in your source code. You can also assemble those objects directly using code because classes and functions are indeed just regular objects (this is normally done only for classes, however).
There are some tools that instead try to consider def and class definitions and import statements as "static", for example to do a static analysis of Python code to generate warnings on suspicious fragments or to create a self-contained deployable package that doesn't depend on having a specific Python installation on the system to run the program. All of them however need to be able to consider that Python is more dynamic than C or C++ in this area and they also allow adding exceptions for where the automatic analysis will fail.
Here is an example that I use to distinguish between Python 2 & 3 for my Python Tk programs:
import sys
if sys.version_info[0] == 3:
from tkinter import *
from tkinter import ttk
else:
from Tkinter import *
import ttk
""" rest of your code """
Hope that is a useful illustration.
As far as I am aware, you have to use actual if statements. There is no preprocessor, so there is no analogue to preprocessor directives.
Edit: Actually, it looks like the top answer to this question will be more illuminating: How would you do the equivalent of preprocessor directives in Python?
Supposedly there is a special variable __debug__ which, when used with an if statement, will be evaluated once and then not evaluated again during execution.
There is no direct equivalent that I'm aware of, so you might want to zoom-out and reconsider the problems that you used to solve using the preprocessor.
If it's just diagnostic logging you're after then there is a comprehensive logging module which should cover everything you wanted and more.
http://docs.python.org/library/logging.html
What else do you use the preprocessor for? Test configurations? There's a config module for that.
http://docs.python.org/library/configparser.html
Anything else?
If you are using #ifdef to check for variables that may have been defined in the scope above the current file, you can use exceptions. For example, I have scripts that I want to run differently from within ipython vs outside ipython (show plots vs save plots, for example). So I add
ipy = False
try:
ipy = __IPYTHON__
except NameError:
pass
This leaves me with a variable ipy, which tells me whether or not __IPYTHON__ was declared in a scope above my current script. This is the closest parallel I know of for an #ifdef function in Python.
For ipython, this is a great solution. You could use similar constructions in other contexts, in which a calling script sets variable values and the inner scripts check accordingly. Whether or not this makes sense, of course, would depend on your specific use case.
If you're working on Spyder, you probably only need this:
try:
print(x)
except:
#code to run under ifndef
x = "x is defined now!"
#other code
The first time, you run your script, you'll run the code under #code to run under ifndef, the second, you'll skip it.
Hope it works:)
This can be achieved by passing command line argument as below:
import sys
my_macro = 0
if(len(sys.argv) > 1):
for x in sys.argv:
if(x == "MACRO"):
my_macro = 1
if (my_macro == 1):
controlled text
Try running the following script and observe the results after this:
python myscript.py MACRO
Hope this helps.
my python code is interlaced with lots of function calls used for (debugging|profiling|tracing etc.)
for example:
import logging
logging.root.setLevel(logging.DEBUG)
logging.debug('hello')
j = 0
for i in range(10):
j += i
logging.debug('i %d j %d' % (i,j))
print(j)
logging.debug('bye')
i want to #define these resource consuming functions out of the code. something like the c equivalent
#define logging.debug(val)
yes, i know the logging module logging level mechanism can be used to mask out loggings below set log level. but, im asking for a general way to have the python interpreter skip functions (that take time to run even if they dont do much)
one idea is to redefine the functions i want to comment out into empty functions:
def lazy(*args): pass
logging.debug = lazy
the above idea still calls a function, and may create a myriad of other problems
Python does not have a preprocessor, although you could run your python source through an external preprocessor to get the same effect - e.g. sed "/logging.debug/d" will strip out all the debug logging commands. This is not very elegant though - you will end up needing some sort of build system to run all your modules through the preprocessor and perhaps create a new directory tree of the processed .py files before running the main script.
Alternatively if you put all your debug statements in an if __debug__: block they will get optimised out when python is run with the -O (optimise) flag.
As an aside, I checked the code with the dis module to ensure that it did get optimised away. I discovered that both
if __debug__: doStuff()
and
if 0: doStuff()
are optimised, but
if False: doStuff()
is not. This is because False is a regular Python object, and you can in fact do this:
>>> False = True
>>> if False: print "Illogical, captain"
Illogical, captain
Which seems to me a flaw in the language - hopefully it is fixed in Python 3.
Edit:
This is fixed in Python 3: Assigning to True or False now gives a SyntaxError.
Since True and False are constants in Python 3, it means that if False: doStuff() is now optimised:
>>> def f():
... if False: print( "illogical")
...
>>> dis.dis(f)
2 0 LOAD_CONST 0 (None)
3 RETURN_VALUE
Although I think the question is perfectly clear and valid (notwithstanding the many responses that suggest otherwise), the short answer is "there's no support in Python for this".
The only potential solution other than the preprocessor suggestion would be to use some bytecode hacking. I won't even begin to imagine how this should work in terms of the high-level API, but at a low level you could imagine examining code objects for particular sequences of instructions and re-writing them to eliminate them.
For example, look at the following two functions:
>>> def func():
... if debug: # analogous to if __debug__:
... foo
>>> dis.dis(func)
2 0 LOAD_GLOBAL 0 (debug)
3 JUMP_IF_FALSE 8 (to 14)
6 POP_TOP
3 7 LOAD_GLOBAL 1 (foo)
10 POP_TOP
11 JUMP_FORWARD 1 (to 15)
>> 14 POP_TOP
>> 15 LOAD_CONST 0 (None)
18 RETURN_VALUE
Here you could scan for the LOAD_GLOBAL of debug, and eliminate it and everything up to the JUMP_IF_FALSE target.
This one is the more traditional C-style debug() function that gets nicely obliterated by a preprocessor:
>>> def func2():
... debug('bar', baz)
>>> dis.dis(func2)
2 0 LOAD_GLOBAL 0 (debug)
3 LOAD_CONST 1 ('bar')
6 LOAD_GLOBAL 1 (baz)
9 CALL_FUNCTION 2
12 POP_TOP
13 LOAD_CONST 0 (None)
16 RETURN_VALUE
Here you would look for LOAD_GLOBAL of debug and wipe everything up to the corresponding CALL_FUNCTION.
Of course, both of those descriptions of what you would do are far simpler than what you'd really need for all but the most simplistic patterns of use, but I think it would be feasible. Would make a cute project, if nobody's already done it.
Well, you can always implement your own simple preprocessor that does the trick. Or, even better, you can use an already existing one. Say http://code.google.com/p/preprocess/
Use a module scoped variable?
from config_module import debug_flag
and use this "variable" to gate access to the logging function(s). You would build yourself a logging module that uses the debug_flag to gate the logging functionality.
I think that completely aboiding the calling on a function is not posible, as Python works in a different way that C. The #define takes place in the pre-compiler, before the code is compiled. In Python, there's no such thing.
If you want to completely remove the calling to debug in a work environment, I think the only way if to actually change the code before execution. With a script previous to execution you could comment/uncomment the debug lines.
Something like this:
File logging.py
#Main module
def log():
print 'logging'
def main():
log()
print 'Hello'
log()
File call_log.py
import re
#To log or not to log, that's the question
log = True
#Change the loging
with open('logging.py') as f:
new_data = []
for line in f:
if not log and re.match(r'\s*log.*', line):
#Comment
line = '#' + line
if log and re.match(r'#\s*log.*', line):
#Uncomment
line = line[1:]
new_data.append(line)
#Save file with adequate log level
with open('logging.py', 'w') as f:
f.write(''.join(new_data))
#Call the module
import logging
logging.main()
Of course, it has its problems, specially if there are a lot of modules and are complex, but could be usable if you need to absolutely avoid the calling to a function.
Before you do this, have you profiled to verify that the logging is actually taking a substantial amount of time? You may find that you spend more time trying to remove the calls than you save.
Next, have you tried something like Psyco? If you've got things set up so logging is disabled, then Psyco may be able to optimise away most of the overhead of calling the logging function, noticing that it will always return without action.
If you still find logging taking an appreciable amount of time, you might then want to look at overriding the logging function inside critical loops, possibly by binding a local variable to either the logging function or a dummy function as appropriate (or by checking for None before calling it).
define a function that does nothing, ie
def nuzzing(*args, **kwargs): pass
Then just overload all the functions you want to get rid of with your function, ala
logging.debug = nuzzing
I like the 'if __debug_' solution except that putting it in front of every call is a bit distracting and ugly. I had this same problem and overcame it by writing a script which automatically parses your source files and replaces logging statements with pass statements (and commented out copies of the logging statements). It can also undo this conversion.
I use it when I deploy new code to a production environment when there are lots of logging statements which I don't need in a production setting and they are affecting performance.
You can find the script here: http://dound.com/2010/02/python-logging-performance/
You can't skip function calls. You could redefine these as empty though, e.g. by creating another logging object that provides the same interface, but with empty functions.
But by far the cleanest approach is to ignore the low priority log messages (as you suggested):
logging.root.setLevel(logging.CRITICAL)