Reload import in IPython - python

I'm trying to access some Fortran subroutines using F2PY, but I've ran into the following problem during consecutive calls from IPython. Take this minimal Fortran code (hope that I didn't code anything stupid; my Fortran is a bit rusty..):
! test.f90
module mod
integer i
contains
subroutine foo
i = i+1
print*,i
end subroutine foo
end module mod
If I compile this using F2PY (f2py3.5 -c -m test test.f90), import it in Python and call it twice:
# run.py
import test
test.mod.foo()
test.mod.foo()
The resulting output is:
$ python run.py
1
2
So on every call of foo(), i is incremented, which is supposed to happen. But between different calls of run.py (either from the command line or IPython interpreter), everything should be "reset", i.e. the printed counter should start from 1 for every call. This happens when calling run.py from the command line, but if I call the script multiple times from IPython, i keeps increasing:
In [1]: run run.py
1
2
In [2]: run run.py
3
4
I know that there are lots of posts showing how to reload imports (using autoreload in IPython, importlib.reload(), ...), but none of them seem to work for this example. Is there a way to force a clean reload/import?
Some side notes: (1) The Fortran code that I'm trying to access is quite large, old and messy, so I'd prefer not to change anything in there; (2) I could easily do test.mod.i = something in between calls, but the real Fortran code is too complex for such solutions; (3) I'd really prefer a solution which I can put in the Python code over e.g. settings (autoreload, ..) which I have to manually put in the IPython interpreter (forget it once and ...)

If you can slightly change your fortran code you may be able to reset without re-import (probably faster too).
The change is about introducing i as a common and resetting it from outside. Your changed fortran code will look this
! test.f90
module mod
common /set1/ i
contains
subroutine foo
common /set1/ i
i = i+1
print*,i
end subroutine foo
end module mod
reset the variable i from python as below:
import test
test.mod.foo()
test.mod.foo()
test.set1.i = 0 #reset here
test.mod.foo()
This should produce the result as follows:
python run.py
1
2
1

Related

Python Debugger (pdb): Navigating through multi-module code using pdb

I have a code separated into three modules; the modules are the following:
1) run_module.py
2) module1.py
3) module2.py
I have an if name == 'main': statement inside run_module.py and I am very comfortable using my CLI to do python -b pdb run_module.py and then set a breakpoint inside run_module.py using the (pdb) b linenumber format.
My question is this: How can I set breakpoints inside module1.py or module2.py from the CLI; ie. not intervening directly into the module1.py and module2.py scripts and typing import pdb; pdb.set_trace()?
Any insight would be extremely appreciated. Many thanks in advance.
pdb, like gdb, or trepan3k has a break command:
(Pdb) help break
b(reak) ([file:]lineno | function) [, condition]
With a line number argument, set a break there in the current
file. With a function name, set a break at first executable line
of that function. Without argument, list all breaks. If a second
argument is present, it is a string specifying an expression
which must evaluate to true before the breakpoint is honored.
The line number may be prefixed with a filename and a colon,
to specify a breakpoint in another file (probably one that
hasn't been loaded yet). The file is searched for on sys.path;
the .py suffix may be omitted.
But when you do this there are some things you should be aware of.
If you specify breakpoints by filename and line number, it is possible to get a error message. For example:
(Pdb) break foo.py
*** The specified object 'foo.py' is not a function or was not found along sys.path.
Let's try to understand what the message says. foo.py clearly isn't a function. It is a file. Is it in sys.path?
(Pdb) import sys
(Pdb) sys.path
['', '/usr/lib/python3.6', ...]
No. Ok. well how about if I give the file name as an absolute path?
(Pdb) break /tmp/bug.py:2
Breakpoint 1 at /tmp/bug.py:2
Ok. That works. But again there is a caveat: is it possible to stop at at that line in that file? Watch this:
(Pdb) break /etc/hosts:1
Breakpoint 2 at /etc/hosts:1
/etc/hosts is a file, but it is not a Python program. And while as we saw before pdb warns if the file is missing, it doesn't check whether the file is a Python file.
If instead you run this with trepan3k it prints out a nasty traceback (which I'll fix at some point), but at least it gives some inkling that something is wrong:
(Nasty traceback)...
ValueError: path /etc/hosts must point to a Python source that can be compiled, or Python bytecode (.pyc, .pyo)
[The traceback has been removed in the github source and will not appear in version 1.2.9. In version 1.2.8 of trepan2 this is gone too.]
Also pdb isn't that smart about knowing whether line 2 has code that can be stopped at; pdb will warn about empty or blank lines or lines comment lines, but anything more sophisticated than that, like some random text, and no dice.
Again trepan3k is a little better about this:
(trepan3k) break /tmp/bug.py:2
** File /tmp/bug.py is not stoppable at line 2.
The last caveat when working with breakpoints is that a breakpoint might not be hit because the code never gets there. But you have this also with sys.set_trace() so I imagine that kind of behavior is less of a surprise.
One other difference of note between the breakpoints of trepan3k and pdb. In pdb when you breakpoint on a function, you are really setting breakpoint on the first line recorded for that function. That is, you have already entered the function.
In trepan3k you are setting it on the function call which is independent of where it is located. Note the different syntax for functions, they have trailing parenthesis after the name, e.g. foo() vs foo.
A method name given in the context of the object is is a method of is also possible, and I use that a lot. Here is an example:
trepan3k /tmp/pathtest.py
(/tmp/pathtest.py:1): <module>
-> 1 from pathlib import Path
(trepan3k) n
(/tmp/pathtest.py:2 #12): <module>
-- 2 mypath = Path(__file__)
(trepan3k) n
(/tmp/pathtest.py:3 #20): <module>
-- 3 print(mypath.match("p.*"))
(trepan3k) break mypath.match()
Breakpoint 1 set at line 972 of file /home/test/.pyenv/versions/3.8.10/lib/python3.8/pathlib.py
(trepan3k) c
(/home/test/.pyenv/versions/3.8.10/lib/python3.8/pathlib.py:972): match
xx 972 def match(self, path_pattern):
(trepan3k) path_pattern
'p.*'
(trepan3k) info pc
PC offset is -1.

Cython magic command doesn't print in jupyter notebook [duplicate]

I am new to cython(only use it for doing a little hw now).
I use the following code to see a general idea of it in jupyter notebook.
%load_ext Cython
%%cython
def cfunc(int n):
cdef int a = 0
for i in range(n):
a += i
return a
print(cfunc(10))
However, it only prints out the result 45 once. When I run the print function, the cell doesn't show 45 anyone.
Is there any problems with the code? How can I make the cell prints out 45 just the same as a normal python code? Thanks.
When running %%cython-magic a lot happens under the hood. One can see parts of it when calling the magic in verbose mode, i.e. %%cython --verbose:
A file called _cython_magic_b599dcf313706e8c6031a4a7058da2a2.pyx is generated. b599dcf313706e8c6031a4a7058da2a2 is the sha1-hash of the %%cython-cell, which is needed for example to be able to reload a %%cython-cell (see this SO-post).
This file is cythonized and build to a c-extension called _cython_magic_b599dcf313706e8c6031a4a7058da2a2.
This extension gets imported - this is the moment your code prints 45, and everything from this module is added to the global namespace.
When you execute the cell again, nothing of the above happens: given the sha-hash the machinery can see, that this cell was already executed and loaded - so nothing to be done. Only when the content of the cell is changed and thus its hash the cash will not be used but the 3 steps above executed.
To enforce that the steps above are performed one has to pass --force (or -f) options to the %%cython-magic-cell, i.e.:
%%cython --force
...
# 45 is printed
However, because building extension anew is quite time consuming one would probably prefer the following
%%cython
def cfunc(int n):
cdef int a = 0
for i in range(n):
a += i
return a
# put the code of __main__ into a function
def cython_main():
print(cfunc(10))
# execute the old main
cython_main()
and now calling cython_main() in a new cell, so it gets reevaluated the same way the normal python code would.

Line Profiling inner function with Cython

I've had pretty good success using this answer to profile my Cython code, but it doesn't seem to work properly with nested functions. In this notebook you can see that the profile doesn't appear when the line profiler is used on a nested function. Is there a way to get this to work?
tl,dr:
This is seems to be an issue with Cython, there's a hackish way that does the trick but isn't reliable, you could use it for one-off cases until this issue has been fixed*
Change the line_profiler source:
I can't be 100% sure for this but it is working, what you need to do is download the source for line_profiler and go fiddle around in python_trace_callback. After the code object is obtained from the current frame of execution (code = <object>py_frame.f_code), add the following:
if what == PyTrace_LINE or what == PyTrace_RETURN:
code = <object>py_frame.f_code
# Add entry for code object with different address if and only if it doesn't already
# exist **but** the name of the function is in the code_map
if code not in self.code_map and code.co_name in {co.co_name for co in self.code_map}:
for co in self.code_map:
# make condition as strict as necessary
cond = co.co_name == code.co_name and co.co_code == code.co_code
if cond:
del self.code_map[co]
self.code_map[code] = {}
This will replace the code object in self.code_map with the one currently executing that matches its name and co.co_code contents. co.co_code is b'' for Cython, so in essence in matches Cython functions with that name. Here is where it can become more robust and match more attributes of a code object (for example, the filename).
You can then procceed to build it with python setup.py build_ext and install with sudo python setup.py install. I'm currently building it with python setup.py build_ext --inplace in order to work with it locally, I'd suggest you do too. If you do build it with --inplace make sure you navigate to the folder containing the source for line_profiler before importing it.
So, in the folder containing the built shared library for line_profiler I set up a cyclosure.pyx file containing your functions:
def outer_func(int n):
def inner_func(int c):
cdef int i
for i in range(n):
c+=i
return c
return inner_func
And an equivalent setup_cyclosure.py script in order to build it:
from distutils.core import setup
from distutils.extension import Extension
from Cython.Build import cythonize
from Cython.Compiler.Options import directive_defaults
directive_defaults['binding'] = True
directive_defaults['linetrace'] = True
extensions = [Extension("cyclosure", ["cyclosure.pyx"], define_macros=[('CYTHON_TRACE', '1')])]
setup(name = 'Testing', ext_modules = cythonize(extensions))
As previously, the build was performed with python setup_cyclosure.py build_ext --inplace.
Launching your interpreter from the current folder and issuing the following yields the wanted results:
>>> import line_profiler
>>> from cyclosure import outer_func
>>> f = outer_func(5)
>>> prof = line_profiler.LineProfiler(f)
>>> prof.runcall(f, 5)
15
>>> prof.print_stats()
Timer unit: 1e-06 s
Total time: 1.2e-05 s
File: cyclosure.pyx
Function: inner_func at line 2
Line # Hits Time Per Hit % Time Line Contents
==============================================================
2 def inner_func(int c):
3 cdef int i
4 1 5 5.0 41.7 for i in range(n):
5 5 6 1.2 50.0 c+=i
6 1 1 1.0 8.3 return c
Issue with IPython %%cython:
Trying to run this from IPython results in an unfortunate situation. While executing, the code object doesn't store the path to the file where it was defined, it simply stored the filename. Since I simply drop the code object into the self.code_map dictionary and since code objects have read-only Attributes, we lose the file path information when using it from IPython (because it stores the files generated from %%cython in a temporary directory).
Because of that, you do get the profiling statistics for your code but you get no contents for the contents. One might be able to forcefully copy the filenames between the two code objects in question but that's another issue altogether.
*The Issue:
The issue here is that for some reason, when dealing with nested and/or enclosed functions, there's an abnormality with the address of the code object when it is created and while it is being interpreted in one of Pythons frames. The issue you were facing was caused by the following condition not being satisfied:
if code in self.code_map:
Which was odd. Creating your function in IPython and adding it to the LineProfiler did indeed add it to the self.code_map dictionary:
prof = line_profiler.LineProfiler(f)
prof.code_map
Out[16]: {<code object inner_func at 0x7f5c65418f60, file "/home/jim/.cache/ipython/cython/_cython_magic_1b89b9cdda195f485ebb96a104617e9c.pyx", line 2>: {}}
When the time came to actually test the previous condition though, and the current code object was snatched from the current execution frame with code = <object>py_frame.f_code, the address of the code object was different:
# this was obtained with a basic print(code) in _line_profiler.pyx
code object inner_func at 0x7f7a54e26150
indicating it was re-created. This only happens with Cython and when a function is defined inside another function. Either this or something that I am completely missing.

Python Debugging Using Pdb

I'm using a interactive graphical Python debugger with ipdb under the hood (Canopy's graphical debugger). The script I am working on has multiple imported modules and several calls to their respective functions. Whenever I attempt a debugging run, execution gets stuck somewhere within a call to an imported module's function (specifically subprocess). My two main questions are:
1) Does running in debug mode slow things down considerably? Is the code not actually stuck, but just running at a painfully slow rate?
2) Is there a way to completely pass over bits of code and run them as if I were not even debugging? I want to prevent the debugger from diving into subprocess and just execute it as if it were a normal run.
I might toss the graphical debugger and do everything from a terminal, but I would like to avoid that if I can because the graphical interface is really convenient and saves a lot of typing.
import pdb
a = "aaa"
pdb.set_trace()
b = "bbb"
c = "ccc"
final = a + b + c
print final
Your output when you run the code then it will start debugging and control will stop after a="aaa"
$ python abc.py
(Pdb) p a
'aaa'
(Pdb)
Thanks, Shashi

Is this a name-value binding issue during module load time by python interpreter?

I wrote a small program in sample.py file,
a=6
def f1():
a=3
return f2()
#def f2(): commented intentionally
# b = a*a
# return b
and loaded as __main__ module using command
>>>python -i sample.py
But i see that interpreter does not check the binding of f2 during loading of module.
Interpreter only realises that f2 name is not binded to its value when i call
>>>f1()
Another example, where interpreter checks the binding of name big1 , while loading the file as __main__ module using command >>>python -i sample.py throws an error saying big1 is not defined
big, small = 3, -4
while big > 2 or small < -2:
big, small = -small - 1, -big + 1
print(big1)
My question is:
1)
Why python interpreter does not take an effort to check whether all names are bind during loading of sample.py file as __main__ module and defer to using that name f1()?
2) Do you think large projects will create complexity, Because the test cases written must make sure each name is used for testing before it goes for production?
Consider this example:
def f1():
name = raw_input()
exec "def {}(): print 'I am a function'".format(name)
f2()
If a user enters "f2" when prompted, then f2() will execute correctly, but not otherwise.
This is a perfectly legal python program, and the line f2() may or may not execute correctly, dependent completely on user input. So python cannot determine at module load time whether execution of this code will result in a name error or not.
Addressing some additional confusion:
It might seem that python does sometimes do static name-checking. Ie, if we have file1.py:
def foo():
print x1
and you do python file1.py, no error will be printed. The function foo has been defined, but not run. We could also do python -i file1.py, which will open up an interpreter. If in the interpreter we then type foo(), now the code will run, and we'll get the lookup error for x1.
Consider a second file file2.py:
print x1
Here, there is no function being defined. We're simply running a print statement at the top-level. Doing python file2.py will cause a lookup error for x1, since running file2.py constitutes actually running that print statement, as opposed to only defining a function that would run the print statement when called.
So it's not that python sometimes does static name-checking -- python will only throw a name error when code involving that name is actually run. It's just that when code gets run depends on where it is (in a function vs being top-level).

Categories

Resources