Python Debugger (pdb): Navigating through multi-module code using pdb - python

I have a code separated into three modules; the modules are the following:
1) run_module.py
2) module1.py
3) module2.py
I have an if name == 'main': statement inside run_module.py and I am very comfortable using my CLI to do python -b pdb run_module.py and then set a breakpoint inside run_module.py using the (pdb) b linenumber format.
My question is this: How can I set breakpoints inside module1.py or module2.py from the CLI; ie. not intervening directly into the module1.py and module2.py scripts and typing import pdb; pdb.set_trace()?
Any insight would be extremely appreciated. Many thanks in advance.

pdb, like gdb, or trepan3k has a break command:
(Pdb) help break
b(reak) ([file:]lineno | function) [, condition]
With a line number argument, set a break there in the current
file. With a function name, set a break at first executable line
of that function. Without argument, list all breaks. If a second
argument is present, it is a string specifying an expression
which must evaluate to true before the breakpoint is honored.
The line number may be prefixed with a filename and a colon,
to specify a breakpoint in another file (probably one that
hasn't been loaded yet). The file is searched for on sys.path;
the .py suffix may be omitted.
But when you do this there are some things you should be aware of.
If you specify breakpoints by filename and line number, it is possible to get a error message. For example:
(Pdb) break foo.py
*** The specified object 'foo.py' is not a function or was not found along sys.path.
Let's try to understand what the message says. foo.py clearly isn't a function. It is a file. Is it in sys.path?
(Pdb) import sys
(Pdb) sys.path
['', '/usr/lib/python3.6', ...]
No. Ok. well how about if I give the file name as an absolute path?
(Pdb) break /tmp/bug.py:2
Breakpoint 1 at /tmp/bug.py:2
Ok. That works. But again there is a caveat: is it possible to stop at at that line in that file? Watch this:
(Pdb) break /etc/hosts:1
Breakpoint 2 at /etc/hosts:1
/etc/hosts is a file, but it is not a Python program. And while as we saw before pdb warns if the file is missing, it doesn't check whether the file is a Python file.
If instead you run this with trepan3k it prints out a nasty traceback (which I'll fix at some point), but at least it gives some inkling that something is wrong:
(Nasty traceback)...
ValueError: path /etc/hosts must point to a Python source that can be compiled, or Python bytecode (.pyc, .pyo)
[The traceback has been removed in the github source and will not appear in version 1.2.9. In version 1.2.8 of trepan2 this is gone too.]
Also pdb isn't that smart about knowing whether line 2 has code that can be stopped at; pdb will warn about empty or blank lines or lines comment lines, but anything more sophisticated than that, like some random text, and no dice.
Again trepan3k is a little better about this:
(trepan3k) break /tmp/bug.py:2
** File /tmp/bug.py is not stoppable at line 2.
The last caveat when working with breakpoints is that a breakpoint might not be hit because the code never gets there. But you have this also with sys.set_trace() so I imagine that kind of behavior is less of a surprise.
One other difference of note between the breakpoints of trepan3k and pdb. In pdb when you breakpoint on a function, you are really setting breakpoint on the first line recorded for that function. That is, you have already entered the function.
In trepan3k you are setting it on the function call which is independent of where it is located. Note the different syntax for functions, they have trailing parenthesis after the name, e.g. foo() vs foo.
A method name given in the context of the object is is a method of is also possible, and I use that a lot. Here is an example:
trepan3k /tmp/pathtest.py
(/tmp/pathtest.py:1): <module>
-> 1 from pathlib import Path
(trepan3k) n
(/tmp/pathtest.py:2 #12): <module>
-- 2 mypath = Path(__file__)
(trepan3k) n
(/tmp/pathtest.py:3 #20): <module>
-- 3 print(mypath.match("p.*"))
(trepan3k) break mypath.match()
Breakpoint 1 set at line 972 of file /home/test/.pyenv/versions/3.8.10/lib/python3.8/pathlib.py
(trepan3k) c
(/home/test/.pyenv/versions/3.8.10/lib/python3.8/pathlib.py:972): match
xx 972 def match(self, path_pattern):
(trepan3k) path_pattern
'p.*'
(trepan3k) info pc
PC offset is -1.

Related

Reload import in IPython

I'm trying to access some Fortran subroutines using F2PY, but I've ran into the following problem during consecutive calls from IPython. Take this minimal Fortran code (hope that I didn't code anything stupid; my Fortran is a bit rusty..):
! test.f90
module mod
integer i
contains
subroutine foo
i = i+1
print*,i
end subroutine foo
end module mod
If I compile this using F2PY (f2py3.5 -c -m test test.f90), import it in Python and call it twice:
# run.py
import test
test.mod.foo()
test.mod.foo()
The resulting output is:
$ python run.py
1
2
So on every call of foo(), i is incremented, which is supposed to happen. But between different calls of run.py (either from the command line or IPython interpreter), everything should be "reset", i.e. the printed counter should start from 1 for every call. This happens when calling run.py from the command line, but if I call the script multiple times from IPython, i keeps increasing:
In [1]: run run.py
1
2
In [2]: run run.py
3
4
I know that there are lots of posts showing how to reload imports (using autoreload in IPython, importlib.reload(), ...), but none of them seem to work for this example. Is there a way to force a clean reload/import?
Some side notes: (1) The Fortran code that I'm trying to access is quite large, old and messy, so I'd prefer not to change anything in there; (2) I could easily do test.mod.i = something in between calls, but the real Fortran code is too complex for such solutions; (3) I'd really prefer a solution which I can put in the Python code over e.g. settings (autoreload, ..) which I have to manually put in the IPython interpreter (forget it once and ...)
If you can slightly change your fortran code you may be able to reset without re-import (probably faster too).
The change is about introducing i as a common and resetting it from outside. Your changed fortran code will look this
! test.f90
module mod
common /set1/ i
contains
subroutine foo
common /set1/ i
i = i+1
print*,i
end subroutine foo
end module mod
reset the variable i from python as below:
import test
test.mod.foo()
test.mod.foo()
test.set1.i = 0 #reset here
test.mod.foo()
This should produce the result as follows:
python run.py
1
2
1

Python PDB inconsistency with command outputs - next statement not outputting

I am trying to step through a python script for some code I wrote over a year ago. The way I recall, when you write a pdb.set_trace() line, the program will halt its execution at that point and print the next line that will be executed. Entering 'n' or 's' will advance the execution by a line (and maybe step into a function, not too relevant to my problem) AND print the next line to be executed again.
Now, I have one program "example1.py" and it is not printing the next statement to be executed. I just get some output like the following
./example1.py
> /home/some/example1.py(44)create()
(Pdb) s
> /home/some/example1.py(45)create()
(Pdb) s
--Call--
> /home/some/example1.py(90)parse()
But when I try this same thing with a different python program, "example2.py", one that I wrote more recently I am getting the output I expect (the next statement to be executed.)
> /home/some/example2.py(86)random_update()
-> DATE_LIMIT=1
(Pdb) n
> /home/some/example2.py(87)random_update()
-> FILE_LIMIT=120
(Pdb) n
> /home/some/example2.py(89)random_update()
-> n_dates=0
I have no idea what could be a possible cause for this. Could import statements interfere with pdb's execution?
UPDATE:
So I set my trace breakpoint before changing to a directory that is outside my home directory. When I do this, I get the output that I expect. I noticed that this directory's group owner was root so I changed it to my user. This didn't resolve my problem, but now I know it has to do with the program execution location.
I figured it out. If I set my trace before a line that changes the current directory to a directory where the program is not located, I was getting the output I expected. Once I set the trace after this directory change, I was no longer getting statement execution output.
To resolve this, I executed the program with a full path
python /home/name/home/some/example1.py

Enabling execution of multi-line statements within the Python's debugger(pdb) conveniently

Running !import code; code.interact(local=vars()) inside the pdb prompt allows you to input multiline statements(e.g. a class definition) inside the debugger(source). Is there any way to omit having to copy paste/type that full line each time?
I was thinking about Conque for vim and setting something like :noremap ,d i!import code; code.interact(local=vars())<Esc> but editing anything outside of insert mode doesn't seem to have any effect on the prompt.
PDB reads in a .pdbrc when it starts. From the Python docs:
If a file .pdbrc exists in the user’s home directory or in the current directory, it is read in and executed as if it had been typed at the debugger prompt. This is particularly useful for aliases. If both files exist, the one in the home directory is read first and aliases defined there can be overridden by the local file.
So try creating that file and putting that command in there as is.

Breakpoint-induced interactive debugging of Python with IPython

Say I have an IPython session, from which I call some script:
> run my_script.py
Is there a way to induce a breakpoint in my_script.py from which I can inspect my workspace from IPython?
I remember reading that in previous versions of IPython one could do:
from IPython.Debugger import Tracer;
def my_function():
x = 5
Tracer()
print 5;
but the submodule Debugger does not seem to be available anymore.
Assuming that I have an IPython session open already: how can I stop my program a location of my choice and inspect my workspace with IPython?
In general, I would prefer solutions that do not require me to pre-specify line numbers, since I would like to possibly have more than one such call to Tracer() above and not have to keep track of the line numbers where they are.
The Tracer() still exists in ipython in a different module. You can do the following:
from IPython.core.debugger import Tracer
def my_function():
x = 5
Tracer()()
print 5
Note the additional call parentheses around Tracer
edit: For IPython 6 onwards Tracer is deprecated so you should use set_trace() instead:
from IPython.core.debugger import set_trace
def my_function():
x = 5
set_trace()
print 5
You can run it and set a breakpoint at a given line with:
run -d -b12 myscript
Where -b12 sets a breakpoint at line 12. When you enter this line, you'll immediately drop into pdb, and you'll need to enter c to execute up to that breakpoint.
This is the version using the set_trace() method instead of the deprecated Tracer() one.
from IPython.core.debugger import Pdb
def my_function():
x = 5
Pdb().set_trace()
print 5
Inside the IPython shell, you can do
from IPython.core.debugger import Pdb
pdb = Pdb()
pdb.runcall(my_function)
for example, or do the normal pdb.set_trace() inside your function.
With Python 3 (v3.7+), there's the new breakpoint() function. You can modify it's behaviour so it'll call ipython's debugger for you.
Basically you can set an environment variable that points to a debugger function. (If you don't set the variable, breakpoint() defaults to calling pdb.)
To set breakpoint() to call ipython's debugger, set the environment variable (in your shell) like so:
# for bash/zsh users
export PYTHONBREAKPOINT='IPython.core.debugger.set_trace'
# powershell users
$env:PYTHONBREAKPOINT='IPython.core.debugger.set_trace'
(Note, obviously if you want to permanently set the environment variable, you'll need to modify your shell profile or system preferences.)
You can write:
def my_function():
x = 5
breakpoint()
print(5)
And it'll break into ipython's debugger for you. I think it's handier than having to import from IPython.core.debugger import set_trace and call set_trace().
I have always had the same question and the best workaround I have found which is pretty hackey is to add a line that will break my code, like so:
...
a = 1+2
STOP
...
Then when I run that code it will break, and I can do %debug to go there and inspect. You can also turn on %pdb to always go to point where your code breaks but this can be bothersome if you don't want to inspect everywhere and everytime your code breaks. I would love a more elegant solution.
I see a lot of options here, but maybe not the following simple option.
Fire up ipython in the directory where my_script.py is.
Turn the debugger on if you want the code to go into debug mode when it fails. Type %pdb.
In [1]: %pdb
Automatic pdb calling has been turned ON
Next type
In [2]: %run -d ./my_script.py
*** Blank or comment
*** Blank or comment
NOTE: Enter 'c' at the ipdb> prompt to continue execution.
> c:\users\c81196\lgd\mortgages-1\nmb\lgd\run_lgd.py(2)<module>()
1 # system imports
----> 2 from os.path import join
Now you can set a breakpoint where ever you want it.
Type b 100 to have a breakpoint at line 100, or b whatever.py:102 to have a breakpoint at line 102 in whatever.py.
For instance:
ipdb> b 100
Then continue to run, or continue.
ipdb> c
Once the code fails, or reaches the breakpoint you can start using the full power of the python debugger pdb.
Note that pdb also allows the setting of a breakpoint at a function.
b(reak) [([filename:]lineno | function) [, condition]]
So you do not necessarily need to use line numbers.

How can I get the name/file of the script from sitecustomize.py?

When I run any Python script, I would like to see the script's filename appear in the Windows command line window's titlebar. For example, if I run a script called "mytest.py", I want to see "mytest" in the titlebar. I would like this to be automatic, so I don't have to add code to every one of my scripts.
Currently I'm attempting to do this with sitecustomize.py, because when Python is run, including from double-clicking a Python script, sitecustomize is imported before the script runs.
I've tried getting __main__'s __file__ and sys.argv, but sitecustomize doesn't see either:
file sitecustomize.py:
import __main__, sys
print "hasattr __main__.__file__:", hasattr(__main__, "__file__")
print "hasattr sys.argv:", hasattr(sys, "argv")
print "-" * 60
file mytest.py:
import sys
print "__file__ is:", __file__
print "sys.argv is:", sys.argv
raw_input() # don't end the script immediately
output:
hasattr __main__.__file__: False
hasattr sys.argv: False
------------------------------------------------------------
__file__ is: C:\Documents and Settings\Owner\Desktop\mytest.py
sys.argv is: ['C:\\Documents and Settings\\Owner\\Desktop\\mytest.py']
I'm glad you asked! I now have it working for my scripts, and it's pretty cool.
Here's the code:
import sys
import time
from ctypes import windll
class SetTitle(object):
def __del__(self):
time.sleep(1)
command = ' '.join(sys.argv)
windll.kernel32.SetConsoleTitleA(command)
sys.argv = SetTitle()
This is for Python 2.x -- for 3.x you need to change SetConsoleTitleA to SetConsoleTitleW (last letter changes from A to W).
How it works: since the sys.argv object does yet exist, I create an object and assign it to sys.argv; then, when Python assigns the actual argv to sys.argv, my object is tossed, and the __del__ method is called; the __del__ method is then able to access the real argv and set the title bar accordingly. I put the 1 second sleep in just to avoid any possible race conditions, but I'm not sure it's necessary. If you don't want to see all the command-line args, you can pre-process command any way you like.
My thanks to the folks on python-win32 mailing list, and Thomas Heller in particular, for helping with the 'set title' portion of this question.
When I run any Python script, I would
like to see the script's filename
appear in the Windows command line
window's titlebar. For example, if I
run a script called "mytest.py", I
want to see "mytest" in the titlebar.
I would like this to be automatic, so
I don't have to add code to every one
of my scripts.
I think you should add this functionality to all your scripts by a module and not by hacking it into sitecustomize.py. Also even if you still want to go the sitecustomize path you will need to pass __file__ from your script, which means you will not get around to add some code to all your scripts.
What you certainly can do is to put that code into a module and then import it in all your python scripts. Like I mentioned above, you need to pass __file__ from your main script otherwise you will get the modules filename. Also there is no need to import __main__ to retrieve __file__.

Categories

Resources