I am running a bunch of code all at once in python by copying it from my editor and pasting it into python. This code includes nested for loops. I am doing some web scraping and the program quits at different times. I suspect that this is because it doesn't have time to load. I get the following error (once again - the program scrapes different amounts of text each time):
Traceback (most recent call last):
File "<stdin>", line 35, in <module>
IndexError: list index out of range
First, what does line 35 refer to? Is this the place in the relevant inner for-loop?
Second, I think that the error might be caused by a line of code using selenium like this:
driver.find_elements_by_class_name("button")[j-1].click()
In this case, how can handle this error? What is some example code with either explicit waits or exception handling that would address the issue?
It means that [j-1] doesn't exist for a given value of j, possibly if j-1 exceeds the max number of elements in the list
You can try your code and catch an IndexError exception like this:
try:
# your code here
except IndexError:
# handle the error here
An IndexError happens when you try to access an index of a list that does not exist. For example:
>>> a = [1, 2, 3]
>>> print(a[10])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: list index out of range
It's difficult to say how you should handle the error without more detail.
When working with code snippets, it's convenient to have them open in a text editor and either
only copy-paste into a console the part you're currently working on so that all the relevant variables are in the local namespace that you can explore from the console, or
copy-paste a moderate-to-large chunk as a whole while having enabled automatic post-mortem debugger calling, e.g. with Automatically start the debugger on an exception Activestate recipe or IPython's %pdb magic, or
run a script as a whole under debugger e.g with -m pdb, IPython's %run or by using an IDE.
Related
I cannot run the selected block of the code in VS Code.
Given the code that works well if I run it as a whole
import numpy as np
x = np.arange(5)
print(x)
if I select the line print(x) and press Shift+Enter, it yields
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'x' is not defined
It looks like the objects are erased from the memory as soon as the compilation is over.
Could somebody explain what is the reason and how to tackle this problem?
Thank you!
As you already know, the previous objects are erased with every execution of the code from the memory.
When you run just the print statement, it is like you would just run print(x) in a new file without defining it.
To my knowledge, this can't be changed, because the python interpreter works that way, and it creates a temporary file with the selected code and runs that. In that file are the objects not defined, and thus it raises an exception.
Background
Consider the following minimal example:
When I save the following script and run it from terminal,
import time
time.sleep(5)
raise Exception
the code will raise an error after sleeping five seconds, leaving the following traceback.
Traceback (most recent call last):
File "test/minimal_error.py", line 4, in <module>
raise Exception
Exception
Now, say, I run the script, and during the 5-second-sleep, I add a line in the middle.
import time
time.sleep(5)
a = 1
raise Exception
After the python interpreter wakes up from the sleep and reaches the next line, raise Exception, it will raise the error, but it leaves the following traceback.
Traceback (most recent call last):
File "test/minimal_error.py", line 4, in <module>
a = 1
Exception
So the obvious problem is that it doesn't print the actual code that caused the error. Although it gives the correct line number (correctly reflecting the version of the script that is running, while understandably useless) and a proper error message, I can't really know what piece of code actually caused the error.
In real practice, I implement one part of a program, run it to see if that part is doing fine, and while it is still running, I move on to the next thing I have to implement. And when the script throws an error, I have to find which actual line of code caused the error. I usually just read the error message and try to deduce the original code that caused it. Sometimes it isn't easy to guess, so I copy the script to clipboard and rollback the code by undoing what I've written after running the script, check the line that caused error, and paste back from clipboard.
Question
Is there any understandable reason why the interpreter shows a = 1, which is line 4 of the "current" version of the code, instead of raise Exception, which is line 4 of the "running" version of the code? If the interpreter knows "line 4" caused the error and the error message is "Exception", why can't it say the command raise Exception raised it?
I'm not really sure if this question is on-topic here, but I don't think I can conclude it off-topic from what the help center says. It is about "[a] software [tool] commonly used by programmers" (the Python interpreter) and is "a practical, answerable problem that is unique to software development," I think. I don't think it's opinion-based, because there should be a reason for this choice of implementation.
(Observed the same in Python 2.7.16, 3.6.8, 3.7.2, and 3.7.3, so it doesn't seem to be version-specific, but a thing that just happens in Python.)
The immediate reason is that Python re-opens the file and reads the specified line again to print it in error messages. So why would it need to do that when it already read the file in the beginning? Because it doesn't keep the source code in memory, only the generated byte code.
In fact, Python will never hold the entire contents of the source file in memory at one time. Instead the lexer will read from the file and produce one token at a time, which the parser then parses and turns into byte code. Once the parser is done with a token, it's gone.
So the only way to get back at the original source code is to open the source file again.
I think it a classic problem which is described here.
Sleep use os system call to pause execution of that thread.
I am wondering if it is possible to edit/customize the behavior and printout of built-in errors in Python. For example, if I type:
>>> a = 1
>>> print A
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'A' is not defined
I want the output to instead be:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'A' is not defined. Check capitalization.
Moreover, I want this to occur at a global level, for ALL FUTURE CODE, without having to explicitly include an exception in my code. If such a change is possible, I would assume this needs to be done at the very source or library-file level of Python. However, I am not sure where exactly to look to know if this is even possible.
I am using Python 2.7 on both Ubuntu and OSX, so help on either system would be appreciated.
(My apologies in advance if this is covered elsewhere, but searching for threads on "changing Python error messages" generally gave me topics on Exceptions, which is not necessarily my interest here. If anyone can point me to a page on this though, I'd greatly appreciate it.)
YES! There is a way to exactly what you want!
traceback.py is the program that detects errors in your code. It then gives you an explanation of what happened (creates the error message that you see.)
You can find this file in your library folder for python.
When in that file you can change the messages that it outputs when you come across an error!
Please tell me if this helped you!
In my experience programming with Java, I have become quite fond of the stack traces it generates when my code goes awry, but I feel that the traces generated by python are a bit lacking by comparison. For example, a trace in java might look like this:
java.lang.RuntimeException
at test.package.Example.c(Example.java:20)
at test.package.Example.b(Example.java:15)
at test.package.Example.a(Example.java:10)
Whereas a python trace might look like this:
Traceback (most recent call last):
File "example.py", line 10, in <module>
a()
File "example.py", line 2, in a
b()
File "example.py", line 5, in b
c()
File "example.py", line 8, in c
raise Exception
Exception
While both of these traces convey basically the same information, I personally find that the trace from java is easier to follow.
Is there a means to change the format python uses for printing its stack traces, or would that sort of change require me to create a custom exception handler at the root of my program?
using traceback module
import traceback
try:
x= 1/0
except Exception as e:
print(e)
traceback.print_exc()
There is a means to change the format Python uses to format its stack traces, and that is that you write your own formatter instead. There is only one built-in format.
You can assign your own function to sys.excepthook and it will act as a top-level exception handler that will get access to exceptions that were about to rise uncaught and cause the program to exit. There you can make use of the traceback object to format things however you like. Triptych's answer shows how to use the traceback module to get the info for each stack frame. extract_tb returns a 4-tuple of the filename, line number, function, and source text of the offending line, so if you want to not display the source text you could just throw that away and concatenate the rest. But you'll have to do the work of constructing whatever output you want to see.
If you really want to, you can reformat exception tracebacks with the traceback.extract_tb method.
ref: https://docs.python.org/2/library/traceback.html#traceback.extract_tb
I'm working on a Python library used by third-party developers to write extensions for our core application.
I'd like to know if it's possible to modify the traceback when raising exceptions, so the last stack frame is the call to the library function in the developer's code, rather than the line in the library that raised the exception. There are also a few frames at the bottom of the stack containing references to functions used when first loading the code that I'd ideally like to remove too.
Thanks in advance for any advice!
You can remove the top of the traceback easily with by raising with the tb_next element of the traceback:
except:
ei = sys.exc_info()
raise ei[0], ei[1], ei[2].tb_next
tb_next is a read_only attribute, so I don't know of a way to remove stuff from the bottom. You might be able to screw with the properties mechanism to allow access to the property, but I don't know how to do that.
Take a look at what jinja2 does here:
https://github.com/mitsuhiko/jinja2/blob/5b498453b5898257b2287f14ef6c363799f1405a/jinja2/debug.py
It's ugly, but it seems to do what you need done. I won't copy-paste the example here because it's long.
Starting with Python 3.7, you can instantiate a new traceback object and use the .with_traceback() method when throwing. Here's some demo code using either sys._getframe(1) (or a more robust alternative) that raises an AssertionError while making your debugger believe the error occurred in myassert(False): sys._getframe(1) omits the top stack frame.
What I should add is that while this looks fine in the debugger, the console behavior unveils what this is really doing:
Traceback (most recent call last):
File ".\test.py", line 35, in <module>
myassert_false()
File ".\test.py", line 31, in myassert_false
myassert(False)
File ".\test.py", line 26, in myassert
raise AssertionError().with_traceback(back_tb)
File ".\test.py", line 31, in myassert_false
myassert(False)
AssertionError
Rather than removing the top of the stack, I have added a duplicate of the second-to-last frame.
Anyway, I focus on how the debugger behaves, and it seems this one works correctly:
"""Modify traceback on exception.
See also https://github.com/python/cpython/commit/e46a8a
"""
import sys
import types
def myassert(condition):
"""Throw AssertionError with modified traceback if condition is False."""
if condition:
return
# This function ... is not guaranteed to exist in all implementations of Python.
# https://docs.python.org/3/library/sys.html#sys._getframe
# back_frame = sys._getframe(1)
try:
raise AssertionError
except AssertionError:
traceback = sys.exc_info()[2]
back_frame = traceback.tb_frame.f_back
back_tb = types.TracebackType(tb_next=None,
tb_frame=back_frame,
tb_lasti=back_frame.f_lasti,
tb_lineno=back_frame.f_lineno)
raise AssertionError().with_traceback(back_tb)
def myassert_false():
"""Test myassert(). Debugger should point at the next line."""
myassert(False)
if __name__ == "__main__":
myassert_false()
You might also be interested in PEP-3134, which is implemented in python 3 and allows you to tack one exception/traceback onto an upstream exception.
This isn't quite the same thing as modifying the traceback, but it would probably be the ideal way to convey the "short version" to library users while still having the "long version" available.
What about not changing the traceback? The two things you request can both be done more easily in a different way.
If the exception from the library is caught in the developer's code and a new exception is raised instead, the original traceback will of course be tossed. This is how exceptions are generally handled... if you just allow the original exception to be raised but you munge it to remove all the "upper" frames, the actual exception won't make sense since the last line in the traceback would not itself be capable of raising the exception.
To strip out the last few frames, you can request that your tracebacks be shortened... things like traceback.print_exception() take a "limit" parameter which you could use to skip the last few entries.
That said, it should be quite possible to munge the tracebacks if you really need to... but where would you do it? If in some wrapper code at the very top level, then you could simply grab the traceback, take a slice to remove the parts you don't want, and then use functions in the "traceback" module to format/print as desired.
For python3, here's my answer. Please read the comments for an explanation:
def pop_exception_traceback(exception,n=1):
#Takes an exception, mutates it, then returns it
#Often when writing my repl, tracebacks will contain an annoying level of function calls (including the 'exec' that ran the code)
#This function pops 'n' levels off of the stack trace generated by exception
#For example, if print_stack_trace(exception) originally printed:
# Traceback (most recent call last):
# File "<string>", line 2, in <module>
# File "<string>", line 2, in f
# File "<string>", line 2, in g
# File "<string>", line 2, in h
# File "<string>", line 2, in j
# File "<string>", line 2, in k
#Then print_stack_trace(pop_exception_traceback(exception),3) would print:
# File "<string>", line 2, in <module>
# File "<string>", line 2, in j
# File "<string>", line 2, in k
#(It popped the first 3 levels, aka f g and h off the traceback)
for _ in range(n):
exception.__traceback__=exception.__traceback__.tb_next
return exception
This code might be of interest for you.
It takes a traceback and removes the first file, which should not be shown. Then it simulates the Python behavior:
Traceback (most recent call last):
will only be shown if the traceback contains more than one file.
This looks exactly as if my extra frame was not there.
Here my code, assuming there is a string text:
try:
exec(text)
except:
# we want to format the exception as if no frame was on top.
exp, val, tb = sys.exc_info()
listing = traceback.format_exception(exp, val, tb)
# remove the entry for the first frame
del listing[1]
files = [line for line in listing if line.startswith(" File")]
if len(files) == 1:
# only one file, remove the header.
del listing[0]
print("".join(listing), file=sys.stderr)
sys.exit(1)