This is a bit of an odd question; it came up in the context of a tool that exposes a Python API, which we spend a lot of time querying interactively from the REPL. The particular idiom causing issues is something like this:
for var in slow_generator_of_giant_list():
stats = update(stats, var)
print stats
To enter this at the REPL, I can type this:
>>> for var in slow_generator_of_giant_list():
... stats = update(stats, var)
...
If I now attempt to type the print, I get a syntax error due to improper indentation. (Or else I put the print inside the loop and do it on every iteration.)
But if I hit enter to go to the next line, the loop runs immediately, and I have to wait for it to finish, or type the print command in the face of possible output coming at me, etc.
Obviously I can define a function containing the above, and it might be worth saving into a file anyway, but in the general case we're constructing these on the fly, and it would be nice to have a way to "schedule" a command to run after the end of a loop from the REPL. In a language with block delimiters, I could of course put it after the ending delimiter (and any necessary statement separator). But my coworkers and I were stumped trying to do something similar here.
Is there perhaps an ugly abuse of Pythonic syntax that will do the trick that my coworkers and I couldn't think of? Or a recommended way to avoid the problem while still making it easy to throw together ad hoc interactive queries?
Thanks for any pointers.
Not beautiful, but this should work:
>>> mygen = slow_generator_of_giant_list()
>>> try:
... while True: stats = update(stats, mygen.next())
... except StopIteration:
... print stats
...
I would just say that you would find it easier just to not use the interactive shell for this.
It's not much effort to save a file and run it. You only have to keep it around for as long as you use it.
I actually have found this answering on SO. I keep a file open in my text editor with a terminal in the right directory, and just use it as a scratchpad for mocking up answers in.
Related
Okay, so let me just say beforehand: I am new to Python. I was just experimenting with IDLE and then I had this weird "crash". I put "crash" inside speech marks because I'm not sure if it qualifies as a crash, as rather than the program just crashing the way a normal program would in Windows, it still runs, but whenever I press enter and try and get it to accept new text it doesn't do anything. E.g. if you try and type "print('a')" and then hit enter it just goes to the next line (and doesn't print 'a'). I tried to make a simple function which converted an integer to a string where each character in the string was either a '1' or a '0', forming the binary number representing said (unsigned) integer.
>>> def int_to_str(int_in):
str_out=''
bit_val=1<<int_in.bit_length()
while(int_in>0):
if(int_in>bit_val):
str_out+='1'
int_in-=bit_val
else:
str_out+='0'
bit_val>>=1
return str_out
>>> print('a')
print('c')
Basically, it becomes completely unresponsive to my input, and allows me to edit/change "print('a')" even though I shouldn't be able to if it had actually "accepted" my input. Why is this? What have I done wrong/messed up?
Also, I made sure it isn't something else I was previously messing around with by closing the shell and opening it again and only putting in said code for the "int_to_string" function, and I haven't changed any settings or imported any modules before hand or anything like that (in case it matters).
EDIT: I tried reinstalling, and that helped a bit in that I can now do other stuff fine, but the moment I try to use the "str_to_int()" function, it has this same weird behaviour of not accepting/interpreting any more user input.
Your while loop never terminates, you need to re-arrange your logic. Printing variables can be an effective debugging tool - like this:
>>> def int_to_str(int_in):
str_out=''
bit_val=1<<int_in.bit_length()
while(int_in>0):
print(int_in, bit_val)
if(int_in>bit_val):
str_out+='1'
int_in-=bit_val
else:
str_out+='0'
bit_val>>=1
return str_out
If your program seems to be going on too long you can stop it with ctrl-c.
Is there a neat way to inject failures in a Python script? I'd like to avoid sprinkling the source code with stuff like:
failure_ABC = True
failure_XYZ = True
def inject_failure_ABC():
raise Exception('ha! a fake error')
def inject_failure_XYZ():
# delete some critical file
pass
# some real code
if failure_ABC:
inject_failure_ABC()
# some more real code
if failure_XYZ:
inject_failure_XYZ()
# even more real code
Edit:
I have the following idea: insert "failure points" as specially-crafted comments. The write a simple parser that will be called before the Python interpreter, and will produce the actual instrumented Python script with the actual failure code. E.g:
#!/usr/bin/parser_script_producing_actual_code_and_calls python
# some real code
# FAIL_123
if foo():
# FAIL_ABC
execute_some_real_code()
else:
# FAIL_XYZ
execute_some_other_real_code()
Anything starting with FAIL_ is considered as a failure point by the script, and depending on a configuration file the failure is enabled/disabled. What do you think?
You could use mocking libraries, for example unittest.mock, there also exist many third party ones as well. You can then mock some object used by your code such that it throws your exception or behaves in whatever way you want it to.
When testing error handling, the best approach is to isolate the code that can throw errors in a new method which you can override in a test:
class ToTest:
def foo(...):
try:
self.bar() # We want to test the error handling in foo()
except:
....
def bar(self):
... production code ...
In your test case, you can extend ToTest and override bar() with code that throws the exceptions that you want to test.
EDIT You should really consider splitting large methods into smaller ones. It will make the code easier to test, to understand and to maintain. Have a look at Test Driven Development for some ideas how to change your development process.
Regarding your idea to use "Failure Comments". This looks like a good solution. There is one small problem: You will have to write your own Python parser because Python doesn't keep comments when it produces bytecode.
So you can either spend a couple of weeks to write this or a couple of weeks to make your code easier to test.
There is one difference, though: If you don't go all the way, the parser will be useless. Also, the time spent won't have improved one bit of your code. Most of the effort will go into the parser and tools. So after all that time, you will still have to improve the code, add failure comments and write the tests.
With refactoring the code, you can stop whenever you want but the time spent so far will be meaningful and not wasted. Your code will start to get better with the first change you make and it will keep improving.
Writing a complex tool takes time and it will have it's own bugs which need to fix or work around. None of this will improve your situation in the short term and you don't have a guarantee that it will improve the long term.
If you only want to stop your code at some point, and fall back to interactive interpreter, one can use:
assert 1==0
But this only works if you do not run python with -O
Edit
Actually, my first answer was to quick, without really understanding what you want to do, sorry.
Maybe your code becomes already more readable if you do parameterization through parameters, not through variable/function suffices. Something like
failure = {"ABC": False, "XYZ":False}
#Do something, maybe set failure
def inject_failure(failure):
if not any(failure.values()):
return
if failure["ABC"]:
raise Exception('ha! a fake error')
elif failure["XYZ"]:
# delete some critical file
pass
inject_failure(failure)
I'm looking at several cases where it would be far, far, far easier to accept nearly-raw code. So,
What's the worst you can do with an expression if you can't lambda, and how?
What's the worst you can do with executed code if you can't use import and how?
(can't use X == string is scanned for X)
Also, B is unecessary if someone can think of such an expr that given d = {key:value,...}:
expr.format(key) == d[key]
Without changing the way the format looks.
The worst you can do with an expression is on the order of
__import__('os').system('rm -rf /')
if the server process is running as root. Otherwise, you can fill up memory and crash the process with
2**2**1024
or bring the server to a grinding halt by executing a shell fork bomb:
__import__('os').system(':(){ :|:& };:')
or execute a temporary (but destructive enough) fork bomb in Python itself:
[__import__('os').fork() for i in xrange(2**64) for x in range(i)]
Scanning for __import__ won't help, since there's an infinite number of ways to get to it, including
eval(''.join(['__', 'im', 'po', 'rt', '__']))
getattr(__builtins__, '__imp' + 'ort__')
getattr(globals()['__built' 'ins__'], '__imp' + 'ort__')
Note that the eval and exec functions can also be used to create any of the above in an indirect way. If you want safe expression evaluation on a server, use ast.literal_eval.
Arbitrary Python code?
Opening, reading, writing, creating files on the partition. Including filling up all the disk space.
Infinite loops that put load on the CPU.
Allocating all the memory.
Doing things that are in pure Python modules without importing them by copy/pasting their code into the expression (messing with built in Python internals and probably finding a way to access files, execute them or import modules).
...
No amount of whitelisting or blacklisting is going to keep people from getting to dangerous parts of Python. You mention running in a sandbox where "open" is not defined, for example. But I can do this to get it:
real_open = getattr(os, "open")
and if you say I won't have os, then I can do:
real_open = getattr(sys.modules['os'], "open")
or
real_open = random.__builtins__['open']
etc, etc, etc. Everything is connected, and the real power is in there somewhere. Bad guys will find it.
I find myself adding debugging "print" statements quite often -- stuff like this:
print("a_variable_name: %s" % a_variable_name)
How do you all do that? Am I being neurotic in trying to find a way to optimize this? I may be working on a function and put in a half-dozen or so of those lines, figure out why it's not working, and then cut them out again.
Have you developed an efficient way of doing that?
I'm coding Python in Emacs.
Sometimes a debugger is great, but sometimes using print statements is quicker, and easier to setup and use repeatedly.
This may only be suitable for debugging with CPython (since not all Pythons implement inspect.currentframe and inspect.getouterframes), but I find this useful for cutting down on typing:
In utils_debug.py:
import inspect
def pv(name):
record=inspect.getouterframes(inspect.currentframe())[1]
frame=record[0]
val=eval(name,frame.f_globals,frame.f_locals)
print('{0}: {1}'.format(name, val))
Then in your script.py:
from utils_debug import pv
With this setup, you can replace
print("a_variable_name: %s' % a_variable_name)
with
pv('a_variable_name')
Note that the argument to pv should be the string (variable name, or expression), not the value itself.
To remove these lines using Emacs, you could
C-x ( # start keyboard macro
C-s pv('
C-a
C-k # change this to M-; if you just want to comment out the pv call
C-x ) # end keyboard macro
Then you can call the macro once with C-x e
or a thousand times with C-u 1000 C-x e
Of course, you have to be careful that you do indeed want to remove all lines containing pv(' .
Don't do that. Use a decent debugger instead. The easiest way to do that is to use IPython and either to wait for an exception (the debugger will set off automatically), or to provoke one by running an illegal statement (e.g. 1/0) at the part of the code that you wish to inspect.
I came up with this:
Python string interpolation implementation
I'm just testing it and its proving handy for me while debugging.
Sometimes I'll be working with, say, a list of thousands of items in IDLE, and accidently print it out to the shell. When this happens, it crashes or at least very significaly slows down IDLE. As you can imagine, this is extremely inconvenient.
Is there a way to make it, rather than printing the entire thing, just give me a summarised [1, 2, ...] output?
Any help would be much appreciated.
As above, try a custom print function like:
def my_print(obj):
if hasattr(obj, '__len__') and len(obj) > 100:
print '... omitted object of %s with length %d ...' % (type(obj), len(obj))
else: print obj
Use IPython as shell instead.
You could use custom print function.
In Python 3, since print is a function, you should be able to "override" it. (I don't have it installed so I can't try it out to make sure.) Probably not recommended for real applications but if you're just trying things out, it would be okay I suppose.
It would go something like:
def myprint(*args):
# write the function as described by other people
print = myprint
The Squeezer extension for IDLE was written to do just this. From the description on Pypi:
IDLE can hang if very long output is printed. To avoid this, the Squeezer
extensions catches any output longer than 80 lines of text (configurable) and
displays a rectangular box instead:
Squeezer, and many other IDLE extensions are included in IdleX.