Chained Conditions that require order to not produce order

Chained Conditions that require order to not produce order - python

Whenever I chain conditions in Python (or any other language tbh) I stumble upon asking myself this, kicking me out of the productive "Zone".
When I chain conditions I can, by ordering them correctly, check conditions that without checking for the other conditions first, may produce an Error.
As an example lets assume the following snippet:
if "attr" in some_dictionary and some_value in some_dictionary["attr"]:
print("whooohooo")
If the first condition wasnt in the first place or even absent, the second condition my produce an KeyError
I do this pretty often to simply save space in the code, but I always wondered, if this is good style, if it comes with a risk or if its simply "pythonic".

A more Pythonic way is to "ask for forgivness rather than permission". In other words, use a try-except block:
try:
if some_value in some_dictionary["attr"]:
print("Woohoo")
except KeyError:
pass

Python is a late binding language, which is reflected in these kind of checks. The behavior is called short-circuiting. One thing I often do is:
def do(condition_check=None):
if condition_check is not None and condition_check():
# do stuff
Now, many people will argue that try: except: is more appropriate. This really depends on the use case!
if expressions are faster when the check is likely to fail, so use them when you know what is happening.
try expressions are faster when the check is likely to succeed, so use them to safeguard against exceptional circumstances.
if is explicit, so you know precisely what you are checking. Use it if you know what is happening, i.e. strongly typed situations.
try is implicit, so you only have to care about the outcome of a call. Use it when you don't care about the details, i.e. in weakly typed situations.
if works in a well-defined scope - namely right where you are performing the check. Use it for nested relations, where you want to check the top-most one.
try works on the entire contained call stack - an exception may be thrown several function calls deeper. Use it for flat or well-defined calls.
Basically, if is a precision tool, while try is a hammer - sometimes you need precision, and sometimes you just have nails.

Related

string to (possible) integer. Error Handling: Filter with regex or try...catch?

Sitting before this problem and I'm not sure which path I should choose.
I get string inputs representing an ID.
Most of the time it will be meaningfull keynames to look up the actual ID in a table/dict.
But direct integer ID input should be possible as well. But these will come as a string.
I'm not sure if this is more a style question or if there is a huge leaning towards one option.
Option 1:
Use a regexp, check for integer,
if true convert.
Option 2:
try
ID = string.tointeger()
catch
ID = table[string]
I'm leaning towards the try...catch option as it looks cleaner. But I'm not sure if avoiding error handling via regex would deep down actually be the cleaner way and should be preferred.
Having no deeper knowledge about about try...catch; if it is actually pretty smooth 'use it if you like' or a hiccup 'avoid if you can'

"He who would sacrifice correctness for performance deserves neither[!?]."
TL;DR:
try...catch can be very fast. Looking cleaner than code trying to ship around errors. On the other hand it might be harder to follow for others when there are big code blocks and there happens stuff behind the scene which you possible don't want to have.
This is of course language dependent
regexp can be very slow and a cheap filter before it can save you.
--
So trying to answer this myself now:
I hoped for a general answer but in the end of course it is language dependent.
While doing research I often found the arguments it is expensive, creating the exception object, collecting the call stack... Sure makes sense. But often there was no explanation and it felt more like a Stigma or Superstition: try...catch is bad.
From my tests (C++): The try...catch method was the fastest overall. A mere 3% drop in execution speed. Two simple "Does the string start with digits?" "^-?\d+" regexp for integer and float were a 50% drop and as I do some substring analysis after these 50%^x became noticeable.
In the end stumpled over Bjarne Stroustrup's (creator or C++) own FAQ:
http://www.stroustrup.com/bs_faq2.html#exceptions-why
What good can using exceptions do for me? The basic answer is: Using
exceptions for error handling makes you code simpler, cleaner, and
less likely to miss errors. But what's wrong with "good old errno and
if-statements"? The basic answer is: Using those, your error handling
and your normal code are closely intertwined. That way, your code gets
messy and it becomes hard to ensure that you have dealt with all
errors (think "spaghetti code" or a "rat's nest of tests").
[...]
Common objections to the use of exceptions:
but exceptions are expensive!: Not really. Modern C++ implementations
reduce the overhead of using exceptions to a few percent (say, 3%) and
that's compared to no error handling. Writing code with error-return
codes and tests is not free either. As a rule of thumb, exception
handling is extremely cheap when you don't throw an exception. It
costs nothing on some implementations. All the cost is incurred when
you throw an exception: that is, "normal code" is faster than code
using error-return codes and tests. You incur cost only when you have
an error.
Well now in the end I decided to use simple manual filter 1st Char in array[-,0-9] before the regexp. Which might be 10% slower than try...catch but does not throw errors 80% of the time. Still good performance and nice code :)
From the Python glossary: https://docs.python.org/3/glossary.html#term-eafp
EAFP
Easier to ask for forgiveness than permission. This common Python
coding style assumes the existence of valid keys or attributes and
catches exceptions if the assumption proves false. This clean and fast
style is characterized by the presence of many try and except
statements. The technique contrasts with the LBYL style common to many
other languages such as C.
LBYL
Look before you leap. This coding style explicitly tests for pre-conditions before making calls or lookups. This style contrasts
with the EAFP approach and is characterized by the presence of many if
statements.
In a multi-threaded environment, the LBYL approach can risk introducing a race condition between “the looking” and “the leaping”.
For example, the code, if key in mapping: return mapping[key] can fail
if another thread removes key from mapping after the test, but before
the lookup. This issue can be solved with locks or by using the EAFP
approach.

Most pythonic way to call dependant methods

I have a class with few methods - each one is setting some internal state, and usually requires some other method to be called first, to prepare stage.
Typical invocation goes like this:
c = MyMysteryClass()
c.connectToServer()
c.downloadData()
c.computeResults()
In some cases only connectToServer() and downloadData() will be called (or even just connectToServer() alone).
The question is: how should those methods behave when they are called in wrong order (or, in other words, when the internal state is not yet ready for their task)?
I see two solutions:
They should throw an exception
They should call correct previous method internally
Currently I'm using second approach, as it allows me to write less code (I can just write c.computeResults() and know that two other methods will be called if necessary). Plus, when I call them multiple times, I don't have to keep track of what was already called and so I avoid multiple reconnecting or downloading.
On the other hand, first approach seems more predictable from the caller perspective, and possibly less error prone.
And of course, there is a possibility for a hybrid solution: throw and exception, and add another layer of methods with internal state detection and proper calling of previous ones. But that seems to be a bit of an overkill.
Your suggestions?

They should throw an exception. As said in the Zen of Python: Explicit is better than implicit. And, for that matter, Errors should never pass silently. Unless explicitly silenced. If the methods are called out of order that's a programmer's mistake, and you shouldn't try to fix that by guessing what they mean. You might accidentally cover up an oversight in a way that looks like it works, but is not actually an accurate reflection of the programmer's intent. (That programmer may be future you.)
If these methods are usually called immediately one after another, you could consider collating them by adding a new method that simply calls them all in a row. That way you can use that method and not have to worry about getting it wrong.
Note that classes that handle internal state in this way are sometimes called for but are often not, in fact, necessary. Depending on your use case and the needs of the rest of your application, you may be better off doing this with functions and actually passing connection objects, etc. from one method to another, rather than using a class to store internal state. See for instance Stop Writing Classes. This is just something to consider and not an imperative; plenty of reasonable people disagree with the theory behind Stop Writing Classes.

You should write exceptions. It is good programming practice to write Exceptions to make your code easier to understand for the following reasons:
What you are describe fits the literal description of "exception" -- it is an exception to normal proceedings.
If you build in some kind of work around, you will likely have "spaghetti code" = BAD.
When you, or someone else goes back and reads this code later, it will be difficult to understand if you do not provide the hint that it is an exception to have these methods executed out of order.
Here's a good source:
http://jeffknupp.com/blog/2013/02/06/write-cleaner-python-use-exceptions/
As my CS professor always said "Good programmers can write code that computers can read, but great programmers write code that humans and computers can read".
I hope this helps.

If it's possible, you should make the dependencies explicit.
For your example:
c = MyMysteryClass()
connection = c.connectToServer()
data = c.downloadData(connection)
results = c.computeResults(data)
This way, even if you don't know how the library works, there's only one order the methods could be called in.

python isinstance vs hasattr vs try/except: What is better?

I am trying to figure out the tradeoffs between different approaches of determining whether or not with object obj you can perform action do_stuff(). As I understand, there are three ways of determining if this is possible:
# Way 1
if isinstance(obj, Foo):
obj.do_stuff()
# Way 2
if hasattr(obj, 'do_stuff'):
obj.do_stuff()
# Way 3
try:
obj.do_stuff()
except:
print 'Do something else'
Which is the preferred method (and why)?

I believe that the last method is generally preferred by Python coders because of a motto taught in the Python community: "Easier to ask for forgiveness than permission" (EAFP).
In a nutshell, the motto means to avoid checking if you can do something before you do it. Instead, just run the operation. If it fails, handle it appropriately.
Also, the third method has the added advantage of making it clear that the operation should work.
With that said, you really should avoid using a bare except like that. Doing so will capture any/all exceptions, even the unrelated ones. Instead, it is best to capture exceptions specifically.
Here, you will want to capture for an AttributeError:
try:
obj.do_stuff() # Try to invoke do_stuff
except AttributeError:
print 'Do something else' # If unsuccessful, do something else

Checking with isinstance runs counter to the Python convention of using duck typing.
hasattr works fine, but is Look Before you Leap instead of the more Pythonic EAFP.
Your implementation of way 3 is dangerous, since it catches any and all errors, including those raised by the do_stuff method. You could go with the more precise:
try:
_ds = obj.do_stuff
except AttributeError:
print('Do something else')
else:
_ds()
But in this case, I'd prefer way 2 despite the slight overhead - it's just way more readable.

The correct answer is 'neither'
hasattr delivers functionality however it is possibly the worst of all options.
We use the object oriented nature of python because it works. OO analysis is never accurate and often confuses however we use class hierarchies because we know they help people do better work faster. People grasp objects and a good object model helps coders change things more quickly and with less errors. The right code ends up clustered in the right places. The objects:
Can just be used without considering which implementation is present
Make it clear what needs to be changed and where
Isolate changes to some functionality from changes to some other functionality – you can fix X without fearing you will break Y
hasattr vs isinstance
Having to use isinstance or hasattr at all indicates the object model is broken or we are using it incorrectly. The right thing to do is to fix the object model or change how we are using it.
These two constructs have the same effect and in the imperative ‘I need the code to do this’ sense they are equivalent. Structurally there is a huge difference. On meeting this method for the first time (or after some months of doing other things), isinstance conveys a wealth more information about what is actually going on and what else is possible. Hasattr does not ‘tell’ you anything.
A long history of development lead us away from FORTRAN and code with loads of ‘who am I’ switches. We choose to use objects because we know they help make the code easier to work with. By choosing hasattr we deliver functionality however nothing is fixed, the code is more broken than it was before we started. When adding or changing this functionality in the future we will have to deal with code that is unequally grouped and has at least two organising principles, some of it is where it ‘should be’ and the rest is randomly scattered in other places. There is nothing to make it cohere. This is not one bug but a minefield of potential mistakes scattered over any execution path that passes through your hasattr.
So if there is any choice, the order is:
Use the object model or fix it or at least work out what is wrong
with it and how to fix it
Use isinstance
Don’t use hasattr

Is it bad nesting try/catch statements?

My case right now:
try:
try:
condition
catch
try:
condition
catch
catch
major failure
Is it bad to have the code like that? Does it clutter too much, or what are the implications of something like that?

No, that's somewhat common (except the keyword is except rather than catch). It depends on what you need to do and the design.
What IS bad, that I see too much of, is catching top-level Exception class, rather than something more specific (e.g. KeyError). Or raising the same.

I wouldn't just cut a verdict and claim "it's bad", because sometimes you may need it. Python sometimes deliberately throws exceptions instead of letting you ask (does this ...?) [the EAFP motto] and in some cases nesting of try/catch is useful - when this makes sense with the logical flow of the code.
But my guess is that most times you don't. So a better question in your case would be to present a specific use case where you think you need such code.

Check if something is a list

What is the easiest way to check if something is a list?
A method doSomething has the parameters a and b. In the method, it will loop through the list a and do something. I'd like a way to make sure a is a list, before looping through - thus avoiding an error or the unfortunate circumstance of passing in a string then getting back a letter from each loop.
This question must have been asked before - however my googles failed me. Cheers.

To enable more usecases, but still treat strings as scalars, don't check for a being a list, check that it isn't a string:
if not isinstance(a, basestring):
...

Typechecking hurts the generality, simplicity, and maintainability of your code. It is seldom used in good, idiomatic Python programs.
There are two main reasons people want to typecheck:
To issue errors if the caller provides the wrong type.
This is not worth your time. If the user provides an incompatible type for the operation you are performing, an error will already be raised when the compatibility is hit. It is worrisome that this might not happen immediately, but it typically doesn't take long at all and results in code that is more robust, simple, efficient, and easier to write.
Oftentimes people insist on this with the hope they can catch all the dumb things a user can do. If a user is willing to do arbitrarily dumb things, there is nothing you can do to stop him. Typechecking mainly has the potential of keeping a user who comes in with his own types that are drop-in replacements for the ones replaced or when the user recognizes that your function should actually be polymorphic and provides something different that can accept the same operation.
If I had a big system where lots of things made by lots of people should fit together right, I would use a system like zope.interface to make testing that everything fits together right.
To do different things based on the types of the arguments received.
This makes your code worse because your API is inconsistent. A function or method should do one thing, not fundamentally different things. This ends up being a feature not usually worth supporting.
One common scenario is to have an argument that can either be a foo or a list of foos. A cleaner solution is simply to accept a list of foos. Your code is simpler and more consistent. If it's an important, common use case only to have one foo, you can consider having another convenience method/function that calls the one that accepts a list of foos and lose nothing. Providing the first API would not only have been more complicated and less consistent, but it would break when the types were not the exact values expected; in Python we distinguish between objects based on their capabilities, not their actual types. It's almost always better to accept an arbitrary iterable or a sequence instead of a list and anything that works like a foo instead of requiring a foo in particular.
As you can tell, I do not think either reason is compelling enough to typecheck under normal circumstances.

I'd like a way to make sure a is a list, before looping through
Document the function.

Usually it's considered not a good style to perform type-check in Python, but try
if isinstance(a, list):
...
(I think you may also check if a.__iter__ exists.)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.