How to do a ?. None check - python

I know this is in the pipeline for Java 8 or 9 but I think there must be a way to do this in python. Say for example I am writing a complex expression and cannot be bothered to add null checks at all levels (example below)
post_code = department.parent_department.get('sibling').employees.get('John').address.post_code
I dont want to worry about several intermediate values being 'None'. For example if the parent_department does not have a sibling key I want to shunt and return None assigned to post_code. Something like
post_code = department?.parent_department?.get('sibling')?.employees?.get('John')?.address?.post_code
Can this be done in Python 2.7.1? I know this means more trouble while debugging but assume I have done all pre-checks and if any value is null it means an internal error so it is enough if I just get the error trace that the particular line failed.
Here is a more verbose way. I just need a one-liner that does not throw random exceptions
def get_post_code(department):
if department is None:
return None
if department.parent_department is None:
return None
if department.parent_department.get('sibling') is None:
return None
... more checks...
return post_code = department.parent_department.get('sibling').employees.get('John').address.post_code

If you want post_code to be None then catch the exceptions raised by trying to access non-existing items:
try:
post_code = department.parent_department.get('sibling').employees.get('John').address.post_code
except (AttributeError, KeyError):
post_code = None

Actually one valid answer for this is to start thinking in terms of Monads (may-be monads) to chain these functions. A very primitive tutorial is at https://github.com/dustingetz/dustingetz.github.com/blob/master/_posts/2012-04-07-dustins-awesome-monad-tutorial-for-humans-in-python.md

Related

How to do elegant error handling in Python

I'm scraping Linkedin using Selenium. This a very brittle task and exceptions are raised often. I want to find an elegant way to handle errors. The internet has the usual try catch but its clunky... See the code below:
try:
WebDriverWait(driver, 20).until(EC.element_to_be_clickable(job))
job_title = job.find_element(By.CLASS_NAME, "base-search-card__title").text
company = job.find_element(By.CLASS_NAME, "base-search-card__subtitle").text
location = job.find_element(By.CLASS_NAME, "job-search-card__location").text
except :
print("Boom Boom")
If any of the find_element methods throws the expect part is run and the code in the try wont execute further. I'd like a scenario where if one fails the except wouldn't be hit i.e. if it fails I can return an empty string. I can wrap everything in a function and do something like this:
def extract_job_title(job):
try:
return job.find_element(By.CLASS_NAME, "base-search-card__title").text
except:
return ""
and have:
job_title = extract_job_title(job)
but that is also clunky... I want something like I would have in Swift. Something like this:
let job_title = try? job.find_element(By.CLASS_NAME, "base-search-card__title").text ?? ""
Does something similar to Swift exist and if not can anyone else see a way of making things "nicer" other than using functions?
Generally, if you have a lot of very repetitive code with only small differences, you can probably extract that into a function or loop. E.g.:
searches = {'title': 'base-search-card__title', ...}
data = {}
for item, cls in searches.items():
try:
data[item] = job.find_element(By.CLASS_NAME, cls).text
except:
pass
This would also extract data into one dict, which seems more logical than a bunch of separate variables.
Alternatively, extracted into a function:
def get(job, cls):
try:
return job.find_element(By.CLASS_NAME, cls).text
except:
return None
job_title = get(job, 'base-search-card__title')
Note that you should intercept a specific exception in except, not any and all exceptions.
Alternatively, turn the entire thing on its head and only evaluate the elements that actually exist, and sort them into the right bins. Something along the lines of:
classes = {'base-search-card__title': 'title', ...}
data = {}
for elem in job.some_broader_find_query():
data[classes[elem.class_name]] = elem.text

How to handle a specific condition and test for it

I have the following code.
MAXIMUM_DAYS = 7
def foo(start_date = None, end_date = None):
if abs((parse(start_date) - parse(end_date)).days) > MAXIMUM_DAYS:
return ?
How do I write the function in such a way that it exits/errors and I can test for that behavior?
I see the AssertRaises method in unittest. Is it as simple as creating a custom exception and raising it in the function, then testing for it in the unit test?
If so, are there standards about how and where to write the exception?
Is there a good example of this somewhere in the documentation or online?
Thanks,
Shawn
You are on the right track - using an exception is a common technique for what you have in mind. assert throws an exception, but it is not typically used in production code because assert statements can be disable on the command line. Instead, for your case raising a ValueError assertion would be a good choice:
MAXIMUM_DAYS = 7
def foo(start_date = None, end_date = None):
if abs((parse(start_date) - parse(end_date)).days) > MAXIMUM_DAYS:
raise ValueError(f'Date range exceeds max: {MAXIMUM_DAYS}')
Caveat... my assumption is that you are asking how to raise an exception if the value of your calculation is greater than MAXIMUM_DAYS. Based upon that assumption here is my answer.
Start by defining your exception:
class GreaterThanMaximumDays(Exception):
pass
Then simply in your code
MAXIMUM_DAYS = 7
def foo(start_date = None, end_date = None):
start_date_value = parse(start_date)
end_date_value = parse(end_date)
date_difference = abs(start_date_value - end_date_value)
if date_difference > MAXIMUM_DAYS:
raise GreaterThanMaximumDays()
return date_difference
Your guess as to if it is this easy and as easy as using AssertRaises is correct.
Personally, I like defining my own exceptions so this way I can easily test for its existence in my exception handling. I find it makes my code easier to read and understand exactly what I expect to be valid.
I pulled out the parsed values into their own variables. This simply makes debugging easier if you need to inspect the returned values for any reason at all.

In case of an exception during a loop: How to return the intermediate result before passing on the exception?

def store(self) -> list:
result = []
for url in self.urls():
if url.should_store():
stored_url = self.func_that_can_throw_errors(url)
if stored_url: result.append(stored_url)
return result
Preface: not actual method names. Silly names chosen to emphasize
During the loop errors may occur. In that case, I desire the intermediate result to be returned by store() and still raise the original exception for handling in a different place.
Doing something like
try:
<accumulating results ... might break>
except Exception:
return result
raise
sadly doesn't do the trick, since trivially the raise stmt won't be reached (and thus an empty list get's returned).
Do you guys have recommendations on how not to lose the intermediate result?
Thanks a lot in advance - Cheers!
It is not possible as you imagine it. You can't raise an exception and return a value.
So I think what you are asking for is a work around. There, I see two possibilities:
return a Flag/Exception along the actual return value:
Return flag:
except Exception:
return result, False
where False is the Flag telling that something went wrong
Return Exception:
except Exception as e:
return result, e
Since it appears, that store is a method of some class, you could raise the exception and retrieve the intermediary result with a second call like so:
def store(self):
result = []
try:
# something
except Exception:
self.intermediary_result = result
raise
def retrieve_intermediary(self):
return self.intermediary_result
The best answer I can come up with given my limited knowledge of Python would be to always return a pair, where the first part of the pair is the result and the second part of the pair is an optional exception value.
def store(self) -> list:
'''
TODO: Insert documentation here.
If an error occurs during the operation, a partial list of results along with
the exception value will be returned.
:return A tuple of [list of results, exception]. The exception part may be None.
'''
result = []
for url in self.urls():
if url.should_store():
try:
stored_url = self.func_that_can_throw_errors(url)
except Exception as e:
return result, e
if stored_url: result.append(stored_url)
return result, None
That said, as you have mentioned, if you have this call multiple places in your code, you would have to be careful to change it in all relevant places as well as possibly change the handling. Type checking might be helpful there, though I only have very limited knowledge of Python's type hints.
Meanwhile I had the idea to just use an accumulator which appears to be the 'quickest' fix for now with the least amount of changes in the project where store() is called.
The (intermediate) result is not needed everywhere (let's say it's optional). So...
I'd like to share that with you:
def store(self, result_accu=None) -> list:
if result_accu is None:
result_accu = []
for url in self.urls():
if url.should_store():
stored_url = self.func(url)
if stored_url: result_accu.append(stored_url)
return result_accu
Still returning a list but alongside the intermediate result is accessible by reference on the accu list.
Making the parameter optional enables to leave most statements in project as they are since the result is not needed everywhere.
store() is rather some kind of a command where the most work on data integrity is done within already. The result is nice-to-have for now.
But you guys also enabled me to notice that there's work to do in ordner to process the intermediate result anyway. Thanks! #attalos #MelvinWM

Python OO: how to stop procedure flow in an entire class with `return` statement?

When I run this code:
from nltk import NaiveBayesClassifier,classify
import USSSALoader
import random
class genderPredictor():
def getFeatures(self):
if self._loadNames() != None:
maleNames,femaleNames=self._loadNames()
else:
print "There is no training file."
return
featureset = list()
for nameTuple in maleNames:
features = self._nameFeatures(nameTuple[0])
featureset.append((features,'M'))
for nameTuple in femaleNames:
features = self._nameFeatures(nameTuple[0])
featureset.append((features,'F'))
return featureset
def trainAndTest(self,trainingPercent=0.80):
featureset = self.getFeatures()
random.shuffle(featureset)
name_count = len(featureset)
cut_point=int(name_count*trainingPercent)
train_set = featureset[:cut_point]
test_set = featureset[cut_point:]
self.train(train_set)
return self.test(test_set)
def classify(self,name):
feats=self._nameFeatures(name)
return self.classifier.classify(feats)
def train(self,train_set):
self.classifier = NaiveBayesClassifier.train(train_set)
return self.classifier
def test(self,test_set):
return classify.accuracy(self.classifier,test_set)
def getMostInformativeFeatures(self,n=5):
return self.classifier.most_informative_features(n)
def _loadNames(self):
return USSSALoader.getNameList()
def _nameFeatures(self,name):
name=name.upper()
return {
'last_letter': name[-1],
'last_two' : name[-2:],
'last_is_vowel' : (name[-1] in 'AEIOUY')
}
if __name__ == "__main__":
gp = genderPredictor()
accuracy=gp.trainAndTest()
And self._loadNames() returns None, I got this error (from random imported module):
shuffle C:\Python27\lib\random.py 285
TypeError: object of type 'NoneType' has no len()
This happend because despite I put a return statment in getFeatures(self), the flow jumps into the next class method (which is trainAndTest(self,trainingPercent=0.80)) which calls the random module (random.shuffle(featureset)).
So, I'd like to know: how to stop the procedure flow not only in the getFeatures(self) method, but in the entire class that contains it?
By the way, thanks Stephen Holiday for sharing the code.
This happend because despite I put a return statment in
getFeatures(self), the flow jumps into the next class method (which is
trainAndTest(self,trainingPercent=0.80)) which calls the random module
(random.shuffle(featureset)).
An important thing to remember is that None is a perfectly valid value. The return statement in your getFeatures() is doing exactly what it is told and returning the valid value. Only an exceptional situation, or you explicitly, will stop that flow.
Instead of asking how you can "return from the class", what you might want to look into is checking the return values of functions you call and making sure its what you expect before you proceed. There are two places you could do this:
def trainAndTest(self,trainingPercent=0.80):
featureset = self.getFeatures()
...
def _loadNames(self):
return USSSALoader.getNameList()
In the first spot, you could check if featureset is None, and react if it is None.
In the second spot, instead of blindly returning, you could check it first and react there.
Secondly. you have the option of raising exceptions. Exceptions are a situation where the code has encountered an error and can't continue. It is then the responsibility of the calling function to either handle it or let it ride up the chain. If nothing handles the exception, your application will crash. As you can see, you are getting an exception being raised from the random class because you are allowing a None to make its way into the shuffle call.
names = USSSALoader.getNameList()
if names is None:
# raise an exception?
# do something else?
# ask the user to do something?
The question at that point is, what do you want your program to do at that moment when it happens to get a None instead of a valid list? Do you want an exception similar to the one being raised by random, but more helpful and specific to your application? Or maybe you just want to call some other method that gets a default list. Is not having the names list even a situation where your application do anything other than exit? That would be an unrecoverable situation.
names = USSSALoader.getNameList()
if names is None:
raise ValueError("USSSALoader didn't return any "
"valid names! Can't continue!")
Update
From your comment, I wanted to add the specific handling you wanted. Python has a handful of built in exception types to represent various circumstances. The one you would most likely want to raise is an IOError, indicating that the file could not be found. I assume "file" means whatever file USSSALoader.getNameList() needs to use and can't find.
names = USSSALoader.getNameList()
if names is None:
raise IOError("No USSSALoader file found")
At this point, unless some function higher up the calling chain handles it, your program will terminate with a traceback error.
There is nothing like "return from the entire class". You need to organize your code so that return values are valid in the functions that get them. Those functions can test the value to determine what to do next. The class boundaries have no effect on program flow, just the namespacing of methods.
Generally what you would do here is check for validity after you call the function, e.g.:
featureset = self.getFeatures()
if not featureset:
# You could log an error message if you expected to get something, etc.
return

How to convert Nonetype to int or string?

I've got an Nonetype value x, it's generally a number, but could be None. I want to divide it by a number, but Python raises:
TypeError: int() argument must be a string or a number, not 'NoneType'
How can I solve this?
int(value or 0)
This will use 0 in the case when you provide any value that Python considers False, such as None, 0, [], "", etc. Since 0 is False, you should only use 0 as the alternative value (otherwise you will find your 0s turning into that value).
int(0 if value is None else value)
This replaces only None with 0. Since we are testing for None specifically, you can use some other value as the replacement.
In one of the comments, you say:
Somehow I got an Nonetype value, it supposed to be an int, but it's now a Nonetype object
If it's your code, figure out how you're getting None when you expect a number and stop that from happening.
If it's someone else's code, find out the conditions under which it gives None and determine a sensible value to use for that, with the usual conditional code:
result = could_return_none(x)
if result is None:
result = DEFAULT_VALUE
...or even...
if x == THING_THAT_RESULTS_IN_NONE:
result = DEFAULT_VALUE
else:
result = could_return_none(x) # But it won't return None, because we've restricted the domain.
There's no reason to automatically use 0 here — solutions that depend on the "false"-ness of None assume you will want this. The DEFAULT_VALUE (if it even exists) completely depends on your code's purpose.
A common "Pythonic" way to handle this kind of situation is known as EAFP for "It's easier to ask forgiveness than permission". Which usually means writing code that assumes everything is fine, but then wrapping it with a try...except block to handle things—just in case—it's not.
Here's that coding style applied to your problem:
try:
my_value = int(my_value)
except TypeError:
my_value = 0 # or whatever you want to do
answer = my_value / divisor
Or perhaps the even simpler and slightly faster:
try:
answer = int(my_value) / divisor
except TypeError:
answer = 0
The inverse and more traditional approach is known as LBYL which stands for "Look before you leap" is what #Soviut and some of the others have suggested. For additional coverage of this topic see my answer and associated comments to the question Determine whether a key is present in a dictionary elsewhere on this site.
One potential problem with EAFP is that it can hide the fact that something is wrong with some other part of your code or third-party module you're using, especially when the exceptions frequently occur (and therefore aren't really "exceptional" cases at all).
In Python 3 you can use the "or" keyword too. This way:
foo = bar or 0
foo2 = bar or ""
That TypeError only appears when you try to pass int() None (which is the only NoneType value, as far as I know). I would say that your real goal should not be to convert NoneType to int or str, but to figure out where/why you're getting None instead of a number as expected, and either fix it or handle the None properly.
I've successfully used int(x or 0) for this type of error, so long as None should equate to 0 in the logic.
Note that this will also resolve to 0 in other cases where testing x returns False. e.g. empty list, set, dictionary or zero length string.
Sorry, Kindall already gave this answer.
This can happen if you forget to return a value from a function: it then returns None. Look at all places where you are assigning to that variable, and see if one of them is a function call where the function lacks a return statement.
You should check to make sure the value is not None before trying to perform any calculations on it:
my_value = None
if my_value is not None:
print int(my_value) / 2
Note: my_value was intentionally set to None to prove the code works and that the check is being performed.
I was having the same problem using the python email functions.
Below is the code I was trying to retrieve email subject into a variable. This works fine for most emails and the variable populates. If you receive an email from Yahoo or the like and the sender did no fill out the subject line Yahoo does not create a subject line in the email and you get a NoneType returned from the function. Martineau provided a correct answer as well as Soviut. IMO Soviut's answer is more concise from a programming stand point; not necessarily from a Python one.
Here is some code to show the technique:
import sys, email, email.Utils
afile = open(sys.argv[1], 'r')
m = email.message_from_file(afile)
subject = m["subject"]
# Soviut's Concise test for unset variable.
if subject is None:
subject = "[NO SUBJECT]"
# Alternative way to test for No Subject created in email (Thanks for NoneThing Yahoo!)
try:
if len(subject) == 0:
subject = "[NO SUBJECT]"
except TypeError:
subject = "[NO SUBJECT]"
print subject
afile.close()
In some situations it is helpful to have a function to convert None to int zero:
def nz(value):
'''
Convert None to int zero else return value.
'''
if value == None:
return 0
return value

Categories

Resources