Python: Clean Code and Performance on For Loops

Python: Clean Code and Performance on For Loops - python

I have problems to apply clean code principles to this code section.
In clean code principles is written, that a function should do one thing and do it well.
But in this case I do not know how to refractor this function according to those principles without decreasing performance.
for netobject in netObjectList:
for key, value in netobject.getObjectParams().items():
if value.getValue() == '' and key is not 'comment':
Errorhandler().error_on_pos(value.getRow(), value.getCol(),
str(key) + ' is missing')
else:
#do syntax check on values in the dictionary
NValidator.checkSyntaxOfValue(key, value)
A netobject is dictionary containing all its parameters.
So I know that this function does more than one thing. And I think it is quite hard to unittest this function.
But if I proof syntax and missing parameters separatley, I have to iterate twice over all network objects bundled in that neObjectList. But otherwise it is no clean code to check syntax and missing parameters in one function.
I very often have these inner conflicts, when I am writing code.
Do you have some tips or suggestions to fix these problems?
If you need more description for that code section, please let me know.
Errorhandler is just a way to print out my errors, if an missing parameter is found.
The function checkSyntaxOfValue(key, value) calls suiteable check methods depending on the key of the bundled parameters. I think there would be also a better solution to handle syntax checks. But I do not know.

As far as I'm concerned, your code IS "doing one thing": validating your netobjects values. The fact that the validation is done in two steps (first checking if there's a value where expected, then checking the value's 'syntax') is an implementation detail.
Now what can be questionned is whether checking for empty values is distinct from checking the value's syntax - ie if an empty value is not a valid syntax for a given key, shouldn't the check be a responsability of NValidator.checkSyntaxOfValue ?
Also if for whatever reason you really do have to keep the tests (empty value / value syntax) distinct, you should probably encapsulate the emptyness test in a distinct function so you have the same level of abstraction for both tests, ie:
def checkForNotEmptyValue(key, value):
if key == 'comment':
# we don't care about empty values here
return True
if value.getValue() == '':
Errorhandler().error_on_pos(
value.getRow(),
value.getCol(),
'{} is missing'.format(key)
)
return False
return True
and then
for netobject in netObjectList:
for key, value in netobject.getObjectParams().items():
if checkForNotEmptyValue(key, value):
NValidator.checkSyntaxOfValue(key, value)
NB: this is just an example based on your code snippet and question - I don't have enough context to tell whether it's the "best" solution here.
As a last word: don't get over the top with "best practices" either. Writing clean code IS important, but "clean" is a relative notion, and trying to strictly follow all "golden rules" is downright impossible (specially since no one really agrees on those rules xD).
Actually, the point of "golden rules" etc is to make you aware of what problem you might have if you don't follow the rule - once you know, it's your responsability to decide when the rule applies and when it's doesn't. It's like raising children: at first you just forbid them to cross the road alone because they are too young to really understand the dangers. Once they are mature enough to get the point, the rule becomes useless.

Let me comment on your code first:
for netobject in netObjectList:
for key, value in netobject.getObjectParams().items():
"""
Below you seem to be assuming that first item must have 'comment' as key?
What if the key 'comment' actually exists? If you iterate like this you will be
Throwing errors until you iterate through all the items (and either hit the required key or not)
"""
if value.getValue() == '' and key is not 'comment':
Errorhandler().error_on_pos(value.getRow(), value.getCol(), str(key) + ' is missing')
else:
NValidator.checkSyntaxOfValue(key, value)
So, I'd choose a different approach. If you definitely want to work with item having key 'comment'
(and assuming here that your keys are unique - ie no multiple 'comment' keys in a single netobject object) you could do this instead:
for netobject in netObjectList:
try:
# try to get the value directly
# (of course you could use .keys() instead of .items() to evaluate existing keys first)
value = netobject.getObjectParams()['comment']
NValidator.checkSyntaxOfValue('comment', value)
except:
# Due to KeyError we now know for sure that such key definitely doesn't exist
# So we can handle the Exception, or throw your own such as:
Errorhandler().error_on_pos(...)
This example will either throw you just one True (ie not false positive) error, or it will succeed with syntax check
EDIT:
Decided to illustrate with better example:
Here I'm first emulating your netobject
class netobject_class(object):
def __init__(self, datadict):
self.data = datadict
def getObjectParams(self):
return self.data
def __str__(self):
return "I'm a netobject_class instance (hash=" + str(self.__hash__()) + ") containing a data Dict"
Now we create the netObjectList which contains 3 sample netobject_class instances. Note that second Dict has no 'comment' key.
netObjectList = [
netobject_class({'comment': 'comment1'}),
netobject_class({'somekey':'somevalue','somekey2':'somevalue2'}),
netobject_class({'comment': 'comment2'})
]
Now we run the code:
print("-" * 40)
for netobject in netObjectList:
# print("Working with netobject: %s" % netobject)
try:
# try to get the value directly
# (of course you could use .keys() instead of .items() to evaluate existing keys first)
value = netobject.getObjectParams()['comment']
print("****** value='%s' # aka comment" % value)
#NValidator.checkSyntaxOfValue('comment', value)
except Exception as e:
# Due to KeyError we now know for sure that such key definitely doesn't exist
# So we can handle the Exception, or throw your own such as:
print('Error: key', str(e), 'does not exist! Skipping....')
print("-" * 40)
And this produces:
----------------------------------------
****** value='comment1' # aka comment
----------------------------------------
Error: key 'comment' does not exist! Skipping....
----------------------------------------
****** value='comment2' # aka comment
----------------------------------------

Related

Given two possible dictionary keys, how to take the one actually set

I am receiving from an API a dictionary, but the problem is that sometimes I get {value: test} and others {key: test}. I am using a try/except block to take the one set:
try:
var = received_dict['value']
except KeyError:
var = received_dict['key']
Is there a better way to do that in Python 3?

received_dict.get('value', received_dict.get('key', False))

This will still throw an error if neither key is present which is almost certainly the desired behavior.
received_dict.get('value', received_dict['key'])
However, python generally uses the: Easier to ask for forgiveness than permission mentality. If one of those occurs more frequently than the other how you're currently doing it may be best. If it really is 50/50 however using a get might be a cleaner way to go.
https://docs.python.org/3.4/glossary.html (See EAFP)

To be honest your approach seems good. If you want a bit more control over the different cases you can do this:
vars = [received_dict[key] for key in ["value","key"] if key in received_dict.keys()]
if not vars:
raise KeyError("No 'value' or 'key' keys exist in dictionary")
elif len(vars)==2:
raise KeyError("Both 'value' and 'key' keys exist in dictionary")
else:
var=vars[0]
The only advantage of this approach is that it throws an error also when both keys exist. If you don't care about the "both exist" error case and you don't mind a different type of error being raised when the "none exist" error case happens, then you can do the code below to have a one liner:
var = [received_dict[key] for key in ["value","key"] if key in received_dict.keys()][0]

using get function if key must exist in dictionary?

One question about using dictionary get function in python. I understand get function can provide default value for dict if key does not exist. What if during the program, we know key must be existed. like following code. should we still use get function or we can just use dict[key] to get value. does it mean get function can replace dict[key]
value = 'default'
dict_get = dict(key='value')
def test_get(dict_get):
return dict_get.get('key', 'default_value')
test_get.get('key')
test_get['key']

If the key must exist, you should use yourdict[key], i.e. the __getitem__ method.
If for some reason the key does not exist, you want your program to crash with a KeyError, because clearly there's something wrong with your program that needs to be fixed.
If the key should exist, but may not due to reasons other than faulty program logic, you can take a more defensive approach. For example, if a user is queried to input a valid key but fails to do so, you could fall back to a default value using dict.get or ask the user again.

dict.get will return None if the key doesn't exist. In some applications you just want to raise an exception there instead, which is what dict[key] does. For example:
scores = {"Larry": 5}
bob_score = scores.get("Bob") # None
if bob_score:
print("You're doing great Bob!")
else:
print("Too bad Bob, you're out.")
Our code assumes that the key Bob is in the dictionary, and maps to an integer. Even though Bob isn't in the dictionary because None is a falsy value we still get a meaningful answer from this code, even though that conclusion is based on a mistake. It would be much better for an exception to be raised here.

Too many if statements

I have some topic to discuss. I have a fragment of code with 24 ifs/elifs. Operation is my own class that represents functionality similar to Enum. Here is a fragment of code:
if operation == Operation.START:
strategy = strategy_objects.StartObject()
elif operation == Operation.STOP:
strategy = strategy_objects.StopObject()
elif operation == Operation.STATUS:
strategy = strategy_objects.StatusObject()
(...)
I have concerns from readability point of view. Is is better to change it into 24 classes and use polymorphism? I am not convinced that it will make my code maintainable... From one hand those ifs are pretty clear and it shouldn't be hard to follow, on the other hand there are too many ifs.
My question is rather general, however I'm writing code in Python so I cannot use constructions like switch.
What do you think?
UPDATE:
One important thing is that StartObject(), StopObject() and StatusObject() are constructors and I wanted to assign an object to strategy reference.

You could possibly use a dictionary. Dictionaries store references, which means functions are perfectly viable to use, like so:
operationFuncs = {
Operation.START: strategy_objects.StartObject
Operation.STOP: strategy_objects.StopObject
Operation.STATUS: strategy_objects.StatusObject
(...)
}
It's good to have a default operation just in case, so when you run it use a try except and handle the exception (ie. the equivalent of your else clause)
try:
strategy = operationFuncs[operation]()
except KeyError:
strategy = strategy_objects.DefaultObject()
Alternatively use a dictionary's get method, which allows you to specify a default if the key you provide isn't found.
strategy = operationFuncs.get(operation(), DefaultObject())
Note that you don't include the parentheses when storing them in the dictionary, you just use them when calling your dictionary. Also this requires that Operation.START be hashable, but that should be the case since you described it as a class similar to an ENUM.

Python's equivalent to a switch statement is to use a dictionary. Essentially you can store the keys like you would the cases and the values are what would be called for that particular case. Because functions are objects in Python you can store those as the dictionary values:
operation_dispatcher = {
Operation.START: strategy_objects.StartObject,
Operation.STOP: strategy_objects.StopObject,
}
Which can then be used as follows:
try:
strategy = operation_dispatcher[operation] #fetch the strategy
except KeyError:
strategy = default #this deals with the else-case (if you have one)
strategy() #call if needed
Or more concisely:
strategy = operation_dispatcher.get(operation, default)
strategy() #call if needed
This can potentially scale a lot better than having a mess of if-else statements. Note that if you don't have an else case to deal with you can just use the dictionary directly with operation_dispatcher[operation].

You could try something like this.
For instance:
def chooseStrategy(op):
return {
Operation.START: strategy_objects.StartObject
Operation.STOP: strategy_objects.StopObject
}.get(op, strategy_objects.DefaultValue)
Call it like this
strategy = chooseStrategy(operation)()
This method has the benefit of providing a default value (like a final else statement). Of course, if you only need to use this decision logic in one place in your code, you can always use strategy = dictionary.get(op, default) without the function.

Starting from python 3.10
match i:
case 1:
print("First case")
case 2:
print("Second case")
case _:
print("Didn't match a case")
https://pakstech.com/blog/python-switch-case/

You can use some introspection with getattr:
strategy = getattr(strategy_objects, "%sObject" % operation.capitalize())()
Let's say the operation is "STATUS", it will be capitalized as "Status", then prepended to "Object", giving "StatusObject". The StatusObject method will then be called on the strategy_objects, failing catastrophically if this attribute doesn't exist, or if it's not callable. :) (I.e. add error handling.)
The dictionary solution is probably more flexible though.

If the Operation.START, etc are hashable, you can use dictionary with keys as the condition and the values as the functions to call, example -
d = {Operation.START: strategy_objects.StartObject ,
Operation.STOP: strategy_objects.StopObject,
Operation.STATUS: strategy_objects.StatusObject}
And then you can do this dictionary lookup and call the function , Example -
d[operation]()

Here is a bastardized switch/case done using dictionaries:
For example:
# define the function blocks
def start():
strategy = strategy_objects.StartObject()
def stop():
strategy = strategy_objects.StopObject()
def status():
strategy = strategy_objects.StatusObject()
# map the inputs to the function blocks
options = {"start" : start,
"stop" : stop,
"status" : status,
}
Then the equivalent switch block is invoked:
options["string"]()

Do you have to check if an array element exists (not null string) in Python3?

If an associative array exists in Python3, is it a waste to check if an element of it exists rather than just using it?
Should you:
if 'name' in array and not array['name'].startswith('something'):
Or should you just:
if not array['name'].startswith('something'):
... And will Python3 handle it "for" you?

You can do -
if not array.get('name', 'something').startswith('something'):
get() function returns the second value by default if the key ( name ) is not found.
So in above case , this would return something , if key is not found in the dictionary, and because of the .startwith() and the not , the complete conditional expression would evaluate to False , as it would be doing in OP's first example .

The answer really depends on where does your array come from. Also on what can you do as a result of the check.
For example if it comes from the user, then yes, you definitely should check if name exists, so that you can respond with a better error message than some big stacktrace.... KeyError: "name".
If it's an internal structure you created and it's not exposed for user modifications, or if it's exposed via API but the expectation is that nobody should touch it, then don't bother checking. If it's missing, then it's an internal exception the developers should see and fix.
Of course there may be a situation where you don't really care if it was provided or not, because you have a reasonable fallback. Then a default value like in Anand's answer is a good solution too. array.get('name', some_default)

Python won't handle it for you. You'll have to do some work in both cases. Try to write code that makes sense to you later, because you'll write it once and read it (and maintain it) many times over.
You can do it two ways, pick the one that makes the most sense to you:
obj = array.get('name')
if obj and not obj.startswith('something'):
pass
In your second option, you'll have to catch the exception:
try:
if not array['name'].startswith('something'):
pass
except KeyError:
print('name does not exist in array')

Just do ...
if not array.get('name',"").startswith('something')
If name exists then it returns array['name'] else it will return empty string .

Python idiomatic unpacking assignment or False

If function returns a two value list or tuple on success or False on failure, how can I best unpack the return list into two variables while also checking for False?
def get_key_value():
if (cond != True):
return False
return [val1, val2]
# Call it
# How can I also check for False while unpacking?
key, value = get_key_value()

Coverting #Felix Kling's great comment into an answer.
If not being able to find a (key, value) pair indicates some kind of system failure, it would be better to throw an exception. If your failure doesn't really fall into any of the standard exceptions, you should build a new exception type of your own.
The cond != True is better written as not cond. Also it's better to not create a list if it's not necessary.
class DataNotFound(Exception): pass
def get_key_value():
if not cond:
raise DataNotFound("Couldn't find it!")
return val1, val2
try:
key,value = get_key_value()
except DataNotFound:
#handle the failure somehow
key, value = 'ERROR', 'ERROR'

This falls under the "Easier to Ask for Forgiveness than Permission" policy of Python. I avoid catching TypeError in your function, in case there's some other unforeseen problem.
data = get_key_value()
try:
key, value = data
except TypeError:
#handle the failure somehow
key, value = 'ERROR', 'ERROR'

I don't think there is an idiomatic way to do this -- not least because a function that behaves that way is itself unidiomatic. If you have to do it, I suggest you simply make use of the fact that your 2-element list or tuple is a "truthy" rather than a "falsy" value (this isn't Python terminology but it's useful):
pair_or_false = get_key_value()
if pair:
key,value = val
else:
# handle failure in whatever way
The obvious alternative is to treat the not-found case as an exception:
try:
key,value = get_key_value()
except TypeError:
# deal with not-found case
but if there's any possibility at all that something other than the unsuccessful unpacking could raise a TypeError then you run the risk of masking a genuine error that way.

You're running into problems because you're mixing return types. Just because you can doesn't mean you should.
Although I agree with the others here that an exception is one appropriate way to go, it may depend on whether you expect to find a valid key & value most of the time. If so, use an exception (something like KeyError) to indicate that the function failed. But if you expect it to fail at a high rate, you may not want the exception overhead. In that case, return something like [None, None] from get_key_value and then your calling code would look like:
key, value = get_key_value()
if key:
# take action
else:
# handle the error appropriately

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python: Clean Code and Performance on For Loops - python

Related

Given two possible dictionary keys, how to take the one actually set

using get function if key must exist in dictionary?

Too many if statements

Do you have to check if an array element exists (not null string) in Python3?

Python idiomatic unpacking assignment or False

Categories

Resources