Refactoring a huge function into a set of smaller functions

Refactoring a huge function into a set of smaller functions - python

In terms of clean code, how should a function that has nested for loops, if-else statements, and while loops be refactored? What would be the ideal, clean structure for a big function like this? Is it acceptable to break a function like this up to nested functions?
def main():
try:
for
if
for
while
for
for
if
for
if
else
if
else
if
except:
if __name__ == "__main__":
main()

Only nest loops iniside loops if you really need, otherwise avoid nesting them (for algorithmic performance reasons).
Use the Omri answer advice to identify each step you are doing, give each step a clear name and extract the step into its own function (that you call to perform the step in the original function).
This is different from nesting functions, something done for different reasons.
You are just making calls to helper functions placed somewhere else (not nested in your function).
Do not surround everything in a try block, and avoid the catch all empty except:. Surround only around the specific (or the few statements) than can cause trouble, and be specific to list in the expect clause only the error or error category(ies) you are expecting there.

It is mainly opinion-based and depends on the code itself. A good rule of thumb is that each function needs to have one logical purpose.

Related

Calling non-pure function in list comprehension

I have the following code (simplified):
def send_issue(issue):
message = bot.send_issue(issue)
return message
def send_issues(issues):
return [send_issue(issue) for issue in issues]
As you see, send_issues and send_issue are non-pure functions. Is this considered a good practice (and Pythonic) to call non-pure functions in list comprehensions? The reason I want to do this is that this is convenient. The reason against this is that when you see a list comprehension, you expect that this code just generates the list and nothing more, but that's not the case.
UPD:
I actually want to create and return the list contrary this question.

The question here is - Do you really need to create the list?
If this is so it is okay but not the best design.
It is good practice for a function to do only one thing especially if it has a side effect like I/O.
In your case, the function is creating and sending a message.
To fix this you can create a function that is sending the message and a function which is generating the message.
It is better to write it as.
msgs = [bot.get_message(issue) for issue in issues]
for msg in msgs:
bot.send(msg)
This is clearer and widens the use of API while keeping the side effect isolated.
If you don't want to create another function you can at least use map since it says - "map this function to every element".
map(lambda issue: bot.send_issue(issue), issues) # returns a list
Also, the function send_issue is not needed because it just wraps the bot.send_issue.
Adding such functions is only making the code noisy which is not a good practice.

Using nested def for organizing code

I keep nesting def statements as a way of grouping/organizing code, but after doing some reading I think I'm abusing it..
Is it kosher to do something like this?
def generate_random_info():
def random_name():
return numpy.random.choice(['foo', 'bar'])
def random_value():
return numpy.random.rand()
return {'name':random_name(), 'value':random_value()}

There is nothing wrong with it per se. But you should consider one thing when you use structures like this: random_name and random_value are functions that keep being redefined whenever you call generate_random_info(). Now that might not be a problem for those particular functions, especially when you won’t call it too often but you should consider that this is overhead that can be avoided.
So you should probably move those function definitions outside of the generate_random_info function. Or, since those inner functions don’t do much themselves, and you just call them directly, just inline them:
def generate_random_info():
return {
'name': numpy.random.choice(['foo', 'bar']),
'value': numpy.random.rand()
}

Unless you are planning on reusing the same chunk of code repeatedly throughout a single function and that function only, I would avoid creating those functions just for the sake of doing it. I'm not an expert on how the code is working on the computational level, but I would think that creating a function is more intensive than simply using that line as you have it now, especially if you're only going to use that function once.

Using pass on a non necessary else statement

Based on PEP8 documentation, I was not able to find any reference regarding if I should use pass for aesthetic reasons on code. Based on the example below, should I keep those else or can I erase them? Until now, the main reason I'm keeping it is based on the mantra "Explicit is better than implicit."
if fields:
for i in foo:
if i == 'something':
print "something"
else:
pass
else:
pass

Yes, you can/should remove them because they do nothing.
The Python community teaches "explicit is better than implicit" as long as the explicit code does something useful. Those else: pass's however contribute nothing positive to the code. Instead, all they do is pointlessly consume two lines each.

You can safely remove those as there's no point in keeping code around that serves no purpose:
if fields:
for i in foo:
if i == 'something':
print "something"

An else pass is dead code, you should remove it, as it adds unnecessary noise to the code and anyway the code will be clearer and easier to understand without it.

I can think of few cases where pass may be useful - the latter two are temporarily stubs:
When you want to do ignore an acceptable exception
When you need to insert a breakpoint at the end of function when debugging.
As a filler in a function whose implementation you want to postpone
I cannot imagine any other case where I will use pass
EDIT:
In some cases, when implementing if-elif-else chain, and you have some common condition that requires no action - along with rare conditions that do require specific actions - for the sake of execution efficiency, you may use pass after the first if:
if <some common condition>:
pass
elif <rare condition>:
<do something>
elif <another rare condition>:
<do something else>
else:
<do another stuff>

The thing about else is that they are not just a part of the if statements; it appears in try statements and for loops too. You don't see else being used (in this context) in those areas, do you?
try:
raw_input("say my name")
except:
print "Heisenberg"
# Meh, this is not needed.
else:
pass
If we are looping over something and checking for some condition (with the if), then an else would add unnecessary lines.
Here's a loop for finding a folder:
for path in pathlist
if os.path.isdir(path):
print "Found a folder, yay!"
break
else:
continue
Clearly, else is executed in every loop and is pointless. This could be avoided as implied in the PEP 8 itself:
But most importantly: know when to be inconsistent -- sometimes the style guide just doesn't apply. When in doubt, use your best judgment.
Look at other examples and decide what looks best. And don't hesitate
to ask! When applying the guideline would make the code less readable, even for someone who is used to reading code that follows this PEP.

else vs return to use in a function to prematurely stop processing

[Edit] changed return 0 to return. Side effects of beinga Python n00b. :)
I'm defining a function, where i'm doing some 20 lines of processing. Before processing, i need to check if a certain condition is met. If so, then I should bypass all processing. I have defined the function this way.
def test_funciton(self,inputs):
if inputs == 0:
<Display Message box>
return
<20 line logic here>
Note that the 20 line logic does not return any value, and i'm not using the 0 returned in the first 'if'.
I want to know if this is better than using the below type of code (in terms of performance, or readability, or for any other matter), because the above method looks good to me as it is one indentation less:
def test_function(self,inputs):
if inputs == 0:
<Display Message box>
else:
<20 line logic here>

In general, it improves code readability to handle failure conditions as early as possible. Then the meat of your code doesn't have to worry about these, and the reader of your code doesn't have to consider them any more. In many cases you'd be raising exceptions, but if you really want to do nothing, I don't see that as a major problem even if you generally hew to the "single exit point" style.
But why return 0 instead of just return, since you're not using the value?

First, you can use return without anything after, you don't have to force a return 0.
For the performance way, this question seems to prove you won't notice any difference (except if you're realy unlucky ;) )

In this context, I think it's important to know why inputs can't be zero? Typically, I think the way most programs will handle this is to raise an exception if a bad value is passed. Then the exception can be handled (or not) in the calling routine.
You'll often see it written "Better to ask forgiveness" as opposed to "Look before you leap". Of course, If you're often passing 0 into the function, then the try / except clause could get expensive (try is cheap, except is not).
If you're set on "looking before you leap", I would probably use the first form to keep indentation down.

I doubt the performance is going to be significantly different in either case. Like you I would tend to lean more toward the first method for readability.
In addition to the smaller indentation(which doesn't really matter much IMO), it precludes the necessity to read further for when your inputs == 0:
In the second method one might assume that there is additional processing after the if/else statement, whereas the first one makes it obvious that the method is complete upon that condition.
It really just comes down to personal preference though, you will see both methods used in practice.

Your second example will return after it displays the message box in this case.
I prefer to "return early" as in my opinion it leads to better readability. But then again, most of my returns that happen prior to the actual end of the function tend to be more around short circuiting application logic if certain conditions are not met.

function in python

In a function, I need to perform some logic that requires me to call a function inside a function. What I did with this, like:
def dfs(problem):
stack.push(bache)
search(root)
while stack.isEmpty() != 0:
def search(vertex):
closed.add(vertex)
for index in sars:
stack.push(index)
return stack
In the function, dfs, I am using search(root), is this is the correct way to do it?
I am getting an error: local variable 'search' referenced before assignment

There are many mysterious bug-looking aspects in your code. The wrong order of definition (assuming you do need the search function to be a nested one) and the syntax error from the empty while loop have already been observed, but there are more...:
def dfs(problem):
stack.push(bache)
search(root)
what's bache, what's stack, what's root? If they're all global variables, then you're overusing globals -- and apparently nowhere ever using the argument problem (?!).
while stack.isEmpty() != 0:
what's this weird-looking method isEmpty? IOW, what type is stack (clearly not a Python list, and that's weird enough, since they do make excellent LIFO stacks;-)...? And what's ever going to make it empty...?
def search(vertex):
closed.add(vertex)
...don't tell me: closed is yet another global? Presumably a set? (I remember from a few of your Qs back that you absolutely wanted to have a closed dict, not set, even though I suggested that as a possibility...
for index in sars:
...and what's sars?!
stack.push(index)
return stack
what a weird "loop" -- one that executes exactly once, altering a global variable, and then immediately returns that global variable (?) without doing any of the other steps through the loop. Even if this is exactly what you mean (push the first item of sars, period) I don't recommend hiding it in a pseudo-loop -- it seriously looks like a mystery bug just waiting to happen;-).

You need to de-indent your search function. The way you have it set up right now you are defining your search function as a part of the completion of your dfs call. Also, encapsulation in a class would help.

Thats the wrong order. Try this:
def dfs(problem):
def search(vertex):
closed.add(vertex)
for index in sars:
stack.push(index)
return stack
stack.push(bache)
search(root)
while stack.isEmpty() != 0:

Either define search before you call it, or define it outside of dfs.

you have to define the function before using it
root doesn't seem to be available in your scope - make sure it's reachable

You don't have body to your for your while loop. That is probably causing problems parsing the code. I would also suggest putting the local function definition before it is used, so it is easier to follow.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Refactoring a huge function into a set of smaller functions - python

It is mainly opinion-based and depends on the code itself. A good rule of thumb is that each function needs to have one logical purpose.

Related

Calling non-pure function in list comprehension

Using nested def for organizing code

Using pass on a non necessary else statement

else vs return to use in a function to prematurely stop processing

function in python

Categories

Resources