Calling non-pure function in list comprehension - python

I have the following code (simplified):
def send_issue(issue):
message = bot.send_issue(issue)
return message
def send_issues(issues):
return [send_issue(issue) for issue in issues]
As you see, send_issues and send_issue are non-pure functions. Is this considered a good practice (and Pythonic) to call non-pure functions in list comprehensions? The reason I want to do this is that this is convenient. The reason against this is that when you see a list comprehension, you expect that this code just generates the list and nothing more, but that's not the case.
UPD:
I actually want to create and return the list contrary this question.

The question here is - Do you really need to create the list?
If this is so it is okay but not the best design.
It is good practice for a function to do only one thing especially if it has a side effect like I/O.
In your case, the function is creating and sending a message.
To fix this you can create a function that is sending the message and a function which is generating the message.
It is better to write it as.
msgs = [bot.get_message(issue) for issue in issues]
for msg in msgs:
bot.send(msg)
This is clearer and widens the use of API while keeping the side effect isolated.
If you don't want to create another function you can at least use map since it says - "map this function to every element".
map(lambda issue: bot.send_issue(issue), issues) # returns a list
Also, the function send_issue is not needed because it just wraps the bot.send_issue.
Adding such functions is only making the code noisy which is not a good practice.

Related

Using a method to create multiple, new instances of a generator

I've been learning about generators in Python recently and have a question. I've used iterators before when learning Java, so I know how they basically work.
So I understand what's going on here in this question: Python for loop and iterator behavior
Essentially, once the for loop traverses through the iterator, it stops there, so doing another for loop would continue the iterator at the end of it (and result in nothing being printed out). That is clear to me
I'm also aware of the tee method from itertools, which lets me "duplicate" a generator. I found this to be helpful when I want to check if a generator is empty before doing anything to it (as I can check whether the duplicate in list form is empty).
In a code I'm writing, I need to create many of the same generators at different instances throughout the code, so my line of thought was: why don't I write a method that makes a generator? So every time I need a new one, I can call that method. Maybe my misunderstanding has to do with this "generator creation" process but that seemed right to me.
Here is the code I'm using. When I first call the method and duplicate it using tee, everything works fine, but then once I call it again after looping through it, the method returns an empty generator. Does this "using a method" workaround not work?
node_list=[]
generate_hash_2, temp= tee(generate_nodes(...))
for node in list(temp):
node_list.append(...)
print("Generate_hash_2:{}".format(node_list))
for node in generate_hash_2:
if node.hash_value==x:
print x
node_list2=[]
generate_hash_3, temp2= tee(generate_nodes(...)) #exact same parameters as before
for node in list(temp2):
node_list2.append(...)
print("Generate_hash_3:{}".format(node_list2))
`
def generate_nodes(nodes, type):
for node in nodes:
if isinstanceof(node.type,type):
yield node
Please ignore the poor variable name choices but the temp2 prints out fine, but temp3 prints out an empty list, despite the methods taking identical parameters :( Note that the inside of the for loop doesn't modify any of the items or anything. Any help or explanation would be great!
For a sample XML file, I have this:
<top></top>
For a sample output, I'm getting:
Generate_hash_2:["XPath:/*[name()='top'][1], Node Type:UnknownNode, Tag:top, Text:"]
Generate_hash_3:[]
If you are interested in helping me understand this further, I've been writing these methods to get an understanding of the files in here: https://github.com/mmoosstt/XmlXdiff/tree/master/lib/diffx , specifically the differ.py file
The code in that file constantly calls the _gen_dx_nodes() method (with the same parameters), which is a method that creates a generator. But the code's generator never "ends" and forces the writer to do something to reset it. So I'm confused why this happens to me (because I've been running into my problem when calling that method from different methods in succession). I've also been using the same test cases so I'm pretty lost here on how to fix this. Any help would be great!

Function calls in a sequence

I am writing a program that must solve a task and the task has many points, so I made one function for each point.
In the main function, I am calling the functions (which all return a value) in the following way:
result = funcD(funcC(funcB(funcA(parameter))))
Is this way of setting function calls right and optimal or there is a better way?
First, as everyone else said, your implementation is totally valid, and separate into multiple lines is good idea to improve readability.
However, if there are even more that 4 functions, I have a better way to make your code more simple.
def chain_func(parameter, *functions):
for func in functions:
parameter = func(parameter)
return parameter
This is based on python can pass function as a variable and call it in other function.
To use it, just simple chain_func(parameter, funcA, funcB, funcC, funcD)
There's nothing really wrong with that way. You could improve readability by instead calling them like this:
resultA = funcA(parameter)
resultB = funcB(resultA)
resultC = funcC(resultB)
resultD = funcD(resultC)
But that's really just a matter of personal preference and style.
If what they do and what they return is fixed, then also the dependency between them is fixed. So you have no other way then call them in this order. Otherwise there is no way of telling without knowing what do they do exactly.
Whether you pin a reference to the partial results:
result1 = funcA(parameter)
#...
result = funcD(result3)
or call them as you've presented in your question doesn't make a significant difference.

When to type-check a function's arguments?

I'm asking about situations where if a wrong type of argument is passed to the function, it could:
Blow up the whole thing.
Return unexpected results
Return nothing
For instance, the function below expects the argument name to be a string. It would throw an exception for all other types that doesn't have a startswith method.
def fruits(name):
if name.startswith('O'):
print('Is it Orange?')
There are other cases where a function could halt or cause damage to the system if execution proceeds without type-checking. Whenever there are a lot of functions or functions with a lot of arguments, type checking is tedious and makes the code unreadable. So, is there a standard for doing this? As to 'how to type check' - there are plenty of examples here on stackexchange, but I couldn't find any about where it would be appropriate to do so.
Another example would be:
def fruits(names):
with open('important_file.txt', 'r+') as fil:
for name in names:
if name in fil:
# Edit the file
Here if the name is a string each character in it will influence the editing of the file. If it is any other iterable, each element provided by it would influence the editing. Both of these could produce different results.
So, when should we type-check an argument and should we not?
The answer off the top of my head would be: it depends where the input comes from.
If the functions are class methods that get invokes internally or things like that, you can assume the inputs are valid, because you wrote it!
For example
def add(x,y):
return x + y
def multiply(a,b):
product = 0
for i in range(a):
product = add(product, b)
return product
In my add function, I could check that there is a + operator for the parameters x and y. But since I wrote the multiply function, and that is the only function that uses add, it is safe to assume the inputs will be int because that's how I wrote it. Now that argument stands on shaky ground for large code bases where you (hopefully) have shared code, so you can't be sure people don't misuse your functions. But that's why you comment them well to describe the correct use of said function.
If it has to read from a file, get user input, etc, then you may want to do some validation first.
I almost never do type checking in Python. In accordance with Pythonic philosophy I assume that me and other programmers are adult people capable of reading the code (or at least the documentation) and using it properly. I assume that we test our code before we let it destroy something important. After all in most cases if you do something wrong, you'll just see an error and Python's error messages are quite informative most of the time.
The only occasion when I sometimes check types is when I want my function to behave differently depending on the argument's type. But although I sometimes feel compelled to do this, I don't consider it a good practice.
Most often it happens when my function iterates over a list of strings and I fear (or want) I could get a single string passed into it by accident - this won't throw an error at once because unfortunately string is an iterable too.

What's the pythonic way of conditional variable initialization?

Due to the scoping rules of Python, all variables once initialized within a scope are available thereafter. Since conditionals do not introduce new scope, constructs in other languages (such as initializing a variable before that condition) aren't necessarily needed. For example, we might have:
def foo(optionalvar = None):
# some processing, resulting in...
message = get_message()
if optionalvar is not None:
# some other processing, resulting in...
message = get_other_message()
# ... rest of function that uses message
or, we could have instead:
def foo(optionalvar = None):
if optionalvar is None:
# processing, resulting in...
message = get_message()
else:
# other processing, resulting in...
message = get_other_message()
# ... rest of function that uses message
Of course, the get_message and get_other_message functions might be many lines of code and are basically irrelevant (you can assume that the state of the program after each path is the same); the goal here is making message ready for use beyond this section of the function.
I've seen the latter construct used several times in other questions, such as:
https://stackoverflow.com/a/6402327/18097
https://stackoverflow.com/a/7382688/18097
Which construct would be more acceptable?
Python also has a very useful if syntax pattern which you can use here
message = get_other_message() if optional_var else get_message()
Or if you want to compare strictly with None
message = get_other_message() if optional_var is not None else get_message()
Unlike with example 1) you posted this doesn't call get_message() unnecessarily.
In general second approach is better and more generic because it doesn't involve calling get_message unconditionally. Which may be ok if that function is not resource incentive but consider a search function
def search(engine):
results = get_from_google()
if engine == 'bing':
results = get_from_bing()
obviously this is not good, i can't think of such bad scenario for second case, so basically a approach which goes thru all options and finally does the default is best e.g.
def search(engine):
if engine == 'bing':
results = get_from_bing()
else:
results = get_from_google()
I think it's more pythonic to not set an explicit rule about this, and instead just keep to the idea that smallish functions are better (in part because it's possible to keep in your mind just when new names are introduced).
I suppose though that if your conditional tests get much more complicated than an if/else you may run the risk of all of them failing and you later using an undefined name, resulting in a possible runtime error, unless you are very careful. That might be an argument for the first style, when it's possible.
The answer depends on if there are side effects of get_message() which are wanted.
In most cases clearly the second one wins, because the code which produces the unwanted result is not executed. But if you need the side effects, you should choose the first version.
It might be better (read: safer) to initialize your variable outside the conditions. If you have to define other conditions or even remove some, the user of message later on might get an uninitialized variable exception.

function in python

In a function, I need to perform some logic that requires me to call a function inside a function. What I did with this, like:
def dfs(problem):
stack.push(bache)
search(root)
while stack.isEmpty() != 0:
def search(vertex):
closed.add(vertex)
for index in sars:
stack.push(index)
return stack
In the function, dfs, I am using search(root), is this is the correct way to do it?
I am getting an error: local variable 'search' referenced before assignment
There are many mysterious bug-looking aspects in your code. The wrong order of definition (assuming you do need the search function to be a nested one) and the syntax error from the empty while loop have already been observed, but there are more...:
def dfs(problem):
stack.push(bache)
search(root)
what's bache, what's stack, what's root? If they're all global variables, then you're overusing globals -- and apparently nowhere ever using the argument problem (?!).
while stack.isEmpty() != 0:
what's this weird-looking method isEmpty? IOW, what type is stack (clearly not a Python list, and that's weird enough, since they do make excellent LIFO stacks;-)...? And what's ever going to make it empty...?
def search(vertex):
closed.add(vertex)
...don't tell me: closed is yet another global? Presumably a set? (I remember from a few of your Qs back that you absolutely wanted to have a closed dict, not set, even though I suggested that as a possibility...
for index in sars:
...and what's sars?!
stack.push(index)
return stack
what a weird "loop" -- one that executes exactly once, altering a global variable, and then immediately returns that global variable (?) without doing any of the other steps through the loop. Even if this is exactly what you mean (push the first item of sars, period) I don't recommend hiding it in a pseudo-loop -- it seriously looks like a mystery bug just waiting to happen;-).
You need to de-indent your search function. The way you have it set up right now you are defining your search function as a part of the completion of your dfs call. Also, encapsulation in a class would help.
Thats the wrong order. Try this:
def dfs(problem):
def search(vertex):
closed.add(vertex)
for index in sars:
stack.push(index)
return stack
stack.push(bache)
search(root)
while stack.isEmpty() != 0:
Either define search before you call it, or define it outside of dfs.
you have to define the function before using it
root doesn't seem to be available in your scope - make sure it's reachable
You don't have body to your for your while loop. That is probably causing problems parsing the code. I would also suggest putting the local function definition before it is used, so it is easier to follow.

Categories

Resources