This question already has answers here:
Python 3 Map function is not Calling up function
(3 answers)
Closed 2 years ago.
i got a function that takes an object as an argument and calls a method of that object. the method simply prints something out and doesnt return anything.
def func(someobject):
someobject.method()
now i got a list of a bunch of objects which i want to pass to the "func" function.
I already tried using the map function like this:
list = [object1, object2, object3]
map(func, list)
however it only works when i do:
tuple(map(func, list))
later i want to the method to communicate with an API so my goal is to use multiprocessing to speed up the whole proccess, however i can't even get it right normally xD.
excuse me if I made a rookie mistake, I'm quite new to python and programming in general
map returns an iterator, so it actually has to be evaluated to get the values in it. tuple or list are common ways to do so.
It does this for efficiency purposes: you can map over a massive structure by having it only consume one element from the generator at a time, if you want, e.g.
bar = map(f, foo)
for x in bar:
baz(x)
You can also feed one generator into another to create more efficient pipelines.
map works by itself, my guess is you are confused as it returns an iterator instead of an actual list or tuple. This has been the default behavior starting Python 3 so if you are following an old tutorial, this can seem like a discrepancy in behavior.
FWIW, iterators are, in general, better if your dataset is huge and you don't want to load all items in memory at the same time. You can already use map without the explicit conversion like so:
for spam in map(func, _lst):
print(spam)
gen = map(func, liste)
gives you an iterator. You can access the individual elements by
el = next(gen)
however it only works when i do
map() works all the time, if you print it will show this:
print(map(func, list)) # <map object at 0x7f6c49ea4880>
The result is a generator so you have to convert it into something like tuple or list.
Related
Hi I'm trying to wrap my head around the concept of generator in Python specifically using Spacy.
As far as I understood, generator runs only once. and nlp.pipe(list) returns a generator to use
machine effectively.
And the generator worked as I predicted like below.
matches = ['one', 'two', 'three']
docs = nlp.pipe(matches)
type(docs)
for x in docs:
print(x)
# First iteration, worked
one
two
three
for x in docs:
print(x)
# Nothing is printed this time
But strange thing happened when I tried to make a list using the generator
for things in nlp.pipe(example1):
print(things)
#First iteration prints things
a is something
b is other thing
c is new thing
d is extra
for things in nlp.pipe(example1):
print(things)
#Second iteration prints things again!
a is something
b is other thing
c is new thing
d is extra
Why this generator runs infinitely? I tried several times and it seems like it runs infinitely.
Thank you
I think you're confused because the term "generator" can be used to mean two different things in Python.
The first thing it can mean is a "generator object" which kind of iterator. The docs variable you created in your first example is a reference to one of these. A generator object can only be iterated once, after that it's exhausted and you'll need to create another one if you want to do more iteration.
The other thing "generator" can mean is a "generator function". A generator function is a function that returns a generator object when you call it. Indeed, the term "generator" is sometimes sloppily used for functions that return iterators generally, even when that's not technically correct. A real generator function is implemented using the yield keyword, but from the caller's perspective, it doesn't really matter how the function is implemented, just that it returns some kind of iterator.
I don't know anything about the library you're using, but it seems like nlp.pipe returns an iterator, so in at least the loosest sense (at least) it can be called a generator function. The iterator it returns is (presumably) the generator object.
Generator objects are single-use, like all iterators are supposed to be. Generator functions on the other hand, can be called as many times as you find appropriate (some might have side effects). Each time you call the generator function, you'll get a new generator object. This is why your second code block works, as you're calling nlp.pipe once for each loop, rather than iterating on the same iterator for both loops.
for things in nlp.pipe(example1) creates a new instance of the nlp.pipe() generator (i.e. an iterator).
If you had assigned the generator to a variable and used the variable multiple times, then you would have seen the effect that you were expecting:
pipeGen = nlp.pipe(example1)
for things in pipeGen:
print(things)
#First iteration will things
for things in pipeGen:
print(things)
#Second iteration will print nothing
In other words nlp.pipe() returns a NEW iterator whereas pipeGen IS an iterator.
I have a list of objects and they have a method called process. In Python 2 one could do this
map(lambda x: x.process, my_object_list)
In Python 3 this will not work because map doesn't call the function until the iterable is traversed. One could do this:
list(map(lambda x: x.process(), my_object_list))
But then you waste memory with a throwaway list (an issue if the list is big). I could also use a 2-line explicit loop. But this pattern is so common for me that I don't want to, or think I should need to, write a loop every time.
Is there a more idiomatic way to do this in Python 3?
Don't use map or a list comprehension where simple for loop will do:
for x in list_of_objs:
x.process()
It's not significantly longer than any function you might use to abstract it, but it is significantly clearer.
Of course, if process returns a useful value, then by all means, use a list comprehension.
results = [x.process() for x in list_of_objs]
or map:
results = list(map(lambda x: x.process(), list_of_objs))
There is a function available that makes map a little less clunky, especially if you would reuse the caller:
from operator import methodcaller
processor = methodcaller('process')
results = list(map(processor, list_of_objs))
more_results = list(map(processor, another_list_of_objs))
If you are looking for a good name for a function to wrap the loop, Haskell has a nice convention: a function name ending with an underscore discards its "return value". (Actually, it discards the result of a monadic action, but I'd rather ignore that distinction for the purposes of this answer.)
def map_(f, *args):
for f_args in zip(*args):
f(*f_args)
# Compare:
map(f, [1,2,3]) # -- return value of [f(1), f(2), f(3)] is ignored
map_(f, [1,2,3]) # list of return values is never built
Since you're looking for a Pythonic solution, why would even bother trying to adapt map(lambda x: x.process, my_object_list) for Python 3 ?
Isn't a simple for loop enough ?
for x in my_object_list:
x.process()
I mean, this is concise, readable and avoid creating an unnecessary list if you don't need return values.
If I try to modify the 'board' list in-place in the way below, it doesn't work, it seems like it generate some new 'board' instead of modify in-place.
def func(self, board):
"""
:type board: List[List[str]]
"""
board = [['A' for j in range(len(board[0]))] for i in range(len(board))]
return
I have to do something like this to modify it in-place, what's the reason? Thanks.
for i in range(len(board)):
for j in range(len(board[0])):
board[i][j] = 'A'
You seem to understand the difference between these two cases, and want to know why Python makes you handle them differently?
I have to do something like this to modify it in-place, what's the reason?
Creating a new copy is something that has a value. So it makes sense for it to be an expression. In fact, list comprehensions would be useless if they weren't expressions.
Mutating a list in-place isn't something that has a value. So, there's no reason to make it an expression, and in fact, it would be weird to do so. Sure, you could come up with some kind of value (like, say, the list being mutated). But that would be at odds with everything else in the design of Python: spam.append(eggs) doesn't return spam, it returns nothing. spam = eggs doesn't have a value. And so on.
Secondarily, the comprehension style feeds very well into the iterable paradigm, which is fundamental to Python. For example, notice that you can turn a list comprehension into a generator comprehension (which gives you a lazy iterator over values that are computed on demand) just by changing the […] to (…). What useful equivalent could there be for mutation?
Making the transforming-copy more convenient also encourages people to use a non-mutating style, which often leads to better answers for many problems. When you want to know how to avoid writing three lines of nested statement to mutate some global, the answer is to stop mutating that global and instead pass in a parameter and return the new value.
Also, the syntax was copied from Haskell, where there is no mutation.
But of course all those "often" and "usually" don't mean "never". Sometimes (unless you're designing a language with no mutation), you need to do things in-place. That's why we have list.sort as well as sorted. (And a lot of work has gone into optimizing the hell out of list.sort; it's not just an afterthought.)
Python doesn't stop you from doing it. It just doesn't bend over quite as far to make it easy as it does for copying.
that is not modifying it in place. The list comprehension syntax [x for y in z] is creating a new list. The original list is not modified by this syntax. Making the name inside the function point to a new list won't change what list the name outside the function is pointing.
In other words, when calling a function python passes a reference to the object, not the name, so there is no easy way to change which object the variable name outside the function is refering to.
I see that using list comprehension provides a very simple way to create new lists in Python.
However, if instead of creating a new list I just want to call a void function for each argument in a list without expecting any sort of return value, should I use list comprehension or just use a for loop to iterate? Does the simplicity in the code justify creating a new list (even if it remains empty) for each set of iterations? Even if this added cost is negligible in small programs, does it make sense to do it in large-scale programs/production?
Thanks!
List comprehensions are the wrong way if you don't actually need a list. Use this instead:
for i in seq:
some_function(i)
This is both more efficient and more expressive than using:
[some_function(i) for i in seq]
Note that there is something similar that doesn't work (and in particular it's not a tuple comprehension):
(some_function(i) for i in seq)
because that only creates an iterator. If you actually pass a list around that only gets iterated once, passing such an iterator around is a much better solution though.
for x in lst: f(x)
looks about equally short (it's actually one character shorter) as
[f(x) for x in lst]
Or is that not what you were trying to do?
There are more possible solutions for calling a funcion on every member of a list:
numpy can vectorize functions
import numpy as np
def func(i):
print i
v_func = np.vectorize(func)
v_func(['one', 'two', 'three'])
python has a builtin map function, that maps a function on every member of an iterable
def func(i):
print i
map(func, ['one', 'two', 'three'])
Are you asking if it is inefficient to create a list you don't need? Put that way, the answer should be obvious.
(To satisfy the answer police: yes, it is less efficient to create a list you don't need.)
This question already has answers here:
Is it Pythonic to use list comprehensions for just side effects?
(7 answers)
Closed 4 months ago.
What is the preferred way to tell someone "I want to apply func to each element in iterable for its side-effects"?
Option 1... clear, but two lines.
for element in iterable:
func(element)
Option 2... even more lines, but could be clearer.
def walk_for_side_effects(iterable):
for element in iterable:
pass
walk_for_side_effects(map(func, iterable)) # Assuming Python3's map.
Option 3... builds up a list, but this how I see everyone doing it.
[func(element) for element in iterable]
I'm liking Option 2; is there a function in the standard library that is already the equivalent?
Avoid the temptation to be clever. Use option 1, it's intent is clear and unambiguous; you are applying the function func() to each and every element in the iterable.
Option 2 just confuses everyone, looking for what walk_for_side_effects is supposed to do (it certainly puzzled me until I realized you needed to iterate over map() in Python 3).
Option 3 should be used when you actually get results from func(), never for the side effects. Smack anyone doing that just for the side-effects. List comprehensions should be used to generate a list, not to do something else. You are instead making it harder to comprehend and maintain your code (and building a list for all the return values is slower to boot).
This has been asked many times, e.g., here and here. But it's an interesting question, though. List comprehensions are meant to be used for something else.
Other options include
use map() - basically the same as your sample
use filter() - if your function returns None, you will get an empty list
Just a plain for-loop
while the plain loop is the preferable way to do it. It is semantically correct in this case, all other ways, including list comprehension, abuse concepts for their side-effect.
In Python 3.x, map() and filter() are generators and thus do nothing until you iterate over them. So we'd need, e.g., a list(map(...)), which makes it even worse.