Iterations is better or lambda functions are better in respect with time processing or memory usage and other things?
for example :
x = [10, 20, 30]
for y in x:
if y>10 or y<20:
print y
this is better or lambda function?
I want the answer with respect to the time processing, memory usage or any other comparisons.
Iterators and lambdas are two completely different things. A lambda is a simple inline function and an iterator is an object which returns successive objects. There are two major problems with your example: You are testing x instead of y and all values in x will pass for y>10 or y<20. So, correcting those, your example could be written using an iterator and a lambda like this:
for value in filter(lamdba y: y < 10 or y > 20, x):
print(value)
There are several ways you could do this, but in terms of performance it depends on what data you're processing, how you're processing it and how much of it you're processing. See http://wiki.python.org/moin/PythonSpeed/PerformanceTips for a useful guide.
For your case the classic loop is obviously better since you don't want to create a new list or generator.
Not creating such an object makes it more memory-efficient and not calling a function for each element makes it more performant.
I find the list comprehension notation much easier to read than the functional notation, especially as the complexity of the expression to be mapped increases. In addition, the list comprehension executes much faster than the solution using map and lambda. This is because calling a lambda function creates a new stack frame while the expression in the list comprehension is evaluated without creating a new stack frame. >> http://python-history.blogspot.com/2010/06/from-list-comprehensions-to-generator.html
In other words, whether you have a choice between a lambda and a loop/comprehension/generator, use the latter. I guess the most pythonic way to write your example would be something like
print [y for y in x if y < 20]
Related
I'm messing with python, and I'm looking for a way to replicate a for loop in a lambda.
Basically, I'd like to convert the function below to a lambda doing the same thing :
def basicForLoop(x):
for i in range(x):
print(i)
basicForLoop(100)
For now, I've managed to do it by using recursion and increasing the value to each new recursion :
(lambda f: lambda x: f(f,0, x))(lambda f,current,max: print(current) or f(f, current+1, max) if current <= max else None)(100)
This work rather well, but it hit the max recursion depth as soon as the number start to be too big, so I'm looking for a way to rearrange this lambda so that it can be used without worrying about the recursion depth to make it truly equivalent to the original function.
EDIT : I'm looking for a way to do this while keeping the loop logic directly inside the lambda, delegating the loop to another function like map, join, ... isn't what I'm looking for.
PS. I know very well that this is an abomination that should never be used in reality but I'm just curious about it.
I'm pretty sure this is impossible.
So I'm assuming you want to keep pretty much all of the logic handled by your lambdas, not even using a range. If that's the case, you're not going to get a stack-safe solution in Python. In other languages you could use something called "Tail Recursion", which allows the interpreter/compiler to collapse some recursive calls down to a single stack frame, but Python does not support that.
I don't think you can make this use fewer stack frames, either. Rewriting and re-formatting a bit, and adding explicit names and more print statements:
buildRecursive = (lambda g:
print("Running 1st") or
(lambda x:
print ("Running 2nd") or
g(g,0, x))
)
entry = buildRecursive (lambda f,current,max:
print("Running 3rd") or
print(current) or f(f, current+1, max) if current <= max else None)
entry (100)
This should be equivalent to what you have. This has print statements as the first operation of every call, and you can see that you're only running the 3rd one repeatedly. Essentially, you're generating as few stack frames per iteration as possible, given the constraints as I understand them.
As an aside, after some reading I understand why you're doing the or thing, but coming from other languages, that is downright hideous. It might be the python way of doing things, but it's a pretty awful way of sequencing operations - especially because of short-circuiting logic, so if you have try to bind operations such that the first doesn't return None, your code will mysteriously break. I would suggest using a tuple instead - (firstOp, secondOp) - but I think that will have memory or performance implications as Python will actually build the resulting value.
You might define your own infix operator which will evaluate both left and right operands in order and return the second (or the first... can you feel functional programming calling yet?). However in Python I think this will result in additional stack frames, as the operator will produce its own totally trivial stack frame.
Have you explored languages other than Python? If not I'd say it's time.
You could do something like that:
x = lambda x: print("\n".join(map(str, list(range(1,x+1)))))
x(100)
Edit:
You can do it like that:
x = lambda x: print(*range(1,x+1), sep='\n')
x(100)
Think about a function that I'm calling for its side effects, not return values (like printing to screen, updating GUI, printing to a file, etc.).
def fun_with_side_effects(x):
...side effects...
return y
Now, is it Pythonic to use list comprehensions to call this func:
[fun_with_side_effects(x) for x in y if (...conditions...)]
Note that I don't save the list anywhere
Or should I call this func like this:
for x in y:
if (...conditions...):
fun_with_side_effects(x)
Which is better and why?
It is very anti-Pythonic to do so, and any seasoned Pythonista will give you hell over it. The intermediate list is thrown away after it is created, and it could potentially be very, very large, and therefore expensive to create.
You shouldn't use a list comprehension, because as people have said that will build a large temporary list that you don't need. The following two methods are equivalent:
consume(side_effects(x) for x in xs)
for x in xs:
side_effects(x)
with the definition of consume from the itertools man page:
def consume(iterator, n=None):
"Advance the iterator n-steps ahead. If n is none, consume entirely."
# Use functions that consume iterators at C speed.
if n is None:
# feed the entire iterator into a zero-length deque
collections.deque(iterator, maxlen=0)
else:
# advance to the empty slice starting at position n
next(islice(iterator, n, n), None)
Of course, the latter is clearer and easier to understand.
List comprehensions are for creating lists. And unless you are actually creating a list, you should not use list comprehensions.
So I would got for the second option, just iterating over the list and then call the function when the conditions apply.
Second is better.
Think of the person who would need to understand your code. You can get bad karma easily with the first :)
You could go middle between the two by using filter(). Consider the example:
y=[1,2,3,4,5,6]
def func(x):
print "call with %r"%x
for x in filter(lambda x: x>3, y):
func(x)
Depends on your goal.
If you are trying to do some operation on each object in a list, the second approach should be adopted.
If you are trying to generate a list from another list, you may use list comprehension.
Explicit is better than implicit.
Simple is better than complex. (Python Zen)
You can do
for z in (fun_with_side_effects(x) for x in y if (...conditions...)): pass
but it's not very pretty.
Using a list comprehension for its side effects is ugly, non-Pythonic, inefficient, and I wouldn't do it. I would use a for loop instead, because a for loop signals a procedural style in which side-effects are important.
But, if you absolutely insist on using a list comprehension for its side effects, you should avoid the inefficiency by using a generator expression instead. If you absolutely insist on this style, do one of these two:
any(fun_with_side_effects(x) and False for x in y if (...conditions...))
or:
all(fun_with_side_effects(x) or True for x in y if (...conditions...))
These are generator expressions, and they do not generate a random list that gets tossed out. I think the all form is perhaps slightly more clear, though I think both of them are confusing and shouldn't be used.
I think this is ugly and I wouldn't actually do it in code. But if you insist on implementing your loops in this fashion, that's how I would do it.
I tend to feel that list comprehensions and their ilk should signal an attempt to use something at least faintly resembling a functional style. Putting things with side effects that break that assumption will cause people to have to read your code more carefully, and I think that's a bad thing.
I have a list of objects and they have a method called process. In Python 2 one could do this
map(lambda x: x.process, my_object_list)
In Python 3 this will not work because map doesn't call the function until the iterable is traversed. One could do this:
list(map(lambda x: x.process(), my_object_list))
But then you waste memory with a throwaway list (an issue if the list is big). I could also use a 2-line explicit loop. But this pattern is so common for me that I don't want to, or think I should need to, write a loop every time.
Is there a more idiomatic way to do this in Python 3?
Don't use map or a list comprehension where simple for loop will do:
for x in list_of_objs:
x.process()
It's not significantly longer than any function you might use to abstract it, but it is significantly clearer.
Of course, if process returns a useful value, then by all means, use a list comprehension.
results = [x.process() for x in list_of_objs]
or map:
results = list(map(lambda x: x.process(), list_of_objs))
There is a function available that makes map a little less clunky, especially if you would reuse the caller:
from operator import methodcaller
processor = methodcaller('process')
results = list(map(processor, list_of_objs))
more_results = list(map(processor, another_list_of_objs))
If you are looking for a good name for a function to wrap the loop, Haskell has a nice convention: a function name ending with an underscore discards its "return value". (Actually, it discards the result of a monadic action, but I'd rather ignore that distinction for the purposes of this answer.)
def map_(f, *args):
for f_args in zip(*args):
f(*f_args)
# Compare:
map(f, [1,2,3]) # -- return value of [f(1), f(2), f(3)] is ignored
map_(f, [1,2,3]) # list of return values is never built
Since you're looking for a Pythonic solution, why would even bother trying to adapt map(lambda x: x.process, my_object_list) for Python 3 ?
Isn't a simple for loop enough ?
for x in my_object_list:
x.process()
I mean, this is concise, readable and avoid creating an unnecessary list if you don't need return values.
Think about a function that I'm calling for its side effects, not return values (like printing to screen, updating GUI, printing to a file, etc.).
def fun_with_side_effects(x):
...side effects...
return y
Now, is it Pythonic to use list comprehensions to call this func:
[fun_with_side_effects(x) for x in y if (...conditions...)]
Note that I don't save the list anywhere
Or should I call this func like this:
for x in y:
if (...conditions...):
fun_with_side_effects(x)
Which is better and why?
It is very anti-Pythonic to do so, and any seasoned Pythonista will give you hell over it. The intermediate list is thrown away after it is created, and it could potentially be very, very large, and therefore expensive to create.
You shouldn't use a list comprehension, because as people have said that will build a large temporary list that you don't need. The following two methods are equivalent:
consume(side_effects(x) for x in xs)
for x in xs:
side_effects(x)
with the definition of consume from the itertools man page:
def consume(iterator, n=None):
"Advance the iterator n-steps ahead. If n is none, consume entirely."
# Use functions that consume iterators at C speed.
if n is None:
# feed the entire iterator into a zero-length deque
collections.deque(iterator, maxlen=0)
else:
# advance to the empty slice starting at position n
next(islice(iterator, n, n), None)
Of course, the latter is clearer and easier to understand.
List comprehensions are for creating lists. And unless you are actually creating a list, you should not use list comprehensions.
So I would got for the second option, just iterating over the list and then call the function when the conditions apply.
Second is better.
Think of the person who would need to understand your code. You can get bad karma easily with the first :)
You could go middle between the two by using filter(). Consider the example:
y=[1,2,3,4,5,6]
def func(x):
print "call with %r"%x
for x in filter(lambda x: x>3, y):
func(x)
Depends on your goal.
If you are trying to do some operation on each object in a list, the second approach should be adopted.
If you are trying to generate a list from another list, you may use list comprehension.
Explicit is better than implicit.
Simple is better than complex. (Python Zen)
You can do
for z in (fun_with_side_effects(x) for x in y if (...conditions...)): pass
but it's not very pretty.
Using a list comprehension for its side effects is ugly, non-Pythonic, inefficient, and I wouldn't do it. I would use a for loop instead, because a for loop signals a procedural style in which side-effects are important.
But, if you absolutely insist on using a list comprehension for its side effects, you should avoid the inefficiency by using a generator expression instead. If you absolutely insist on this style, do one of these two:
any(fun_with_side_effects(x) and False for x in y if (...conditions...))
or:
all(fun_with_side_effects(x) or True for x in y if (...conditions...))
These are generator expressions, and they do not generate a random list that gets tossed out. I think the all form is perhaps slightly more clear, though I think both of them are confusing and shouldn't be used.
I think this is ugly and I wouldn't actually do it in code. But if you insist on implementing your loops in this fashion, that's how I would do it.
I tend to feel that list comprehensions and their ilk should signal an attempt to use something at least faintly resembling a functional style. Putting things with side effects that break that assumption will cause people to have to read your code more carefully, and I think that's a bad thing.
Is there a faster/smarter way to perform operations on every element of a numpy array? What I specifically have is a list of datetime objects like, e.g.:
hh = np.array( [ dt.date(2000, 1, 1), dt.date(2001, 1, 1) ] )
To get a list of of years from that I do at the moment:
years = np.array( [ x.year for x in hh ] )
Is there a smarter way to do this? I'm thinking something like
hh.year
which obviously doesn't work.
I have a script in which I need different variations of a (much longer) array constantly (year, month, hours...). Of course I could always just define a separate array for everything but like there should be a more elegant solution.
If you evaluate a python expression for each element, it doesn't matter whether the iteration will be done in C++ or Python. What will have weight is the python-complexity of the evaluated (in-loop) expression. This means: If your (in-loop) expression takes 1 microsec (a very simple script), it will be significantly harder than the difference between using a python iteration or a C++ iteration (you have a "marshalling" between C++ and PyObjects, and that applies to python functions as well).
For that reason, calling vectorize is -under the hoods- done in Python: what is called inside is python code. The idea behind vectorize is not performance, but code readability and ease of iteration: vectorize performs introspection (of function's parameters) and serves well for N-dimensional iterations (i.e. a lambda x,y: x+y automagically serves to iterate in two dimensions).
So: no, there's no "fast" way to iterate python code. The final speed that matters is the speed of your inner python code.
Edit: your -desired- hh.year looks like hh*.year equivalent in groovy, but even there under the hoods is the same as an in-code iteration. Comprehensions are the fastest (and equivalent) way in python. The real pity is being forced to:
years = np.array( [ x.year for x in hh ] )
(which forces you to create another provably-huge-sized) instead of letting you use any type of iterator:
years = np.array( x.year for x in hh )
Edit (suggestion by #Jaime): You can't construct array with that function from an iterator. For that, you must use:
np.fromiter(x.year for x in hh, dtype=int, count=len(x))
which lets you save the time and memory of building an intermediate array. This exact approach works for any sequence to avoid the inner-list creation (this one would be your case) but does not work with other types of generators, for future cases you'd need.
You can use numpy.vectorize.
Doing some benchmarking, performance is pretty similar (vectorize slightly slower than a list comprehension), and in my opinion numpy.vectorize(lambda j: j.year)(hh) (or something similar) doesn't look super elegant.