perform multiple for loops simultaneously - python

Is it possible to perform multiple loops simultaneously in python.
Like(syntax error, of course):
for a,b in list_of_a,list_of_b:
//do some thing
By simultaneously, I am not meaning the thread or process sense.
I mean, they share the same index or cursor during the iteration.
What I can think of achieving that is:
Use a int variable to act as a shared cursor
put them in a list of tuples and iterate the tuple-list. But creating the list is laborious
I am just wondering if there some built-in functions or simpler syntax to achieve that.

for a,b in zip(list_of_a, list_of_b):
# Do some thing
If you're using Python 2.x, are worried about performance, and/or using iterators instead of lists, consider itertools.izip instead of zip.
In Python 3.x, zip replaces itertools.izip; use list(zip(..)) to get the old (2.x) behavior of zip returning a list.

import itertools
for a, b in itertools.izip(list_a, list_b):
# ...

Related

Renaming files with map function [duplicate]

Think about a function that I'm calling for its side effects, not return values (like printing to screen, updating GUI, printing to a file, etc.).
def fun_with_side_effects(x):
...side effects...
return y
Now, is it Pythonic to use list comprehensions to call this func:
[fun_with_side_effects(x) for x in y if (...conditions...)]
Note that I don't save the list anywhere
Or should I call this func like this:
for x in y:
if (...conditions...):
fun_with_side_effects(x)
Which is better and why?
It is very anti-Pythonic to do so, and any seasoned Pythonista will give you hell over it. The intermediate list is thrown away after it is created, and it could potentially be very, very large, and therefore expensive to create.
You shouldn't use a list comprehension, because as people have said that will build a large temporary list that you don't need. The following two methods are equivalent:
consume(side_effects(x) for x in xs)
for x in xs:
side_effects(x)
with the definition of consume from the itertools man page:
def consume(iterator, n=None):
"Advance the iterator n-steps ahead. If n is none, consume entirely."
# Use functions that consume iterators at C speed.
if n is None:
# feed the entire iterator into a zero-length deque
collections.deque(iterator, maxlen=0)
else:
# advance to the empty slice starting at position n
next(islice(iterator, n, n), None)
Of course, the latter is clearer and easier to understand.
List comprehensions are for creating lists. And unless you are actually creating a list, you should not use list comprehensions.
So I would got for the second option, just iterating over the list and then call the function when the conditions apply.
Second is better.
Think of the person who would need to understand your code. You can get bad karma easily with the first :)
You could go middle between the two by using filter(). Consider the example:
y=[1,2,3,4,5,6]
def func(x):
print "call with %r"%x
for x in filter(lambda x: x>3, y):
func(x)
Depends on your goal.
If you are trying to do some operation on each object in a list, the second approach should be adopted.
If you are trying to generate a list from another list, you may use list comprehension.
Explicit is better than implicit.
Simple is better than complex. (Python Zen)
You can do
for z in (fun_with_side_effects(x) for x in y if (...conditions...)): pass
but it's not very pretty.
Using a list comprehension for its side effects is ugly, non-Pythonic, inefficient, and I wouldn't do it. I would use a for loop instead, because a for loop signals a procedural style in which side-effects are important.
But, if you absolutely insist on using a list comprehension for its side effects, you should avoid the inefficiency by using a generator expression instead. If you absolutely insist on this style, do one of these two:
any(fun_with_side_effects(x) and False for x in y if (...conditions...))
or:
all(fun_with_side_effects(x) or True for x in y if (...conditions...))
These are generator expressions, and they do not generate a random list that gets tossed out. I think the all form is perhaps slightly more clear, though I think both of them are confusing and shouldn't be used.
I think this is ugly and I wouldn't actually do it in code. But if you insist on implementing your loops in this fashion, that's how I would do it.
I tend to feel that list comprehensions and their ilk should signal an attempt to use something at least faintly resembling a functional style. Putting things with side effects that break that assumption will cause people to have to read your code more carefully, and I think that's a bad thing.

python (3.5) list comprehension vs foor loop over void funcs [duplicate]

Think about a function that I'm calling for its side effects, not return values (like printing to screen, updating GUI, printing to a file, etc.).
def fun_with_side_effects(x):
...side effects...
return y
Now, is it Pythonic to use list comprehensions to call this func:
[fun_with_side_effects(x) for x in y if (...conditions...)]
Note that I don't save the list anywhere
Or should I call this func like this:
for x in y:
if (...conditions...):
fun_with_side_effects(x)
Which is better and why?
It is very anti-Pythonic to do so, and any seasoned Pythonista will give you hell over it. The intermediate list is thrown away after it is created, and it could potentially be very, very large, and therefore expensive to create.
You shouldn't use a list comprehension, because as people have said that will build a large temporary list that you don't need. The following two methods are equivalent:
consume(side_effects(x) for x in xs)
for x in xs:
side_effects(x)
with the definition of consume from the itertools man page:
def consume(iterator, n=None):
"Advance the iterator n-steps ahead. If n is none, consume entirely."
# Use functions that consume iterators at C speed.
if n is None:
# feed the entire iterator into a zero-length deque
collections.deque(iterator, maxlen=0)
else:
# advance to the empty slice starting at position n
next(islice(iterator, n, n), None)
Of course, the latter is clearer and easier to understand.
List comprehensions are for creating lists. And unless you are actually creating a list, you should not use list comprehensions.
So I would got for the second option, just iterating over the list and then call the function when the conditions apply.
Second is better.
Think of the person who would need to understand your code. You can get bad karma easily with the first :)
You could go middle between the two by using filter(). Consider the example:
y=[1,2,3,4,5,6]
def func(x):
print "call with %r"%x
for x in filter(lambda x: x>3, y):
func(x)
Depends on your goal.
If you are trying to do some operation on each object in a list, the second approach should be adopted.
If you are trying to generate a list from another list, you may use list comprehension.
Explicit is better than implicit.
Simple is better than complex. (Python Zen)
You can do
for z in (fun_with_side_effects(x) for x in y if (...conditions...)): pass
but it's not very pretty.
Using a list comprehension for its side effects is ugly, non-Pythonic, inefficient, and I wouldn't do it. I would use a for loop instead, because a for loop signals a procedural style in which side-effects are important.
But, if you absolutely insist on using a list comprehension for its side effects, you should avoid the inefficiency by using a generator expression instead. If you absolutely insist on this style, do one of these two:
any(fun_with_side_effects(x) and False for x in y if (...conditions...))
or:
all(fun_with_side_effects(x) or True for x in y if (...conditions...))
These are generator expressions, and they do not generate a random list that gets tossed out. I think the all form is perhaps slightly more clear, though I think both of them are confusing and shouldn't be used.
I think this is ugly and I wouldn't actually do it in code. But if you insist on implementing your loops in this fashion, that's how I would do it.
I tend to feel that list comprehensions and their ilk should signal an attempt to use something at least faintly resembling a functional style. Putting things with side effects that break that assumption will cause people to have to read your code more carefully, and I think that's a bad thing.

iterating over a single list in parallel in python

The objective is to do calculations on a single iter in parallel using builtin sum & map functions concurrently. Maybe using (something like) itertools instead of classic for loops to analyze (LARGE) data that arrives via an iterator...
In one simple example case I want to calculate ilen, sum_x & sum_x_sq:
ilen,sum_x,sum_x_sq=iterlen(iter),sum(iter),sum(map(lambda x:x*x, iter))
But without converting the (large) iter to a list (as with iter=list(iter))
n.b. Do this using sum & map and without for loops, maybe using the itertools and/or threading modules?
def example_large_data(n=100000000, mean=0, std_dev=1):
for i in range(n): yield random.gauss(mean,std_dev)
-- edit --
Being VERY specific: I was taking a good look at itertools hoping that there was a dual function like map that could do it. For example: len_x,sum_x,sum_x_sq=itertools.iterfork(iter_x,iterlen,sum,sum_sq)
If I was to be very very specific: I am looking for just one answer, python source code for the "iterfork" procedure.
You can use itertools.tee to turn your single iterator into three iterators which you can pass to your three functions.
iter0, iter1, iter2 = itertools.tee(input_iter, 3)
ilen, sum_x, sum_x_sq = count(iter0),sum(iter1),sum(map(lambda x:x*x, iter2))
That will work, but the builtin function sum (and map in Python 2) is not implemented in a way that supports parallel iteration. The first function you call will consume its iterator completely, then the second one will consume the second iterator, then the third function will consume the third iterator. Since tee has to store the values seen by one of its output iterators but not all of the others, this is essentially the same as creating a list from the iterator and passing it to each function.
Now, if you use generator functions that consume only a single value from their input for each value they output, you might be able to make parallel iteration work using zip. In Python 3, map and zip are both generators. The question is how to make sum into a generator.
I think you can get pretty much what you want by using itertools.accumulate (which was added in Python 3.2). It is a generator that yields a running sum of its input. Here's how you could make it work for your problem (I'm assuming your count function was supposed to be an iterator-friendly version of len):
iter0, iter1, iter2 = itertools.tee(input_iter, 3)
len_gen = itertools.accumulate(map(lambda x: 1, iter0))
sum_gen = itertools.accumulate(iter1)
sum_sq_gen = itertools.accumulate(map(lambda x: x*x, iter2))
parallel_gen = zip(len_gen, sum_gen, sum_sq_gen) # zip is a generator in Python 3
for ilen, sum_x, sum_x_sq in parallel_gen:
pass # the generators do all the work, so there's nothing for us to do here
# ilen_x, sum_x, sum_x_sq have the right values here!
If you're using Python 2, rather than 3, you'll have to write your own accumulate generator function (there's a pure Python implementation in the docs I linked above), and use itertools.imap and itertools.izip rather than the builtin map and zip functions.

When does using list comprehension in Python become inefficient?

I see that using list comprehension provides a very simple way to create new lists in Python.
However, if instead of creating a new list I just want to call a void function for each argument in a list without expecting any sort of return value, should I use list comprehension or just use a for loop to iterate? Does the simplicity in the code justify creating a new list (even if it remains empty) for each set of iterations? Even if this added cost is negligible in small programs, does it make sense to do it in large-scale programs/production?
Thanks!
List comprehensions are the wrong way if you don't actually need a list. Use this instead:
for i in seq:
some_function(i)
This is both more efficient and more expressive than using:
[some_function(i) for i in seq]
Note that there is something similar that doesn't work (and in particular it's not a tuple comprehension):
(some_function(i) for i in seq)
because that only creates an iterator. If you actually pass a list around that only gets iterated once, passing such an iterator around is a much better solution though.
for x in lst: f(x)
looks about equally short (it's actually one character shorter) as
[f(x) for x in lst]
Or is that not what you were trying to do?
There are more possible solutions for calling a funcion on every member of a list:
numpy can vectorize functions
import numpy as np
def func(i):
print i
v_func = np.vectorize(func)
v_func(['one', 'two', 'three'])
python has a builtin map function, that maps a function on every member of an iterable
def func(i):
print i
map(func, ['one', 'two', 'three'])
Are you asking if it is inefficient to create a list you don't need? Put that way, the answer should be obvious.
(To satisfy the answer police: yes, it is less efficient to create a list you don't need.)

Learning Python from Ruby; Differences and Similarities

I know Ruby very well. I believe that I may need to learn Python presently. For those who know both, what concepts are similar between the two, and what are different?
I'm looking for a list similar to a primer I wrote for Learning Lua for JavaScripters: simple things like whitespace significance and looping constructs; the name of nil in Python, and what values are considered "truthy"; is it idiomatic to use the equivalent of map and each, or are mumble somethingaboutlistcomprehensions mumble the norm?
If I get a good variety of answers I'm happy to aggregate them into a community wiki. Or else you all can fight and crib from each other to try to create the one true comprehensive list.
Edit: To be clear, my goal is "proper" and idiomatic Python. If there is a Python equivalent of inject, but nobody uses it because there is a better/different way to achieve the common functionality of iterating a list and accumulating a result along the way, I want to know how you do things. Perhaps I'll update this question with a list of common goals, how you achieve them in Ruby, and ask what the equivalent is in Python.
Here are some key differences to me:
Ruby has blocks; Python does not.
Python has functions; Ruby does not. In Python, you can take any function or method and pass it to another function. In Ruby, everything is a method, and methods can't be directly passed. Instead, you have to wrap them in Proc's to pass them.
Ruby and Python both support closures, but in different ways. In Python, you can define a function inside another function. The inner function has read access to variables from the outer function, but not write access. In Ruby, you define closures using blocks. The closures have full read and write access to variables from the outer scope.
Python has list comprehensions, which are pretty expressive. For example, if you have a list of numbers, you can write
[x*x for x in values if x > 15]
to get a new list of the squares of all values greater than 15. In Ruby, you'd have to write the following:
values.select {|v| v > 15}.map {|v| v * v}
The Ruby code doesn't feel as compact. It's also not as efficient since it first converts the values array into a shorter intermediate array containing the values greater than 15. Then, it takes the intermediate array and generates a final array containing the squares of the intermediates. The intermediate array is then thrown out. So, Ruby ends up with 3 arrays in memory during the computation; Python only needs the input list and the resulting list.
Python also supplies similar map comprehensions.
Python supports tuples; Ruby doesn't. In Ruby, you have to use arrays to simulate tuples.
Ruby supports switch/case statements; Python does not.
Ruby supports the standard expr ? val1 : val2 ternary operator; Python does not.
Ruby supports only single inheritance. If you need to mimic multiple inheritance, you can define modules and use mix-ins to pull the module methods into classes. Python supports multiple inheritance rather than module mix-ins.
Python supports only single-line lambda functions. Ruby blocks, which are kind of/sort of lambda functions, can be arbitrarily big. Because of this, Ruby code is typically written in a more functional style than Python code. For example, to loop over a list in Ruby, you typically do
collection.each do |value|
...
end
The block works very much like a function being passed to collection.each. If you were to do the same thing in Python, you'd have to define a named inner function and then pass that to the collection each method (if list supported this method):
def some_operation(value):
...
collection.each(some_operation)
That doesn't flow very nicely. So, typically the following non-functional approach would be used in Python:
for value in collection:
...
Using resources in a safe way is quite different between the two languages. Here, the problem is that you want to allocate some resource (open a file, obtain a database cursor, etc), perform some arbitrary operation on it, and then close it in a safe manner even if an exception occurs.
In Ruby, because blocks are so easy to use (see #9), you would typically code this pattern as a method that takes a block for the arbitrary operation to perform on the resource.
In Python, passing in a function for the arbitrary action is a little clunkier since you have to write a named, inner function (see #9). Instead, Python uses a with statement for safe resource handling. See How do I correctly clean up a Python object? for more details.
I, like you, looked for inject and other functional methods when learning Python. I was disappointed to find that they weren't all there, or that Python favored an imperative approach. That said, most of the constructs are there if you look. In some cases, a library will make things nicer.
A couple of highlights for me:
The functional programming patterns you know from Ruby are available in Python. They just look a little different. For example, there's a map function:
def f(x):
return x + 1
map(f, [1, 2, 3]) # => [2, 3, 4]
Similarly, there is a reduce function to fold over lists, etc.
That said, Python lacks blocks and doesn't have a streamlined syntax for chaining or composing functions. (For a nice way of doing this without blocks, check out Haskell's rich syntax.)
For one reason or another, the Python community seems to prefer imperative iteration for things that would, in Ruby, be done without mutation. For example, folds (i.e., inject), are often done with an imperative for loop instead of reduce:
running_total = 0
for n in [1, 2, 3]:
running_total = running_total + n
This isn't just a convention, it's also reinforced by the Python maintainers. For example, the Python 3 release notes explicitly favor for loops over reduce:
Use functools.reduce() if you really need it; however, 99 percent of the time an explicit for loop is more readable.
List comprehensions are a terse way to express complex functional operations (similar to Haskell's list monad). These aren't available in Ruby and may help in some scenarios. For example, a brute-force one-liner to find all the palindromes in a string (assuming you have a function p() that returns true for palindromes) looks like this:
s = 'string-with-palindromes-like-abbalabba'
l = len(s)
[s[x:y] for x in range(l) for y in range(x,l+1) if p(s[x:y])]
Methods in Python can be treated as context-free functions in many cases, which is something you'll have to get used to from Ruby but can be quite powerful.
In case this helps, I wrote up more thoughts here in 2011: The 'ugliness' of Python. They may need updating in light of today's focus on ML.
My suggestion: Don't try to learn the differences. Learn how to approach the problem in Python. Just like there's a Ruby approach to each problem (that works very well givin the limitations and strengths of the language), there's a Python approach to the problem. they are both different. To get the best out of each language, you really should learn the language itself, and not just the "translation" from one to the other.
Now, with that said, the difference will help you adapt faster and make 1 off modifications to a Python program. And that's fine for a start to get writing. But try to learn from other projects the why behind the architecture and design decisions rather than the how behind the semantics of the language...
I know little Ruby, but here are a few bullet points about the things you mentioned:
nil, the value indicating lack of a value, would be None (note that you check for it like x is None or x is not None, not with == - or by coercion to boolean, see next point).
None, zero-esque numbers (0, 0.0, 0j (complex number)) and empty collections ([], {}, set(), the empty string "", etc.) are considered falsy, everything else is considered truthy.
For side effects, (for-)loop explicitly. For generating a new bunch of stuff without side-effects, use list comprehensions (or their relatives - generator expressions for lazy one-time iterators, dict/set comprehensions for the said collections).
Concerning looping: You have for, which operates on an iterable(! no counting), and while, which does what you would expect. The fromer is far more powerful, thanks to the extensive support for iterators. Not only nearly everything that can be an iterator instead of a list is an iterator (at least in Python 3 - in Python 2, you have both and the default is a list, sadly). The are numerous tools for working with iterators - zip iterates any number of iterables in parallel, enumerate gives you (index, item) (on any iterable, not just on lists), even slicing abritary (possibly large or infinite) iterables! I found that these make many many looping tasks much simpler. Needless to say, they integrate just fine with list comprehensions, generator expressions, etc.
In Ruby, instance variables and methods are completely unrelated, except when you explicitly relate them with attr_accessor or something like that.
In Python, methods are just a special class of attribute: one that is executable.
So for example:
>>> class foo:
... x = 5
... def y(): pass
...
>>> f = foo()
>>> type(f.x)
<type 'int'>
>>> type(f.y)
<type 'instancemethod'>
That difference has a lot of implications, like for example that referring to f.x refers to the method object, rather than calling it. Also, as you can see, f.x is public by default, whereas in Ruby, instance variables are private by default.

Categories

Resources