<Python> Two iterating variables in a for loop [duplicate] - python

This question already has answers here:
How are tuples unpacked in for loops?
(8 answers)
Closed 7 years ago.
Let me start of with the generic I'm a beginner with Python introduction.
Hi, I'm a beginner with Python so please keep responses as closely aligned with plain English as possible :)
I keep running into these for loops where there are two iterating variables in a for loop. This is highly confusing to me because I just got my head wrapped around the basic concept of for loops. That is your iterating variable runs through one piece at a time through the for loop line by line (in most cases). So what then does two iterating variables do? I have some speculations but I'd like correct answers to put my thinking in the right direction.
Would someone type how the for loops would be read (in speaking terms) and explain what exactly is happening.
>>> elements = ('foo', 'bar', 'baz')
>>> for elem in elements:
... print elem
...
foo
bar
baz
>>> for count, elem in enumerate(elements):
... print count, elem
...
0 foo
1 bar
2 baz

In your code, you use the enumerate() function. You can imagine that it does the following:
>>> print enumerate("a","b","c")
>>> [(0,"a"),(1,"b"),(2,"c")]
(This isn't exactly correct, but it is close enough.)
enumerate() turns a list of elements into a list* of tuples, where each tuple consists of the index of the element and the element. Tuples are similar to lists.
*In fact it returns an iterator, not a list, but that shouldn't bother you too much.
But how do the two iteration variables work?
Let me illustrate this with a few examples.
In the first example, we iterate over each element in a list of integers.
simple_list = [1,2,3,4]
for x in simple_list:
print x
But we can also iterate over each element in a list of tuples of integers:
tuple_list = [(1,5),(2,6),(3,7),(4,8)]
for x, y in tuple_list:
print x, y
print x+y
Here, each element of the list is a tuple of two integers. When we write for tup in tuple_list, then tup is populated on each iteration with one tuple of integers from the list.
Now, because Python is awsome, we can "catch" the tuple of two integers into two different integer variables by writing (x,y) (or x,y, which is the same) in place of of tup
Does this make sense?
Analogously, we can iterate over 3-tuples:
tuple_list = [(1,5,10),(2,6,11),(3,7,12),(4,8,69)]
for x, y, z in tuple_list:
print x, y, z
print x+y-z
All of this works. Or should work (haven't run the code).
And of course,
tuple_list = [(1,5),(2,6),(3,7),(4,8)]
for tuple in tuple_list:
print tuple
also works.

Related

Why does a list comprehension over a zip() call return a list containing the zip object instead of a list of zip()'s return values?

Playing around with python3 REPL and noticed the following:
Why are print( [zip([1,2,3], [3,1,4])] ) and print( list(zip([1,2,3], [3,1,4])) ) different?
The first returns [<zip object at 0xblah>] and the second returns [(1,3), (2,1), (3,4)].
Trying to understand why the list comprehension in the first statement doesn’t give me the result that the list() constructor gives - I think I'm confused about the difference between list comprehension and list() and would appreciate insight into what's happening under the hood.
Searching gives me this question on lists and tuples which doesn't answer my question.
Edit: A suggested question on The zip() function in Python 3 is very helpful background, but does not address the confusion in my question about the difference between a list comprehension and a list literal, so i prefer the submitted answer below as more complete.
The first statement is not a list comprehension, a list comprehension would give you the same result. It is just a list literal containing a zip object:
This would be a list comprehension:
[value for value in zip([1,2,3], [3,1,4])]
The above will print the same as list(zip([1, 2, 3], [3, 1, 4])).
In general, [something] means: A list with one element: something.
On the other hand, list(something) means: Iterate over the values in something, and make a list from the result. You can see the difference for example by putting primitive objects inside it, like a number:
>>> [2]
[2]
>>> list(2)
TypeError: 'int' object is not iterable

Confused about python, lists/generator objects [duplicate]

This question already has an answer here:
Why a generator object is obtained instead of a list
(1 answer)
Closed 3 years ago.
Not sure if this has been asked before but I couldn't find a proper, clear explanation.
I had a concern about something related to python syntax.
While practicing some python, I Intuitively assumed this would print all the elements of the list; list1.
But it doesn't seem to do so, why would that be?
I could obviously print it in many other ways; but I fail to understand the inherent python logic at play here.
list1 = [1,2,3,4]
print(list1[i] for i in range(len(list1)))
I expected the output to be '[1, 2, 3, 4]', but it instead prints a generator object.
You need to surround list1[i] for i in range(len(list)) with [] to indicate that it's a list. Although list1 is a list, you are trying to use a generator expression to print it out, which will return a generator object (type of iterable similar to a list.) Without specifying you want to convert the generator to a list, it won't print a list. (A generator expression converted to a list is called list comprehension.)
Even if you did do this, it would still print it [1, 2, 3, 4] rather than 1 2 3 4. You need to do [print(list1[i], end=" ") for i in range(len(list1)))] for that to work. There are far better ways of doing this: see donkopotamus's answer.
The expression (list1[i] for i in range(len(list))) defines a generator object. So that is what is printed.
If you wish to print a list, then make it a list comprehension rather than a generator, and print that:
print( [list1[i] for i in range(len(list1))] )
Alternatively, you could force evaluation of the generator into a tuple (or list or set), by passing the generator to the appropriate type using eg
print(tuple(list1[i] for i in range(len(list1))))
In order to get the specific output you intended (space separated) of 1 2 3 4 you could use str.join in the following way:
>>> list1 = [1, 2, 3, 4]
>>> print(" ".join(list1[i] for i in range(len(list1))))
1 2 3 4
or unpack the list into print (this will not work in python 2, as in python 2 print is not a function)
>>> print(*(list1[i] for i in range(len(list1))))
1 2 3 4
(list1[i] for i in range(len(list1)))
is indeed a generator object, equivalent to simply
(x for x in list1)
You're passing that generator to print as a single argument, so print simply prints it: it does not extract the elements from it.
Alternatively, you can unpack it as you pass it to print:
print(*(list1[i] for i in range(len(list1))))
This will pass each element of the generated sequence to print as a separate argument, so they should each get printed.
If you simply meant to print your list, any of the following would have worked:
print(list1)
print([list1[i] for i in range(len(list1))])
print([x for x in list1])
The use of square brackets makes a list comprehension rather than a generator expression.
There is something called list comprehension and generator expression in python. They are awesome tools, you can find more info by googling. Here is a link.
Basically, you can make a list on the fly in python.
list1 = [1,2,3,4]
squared_list = [i*i for i in list1]
would return a list with all the items squared. However, is we say
squared_gen_list = (i*i for i in list1)
this returns what is known as a generator object. That is what is happening in your case, as you can see from the syntax and so you are just printing that out. Hope that clears up the confusion.

How do you convert a list of strings to separate strings in Python 3?

I want to know if you have a list of strings such as:
l = ['ACGAAAG', 'CAGAAGC', 'ACCTGTT']
How do you convert it to:
O = 'ACGAAAG'
P = 'CAGAAGC'
Q = 'ACCTGTT'
Can you do this without knowing the number of items in a list? You have to store them as variables.
(The variables don't matter.)
Welcome to SE!
Structure Known
If you know the structure of the string, then you might simply unpack it:
O, P, Q = my_list
Structure Unknown
Unpack your list using a for loop. Do your work on each string inside the loop. For the below, I am simply printing each one:
for element in l:
print(element)
Good luck!
If you don't know the number of items beforehand, a list is the right structure to keep the items in.
You can, though, cut off fist few known items, and leave the unknown tail as a list:
a, b, *rest = ["ay", "bee", "see", "what", "remains"]
print("%r, %r, rest is %r" % (a, b, rest))
a,b,c = my_list
this will work as long as the numbers of elements in the list is equal to the numbers of variables you want to unpack, it actually work with any iterable, tuple, list, set, etc
if the list is longer you can always access the first 3 elements if that is what you want
a = my_list[0]
b = my_list[1]
c = my_list[2]
or in one line
a, b, c = my_list[0], my_list[1], my_list[2]
even better with the slice notation you can get a sub list of the right with the first 3 elements
a, b, c = my_list[:3]
those would work as long as the list is at least of size 3, or the numbers of variables you want
you can also use the extended unpack notation
a, b, c, *the_rest = my_list
the rest would be a list with everything else in the list other than the first 3 elements and again the list need to be of size 3 or more
And that pretty much cover all the ways to extract a certain numbers of items
Now depending of what you are going to do with those, you may be better with a regular loop
for item in my_list:
#do something with the current item, like printing it
print(item)
in each iteration item would take the value of one element in the list for you to do what you need to do one item at the time
if what you want is take 3 items at the time in each iteration, there are several way to do it
like for example
for i in range(3,len(my_list),3)
a,b,c = my_list[i-3:i]
print(a,b,c)
there are more fun construct like
it = [iter(my_list)]*3
for a,b,c in zip(*it):
print(a,b,c)
and other with the itertools module.
But now you said something interesting "so that every term is assigned to a variable" that is the wrong approach, you don't want an unknown number of variables running around that get messy very fast, you work with the list, if you want to do some work with each element it there are plenty of ways of doing it like list comprehension
my_new_list = [ some_fun(x) for x in my_list ]
or in the old way
my_new_list = []
for x in my_list:
my_new_list.append( some_fun(x) )
or if you need to work with more that 1 item at the time, combine that with some of the above
I do not know if your use case requires the strings to be stored in different variables. It usually is a bad idea.
But if you do need it, then you can use exec builtin which takes the string representation of a python statement and executes it.
list_of_strings = ['ACGAAAG', 'CAGAAGC', 'ACCTGTT']
Dynamically generate variable names equivalent to the column names in an excel sheet. (A,B,C....Z,AA,AB........,AAA....)
variable_names = ['A', 'B', 'C'] in this specific case
for vn, st in zip(variable_names, list_of_strings):
exec('{} = "{}"'.format(vn, st))
Test it out, print(A,B,C) will output the three strings and you can use A,B and C as variables in the rest of the program

Get related dictionaries from lists

I have two list of different dictionaries (ListA and ListB).
All dictionaries in listA have field "id" and "external_id"
All dictionaries in listB have field "num" and "external_num"
I need to get all pairs of dictionaries where value of external_id = num and value of external_num = id.
I can achieve that using this code:
for dictA in ListA:
for dictB in ListB:
if dictA["id"] == dictB["external_num"] and dictA["external_id"] == dictB["num"]:
But I saw many beautiful python expressions, and I guess it is possible to get that result more pythonic style, isn't it?
I something like:
res = [A, B for A, B in listA, listB if A['id'] == B['extnum'] and A['ext'] == B['num']]
You are pretty close, but you aren't telling Python how you want to connect the two lists to get the pairs of dictionaries A and B.
If you want to compare all dictionaries in ListA to all in ListB, you need itertools.product:
from itertools import product
res = [A, B for A, B in product(ListA, ListB) if ...]
Alternatively, if you want pairs at the same indices, use zip:
res = [A, B for A, B in zip(ListA, ListB) if ...]
If you don't need the whole list building at once, note that you can use itertools.ifilter to pick the pairs you want:
from itertools import ifilter, product
for A, B in ifilter(lambda (A, B): ...,
product(ListA, ListB)):
# do whatever you want with A and B
(if you do this with zip, use itertools.izip instead to maximise performance).
Notes on Python 3.x:
zip and filter no longer return lists, therefore itertools.izip and itertools.ifilter no longer exist (just as range has pushed out xrange) and you only need product from itertools; and
lambda (A, B): is no longer valid syntax; you will need to write the filtering function to take a single tuple argument lambda t: and e.g. replace A with t[0].
Firstly, for code clarity, I actually would probably go with your first option - I don't think using for loops is particularly un-Pythonic, in this case. However, if you want to try using a list comprehension, there are a few things to be aware of:
Each item returned by the list comprehension needs to be just a singular item. Trying to return A, B is going to give you a SyntaxError. However, you can return either a list or a tuple (or anything else, that is a single object), so something like res = [(A,B) for...] would start working.
Another concern is how you're iterating over these lists - from you first snippet of code, it appears you don't make any assumptions about these lists lining up, meaning: you seem to be ok if the 2nd item in listA matches the 14th item in listB, so long as they match on the appropriate fields. That's perfectly reasonable, but just be aware that means you will need two for loops no matter how you try to do it*. And you still need your comparisons. So, as a list comprehension, you might try:
res = [(A, B) for A in listA for B in listB if A['id']==B['extnum'] and A['extid']==B['num']]
Then, in res, you'll have 0 or more tuples, and each tuple will contain the respective dictionaries you're interested in. To use them:
for tup in res:
A = tup[0]
B = tup[1]
#....
or more concisely (and Pythonically):
for A,B in res:
#...
since Python is smart enough to know that it's yielding an item (the tuple) that has 2 elements, and so it can directly assign them to A and B.
EDIT:* in retrospect, it isn't completely true that you need two forloops, and if your lists are big enough, it may be helpful, performance-wise, to make an intermediate dictionary such as this:
# make a dictionary with key=tuple, value=dictionary
interim = {(A['id'], A['extid']): A for A in listA}
for B in listB:
tup = (B['extnum'], B['num']) ## order matters! match-up with A
if tup in interim:
A = interim[tup]
print(A, B)
and, if the id-extid pair isnot expected to be unique across all items in listA, then you'd want to look into collections.defaultdict with a list... but I'm not sure this still fits in the 'more Pythonic' category anymore.
I realize this is likely overkill for the question you asked, but I couldn't let my 'two for loops' statement stand, since it's not entirely true.

Python list index splitting and manipulation

My question seems simple, but for a novice to python like myself this is starting to get too complex for me to get, so here's the situation:
I need to take a list such as:
L = [(a, b, c), (d, e, d), (etc, etc, etc), (etc, etc, etc)]
and make each index an individual list so that I may pull elements from each index specifically. The problem is that the list I am actually working with contains hundreds of indices such as the ones above and I cannot make something like:
L_new = list(L['insert specific index here'])
for each one as that would mean filling up the memory with hundreds of lists corresponding to individual indices of the first list and would be far too time and memory consuming from my point of view. So my question is this, how can I separate those indices and then pull individual parts from them without needing to create hundreds of individual lists (at least to the point where I wont need hundreds of individual lines to create them).
I might be misreading your question, but I'm inclined to say that you don't actually have to do anything to be able to index your tuples. See my comment, but: L[0][0] will give "a", L[0][1] will give "b", L[2][1] will give "etc" etc...
If you really want a clean way to turn this into a list of lists you could use a list comprehension:
cast = [list(entry) for entry in L]
In response to your comment: if you want to access across dimensions I would suggest list comprehension. For your comment specifically:
crosscut = [entry[0] for entry in L]
In response to comment 2: This is largely a part of a really useful operation called slicing. Specifically to do the referenced operation you would do this:
multiple_index = [entry[0:3] for entry in L]
Depending on your readability preferences there are actually a number of possibilities here:
list_of_lists = []
for sublist in L:
list_of_lists.append(list(sublist))
iterator = iter(L)
for i in range(0,iterator.__length_hint__()):
return list(iterator.next())
# Or yield list(iterator.next()) if you want lazy evaluation
What you have there is a list of tuples, access them like a list of lists
L[3][2]
will get the second element from the 3rd tuple in your list L
Two way of using inner lists:
for index, sublist in enumerate(L):
# do something with sublist
pass
or with an iterator
iterator = iter(L)
sublist = L.next() # <-- yields the first sublist
in both case, sublist elements can be reached via
direct index
sublist[2]
iteration
iterator = iter(sublist)
iterator.next() # <-- yields first elem of sublist
for elem in sublist:
# do something with my elem
pass

Categories

Resources