I'm a bit bewildered by the "in" keyword in python.
If I take a sample list of tuples:
data = [
(5, 1, 9.8385465),
(10, 1, 8.2087544),
(15, 1, 7.8788187),
(20, 1, 7.5751283)
]
I can do two different "for - in" loops and get different results:
for G,W,V in data:
print G,W,V
This prints each set of values on a line, e.g. 5, 1, 9.8385465
for i in data:
print i
This prints the whole tuple, e.g. (5, 1, 9.8385465)
How does python "know" that by providing one variable I want to assign the tuple to a variable, and that by providing three variables I want to assign each value from the tuple to one of those variables?
According to the for compound statement documentation:
Each item in turn is assigned to the target list using the standard
rules for assignments...
Those "standard rules" are in the assignment statement documentation, specifically:
Assignment of an object to a target list is recursively defined as
follows.
If the target list is a single target: The object is assigned to that target.
If the target list is a comma-separated list of targets: The object must be an iterable with the same number of items as there are targets
in the target list, and the items are assigned, from left to right, to
the corresponding targets.
So this different behaviour, depending on whether you assign to a single target or a list of targets, is baked right into Python's fundamentals, and applies wherever assignment is used.
This isn't really a feature of the in keyword, but of the Python language. The same works with assignment.
x = (1, 2, 3)
print(x)
>>> (1, 2, 3)
a, b, c = (1, 2, 3)
print(a)
>>> 1
print(b)
>>> 2
print(c)
>>> 3
So to answer your question, it's more that Python knows what to do with assignments when you either:
assign a tuple to a variable, or
assign a tuple to a number of variables equal to the number of items in the tuple
It's called tuple unpacking, and has nothing to do with the in keyword.
The for loop returns the single thing (a tuple in this case), and then that tuple gets assigned -- to a single item in the second case, or multiple items in the first case.
If you try specifying the incorrect number of variables:
for G,W in data:
print G,W
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: too many values to unpack
Iterators notionally return a single item (the for a in b syntax).
The compiler knows that if you specify more than one thing for a that the one returned value must itself be iterable and contain the same number of items as your list of variables. It would throw a runtime error if you had for a, b in c and the elements of c did not contain exactly two items.
So in short it "knew" because that is what you told it. It will explode at runtime if you lied ...
Related
I was researching about python codegolf and saw someone use the unpacking operator in a strange way:
*s,='abcde'
I know that the unpacking operator basically iterates over a sequence. So I know that
s=[*'abcde']
will "unpack" the abcde string and save ['a', 'b', 'c', 'd', 'e'] in variable s.
Can someone explain as thoroughly as possible how does the
*s,='abcde'
statement work? I know it does the same thing as s=[*'abcde'] but it accomplishes it in a different way. Why is the unpacking iterator on the variable, instead of the string? Why is there a comma right after the variable name?
This is Iterable Unpacking. You may have seen it in other places to assign values to multiple variables from a single expression
a, b, c = [1, 2, 3]
This syntax includes a * to indicate that this variable should be a list containing the elements from the iterable that weren't explicitly assigned to another variable.
a, *b, c = [1, 2, 3, 4, 5]
print(b)
# [2, 3, 4]
So, what's going on in your example? There's only a single variable name being assigned to, so it's going to take all the items not assigned to another variable, which in this case is all of them. If you try just
*s='abcde'
you'll get
SyntaxError: starred assignment target must be in a list or tuple
Which is why that comma is there, as a trailing comma is how you indicate a single-value tuple.
The trailing comma is required only to create a single tuple (a.k.a. a singleton); it is optional in all other cases. A single expression without a trailing comma doesn’t create a tuple, but rather yields the value of that expression.
I am having a problem understanding why one of the following line returns generator and another tuple.
How exactly and why a generator is created in the second line, while in the third one a tuple is produced?
sample_list = [1, 2, 3, 4]
generator = (i for i in sample_list)
tuple_ = (1, 2, 3, 4)
print type(generator)
<type 'generator'>
print type(tuple_)
<type 'tuple'>
Is it because tuple is immutable object and when I try to unpack list inside (), it can't create the tuple as it has to change the tuple tuple.
You can imagine tuples as being created when you hardcode the values, while generators are created where you provide a way to create the objects.
This works since there is no way (1,2,3,4) could be a generator. There is nothing to generate there, you just specified all the elements, not a rule to obtain them.
In order for your generator to be a tuple, the expression (i for i in sample_list) would have to be a tuple comprehension. There is no way to have tuple comprehensions, since comprehensions require a mutable data type.
Thus, the syntax for what should have been a tuple comprehension has been reused for generators.
Parentheses are used for three different things: grouping, tuple literals, and function calls. Compare (1 + 2) (an integer) and (1, 2) (a tuple). In the generator assignment, the parentheses are for grouping; in the tuple assignment, the parentheses are a tuple literal. Parentheses represent a tuple literal when they contain a comma and are not used for a function call.
Imagine the following function:
def getMinAndMax(numbers):
return min(numbers), max(numbers)
What will happen if I do this?
num = getMinAndMax([1,2,3,4,5])
Will num assume the value of the first item in the tuple, min, or will something else happen? I know I can just try it, but I'm looking for some defined Python behavior here.
Your function returns the two-element tuple min([1, 2, 3, 4, 5]), max([1, 2, 3, 4, 5]) which evaluates to 1, 5. So the statement
num = getMinAndMax([1,2,3,4,5])
will bind the name num to the tuple (1, 2) and you can access the individual values as num[0] and num[1]. Python does allow you, though, to use what's usually referred to as a unpacking assignment which looks like this:
nmin, nmax = getMinAndMax([1, 2, 3, 4, 5])
That binds each name to a succeeding element of the tuple on the right-hand side and allows you to use the values without indexing. If you need a tuple of the results your formulation is simplest, though of course the expression (nmin, nmax) will re-create it from the second one.
num will be a tuple. The value of num will be equal to (1,5) in your example. Python does not check types by default, so you can safely assign whatever value of whatever type you want to whatever variable.
I know that there are certain "special" methods of various objects that represent operations that would normally be performed with operators (i.e. int.__add__ for +, object.__eq__ for ==, etc.), and that one of them is list.__setitem, which can assign a value to a list element. However, I need a function that can assign a list into a slice of another list.
Basically, I'm looking for the expression equivalent of some_list[2:4] = [2, 3].
The line
some_list[2:4] = [2, 3]
will also call list.__setitem__(). Instead of an index, it will pass a slice object though. The line is equivalent to
some_list.__setitem__(slice(2, 4), [2, 3])
It depends on the version of Python. For 3.2, __setitem__ does the job:
Note Slicing is done exclusively with the following three methods. A call like
a[1:2] = b
is translated to
a[slice(1, 2, None)] = b
and so forth. Missing slice items are always filled in with None.
Why do these two operations (append() resp. +) give different results?
>>> c = [1, 2, 3]
>>> c
[1, 2, 3]
>>> c += c
>>> c
[1, 2, 3, 1, 2, 3]
>>> c = [1, 2, 3]
>>> c.append(c)
>>> c
[1, 2, 3, [...]]
>>>
In the last case there's actually an infinite recursion. c[-1] and c are the same. Why is it different with the + operation?
To explain "why":
The + operation adds the array elements to the original array. The array.append operation inserts the array (or any object) into the end of the original array, which results in a reference to self in that spot (hence the infinite recursion in your case with lists, though with arrays, you'd receive a type error).
The difference here is that the + operation acts specific when you add an array (it's overloaded like others, see this chapter on sequences) by concatenating the element. The append-method however does literally what you ask: append the object on the right-hand side that you give it (the array or any other object), instead of taking its elements.
An alternative
Use extend() if you want to use a function that acts similar to the + operator (as others have shown here as well). It's not wise to do the opposite: to try to mimic append with the + operator for lists (see my earlier link on why). More on lists below:
Lists
[edit] Several commenters have suggested that the question is about lists and not about arrays. The question has changed, though I should've included this earlier.
Most of the above about arrays also applies to lists:
The + operator concatenates two lists together. The operator will return a new list object.
List.append does not append one list with another, but appends a single object (which here is a list) at the end of your current list. Adding c to itself, therefore, leads to infinite recursion.
As with arrays, you can use List.extend to add extend a list with another list (or iterable). This will change your current list in situ, as opposed to +, which returns a new list.
Little history
For fun, a little history: the birth of the array module in Python in February 1993. it might surprise you, but arrays were added way after sequences and lists came into existence.
The concatenation operator + is a binary infix operator which, when applied to lists, returns a new list containing all the elements of each of its two operands. The list.append() method is a mutator on list which appends its single object argument (in your specific example the list c) to the subject list. In your example this results in c appending a reference to itself (hence the infinite recursion).
An alternative to '+' concatenation
The list.extend() method is also a mutator method which concatenates its sequence argument with the subject list. Specifically, it appends each of the elements of sequence in iteration order.
An aside
Being an operator, + returns the result of the expression as a new value. Being a non-chaining mutator method, list.extend() modifies the subject list in-place and returns nothing.
Arrays
I've added this due to the potential confusion which the Abel's answer above may cause by mixing the discussion of lists, sequences and arrays.
Arrays were added to Python after sequences and lists, as a more efficient way of storing arrays of integral data types. Do not confuse arrays with lists. They are not the same.
From the array docs:
Arrays are sequence types and behave very much like lists, except that the type of objects stored in them is constrained. The type is specified at object creation time by using a type code, which is a single character.
append is appending an element to a list. if you want to extend the list with the new list you need to use extend.
>>> c = [1, 2, 3]
>>> c.extend(c)
>>> c
[1, 2, 3, 1, 2, 3]
Python lists are heterogeneous that is the elements in the same list can be any type of object. The expression: c.append(c) appends the object c what ever it may be to the list. In the case it makes the list itself a member of the list.
The expression c += c adds two lists together and assigns the result to the variable c. The overloaded + operator is defined on lists to create a new list whose contents are the elements in the first list and the elements in the second list.
So these are really just different expressions used to do different things by design.
The method you're looking for is extend(). From the Python documentation:
list.append(x)
Add an item to the end of the list; equivalent to a[len(a):] = [x].
list.extend(L)
Extend the list by appending all the items in the given list; equivalent to a[len(a):] = L.
list.insert(i, x)
Insert an item at a given position. The first argument is the index of the element before which to insert, so a.insert(0, x) inserts at the front of the list, and a.insert(len(a), x) is equivalent to a.append(x).
you should use extend()
>>> c=[1,2,3]
>>> c.extend(c)
>>> c
[1, 2, 3, 1, 2, 3]
other info: append vs. extend
See the documentation:
list.append(x)
Add an item to the end of the list; equivalent to a[len(a):] = [x].
list.extend(L)
- Extend the list by appending all the items in the given list;
equivalent to a[len(a):] = L.
c.append(c) "appends" c to itself as an element. Since a list is a reference type, this creates a recursive data structure.
c += c is equivalent to extend(c), which appends the elements of c to c.