Python tuple vs generator - python

I am having a problem understanding why one of the following line returns generator and another tuple.
How exactly and why a generator is created in the second line, while in the third one a tuple is produced?
sample_list = [1, 2, 3, 4]
generator = (i for i in sample_list)
tuple_ = (1, 2, 3, 4)
print type(generator)
<type 'generator'>
print type(tuple_)
<type 'tuple'>
Is it because tuple is immutable object and when I try to unpack list inside (), it can't create the tuple as it has to change the tuple tuple.

You can imagine tuples as being created when you hardcode the values, while generators are created where you provide a way to create the objects.
This works since there is no way (1,2,3,4) could be a generator. There is nothing to generate there, you just specified all the elements, not a rule to obtain them.
In order for your generator to be a tuple, the expression (i for i in sample_list) would have to be a tuple comprehension. There is no way to have tuple comprehensions, since comprehensions require a mutable data type.
Thus, the syntax for what should have been a tuple comprehension has been reused for generators.

Parentheses are used for three different things: grouping, tuple literals, and function calls. Compare (1 + 2) (an integer) and (1, 2) (a tuple). In the generator assignment, the parentheses are for grouping; in the tuple assignment, the parentheses are a tuple literal. Parentheses represent a tuple literal when they contain a comma and are not used for a function call.

Related

Why does is my code output returning as a list? [duplicate]

I was researching about python codegolf and saw someone use the unpacking operator in a strange way:
*s,='abcde'
I know that the unpacking operator basically iterates over a sequence. So I know that
s=[*'abcde']
will "unpack" the abcde string and save ['a', 'b', 'c', 'd', 'e'] in variable s.
Can someone explain as thoroughly as possible how does the
*s,='abcde'
statement work? I know it does the same thing as s=[*'abcde'] but it accomplishes it in a different way. Why is the unpacking iterator on the variable, instead of the string? Why is there a comma right after the variable name?
This is Iterable Unpacking. You may have seen it in other places to assign values to multiple variables from a single expression
a, b, c = [1, 2, 3]
This syntax includes a * to indicate that this variable should be a list containing the elements from the iterable that weren't explicitly assigned to another variable.
a, *b, c = [1, 2, 3, 4, 5]
print(b)
# [2, 3, 4]
So, what's going on in your example? There's only a single variable name being assigned to, so it's going to take all the items not assigned to another variable, which in this case is all of them. If you try just
*s='abcde'
you'll get
SyntaxError: starred assignment target must be in a list or tuple
Which is why that comma is there, as a trailing comma is how you indicate a single-value tuple.
The trailing comma is required only to create a single tuple (a.k.a. a singleton); it is optional in all other cases. A single expression without a trailing comma doesn’t create a tuple, but rather yields the value of that expression.

Document of Functions that accept iterators

I was having trouble with a project and was later able to successfully complete it. However, while running through some code written by someone else, I noticed they were able to utilize an iterator (for loop) within the join-function.
example:
' '.join(x for x in name.split('*'))
I thought this was awesome as it helped me cut down lines of code from my original draft.
So my question is: Are there any documents that have a list of functions that accept iterators?
I could be mistaken here, but I think what you mean by iterator is in fact called a list comprehension in python. It's not that the list comprehension in question does not return an iterable, but it seems that you are impressed not with the fact that you could pass an iterable to the join function, but instead that the fact that you could put what seems to be flow control inline. Again, tell me if I'm wrong about this.
List comprehensions can be in the form of tuples (returns a generator) or lists (returns a list). To see the difference between these two, type the following in a python shell:
>>> (x for x in 'cool')
<generator object <genexpr> at 0x03980990>
>>> [x for x in 'cool']
['c', 'o', 'o', 'l']
I would imagine it is obvious how you can work with a list, but if you want to learn more about how generators work, you might want to check this out.
Also, the fun doesn't end there with list comprehensions. The possibilities are endless.
>>> [x for x in [1,5,4,7,8,2,6,3] if x > 3]
[5, 4, 7, 8, 6]
>>> [(x,y) for x in range(3) for y in range(3)]
[(0, 0), (0, 1), (0, 2), (1, 0), (1, 1), (1, 2), (2, 0), (2, 1), (2, 2)]
To learn more about list comprehensions in general, try here.
They're called generators, and they work in many places that accept lists or tuples. The generic term for all three is iterable. But it depends on what the code in question does. If it just iterates then a generator will work. If it tries to get the len() or access items by index, it won't.
There isn't a list of functions that accept generators or iterables, no; nobody organizes documentation that way.
Technically, the argument to str.join() in your example is called a "generator expression". A generator expression evals to an iterable object
- note that an iterable is not necessarily an iterator (but iterators are iterable).
I assume your question really was about "functions that accept generator expressions". If yes, the answer is above: any function that expects an iterable, since arguments are eval'd before being passed so the generator expression is turned into an iterable before the function is actually called.
Note that there's a distinction to be made between iterables and "sequences types" (strings, tuples, lists, sets etc): the later are indeed iterable, but they have some other specifities too (ie they usually have a length, can be iterated more than once etc) so not all functions expecting a sequence will work with non-sequence iterators. But this is usually documented.

How does python "know" what to do with the "in" keyword?

I'm a bit bewildered by the "in" keyword in python.
If I take a sample list of tuples:
data = [
(5, 1, 9.8385465),
(10, 1, 8.2087544),
(15, 1, 7.8788187),
(20, 1, 7.5751283)
]
I can do two different "for - in" loops and get different results:
for G,W,V in data:
print G,W,V
This prints each set of values on a line, e.g. 5, 1, 9.8385465
for i in data:
print i
This prints the whole tuple, e.g. (5, 1, 9.8385465)
How does python "know" that by providing one variable I want to assign the tuple to a variable, and that by providing three variables I want to assign each value from the tuple to one of those variables?
According to the for compound statement documentation:
Each item in turn is assigned to the target list using the standard
rules for assignments...
Those "standard rules" are in the assignment statement documentation, specifically:
Assignment of an object to a target list is recursively defined as
follows.
If the target list is a single target: The object is assigned to that target.
If the target list is a comma-separated list of targets: The object must be an iterable with the same number of items as there are targets
in the target list, and the items are assigned, from left to right, to
the corresponding targets.
So this different behaviour, depending on whether you assign to a single target or a list of targets, is baked right into Python's fundamentals, and applies wherever assignment is used.
This isn't really a feature of the in keyword, but of the Python language. The same works with assignment.
x = (1, 2, 3)
print(x)
>>> (1, 2, 3)
a, b, c = (1, 2, 3)
print(a)
>>> 1
print(b)
>>> 2
print(c)
>>> 3
So to answer your question, it's more that Python knows what to do with assignments when you either:
assign a tuple to a variable, or
assign a tuple to a number of variables equal to the number of items in the tuple
It's called tuple unpacking, and has nothing to do with the in keyword.
The for loop returns the single thing (a tuple in this case), and then that tuple gets assigned -- to a single item in the second case, or multiple items in the first case.
If you try specifying the incorrect number of variables:
for G,W in data:
print G,W
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: too many values to unpack
Iterators notionally return a single item (the for a in b syntax).
The compiler knows that if you specify more than one thing for a that the one returned value must itself be iterable and contain the same number of items as your list of variables. It would throw a runtime error if you had for a, b in c and the elements of c did not contain exactly two items.
So in short it "knew" because that is what you told it. It will explode at runtime if you lied ...

What does the list() function do in Python?

I know that the list() constructor creates a new list but what exactly are its characteristics?
What happens when you call list((1,2,3,4,[5,6,7,8],9))?
What happens when you call list([[[2,3,4]]])?
What happens when you call list([[1,2,3],[4,5,6]])?
From what I can tell, calling the constructor list removes the most outer braces (tuple or list) and replaces them with []. Is this true? What other nuances does list() have?
list() converts the iterable passed to it to a list. If the itertable is already a list then a shallow copy is returned, i.e only the outermost container is new rest of the objects are still the same.
>>> t = (1,2,3,4,[5,6,7,8],9)
>>> lst = list(t)
>>> lst[4] is t[4] #outermost container is now a list() but inner items are still same.
True
>>> lst1 = [[[2,3,4]]]
>>> id(lst1)
140270501696936
>>> lst2 = list(lst1)
>>> id(lst2)
140270478302096
>>> lst1[0] is lst2[0]
True
Python has a well-established documentation set for every release version, readable at https://docs.python.org/. The documentation for list() states that list() is merely a way of constructing a list object, of which these are the listed ways:
Using a pair of square brackets to denote the empty list: []
Using square brackets, separating items with commas: [a], [a, b, c]
Using a list comprehension: [x for x in iterable]
Using the type constructor: list() or list(iterable)
The list() function accepts any iterable as its argument, and the return value is a list object.
Further reading: https://docs.python.org/3.4/library/stdtypes.html#typesseq-list
Yes it is true.
Its very simple. list() takes an iterable object as input and adds its elements to a newly created list. Elements can be anything. It can also be an another list or an iterable object, and it will be added to the new list as it is.
i.e no nested processing will happen.
You said: "From what I can tell, calling the constructor list removes the most outer braces (tuple or list) and replaces them with []. Is this true?"
IMHO, this is not a good way to think about what list() does. True, square brackets [] are used to write a list literal, and are used when you tell a list to represent itself as a string, but ultimately, that's just notation. It's better to think of a Python list as a particular kind of container object with certain properties, eg it's ordered, indexable, iterable, mutable, etc.
Thinking of the list() constructor in terms of it performing a transformation on the kind of brackets of a tuple that you pass it is a bit like saying adding 3 to 6 turns the 6 upside down to make 9. It's true that a '9' glyph looks like a '6' glyph turned upside down, but that's got nothing to do with what happens on the arithmetic level, and it's not even true of all fonts.
aTuple = (123, 'xyz', 'zara', 'abc');
aList = list(aTuple)
print "List elements : ", aList
When we run above program, it produces following result:
List elements : [123, 'xyz', 'zara', 'abc']
It is another way to create a list in python. How convenient!
Your question is vague, but this is the output as follows, it doesn't "replace" the outer braces, it creates a data structure of a list, that can contain any value in a "listed" order (one after the other, after the other, and so on...) in a recursive way, you can add/remove elements to a specified index using append and pop. By the other hand, tuples are static and are not dynamically linked, they are more like an array of any type of element.
WHEN:
list((1,2,3,4,[5,6,7,8],9))
RETURNS:
[1, 2, 3, 4, [5, 6, 7, 8], 9]
WHEN:
list([[[2,3,4]]])
RETURNS:
[[[2, 3, 4]]]
WHEN:
list([[1,2,3],[4,5,6]])
RETURNS:
[[1, 2, 3], [4, 5, 6]]

Python append() vs. + operator on lists, why do these give different results?

Why do these two operations (append() resp. +) give different results?
>>> c = [1, 2, 3]
>>> c
[1, 2, 3]
>>> c += c
>>> c
[1, 2, 3, 1, 2, 3]
>>> c = [1, 2, 3]
>>> c.append(c)
>>> c
[1, 2, 3, [...]]
>>>
In the last case there's actually an infinite recursion. c[-1] and c are the same. Why is it different with the + operation?
To explain "why":
The + operation adds the array elements to the original array. The array.append operation inserts the array (or any object) into the end of the original array, which results in a reference to self in that spot (hence the infinite recursion in your case with lists, though with arrays, you'd receive a type error).
The difference here is that the + operation acts specific when you add an array (it's overloaded like others, see this chapter on sequences) by concatenating the element. The append-method however does literally what you ask: append the object on the right-hand side that you give it (the array or any other object), instead of taking its elements.
An alternative
Use extend() if you want to use a function that acts similar to the + operator (as others have shown here as well). It's not wise to do the opposite: to try to mimic append with the + operator for lists (see my earlier link on why). More on lists below:
Lists
[edit] Several commenters have suggested that the question is about lists and not about arrays. The question has changed, though I should've included this earlier.
Most of the above about arrays also applies to lists:
The + operator concatenates two lists together. The operator will return a new list object.
List.append does not append one list with another, but appends a single object (which here is a list) at the end of your current list. Adding c to itself, therefore, leads to infinite recursion.
As with arrays, you can use List.extend to add extend a list with another list (or iterable). This will change your current list in situ, as opposed to +, which returns a new list.
Little history
For fun, a little history: the birth of the array module in Python in February 1993. it might surprise you, but arrays were added way after sequences and lists came into existence.
The concatenation operator + is a binary infix operator which, when applied to lists, returns a new list containing all the elements of each of its two operands. The list.append() method is a mutator on list which appends its single object argument (in your specific example the list c) to the subject list. In your example this results in c appending a reference to itself (hence the infinite recursion).
An alternative to '+' concatenation
The list.extend() method is also a mutator method which concatenates its sequence argument with the subject list. Specifically, it appends each of the elements of sequence in iteration order.
An aside
Being an operator, + returns the result of the expression as a new value. Being a non-chaining mutator method, list.extend() modifies the subject list in-place and returns nothing.
Arrays
I've added this due to the potential confusion which the Abel's answer above may cause by mixing the discussion of lists, sequences and arrays.
Arrays were added to Python after sequences and lists, as a more efficient way of storing arrays of integral data types. Do not confuse arrays with lists. They are not the same.
From the array docs:
Arrays are sequence types and behave very much like lists, except that the type of objects stored in them is constrained. The type is specified at object creation time by using a type code, which is a single character.
append is appending an element to a list. if you want to extend the list with the new list you need to use extend.
>>> c = [1, 2, 3]
>>> c.extend(c)
>>> c
[1, 2, 3, 1, 2, 3]
Python lists are heterogeneous that is the elements in the same list can be any type of object. The expression: c.append(c) appends the object c what ever it may be to the list. In the case it makes the list itself a member of the list.
The expression c += c adds two lists together and assigns the result to the variable c. The overloaded + operator is defined on lists to create a new list whose contents are the elements in the first list and the elements in the second list.
So these are really just different expressions used to do different things by design.
The method you're looking for is extend(). From the Python documentation:
list.append(x)
Add an item to the end of the list; equivalent to a[len(a):] = [x].
list.extend(L)
Extend the list by appending all the items in the given list; equivalent to a[len(a):] = L.
list.insert(i, x)
Insert an item at a given position. The first argument is the index of the element before which to insert, so a.insert(0, x) inserts at the front of the list, and a.insert(len(a), x) is equivalent to a.append(x).
you should use extend()
>>> c=[1,2,3]
>>> c.extend(c)
>>> c
[1, 2, 3, 1, 2, 3]
other info: append vs. extend
See the documentation:
list.append(x)
Add an item to the end of the list; equivalent to a[len(a):] = [x].
list.extend(L)
- Extend the list by appending all the items in the given list;
equivalent to a[len(a):] = L.
c.append(c) "appends" c to itself as an element. Since a list is a reference type, this creates a recursive data structure.
c += c is equivalent to extend(c), which appends the elements of c to c.

Categories

Resources