the process of function zip() python - python

I want to ask a question about zip() in python.
Consider the following simple code to demonstrate zip().
a = ['1', '2', '3']
b = ['a', 'b', 'c']
for key, value in zip(a, b):
print(key + value)
I know that the following output is produced:
1a
2b
3c
where each element in the corresponding lists are concatenated.
As a beginner in Python3, I understand the following about zip():
zip() creates a zip object, related to OOP that can be shown using list():
my_zip = zip(a, b)
print(my_zip)
print(list(my_zip))
>>> <zip object at 0xsomelocation>
>>>[('1', 'a'), ('2', 'b'), ('3', 'c')]
such that the zip object is a list of tuples.
My confusion is in this line from the original block of code, which I don't really understand:
for key, value in zip(a, b)
My interpretation is that as we are looping through our zip object, which has some innate __next__() method called on by our for loop, we loop through each tuple in turn.
For our first iteration of the loop, we get:
('1', 'a')
and python assigns '1' and 'a' to our variables key and value respectively. This is repeated until the end of the list dimensions i.e. 3 times.
Is this the correct interpretation of what is happening in our code?

such that the zip object is a list of tuples.
zip() doesn't return a list of tuples. It returns an iterator of tuples, where the i-th tuple contains the i-th element from each of the argument sequences or iterables.
The iterator stops when the shortest input iterable is exhausted. With a single iterable argument, it returns an iterator of 1-tuples. With no arguments, it returns an empty iterator.
and python assigns '1' and 'a' to our variables key and value respectively. This is repeated until the end of the list dimensions i.e. 3 times.
Yes. Rest of your interpretation is correct.
BONUS:
zip() should only be used with unequal length inputs when you don’t care about trailing, unmatched values from the longer iterables. If those values are important, use itertools.zip_longest() instead.

Related

Comparing lists by min function

I was trying to compare different lists and to get the shorter one among them with the min() built-in function (I know that this isn't what min() made for but I was just trying) and I've got some results that made me not sure what the output was based on
min(['1','0','l'], ['1', '2'])
>>> ['1', '0', 'l']
min(['1','2','3'], ['1', '2'])
>>> ['1', '2']
min(['1', '2'], ['4'])
>>> ['1', '2']
min(['1', 'a'], ['0'])
>>> ['0']
min(['1', 'a'], ['100000000'])
>>> ['1', 'a']
I don't know what is the output based on and I hope someone can help me and clarify this behavior.
The min() function takes the keyword argument key which you can use to specify what exact value to compare. It is a function which gets the list in your case as the argument.
So to get the shortest list, you can use this code:
min(['1','0','l'], ['1', '2'], key=lambda x: len(x))
Regarding your code and how the min() function determines the result:
You can look at your list like a string (which is just a list of characters in theory). If you'd compare a string to another one, you'd look at the letters from left to right and order them by their leftmost letters. For example abc < acb because b comes before c in the alphabet (and a=a so we can ignore the first letter).
With lists it's the same. It will go through the items from left to right and compare them until it finds the first one which is not equal in both lists. The smaller one of those is then used to determine the "smaller" list.
min finds the 'smallest' of the lists by the comparison operator they provide. For lists, it works by lexicographical order - of two lists, the one whose first unequal(to the elements in the other list at the same index) element is larger, is the larger list.
You can check what an inbuilt function does in the documentation
as you can see the minimum function accepts two things as its parameters:
min(iterable, *[, key, default]) : which is used to get the smallest value in an iterable object such as a list.
min(arg1, arg2, *args[, key]): which is what you are current using. It gets the minimum value when both arguments are compared. When comparing lists to see which one is smaller, it will get the first index that does not have the same value in both lists i.e.
a = [3,5,1]
b = [3,3,1]
result = a > b # true
here the first index that is not the same on both lists is index 1, and so the comparison is 5 > 3 (which is true)
using this logic of comparing lists, the min() function will return the list that has the smallest index which is unique and smaller than the other list.
See lexicographical order.
If you place characters, then we use lexicographical ordering, and so
>>> 'a' < 'b'
True
>>> '1' < '2'
True
>>> 'a' < 'A'
False
From the documentation:
Docstring:
min(iterable, *[, default=obj, key=func]) -> value
min(arg1, arg2, *args, *[, key=func]) -> value
With a single iterable argument, return its smallest item. The
default keyword-only argument specifies an object to return if
the provided iterable is empty.
With two or more arguments, return the smallest argument.
So, for example,
IN: min([5,4,3], [6])
OUT: [6]
As #Tim Woocker wrote, you should use a function(argument key) to specify what you want to compare.

Python: about sort

I noticed that the results are different of the two lines. One is a sorted list, while the other is a sorted dictionary. Cant figure out why adding .item will give this difference:
aa={'a':1,'d':2,'c':3,'b':4}
bb=sorted(aa,key=lambda x:x[0])
print(bb)
#['a', 'b', 'c', 'd']
aa={'a':1,'d':2,'c':3,'b':4}
bb=sorted(aa.items(),key=lambda x:x[0])
print(bb)
# [('a', 1), ('b', 4), ('c', 3), ('d', 2)]
The first version implicitly sorts the keys in the dictionary, and is equivalent to sorting aa.keys(). The second version sorts the items, that is: a list of tuples of the form (key, value).
When you iterate on dictionary then you get iterate of keys not (key, value) pair. The sorted method takes any object on which we can iterate and hence you're seeing a difference.
You can verify this by prining while iterating on the dict:
aa={'a':1,'d':2,'c':3,'b':4}
for key in aa:
print(key)
for key in aa.keys():
print(key)
All of the above two for loops print same values.
In the second example, items() method applied to a dictionary returns an iterable collection of tuples (dictionary_key, dictrionary_value). Then the collection is being sorted.
In the first example, a dictionary is automatically casted to an iterable collection of its keys first. (And note: only very first characters of each of them are used for comparinson while sorting, which is probably NOT what you want)

Popping first element from a Python tuple

Is there any way to pop the first element from a Python tuple?
For example, for
tuple('A', 'B', 'C')
I would like to pop off the 'A' and have a tuple containing 'B' and 'C'.
Since tuples are immutable I understand that I need to copy them to a new tuple. But how can I filter out only the first element of the tuple?
With this tuple
x = ('A','B','C')
you can get a tuple containing all but the first element using a slice:
x[1:]
Result:
('B','C')

What is this Python magic?

If you do this {k:v for k,v in zip(*[iter(x)]*2)} where x is a list of whatever, you'll get a dictionary with all the odd elements as keys and even ones as their values. woah!
>>> x = [1, "cat", "hat", 35,2.5, True]
>>> d = {k:v for k,v in zip(*[iter(x)]*2)}
>>> d
{1: "cat", "hat": 35, 2.5: True}
I have a basic understanding of how dictionary comprehensions work, how zip works, how * extracts arguments, how [iter(x)]*2 concatenates two copies of the list, and so I was expecting a one-to-one correspondence like {1: 1, "cat": "cat" ...}.
What's going on here?
This is an interesting little piece of code for sure! The main thing it utilizes that you might not expect is that objects are, in effect, passed by reference (they're actually passed by assignment, but hey). iter() constructs an object, so "copying" it (using multiplication on a list, in this case) doesn't create a new one, but rather adds another reference to the same one. That means you have a list where l[0] is an iterator, and l[1] is the same iterator - accessing them both accesses the very same object.
Every time the next element of the iterator is accessed, it continues where it last left off. Since elements are accessed alternately between the first and second elements of the tuples that zip() creates, the single iterator's state is advanced across both elements in the tuple.
After that, the dictionary comprehension simply consumes these pair tuples as they expand to k, v - as they would in any other dictionary comprehension.
This iter(x) creates an iterator over the iterable (list or similar) x. This iterator gets copied using [iter(x)]*2. Now you have a list of two times the same iterator. This means, if I ask one of them for a value, the other (which is the same) gets incremented as well.
zip() now gets the two iterators (which are the same) as two parameters via the zip(* ... ) syntax. This means, it creates a list of pairs of the two arguments it got. It will ask the first iterator for a value (and receive x[0]), then it will ask the other iterator for a value (and receive x[1]), then it will form a pair of the two values and put that in its output. Then it will do this repeatedly until the iterators are exhausted. By this it will form a pair of x[2] and x[3], then a pair of x[4] and x[5], etc.
This list of pairs then is passed to the dictionary comprehension which will form the pairs into key/values of a dictionary.
Easier to read might be this:
{ k: v for (k, v) in zip(x[::2], x[1::2]) }
But that might not be as efficient.

How does the following map function work?

>>>uneven = [['a','b','c'],['d','e'],['g','h','i']]
>>>map(None,*uneven)
O/P: [('a', 'd', 'g'), ('b', 'e', 'h'), ('c', None, 'i')]
The code above can be used for finding transpose of a matrix.
However iam unable to understand how it WORKS.
When using the * operator, the list is broken up into position arguments for the map. This is what you're actually running:
>>> map(None, ['a','b','c'], ['d','e'], ['g','h','i'])
When you pass multiple iterables to map, then the function (in this case None) is applied to every iterable in parallel. It processes 'a', 'd', 'g' first, and so on.
Edit:
As pointed out by Jon below, when you pass in None as the map function, it gets special cased to be the identity function, i.e. lambda id: id. This special casing of None's use in map has been removed in Python 3.
map(function, sequence[, sequence, ...]) -> list
from the documentation of map
If more than one sequence is given, the
function is called with an argument list consisting of the corresponding
item of each sequence, substituting None for missing values when not all
sequences have the same length.
If the function is None, return a list of the items of the sequence
Using sequence with * operator zip it according to the position of items in sequence.

Categories

Resources