What is this Python magic? - python

If you do this {k:v for k,v in zip(*[iter(x)]*2)} where x is a list of whatever, you'll get a dictionary with all the odd elements as keys and even ones as their values. woah!
>>> x = [1, "cat", "hat", 35,2.5, True]
>>> d = {k:v for k,v in zip(*[iter(x)]*2)}
>>> d
{1: "cat", "hat": 35, 2.5: True}
I have a basic understanding of how dictionary comprehensions work, how zip works, how * extracts arguments, how [iter(x)]*2 concatenates two copies of the list, and so I was expecting a one-to-one correspondence like {1: 1, "cat": "cat" ...}.
What's going on here?

This is an interesting little piece of code for sure! The main thing it utilizes that you might not expect is that objects are, in effect, passed by reference (they're actually passed by assignment, but hey). iter() constructs an object, so "copying" it (using multiplication on a list, in this case) doesn't create a new one, but rather adds another reference to the same one. That means you have a list where l[0] is an iterator, and l[1] is the same iterator - accessing them both accesses the very same object.
Every time the next element of the iterator is accessed, it continues where it last left off. Since elements are accessed alternately between the first and second elements of the tuples that zip() creates, the single iterator's state is advanced across both elements in the tuple.
After that, the dictionary comprehension simply consumes these pair tuples as they expand to k, v - as they would in any other dictionary comprehension.

This iter(x) creates an iterator over the iterable (list or similar) x. This iterator gets copied using [iter(x)]*2. Now you have a list of two times the same iterator. This means, if I ask one of them for a value, the other (which is the same) gets incremented as well.
zip() now gets the two iterators (which are the same) as two parameters via the zip(* ... ) syntax. This means, it creates a list of pairs of the two arguments it got. It will ask the first iterator for a value (and receive x[0]), then it will ask the other iterator for a value (and receive x[1]), then it will form a pair of the two values and put that in its output. Then it will do this repeatedly until the iterators are exhausted. By this it will form a pair of x[2] and x[3], then a pair of x[4] and x[5], etc.
This list of pairs then is passed to the dictionary comprehension which will form the pairs into key/values of a dictionary.
Easier to read might be this:
{ k: v for (k, v) in zip(x[::2], x[1::2]) }
But that might not be as efficient.

Related

how to remove duplicate list in the values of a dictionary

i have a dictionary
dictionary = {
1:[[1,2],[3,4],[5,6],[7,8],[1,2]],
2:[[5,6],[7,8],[1,2]],
3:[3,4],[5,6],[3,4]]
}
How can i remove duplicate list in each value of the dictionary?
output = {
1:[[3,4],[5,6],[7,8],[1,2]],
2:[[5,6],[7,8],[1,2]],
3:[3,4],[5,6]]
}
How can i remove all duplicates?
output = [[1,2],[3,4],[5,6],[7,8]]
i have tried doing for loops, like so:
for i in dictionary.values():
for j in i:
for k in i:
if j == k:
i.remove(k)
but i'm just a beginner so i'm not getting any results...
The usual way to do this is to leverage a set, which is like a dictionary that has only keys and no values. Dictionaries (and sets) rely on their keys to be "hashable," which means that you can feed the key through some hash function and get the same result every time. In Python you can call this hash function with hash(some_object), which internally invokes some_object.__hash__().
The problem with this approach is that lists are not hashable. No mutable objects (things you can change with methods like list.append or set.add or dict.union or etc) are. This means you must either check equality by hand, or mutate it into some form that is hashable, use the set, and then mutate it back. I think the latter is probably your best bet.
To that end, let's use a tuple. Tuples are just like lists except they are not idiomatically homogenous (so mixing types is common, not just technically allowed) and their order has semantic meaning. Consider an ordered pair on a plane -- it would matter deeply if the order flipped: (1, 4) is not the same point as (4, 1). They are, however, immutable and hashable.
d = {1: [[1,2],[3,4],[5,6],[7,8],[1,2]],
2: [[5,6],[7,8],[1,2]],
3: [[3,4],[5,6],[3,4]]}
# we'll use a set comprehension here because it's concise
uniques = {tuple(sublst) for lst in d.values() for sublst in lst}
result = [list(tup) for tup in uniques] # then just change them back to lists
Note that the conversion to set and back does lose all ordering. If ordering is important then you'll have to do something like iterate through every sub list, convert it to tuple, check to see if it's already been seen, and if not add it to the seen set and append it to your final list.
d = {1: [[1,2],[3,4],[5,6],[7,8],[1,2]],
2: [[5,6],[7,8],[1,2]],
3: [[3,4],[5,6],[3,4]]}
seen = set()
result = []
for lst in d.values():
for sublst in lst:
tup = tuple(sublst)
if tup not in seen:
seen.add(tup)
result.append(sublst)

Are dict comprehensions evaluated incrementally in Python?

I'd have assumed the results of purge and purge2 would be the same in the following code (remove duplicate elements, keeping the first occurrences and their order):
def purge(a):
l = []
return (l := [x for x in a if x not in l])
def purge2(a):
d = {}
return list(d := {x: None for x in a if x not in d})
t = [2,5,3,7,2,6,2,5,2,1,7]
print(purge(t), purge2(t))
But it looks like with dict comprehensions, unlike with lists, the value of d is built incrementally. Is this what's actually happening? Do I correctly infer the semantics of dict comprehensions from this sample code and their difference from list comprehensions? Does it work only with comprehensions, or also with other right-hand sides referring to the dictionary being assigned to (e.g. comprehensions nested inside other expressions, something involving iterators, comprehensions of types other than dict)? Where is it specified and full semantics can be consulted? Or is it just an undocumented behaviour of the implementation, not to be relied upon?
There's nothing "incremental" going on here. The walrus operator doesn't assign to the variable until the dictionary comprehension completes. if x not in d is referring to the original empty dictionary, not the dictionary that you're building with the comprehension, just as the version with the list comprehension is referring to the original l.
The reason the duplicates are filtered out is simply because dictionary keys are always unique. Trying to create a duplicate key simply ignores the second one. It's the same as if you'd written:
return {2: None, 2: None}
you'll just get {2: None}.
So your function can be simplified to
def purge2(a):
return list({x: None for x in a})

Accessing elements of a list

I have a list of strings, and calling a function on each string which returns a string. The thing I want is to update the string in the list. How can I do that?
for i in list:
func(i)
The function func() returns a string. i want to update the list with this string. How can it be done?
If you need to update your list in place (not create a new list to replace it), you'll need to get indexes that corresponds to each item you get from your loop. The easiest way to do that is to use the built-in enumerate function:
for index, item in enumerate(lst):
lst[index] = func(item)
You can reconstruct the list with list comprehension like this
list_of_strings = [func(str_obj) for str_obj in list_of_strings]
Or, you can use the builtin map function like this
list_of_strings = map(func, list_of_strings)
Note : If you are using Python 3.x, then you need to convert the map object to a list, explicitly, like this
list_of_strings = list(map(func, list_of_strings))
Note 1: You don't have to worry about the old list and its memory. When you make the variable list_of_strings refer a new list by assigning to it, the reference count of the old list reduces by 1. And when the reference count drops to 0, it will be automatically garbage collected.
First, don't call your lists list (that's the built-in list constructor).
The most Pythonic way of doing what you want is a list comprehension:
lst = [func(i) for i in lst]
or you can create a new list:
lst2 = []
for i in lst:
lst2.append(func(i))
and you can even mutate the list in place
for n, i in enumerate(lst):
lst[n] = func(i)
Note: most programmers will be confused by calling the list item i in the loop above since i is normally used as a loop index counter, I'm just using it here for consistency.
You should get used to the first version though, it's much easier to understand when you come back to the code six months from now.
Later you might also want to use a generator...
g = (func(i) for i in lst)
lst = list(g)
You can use map() to do that.
map(func, list)

Python list index splitting and manipulation

My question seems simple, but for a novice to python like myself this is starting to get too complex for me to get, so here's the situation:
I need to take a list such as:
L = [(a, b, c), (d, e, d), (etc, etc, etc), (etc, etc, etc)]
and make each index an individual list so that I may pull elements from each index specifically. The problem is that the list I am actually working with contains hundreds of indices such as the ones above and I cannot make something like:
L_new = list(L['insert specific index here'])
for each one as that would mean filling up the memory with hundreds of lists corresponding to individual indices of the first list and would be far too time and memory consuming from my point of view. So my question is this, how can I separate those indices and then pull individual parts from them without needing to create hundreds of individual lists (at least to the point where I wont need hundreds of individual lines to create them).
I might be misreading your question, but I'm inclined to say that you don't actually have to do anything to be able to index your tuples. See my comment, but: L[0][0] will give "a", L[0][1] will give "b", L[2][1] will give "etc" etc...
If you really want a clean way to turn this into a list of lists you could use a list comprehension:
cast = [list(entry) for entry in L]
In response to your comment: if you want to access across dimensions I would suggest list comprehension. For your comment specifically:
crosscut = [entry[0] for entry in L]
In response to comment 2: This is largely a part of a really useful operation called slicing. Specifically to do the referenced operation you would do this:
multiple_index = [entry[0:3] for entry in L]
Depending on your readability preferences there are actually a number of possibilities here:
list_of_lists = []
for sublist in L:
list_of_lists.append(list(sublist))
iterator = iter(L)
for i in range(0,iterator.__length_hint__()):
return list(iterator.next())
# Or yield list(iterator.next()) if you want lazy evaluation
What you have there is a list of tuples, access them like a list of lists
L[3][2]
will get the second element from the 3rd tuple in your list L
Two way of using inner lists:
for index, sublist in enumerate(L):
# do something with sublist
pass
or with an iterator
iterator = iter(L)
sublist = L.next() # <-- yields the first sublist
in both case, sublist elements can be reached via
direct index
sublist[2]
iteration
iterator = iter(sublist)
iterator.next() # <-- yields first elem of sublist
for elem in sublist:
# do something with my elem
pass

Python construct a dictionary data type?

I'm new in python, and just ran into this statement
data = dict( (k, v) for k, v in data.items() if v != 'null')
I don't really what they doing here to construct a dict. Could you explain it a bit to me? Why using for loop in dict() and why the if comes after? I didn't see anythin like this in the python docs.
Thanks guys
The code uses the dict constructor to create a new dictionary. The constructor can take an iterable of key, value pairs to initialise the new dictionary with. As others have pointed out, the example code has a generator expression the creates this iterable of key, value pairs.
The generator expression acts a little bit like a list and could be re-written like this:
mylist = []
for k, v in data.items():
if v != 'null':
mylist.append((k, v))
But it never actually creates a list, it just yields each value in turn as it is processed by the dict constructor.
As for why the if comes after the loop, this is the syntax chosen by the python developers, so you'd have to ask them. But notice in my re-written generator expression that the if statement is inside (i.e. after) the for statement.
I've linked already to the section on generator expressions in the python documentation but at unkulunkulu's request, here's a couple more:
Carl's Groner's Introduction to List Comprehensions
Fredrik Haard's How to (Effectively) Explain List Comprehensions
The argument to dict() is a generator expression that yields tuples consisting of key, value pairs (i.e., the (k, v)) drawn from data.items(). The dict() built-in function can automatically construct a dictionary object from a list or sequence of such tuples, e.g.:
>>> kvs = [('a', 1), ('b', 2)]
>>> dict(kvs)
{'a': 1, 'b': 2}
The if v != 'null' qualifier instructs the generator to ignore/skip over those elements whose value (that is, the second item in the tuple) equals 'null' (more precisely, it only yields those pairs for which the value is not equal to 'null').
For a much more detailed explanation of generator expressions, see PEP 289.

Categories

Resources