Why does set() behave so unintuitively in Python? [duplicate] - python

This question already has answers here:
Converting a list to a set changes element order
(16 answers)
Closed 2 years ago.
I don't understand why set() works the way it does...
Let's say we have two lists:
a = [1,2,-1,20,6,210,1, -11.4, 2]
b = [1,2,-1,20,6,210,1,-11.4, 2, "a"]
When I run set() on the list of numerics, a, I get a set of unique numerics ordered from smallest to largest. Ok great, that seems intuitive! Haven't found any exceptions yet:
set(a)
Out: {-11.4, -1, 1, 2, 6, 20, 210}
What happens if I throw a character in like with list b? Weirdness. The negatives are out of order and so is 6.
set(b)
Out: {-1, -11.4, 1, 2, 20, 210, 6, 'a'}
It gets worse though. What if I try to turn those sets back into lists? Pure chaos.
list(set(a))
Out: [1, 2, 6, 210, 20, -11.4, -1]
list(set(b))
Out: [1, 2, 6, 'a', 210, 20, -11.4, -1]
As you can see, these lists indeed only have unique values. But have failed to preserve much semblance of the order of the original lists.
What's going on here and why?

The set type in python is not explicitly ordered. It can appear ordered based on the implementation, but is not guaranteed to be so. If you need a ordered representation, you should use something like sorted(set(input_sequence)) which will return a sorted list after removing the duplicates. Note that sorting lists with types that are not comparable is not supported without some sort of custom comparator (so you can't sort ['a', 1] out of the box).

Related

Why list(map(func, set)) always gives a sorted list in Python?

While learning about map(function, *iterables) function, I found that it applies the function with n number of arguments where n is the number of supplied iterables as args to the map() function. So, I tried and it's working as expected.
Another point is lists in Python maintain the insertion order while sets are unordered. Apart from having all these details, I'm unable to understand that -
Why does applying map() on a set and then converting the resultant map object to a list give me a sorted list while applying map() on a list and then converting the map object to a list gives me a list with the same order as it was in the supplied list?
This is my source code:
l1 = [1, 2, 4, 5, 3]
s1 = {6, 7, 9, 8, 0}
print(list(map(lambda num: num * 2, l1)))
print(list(map(lambda num: num * 2, s1)))
and this is the output:
[2, 4, 8, 10, 6]
[0, 12, 14, 16, 18]
In case of:
print(list(map(lambda num: num * 2, s1)))
each time I run the program, I'm getting the same sorted list but if I print(s1) I'm getting different orders of s1. Shouldn't I get a list with random order of doubled numbers each time I run the program?
P.S.
Another user, suggested me a possible duplicate order-of-unordered-python-sets. But, if possible, I seek an updated answer for this question that talks about the implementation of sets in current versions of Python (as of now the current version is 3.10.7).
The order-of-unordered-python-sets definitely gives a base to under the reason behind this problem. But please provide your updated answers here so that it'll be relevant for upcoming years. Otherwise, you can definitely vote to close it.
You picked a very bad sample since {6, 7, 9, 8, 0} is printed as {0, 6, 7, 8, 9} which appears to be sorted, but it is not, it just happens to be the same order as something that is sorted. Pick {16, 6, 7, 9, 8, 0} instead and you will see {0, 16, 6, 7, 8, 9} in the output, obviously not sorted or ordered.
(Unordered) sets do have an order that you MUST NOT rely on, even / especially if you think you found order in it. Any order you see is purely by chance.
sets are unordered (not randomly ordered), in your case, it was just a coincidence that your specific series of elements were printed in order. If you run the same code but with different set elements, they may end up being out of order.

How to get all possible combinations of pairs with unique elements [duplicate]

This question already has answers here:
How to get all possible combinations of a list’s elements?
(32 answers)
Closed 6 months ago.
I have a list of even N elements and would like to get a list of lists of all possible and unique pairs, e.g.:
list = [1, 2, 3, 4, 5, 6]
result: [[(1,2), (3,4), (5,6)], [(1,2), (3,5), (4,6)] ...
So each list I get as the result should have N/2 elements (pairs with unique numbers). This question seemed similar to my problem, although the answer gives the lists with 2 combinations only and it doesn't work for N > 4; not sure if it's possible to rework this solution for my purposes.
I suppose that one possible option is to:
iterate through each possible order of N numbers (123456, 123465 ... 654321)
create a list of pairs for each following 2 elements ([1,2,3,4,5,6] -> [(12), (34), (56)])
sort those pairs and eliminate duplicates
But it feels that there should be a more elegant solution, would be grateful for help :)
Unless I'm mistaken,
l = [1, 2, 3, 4, 5, 6]
x = list(itertools.combinations(itertools.combinations(l, 2), 3))
does what you want.

Efficient way to remove all items from another list [duplicate]

This question already has answers here:
Remove all the elements that occur in one list from another
(13 answers)
How do I subtract one list from another?
(16 answers)
Closed 2 years ago.
I have a list of items. I also have another list of items (subset of original list) I want to be removed from this list.
myitems = [1, 1, 2, 3, 3, 4, 5]
items_to_remove = [1, 4]
The output of this should be [2, 3, 3, 5]
What is the most efficient way all items from items_to_remove from myitems?
My current code is:
for item in items_to_remove:
myitems = list(filter((item).__ne__,myitems)
Because my actual use case has lots of items to be removed I am trying to find a more efficient way to do this.
The most efficient way is to create a set of the items to be removed, then use the set to filter the first list:
s = set(items_to_remove)
result = [x for x in myitems if x not in s]
With the sample list values, this produces the desired result:
[2, 3, 3, 5]
This solution has O(l1+l2) time complexity, where l1 and l2 are the two list lengths.
Note that some of the answers in the duplicate posts skipped the set creation, and just tested for membership directly in the second list. While correct, this has a serious negative impact on performance if the second list is large, with the performance being O(l1*l2) where l1 and l2 are the two list lengths. So unless the second list is very small, you definitely want to convert it to a set first.

How does set() remove duplicates from a list [duplicate]

This question already has answers here:
Removing duplicates in lists
(56 answers)
'order' of unordered Python sets
(5 answers)
Closed 3 years ago.
I tried to remove duplicates from a list in Python 3 by converting it into a set by using set(). However I tried to achieve a certain order at the end of the process. After converting the list, I noticed, that the resulting set was not in the order, I would have expected.
data = [3, 6, 3, 4, 4, 3]
my_set = set(data)
print(my_set)
The resulting set is: (3,4,6)
I expected set() to kind of iterate over the given list from 0 to n, keeping the first instance of every integer it encounters. However the resulting set seems to be ordered in a different way.
I was unable to find anything about this in the python documentation, or here on stack overflow. Is it known how the set() method orders the elements in the given datastructure when converting it to a set?
The concept of order simply does not exist for sets in Python, which is why you can not expect the elements to be shown in any particular order. Here is an example of creating a list without duplicates, that has the same order as the original list.
data = [3, 6, 3, 4, 4, 3]
without_duplicates = list(dict.fromkeys(data))
>>> without_duplicates
[3, 6, 4]
set objects are not ordered by key or by insertion order in Python... you can however get what you want by building the result you are looking for explicitly:
res = []
seen = set()
for x in data:
if x not in seen:
seen.add(x)
res.append(x)
print(res)

List cherry picking slicing method [duplicate]

This question already has answers here:
Access multiple elements of list knowing their index [duplicate]
(11 answers)
Closed 5 years ago.
It's fairly simple to articulate problem, but I'm not 100% sure that I have my lingo right. Nonetheless, conceptually "cherry picking" is suitable to describe the slice I have in mind. This is because I simply want to access (cherry pick from all elements) two distant elements of a list. I tried this:
my_list[2,7]
So I was expecting it to return only 2 elements, but instead I got the error:
list indices must be integers, not tuples.
I searched this error, but I found it was actually a very general error and none of the problems that instigated this error were actually for my type of problem.
I suppose I could extract the elements 1 at a time and then merge them, but my gut tells me there is a more "pythonic" way about this.
Also a slightly more complicated form of this problem I ran into was building a new list from an existing list of lists:
new_list = []
for i in range(len(my_list)):
new_list.append(my_list[i][2,7])
Normally I would just use operator.itemgetter for this:
>>> my_list = list(range(10))
>>> import operator
>>> list(operator.itemgetter(2, 7)(my_list))
[2, 7]
It also allows getting an arbitrary amount of list elements by their indices.
However you can always use NumPy (that's an external package) and it's integer slicing for this (but it won't work for normal Python lists, just for NumPy arrays):
>>> import numpy as np
>>> my_arr = np.array(my_list)
>>> my_arr[[2, 7]]
array([2, 7])
In [1]: myList = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
In [2]: myList[2:8:5]
Out[2]: [2, 7]
myList[start:end:stride]
Hope this helps.

Categories

Resources