How does set() remove duplicates from a list [duplicate] - python

This question already has answers here:
Removing duplicates in lists
(56 answers)
'order' of unordered Python sets
(5 answers)
Closed 3 years ago.
I tried to remove duplicates from a list in Python 3 by converting it into a set by using set(). However I tried to achieve a certain order at the end of the process. After converting the list, I noticed, that the resulting set was not in the order, I would have expected.
data = [3, 6, 3, 4, 4, 3]
my_set = set(data)
print(my_set)
The resulting set is: (3,4,6)
I expected set() to kind of iterate over the given list from 0 to n, keeping the first instance of every integer it encounters. However the resulting set seems to be ordered in a different way.
I was unable to find anything about this in the python documentation, or here on stack overflow. Is it known how the set() method orders the elements in the given datastructure when converting it to a set?

The concept of order simply does not exist for sets in Python, which is why you can not expect the elements to be shown in any particular order. Here is an example of creating a list without duplicates, that has the same order as the original list.
data = [3, 6, 3, 4, 4, 3]
without_duplicates = list(dict.fromkeys(data))
>>> without_duplicates
[3, 6, 4]

set objects are not ordered by key or by insertion order in Python... you can however get what you want by building the result you are looking for explicitly:
res = []
seen = set()
for x in data:
if x not in seen:
seen.add(x)
res.append(x)
print(res)

Related

How do I create a remove every other element function? [duplicate]

This question already has answers here:
How to remove items from a list while iterating?
(25 answers)
Closed 7 months ago.
I have written a script
def remove_every_other(my_list):
# Your code here!
rem = False
for i in my_list:
if rem:
my_list.remove(i)
rem = not rem
return my_list
It removes every other element from the list.
I input [1,2,3,4,5,6,7,8,9,10] it returns [1,3,4,6,7,9,10]
Which to me is very strange also if I input [Yes,No,Yes,No,Yes]
it outputs [Yes,No,Yes,No]
I have tried to figure it out for hours but couldn't get it to work, I am a beginner.
You could just use slicing for that. Or do you want to do it explicitly in a loop? For an explanation of the syntax, you can follow the link. Basically you take the full list (no start or end defined) with a step-value of 2 (every other element). As others have pointed out, you run into problems if you're modifying a list that you're iterating over, thus the unexpected behavior.
input_list = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
output_list = input_list[::2]
print(output_list)
This returns a copy of the list with every other element removed:
[1, 3, 5, 7, 9]
Since .remove() is being called upon the same list that you're iterating over, the order is getting disarranged.
We could append the items that you want at the end in another list to maintain the ordering in the original list.
def remove_every_other(my_list):
morphed_list = []
rem = True
for i in my_list:
if rem:
morphed_list.append(i)
rem = not rem
return morphed_list
In general, it is not a good idea to modify the list you're iterating over. You could read more about it over here: https://stackoverflow.com/questions/10812272/modifying-a-list-while-iterating-over-it-why-not#:~:text=The%20reason%20to%20why%20you,list%20of%20just%20odd%20numbers.

Why does set() behave so unintuitively in Python? [duplicate]

This question already has answers here:
Converting a list to a set changes element order
(16 answers)
Closed 2 years ago.
I don't understand why set() works the way it does...
Let's say we have two lists:
a = [1,2,-1,20,6,210,1, -11.4, 2]
b = [1,2,-1,20,6,210,1,-11.4, 2, "a"]
When I run set() on the list of numerics, a, I get a set of unique numerics ordered from smallest to largest. Ok great, that seems intuitive! Haven't found any exceptions yet:
set(a)
Out: {-11.4, -1, 1, 2, 6, 20, 210}
What happens if I throw a character in like with list b? Weirdness. The negatives are out of order and so is 6.
set(b)
Out: {-1, -11.4, 1, 2, 20, 210, 6, 'a'}
It gets worse though. What if I try to turn those sets back into lists? Pure chaos.
list(set(a))
Out: [1, 2, 6, 210, 20, -11.4, -1]
list(set(b))
Out: [1, 2, 6, 'a', 210, 20, -11.4, -1]
As you can see, these lists indeed only have unique values. But have failed to preserve much semblance of the order of the original lists.
What's going on here and why?
The set type in python is not explicitly ordered. It can appear ordered based on the implementation, but is not guaranteed to be so. If you need a ordered representation, you should use something like sorted(set(input_sequence)) which will return a sorted list after removing the duplicates. Note that sorting lists with types that are not comparable is not supported without some sort of custom comparator (so you can't sort ['a', 1] out of the box).

Efficient way to remove all items from another list [duplicate]

This question already has answers here:
Remove all the elements that occur in one list from another
(13 answers)
How do I subtract one list from another?
(16 answers)
Closed 2 years ago.
I have a list of items. I also have another list of items (subset of original list) I want to be removed from this list.
myitems = [1, 1, 2, 3, 3, 4, 5]
items_to_remove = [1, 4]
The output of this should be [2, 3, 3, 5]
What is the most efficient way all items from items_to_remove from myitems?
My current code is:
for item in items_to_remove:
myitems = list(filter((item).__ne__,myitems)
Because my actual use case has lots of items to be removed I am trying to find a more efficient way to do this.
The most efficient way is to create a set of the items to be removed, then use the set to filter the first list:
s = set(items_to_remove)
result = [x for x in myitems if x not in s]
With the sample list values, this produces the desired result:
[2, 3, 3, 5]
This solution has O(l1+l2) time complexity, where l1 and l2 are the two list lengths.
Note that some of the answers in the duplicate posts skipped the set creation, and just tested for membership directly in the second list. While correct, this has a serious negative impact on performance if the second list is large, with the performance being O(l1*l2) where l1 and l2 are the two list lengths. So unless the second list is very small, you definitely want to convert it to a set first.

Why doesn`t list[:][0] get me the first row of the list? [duplicate]

This question already has answers here:
Understanding slicing
(38 answers)
Closed 6 years ago.
For the following:
list=[[2, 3, 5], [7, 8, 9]]
Why does [list[0][0], list[1][0]] represent the first row ([2, 7]), but the command list[:][0] or list[0:2][0] returns the first column ([2, 3, 5])?
The way I see it list[:][0] should get all the possible values for the first parameter (that is 0 and 1) and 0 for the second, meaning it would return the first row. Instead what it does is return the first column and I can't understand why.
In python, the [a:b:c] syntax creates a new list. That is,
list = [1,2,3]
print(list[:])
is going to print a list, not a value.
Therefore, when you say list[:][0] you are making a copy of the original list (list[:]) and then accessing item 0 within it.
Of course you know, item 0 of the original list (list[0]) is another list.
I think you want:
[sl[0] for sl in list]
Elaboration:
This is called a "comprehension." It is a compact special syntax for generating lists, dicts, and tuples by processing or filtering other iterables. Basically {..}, [..], and (..) with an expression inside involving a for and optionally an if. Naturally, you can have multiples (like [x for x in y for y in z]) of the for, and multiples of the if.
In your case, it's pretty obvious you want a list. So []. You want to make the list by taking the first item from each sublist. So [sl[0] for sl in list].
Here's a more-detailed article: http://carlgroner.me/Python/2011/11/09/An-Introduction-to-List-Comprehensions-in-Python.html

Lists : printing number only if its not a duplicate [duplicate]

This question already has answers here:
Removing duplicates in lists
(56 answers)
Closed 8 years ago.
Given a list of n numbers how can i print each element except the duplicates?
d = [1,2,3,4,1,2,5,6,7,4]
For example from this list i want to print : 1,2,3,4,5,6,7
Since order doesn't matter, you can simply do:
>>> print list(set(d))
[1, 2, 3, 4, 5, 6, 7]
It would be helpful to read about sets
If the order does not matter:
print set(d)
If the type matters (want a list?)
print list(set(d))
If the order matters:
def unique(d):
d0 = set()
for i in d:
if not i in d0:
yield i
d0.add(i)
print unique(d)
All you have to do is
create an array.
get list's element.
if element exists in array, leave it.
and if it does not exists, print it.

Categories

Resources