should not the second code output just like the first - python

Why does the second code give different output than the first one ?
Using a for loop:
my_list = [1, 2, 3, 4, 2, 6, 2, 2, 7, 3, 8, 2]
uniques = []
for item in my_list:
if item not in uniques:
uniques.append(item)
print(uniques)
Output:
[1, 2, 3, 4, 6, 7, 8]
Using list comprehension:
my_list = [1, 2, 3, 4, 2, 6, 2, 2, 7, 3, 8, 2]
uniques = []
uniques = [item for item in my_list if item not in uniques]
print(uniques)
Output:
[1, 2, 3, 4, 2, 6, 2, 2, 7, 3, 8, 2]

The expression [item for item in my_list if item not in uniques] computes a new list based on the comprehension all at once. It then assigns the result to the name uniques. During the time the comprehension is running, uniques is an empty list, so the test if item not in uniques always returns True.
In the first version, uniques is referencing a list that is being actively updated, so it is able to meaningfully check for items already in the list.
As an aside, this is a very inefficient way to check for duplicates, because every time you write if item not in uniques:, the entire list is checked in linear time. A better alternative would be to use a set, which does fixed-time lookups using a hash-table.

Related

Python list loops

in this code I'm trying to delete every repeated element in the list and just make all of the elements unique and not repeated, so when I run this code give me an error:
myList = [1, 2, 4, 4, 1, 4, 2, 6, 2, 9]
repeat = 0
for i in range(len(myList)-1):
for j in range(len(myList)-1):
if myList[i]== myList[j]:
repeat+=1
if repeat>1:
del myList[j]
print("The list with unique elements only:")
print(myList)
the error which apppears is :
Traceback (most recent call last):
File "main.py", line 8, in <module>
if myList[i]== myList[j]:
IndexError: list index out of range
why is that happens and how can I solve it?
It is a really bad idea to modify an array while looping on it as you have no control on the way things are handled.
May I suggest these two solutions to your problem.
The first one is using set.
myList = [1, 2, 4, 4, 1, 4, 2, 6, 2, 9]
myList = list(set(myList))
print("The list with unique elements only:")
print(myList)
The other solution is using an other array
myList = [1, 2, 4, 4, 1, 4, 2, 6, 2, 9]
uniques = []
for number in myList:
if number not in uniques:
uniques.append(number)
print("The list with unique elements only:")
print(uniques)
You can convert list to set, it will automatically delete all of repeated elements
a = [1, 2, 4, 4, 1, 4, 2, 6, 2, 9]
unique_list = list(set(a))
print(a)
Note: We again convert set to list
What is heppening here is that you are deleting some elements in your list, making it shorter.
Since your for loops are running for the lenght of your original list, you will eventuall try to access an index that no longer exists. This will cause you to get "list index out of range"
To see this for your self, you can add a print statement, like so:
myList = [1, 2, 4, 4, 1, 4, 2, 6, 2, 9]
repeat = 0
for i in range(len(myList)-1):
for j in range(len(myList)-1):
print(i,j,len(myList))
if myList[i]== myList[j]:
repeat+=1
if repeat>1:
del myList[j]
Set data type in Python is used to remove duplicity. Whenever any iterator needs to be viewed with only the unique values in it, it can be converted into a set and that will remove all the duplicate values. For example:
lis=[2,2,3,4]
l=set(lis)
print(l)
Output:
{2, 3, 4}
It can be converted back into the list:
lis=[2,2,3,4]
l=set(lis)
print(l)
l=list(l)
print(l)
Output:
{2, 3, 4}
[2, 3, 4]
Similarly:
myList = [1, 2, 4, 4, 1, 4, 2, 6, 2, 9]
s=set(myList)
l=list(s)
print(l)
Output:
[1, 2, 4, 6, 9]
Frozen sets can also be used for this purpose. Although; elements of the frozen set remain the same after creation i.e, they can't be modified unlike the elements of the set which are mutable(can be modified).
Hope this was helpful!

"for" loop and "if" condition for list creation in python

source=[1,2,3,4,2,3,5,6]
dst=[]
for item in source:
if item not in dst:
dst.append(item)
print(dst) # [1,2,3,4,5,6]
Can I simplify code above something like this:
dst=[item for item in [1,2,3,4,2,3,5,6] if item not in 'this array']
Thanks
No, list comprehensions cannot be self-referential.
You seem to want to remove duplicates from a list. See this and this questions for a boatload of approaches to this problem.
A set is probably what you are looking for, since you cannot refer to this array while it's being created:
>>> source = [1,2,3,4,2,3,5,6]
>>> set(source)
{1, 2, 3, 4, 5, 6}
If you do want to keep original order, though, you can keep track of what you have already added to dst with a set (seen):
>>> source = [1,2,3,4,2,3,5,6]
>>> seen = set()
>>> dst = []
>>> for i in source:
>>> if i not in seen:
>>> dst.append(i)
>>> seen.add(i)
>>>
>>> dst
[1, 2, 3, 4, 5, 6]
You can't reference dst from within the list comprehension, but you can check the current item against the previously iterated items in source by slicing it on each iteration:
source = [1, 2, 3, 4, 2, 3, 5, 6]
dst = [item for i, item in enumerate(source)
if item not in source[0:i]]
print(dst) # [1, 2, 3, 4, 5, 6]
If using if and for is your requirement
How about this?
[dst.append(item) for item in source if item not in dst]
Well instead of creating new list you can modify your existing list with list comprehension as shown below:
In [1]: source
Out[1]: [1, 9, 2, 5, 6, 6, 4, 1, 4, 11]
In [2]: [ source.pop(i) for i in range(len(source))[::-1] if source.count(source[i]) > 1 ]
Out[2]: [4, 1, 6]
In [3]: source
Out[3]: [1, 9, 2, 5, 6, 4, 11]
As another approach you can first get unique list with set and then sort it with reference to source index value as follow:
source = [1, 9, 2, 5, 6, 6, 4, 1, 4, 11]
d = list(set(source))
d.sort(key=source.index)
print(d) # [1, 9, 2, 5, 6, 4, 11]

Removing the duplicate entries from a list by editing the list

Have a list arr = [1,3,4,5,2,3,4,2,5,7,3,8,1,9,6,2,1,2,1,3,4,3,4,6,9]
want to remove the duplicate values so that the original list should contains single instances of all elements. Do not want to create a extra list and append the elements from list. Also do not want to use inbuilt "set".
Tried to do that with some code as below:
l = len(arr)
for x in range(l):
for y in range(x+1,l):
if arr[x] == arr[y]:
del arr[y]
Tried the above code and its throwing error
"IndexError: list index out of range"
What I understand is whiling deleting the value the size of the list is changing for which its throwing the error. So I made the below changes. But still its failing with same error:
l = len(arr)
for x in range(l):
for y in range(x+1,l):
if arr[x] == arr[y]:
t = y
del arr[y]
y = t - 1
Can some one help me out on this?
Thanks in Advance.
You are trying to make the code more efficient by caching the length of the list in the local variable l. However, that is not helpful because the list is being trimmed inside the loop, and you are not keeping the cached length variable in sync.
for index in range(len(arr)-1,0,-1):
if arr[index] in arr[:index]:
del arr[index]
By going backwards through the array and looking for earlier occurrences of each element, you can avoid having to worry about the length of the list changing all the time.
This method also preserves the order in which elements occur in the original array. Note the instruction is to only remove duplicates (a.k.a. subsequent occurrences).
For example the list [9,3,4,3,5] should reduce to [9,3,4, 5] as the second occurrence of 3 is considered a duplicate and should be removed.
How about this approach:
>>> set(arr)
set([1, 2, 3, 4, 5, 6, 7, 8, 9]) #Just to compare it with the results below.
>>> arr = [1,3,4,5,2,3,4,2,5,7,3,8,1,9,6,2,1,2,1,3,4,3,4,6,9]
>>> arr.sort()
>>> arr
[1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 6, 6, 7, 8, 9, 9]
>>> for i in arr:
while arr.count(i) > 1:
del arr[i]
>>> arr
[1, 2, 3, 4, 5, 6, 7, 8, 9]
Another approach is to find, after sorting your list, the length of the sublist to delete for each number:
>>> arr = [1,3,4,5,2,3,4,2,5,7,3,8,1,9,6,2,1,2,1,3,4,3,4,6,9]
>>> arr.sort()
>>> arr
[1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 6, 6, 7, 8, 9, 9]
>>> for i,j in enumerate(arr):
del arr[i+1:i+arr.count(j)]
>>> arr
[1, 2, 3, 4, 5, 6, 7, 8, 9]

Removing duplicates using set

Basically, I want to do this with iteration from a lst in to a set and the printing it back to list. The problem I get is that I can't iterate through set.add(item). set.add() was perfectly fine when applying one value outside a loop but I can't get it to work inside a loop.
Using this function I am able to remove duplicates.
remove_duplicates(numbers):
lst = []
for i in numbers:
if i not in lst:
lst.append(i)
return lst
However, I want to be able to do something like this.
Here is how far I was able to come.
lst = { }
lsto = [1, 1, 1, 2, 3, 4, 1, 2, 5, 7, 5]
for item in lsto:
lst.add(item)
print(lst)
Thanks in advance!
I think you mean this,
>>> lsto = [1, 1, 1, 2, 3, 4, 1, 2, 5, 7, 5]
>>> list(set(lsto))
[1, 2, 3, 4, 5, 7]
set(lsto) turns the iterable lsto into set which in-turn remove the duplicate elements. By again turning the set to list will give you a list at final.
To match the first logic and keep order you can use an OrderedDict :
from collections import OrderedDict
lsto = [1, 1, 1, 2, 3, 4, 1, 2, 5, 7, 5]
print(OrderedDict().fromkeys(lsto).keys())
[1, 2, 3, 4, 5, 7]
The set by chance gives you the same order but sets are unordered collections so you cannot rely on getting any order.

Compare each element in a list to all others

Is there a way to compare all elements of a list (ie one such as [4, 3, 2, 1, 4, 3, 2, 1, 4]) to all others and return, for each element, the number of other elements it is different from (ie, for the list above [6, 7, 7, 7, 6, 7, 7, 7, 6])? I then will need to add the numbers from this list.
li = [4, 3, 2, 1, 4, 3, 2, 1, 4]
from collections import Counter
c = Counter(li)
print c
length = len(li)
print [length - c[el] for el in li]
Creating c before executing [length - c[el] for el in li] is better than doing count(i) for each element i of the list, because that means that count() do the same count several times (each time it encounters a given element, it counts it)
By the way, another way to write it:
map(lambda x: length-c[x] , li)
You can get similar counter with count() method.
And subtract the total number.
Do it in one line with a comprehension list.
>>> l = [4, 3, 2, 1, 4, 3, 2, 1, 4]
>>> [ len(l)-l.count(i) for i in l ]
[6, 7, 7, 7, 6, 7, 7, 7, 6]
For Python 2.7:
test = [4, 3, 2, 1, 4, 3, 2, 1, 4]
length = len(test)
print [length - test.count(x) for x in test]
You could just use the sum function, along with a generator expression.
>>> l = [4, 3, 2, 1, 4, 3, 2, 1, 4]
>>> length = len(l)
>>> print sum(length - l.count(i) for i in l)
60
The good thing about a generator expression is that you don't create an actual list in memory, but functions like sum can still iterate over them and produce the desired result. Note, however, that once you iterate over a generator once, you can't iterate over it again.

Categories

Resources