Python arranging a list to include duplicates - python

I have a list in Python that is similar to:
x = [1,2,2,3,3,3,4,4]
Is there a way using pandas or some other list comprehension to make the list appear like this, similar to a queue system:
x = [1,2,3,4,2,3,4,3]

It is possible, by using cumcount
s=pd.Series(x)
s.index=s.groupby(s).cumcount()
s.sort_index()
Out[11]:
0 1
0 2
0 3
0 4
1 2
1 3
1 4
2 3
dtype: int64

If you split your list into one separate list for each value (groupby), you can then use the itertools recipe roundrobin to get this behavior:
x = ([1, 2, 2, 3, 3, 3, 4, 4])
roundrobin(*(g for _, g in groupby(x)))

If I'm understanding you correctly, you want to retain all duplicates, but then have the list arranged in an order where you create what are in essence separate lists of unique values, but they're all concatenated into a single list, in order.
I don't think this is possible in a listcomp, and nothing's occurring to me for getting it done easily/quickly in pandas.
But the straightforward algorithm is:
Create a different list for each set of unique values: For i in x: if x not in list1, add to list 1; else if not in list2, add to list2; else if not in list3, ad to list3; and so on. There's certainly a way to do this with recursion, if it's an unpredictable number of lists.
Evaluate the lists based on their values, to determine the order in which you want to have them listed in the final list. It's unclear from your post exactly what order you want them to be in. Querying by the value in the 0th position could be one way. Evaluating the entire lists as >= each other is another way.
Once you have that set of lists and their orders, it's straightforward to concatenate them in order, in the final list.

essentially what you want is pattern, this pattern is nothing but the order in which we found unique numbers while traversing the list x for eg: if x = [4,3,1,3,5] then pattern = 4 3 1 5 and this will now help us in filling x again such that output will be [4,3,1,5,3]
from collections import defaultdict
x = [1,2,2,3,3,3,4,4]
counts_dict = defaultdict(int)
for p in x:
counts_dict[p]+=1
i =0
while i < len(x):
for p,cnt in counts_dict.items():
if i < len(x):
if cnt > 0:
x[i] = p
counts_dict[p]-=1
i+=1
else:
continue
else:
# we have placed all the 'p'
break
print(x) # [1, 2, 3, 4, 2, 3, 4, 3]
note: python 3.6+ dict respects insertion order and I am assuming that you are using python3.6+ .
This is what I thought of doing at first but It fails in some cases..
'''
x = [3,7,7,7,4]
i = 1
while i < len(x):
if x[i] == x[i-1]:
x.append(x.pop(i))
i = max(1,i-1)
else:
i+=1
print(x) # [1, 2, 3, 4, 2, 3, 4, 3]
# x = [2,2,3,3,3,4,4]
# output [2, 3, 4, 2, 3, 4, 3]
# x = [3,7,1,7,4]
# output [3, 7, 1, 7, 4]
# x = [3,7,7,7,4]
# output time_out
'''

Related

How do I move list elements to another list if a condition is met?

I have a list of values and I want to move certain (or all) values to another list if they exist in a reference list.
x = [2,3,4,5,6,7,8] # list of values
ref = [2,3,4,5,6,7,8] # reference list
result = [x.pop(i) for i, v in enumerate(x) if v in ref]
But because of popping the current index, it ends up giving every other value instead. Is there a nice straightforward way to do this?
What I want at the end of this example is x=[] and result=[2,3,4,5,6,7,8], but ref doesn't need to contain all elements of x, this was just for an example. In another case it might be:
x = [2,3,4,5,6,7,8] # list of values
ref = [2,6,7] # reference list
So then I want x = [3,4,5,8] and result = [2,6,7]
In general, we should avoid modifying objects while iterating over them.
For this problem, we could generate result and filter x in two steps using comprehensions (avoiding appending to lists) as in the following example.
result, x = [v for v in x if v in ref], [v for v in x if v not in ref]
You could do it the old-fashioned way, with a while loop and a pointer into x:
x = [2, 3, 4, 5, 6, 7, 8]
ref = [2, 6, 7]
result = []
i = 0
while i < len(x):
if x[i] in ref:
result.append(x.pop(i))
else:
i += 1
print(x)
print(result)
Output:
[]
[2, 3, 4, 5, 6, 7, 8]
You can simply iterate from the end to the start to avoid pop() changing the list size while iterating. Just call reverse() on your new list after running your loop if the order of the list matters.

Find unique elements part of each list amongst unknown number of lists

I am trying to find an efficient solution to subsequent problem:
I have a number of x lists (number unknown) with each having different but also overlapping elements. I would like to find the elements unique to each list and output them separately.
For example if I have 3 lists:
a = [1,2,3,4]
b = [2,5,6,7]
c = [3,6,8,9]
This would result in an output of (I am not trying to find the unique elements only):
a --> [1,4]
b --> [5,7]
c --> [8,9]
Assuming that one list gets generated sequentially. I was thinking of using sets but believe that this can be solved when each list gets generated.
Here is a simple solution in O(N) where N is the total number of elements.
The key idea is to count for each elements how many times it appears in all the lists. Then you can filter each list by keeping only elements that appear once.
from collections import Counter
a = [1,2,3,4]
b = [2,5,6,7]
c = [3,6,8,9]
# Count how many times each elements appear.
counter = Counter()
for l in [a,b,c]:
counter.update(l)
print(counter)
# If an element appears only once, it is an unique element !
for l in [a,b,c]:
print(*filter(lambda x: counter[x]==1, l))
And the output is:
Counter({2: 2, 3: 2, 6: 2, 1: 1, 4: 1, 5: 1, 7: 1, 8: 1, 9: 1})
1 4
5 7
8 9
Use
set.difference() - Return a set that contains the items that only exist in set x, and not in set y:
Ex.
a = [1,2,3,4]
b = [2,5,6,7]
c = [3,6,8,9]
abc = list(set(a).difference(b).difference(c))
bca = list(set(b).difference(c).difference(a))
cab = list(set(c).difference(a).difference(b))
print(abc)
print(bca)
print(cab)
O/P:
[1, 4]
[5, 7]
[8, 9]
You can use a dict that stores the number of times each number is seen and use that to generate a set that lists are compared against. With the dict it means that you don't then need to compare every new list to all other lists again (but duplicate_numbers will need to be redefined).
tracker_dict = dict()
duplicate_numbers = set()
a = [1,2,3,4]
b = [2,5,6,7]
c = [3,6,8,9]
# Get count of all numbers in all lists
all_lists = [a, b, c]
for l in all_lists:
for item in l:
tracker_dict[item] = tracker_dict.get(item, 0) + 1
# Store all duplicate numbers in a set
duplicate_numbers = set([num for num in tracker_dict if tracker_dict[num] > 1])
# Get new lists
new_a = [i for i in a if i not in duplicate_numbers]
# With a new list that is defined afterwards
d = [1, 4, 5, 1]
# Update the tracker_dict and duplicate_numbers set
for item in d:
tracker_dict[item] = tracker_dict.get(item, 0) + 1
duplicate_numbers = set([num for num in tracker_dict if tracker_dict[num] > 1])
new_d = [i for i in a if i not in duplicate_numbers]
# This does not affect previously processed lists however

Unable to reset counters in for loop

I am trying to amend a list of integers in a way that every 2 duplicating integers will be multiplied by 2 and will replace the duplicates. here is an example:
a = [1, 1, 2, 3] = [2, 2 ,3] = [4 ,3]
also : b = [2, 3, 3, 6 ,9] = [2 , 6 , 6, 9] = [2, 12 , 9]
I am using the code below to achieve this. Unfortunately, every time I find a match my index would skip the next match.
user_input = [int(a) for a in input().split()]
for index, item in enumerate(user_input):
while len(user_input)-2 >= index:
if item == user_input[index + 1]:
del user_input[index]
del user_input[index]
item += item
user_input.insert(index,item)
break
print(*user_input)
In Python, you should never modify a container object while you are iterating over it. There are some exceptions if you know what you are doing, but you certainly should not change the size of the container object. That is what you are trying to do and that is why it fails.
Instead, use a different approach. Iterate over the list but construct a new list. Modify that new list as needed. Here is code that does what you want. This builds a new list named new_list and either changes the last item(s) in that list or appends a new item. The original list is never changed.
user_input = [int(a) for a in input().split()]
new_list = []
for item in user_input:
while new_list and (item == new_list[-1]):
new_list.pop()
item *= 2
new_list.append(item)
print(*new_list)
This code passes the two examples you gave. It also passes the example [8, 4, 2, 1, 1, 7] which should result in [16, 7]. My previous version did not pass that last test but this new version does.
Check if this works Rory!
import copy
user_input = [1,1,2,3]
res = []
while res!=user_input:
a = user_input.pop(0)
if len(user_input)!=0
b = user_input.pop(0)
if a==b:
user_input.insert(0,a+b)
else:
res.append(a)
user_input.insert(0,b)
else:
res.append(a)
user_input = copy.deepcopy(res)
You can use itertools.groupby and a recursion:
Check for same consecutive elements:
def same_consec(lst):
return any(len(list(g)) > 1 for _, g in groupby(lst))
Replace consecutive same elements:
def replace_consec(lst):
if same_consec(lst):
lst = [k * 2 if len(list(g)) > 1 else k for k, g in groupby(lst)]
return replace_consec(lst)
else:
return lst
Usage:
>>> a = [8, 4, 2, 1, 1, 7]
>>> replace_consec(a)
[16, 7]

Writing Python code that works like the reverse() function

I'm looking to break down the reverse() function and write it out in code for practice. I eventually figured out how to do it (step thru the original list backwards and append to the new 'reversed' list) but wondering why this doesn't work.
def reverse(list):
newlist = []
index = 0
while index < len(list):
newlist[index] = list[(len(list)) - 1 - index]
index = index + 1
return newlist
list = [1, 2, 3, 4, 5]
print(reverse(list))
In Python, you cannot access/update an element of a list, if the index is not in the range of 0 and length of the list - 1.
In your case, you are trying to assign to element at 0, but the list is empty. So, it doesn't have index 0. That is why it fails with the error,
IndexError: list assignment index out of range
Instead, you can use append function, like this
newlist.append(list[(len(list)) - 1 - index])
Apart from that, you can use range function to count backwards like this
for index in range(len(list) - 1, -1, -1):
newlist.append(list[index])
you don't even have to increment the index yourself, for loop takes care of it.
As suggested by #abarnert, you can actually iterate the list and add the elements at the beginning every time, like this
>>> def reverse(mylist):
... result = []
... for item in mylist:
... result.insert(0, item)
... return result
...
>>> reverse([1, 2, 3, 4, 5])
[5, 4, 3, 2, 1]
If you want to create a new reversed list, you may not have to write a function on your own, instead you can use the slicing notation to create a new reversed list, like this
>>> mylist = [1, 2, 3, 4, 5]
>>> mylist[::-1]
[5, 4, 3, 2, 1]
but this doesn't change the original object.
>>> mylist = [1, 2, 3, 4, 5]
>>> mylist[::-1]
[5, 4, 3, 2, 1]
>>> mylist
[1, 2, 3, 4, 5]
if you want to change the original object, just assign the slice back to the slice of the original object, like this
>>> mylist
[1, 2, 3, 4, 5]
>>> mylist[:] = mylist[::-1]
>>> mylist
[5, 4, 3, 2, 1]
Note: reversed actually returns a reverse iterator object, not a list. So, it doesn't build the entire list reversed. Instead it returns elements one by one when iterated with next protocol.
>>> reversed([1, 2, 3, 4, 5])
<list_reverseiterator object at 0x7fdc118ba978>
>>> for item in reversed([1, 2, 3, 4, 5]):
... print(item)
...
...
5
4
3
2
1
So, you might want to make it a generator function, like this
>>> def reverse(mylist):
... for index in range(len(mylist) - 1, -1, -1):
... yield mylist[index]
...
...
>>> reverse([1, 2, 3, 4, 5])
<generator object reverse at 0x7fdc118f99d8>
So the reverse function returns a generator object. If you want a list, then you can create one with list function, like this
>>> list(reverse([1, 2, 3, 4, 5]))
[5, 4, 3, 2, 1]
if you are just going to process it one by one, then iterate it with a for loop, like this
>>> for i in reverse([1, 2, 3, 4, 5]):
... print(i)
...
...
5
4
3
2
1
First off don't override build-ins (list in your case) second newlist has a len of 0 therefore cannot be accessed by index.
def reverse(mylist):
newlist = [0] * len(mylist)
index = 0
while index < len(mylist):
newlist[index] = mylist[(len(mylist)) - 1 - index]
index = index + 1
return newlist
mylist = [1, 2, 3, 4, 5]
print(reverse(mylist))
you can create a list with values of the same lenght as your input list like so
newlist = [0] * len(mylist)
You need to use list.append. newlist[0] is a valid operation, if the list has atleast one element in it, but newlist is empty in this very first iteration. Also, list is not a good name for a variable, as there is a python builtin container with the same name:
def reverse(lst):
newlist = []
index = 0
while index < len(lst):
newlist.append(lst[(len(list)) - 1 - index])
index += 1
return newlist
list = [1, 2, 3, 4, 5]
print(reverse(list))
You can't assign to an arbitrary index for a 0-length list. Doing so raises an IndexError. Since you're assigning the elements in order, you can just do an append instead of an assignment to an index:
newlist.append(l[(len(l)) - 1 - index])
Append modifies the list and increases its length automatically.
Another way to get your original code to work would be to change the initialization of newlist so that it has sufficient length to support your index operations:
newlist = [None for _ in range(len(l))]
I would also like to note that it's not a good idea to name things after built-in types and functions. Doing so shadows the functionality of the built-ins.
To write the function you're trying to write, see thefourtheye's answer.
But that isn't how reverse works, or what it does. Instead of creating a new list, it modifies the existing list in-place.
If you think about it, that's pretty easy: just go through half the indices, for each index N, swap the Nth from the left and the Nth from the right.*
So, sticking with your existing framework:
def reverse(lst):
index = 0
while index < len(lst)/2:
lst[index], lst[len(lst) - 1 - index] = lst[len(lst) - 1 - index], lst[index]
index = index + 1
As a side note, using while loops like this is almost always a bad idea. If you want to loop over a range of numbers, just use for index in range(len(lst)):. Besides reducing three lines of code to one and making it more obvious what you're doing, it removes multiple places where you could make a simple but painful-to-debug mistake.
Also, note that in most cases, in Python, it's easier to use a negative index to mean "from the right edge" than to do the math yourself, and again it will usually remove a possible place you could easily make a painful mistake. But in this particular case, it might not actually be any less error-prone…
* You do have to make sure you think through the edge cases. It doesn't matter whether for odd lists you swap the middle element with itself or not, but just make sure you don't round the wrong way and go one element too far or too short. Which is a great opportunity to learn about how to write good unit tests…
probably check this out:
def reverse(lst):
newList = []
countList = len(lst) - 1
for x in range(countList,-1,-1):
newList.append(lst[x])
return newList
def main():
lst = [9,8,7,6,5,4,2]
print(reverse(lst))
main()

Extract elements of list at odd positions

So I want to create a list which is a sublist of some existing list.
For example,
L = [1, 2, 3, 4, 5, 6, 7], I want to create a sublist li such that li contains all the elements in L at odd positions.
While I can do it by
L = [1, 2, 3, 4, 5, 6, 7]
li = []
count = 0
for i in L:
if count % 2 == 1:
li.append(i)
count += 1
But I want to know if there is another way to do the same efficiently and in fewer number of steps.
Solution
Yes, you can:
l = L[1::2]
And this is all. The result will contain the elements placed on the following positions (0-based, so first element is at position 0, second at 1 etc.):
1, 3, 5
so the result (actual numbers) will be:
2, 4, 6
Explanation
The [1::2] at the end is just a notation for list slicing. Usually it is in the following form:
some_list[start:stop:step]
If we omitted start, the default (0) would be used. So the first element (at position 0, because the indexes are 0-based) would be selected. In this case the second element will be selected.
Because the second element is omitted, the default is being used (the end of the list). So the list is being iterated from the second element to the end.
We also provided third argument (step) which is 2. Which means that one element will be selected, the next will be skipped, and so on...
So, to sum up, in this case [1::2] means:
take the second element (which, by the way, is an odd element, if you judge from the index),
skip one element (because we have step=2, so we are skipping one, as a contrary to step=1 which is default),
take the next element,
Repeat steps 2.-3. until the end of the list is reached,
EDIT: #PreetKukreti gave a link for another explanation on Python's list slicing notation. See here: Explain Python's slice notation
Extras - replacing counter with enumerate()
In your code, you explicitly create and increase the counter. In Python this is not necessary, as you can enumerate through some iterable using enumerate():
for count, i in enumerate(L):
if count % 2 == 1:
l.append(i)
The above serves exactly the same purpose as the code you were using:
count = 0
for i in L:
if count % 2 == 1:
l.append(i)
count += 1
More on emulating for loops with counter in Python: Accessing the index in Python 'for' loops
For the odd positions, you probably want:
>>>> list_ = list(range(10))
>>>> print list_[1::2]
[1, 3, 5, 7, 9]
>>>>
I like List comprehensions because of their Math (Set) syntax. So how about this:
L = [1, 2, 3, 4, 5, 6, 7]
odd_numbers = [y for x,y in enumerate(L) if x%2 != 0]
even_numbers = [y for x,y in enumerate(L) if x%2 == 0]
Basically, if you enumerate over a list, you'll get the index x and the value y. What I'm doing here is putting the value y into the output list (even or odd) and using the index x to find out if that point is odd (x%2 != 0).
You can also use itertools.islice if you don't need to create a list but just want to iterate over the odd/even elements
import itertools
L = [1, 2, 3, 4, 5, 6, 7]
li = itertools.islice(l, 1, len(L), 2)
You can make use of bitwise AND operator &:
>>> x = [1, 2, 3, 4, 5, 6, 7]
>>> y = [i for i in x if i&1]
[1, 3, 5, 7]
This will give you the odd elements in the list. Now to extract the elements at odd indices you just need to change the above a bit:
>>> x = [10, 20, 30, 40, 50, 60, 70]
>>> y = [j for i, j in enumerate(x) if i&1]
[20, 40, 60]
Explanation
Bitwise AND operator is used with 1, and the reason it works is because, odd number when written in binary must have its first digit as 1. Let's check:
23 = 1 * (2**4) + 0 * (2**3) + 1 * (2**2) + 1 * (2**1) + 1 * (2**0) = 10111
14 = 1 * (2**3) + 1 * (2**2) + 1 * (2**1) + 0 * (2**0) = 1110
AND operation with 1 will only return 1 (1 in binary will also have last digit 1), iff the value is odd.
Check the Python Bitwise Operator page for more.
P.S: You can tactically use this method if you want to select odd and even columns in a dataframe. Let's say x and y coordinates of facial key-points are given as columns x1, y1, x2, etc... To normalize the x and y coordinates with width and height values of each image you can simply perform:
for i in range(df.shape[1]):
if i&1:
df.iloc[:, i] /= heights
else:
df.iloc[:, i] /= widths
This is not exactly related to the question but for data scientists and computer vision engineers this method could be useful.

Categories

Resources