Related
This question already has answers here:
Split List By Value and Keep Separators
(8 answers)
Closed 1 year ago.
Is there an easy way to split the list l below into 3 list. I want to cut the list when the sequence starts over. So every list should start with 1.
l= [1, 2, 3, 4, 5, 1, 2, 3, 4, 1, 2, 3, 4]
l1 = [1, 2, 3,4, 5]
l2=[1,2,3,4]
l3=[1,2,3,4]
My original thought was to look at the lead value and implement a condition inside a for loop that would cut the list when x.lead < x. But how do I use lead and lag when using lists in python?
NumPy solution
import numpy as np
l = [1, 2, 3, 4, 5, 1, 2, 3, 4, 1, 2, 3, 4]
parts = [list(i) for i in np.split(l,np.flatnonzero(np.diff(l)-1)+1)]
print(parts)
output
[[1, 2, 3, 4, 5], [1, 2, 3, 4], [1, 2, 3, 4]]
Explanation: I first find differences between adjacent elements using numpy.diff, then subtract 1 to be able to use numpy.flatnonzero to find where difference is other than 1, add 1 (note that numpy.diff output length is input length minus 1) to get indices for use in numpy.split, eventually convert it to list, as otherwise you would end with numpy.arrays
What about this:
l = [1, 2, 3, 4, 5, 1, 2, 3, 4, 1, 2, 3, 4]
one_indices = [i for i, e in enumerate(l) if e == 1]
slices = []
for count, item in enumerate(one_indices):
if count == len(one_indices) - 1:
slices.append((item, None))
else:
slices.append((item, one_indices[count + 1]))
sequences = [l[x[0] : x[1]] for x in slices]
print(sequences)
Out:
[[1, 2, 3, 4, 5], [1, 2, 3, 4], [1, 2, 3, 4]]
Another way without numpy,
l= [1, 2, 3, 4, 5, 1, 2, 3, 4, 1, 2, 3, 4]
start = 0
newlist = []
for i,v in enumerate(l):
if i!=0 and v==1:
newlist.append(l[start:i])
start = i
newlist.append(l[start:i+1])
print(newlist)
Working Demo: https://rextester.com/RYCV85570
I want to weave two lists and output all the possible results.
For example,
input: two lists l1 = [1, 2], l2 = [3, 4]
output: [1, 2, 3, 4], [1, 3, 2, 4], [1, 3, 4, 2], [3, 1, 2, 4], [3, 1, 4, 2], [3, 4, 1, 2]
Note: I need to keep the order in each list (e.g. 1 is always before 2, and 3 is always before 4)
The way I am solving this is by removing the head from one list, recursing, and then doing the same thing with the other list. The code is below:
all_possibles = []
def weaveLists(first, second, added):
if len(first) == 0 or len(second) == 0:
res = added[:]
res += first[:]
res += second[:]
all_possibles.append(res)
return
cur1 = first[0]
added.append(cur1)
first = first[1:]
weaveLists(first, second, added)
added = added[:-1]
first = [cur1] + first
cur2 = second[0]
added.append(cur2)
second = second[1:]
weaveLists(first, second, added)
added = added[:-1]
second = [cur2] + second
weaveLists([1, 2], [3, 4], [])
print(all_possibles)
The result I got is:
[[1, 2, 3, 4], [1, 3, 2, 4], [1, 3, 4, 2], [1, 3, 1, 2, 4], [1, 3, 1, 4, 2], [1, 3, 1, 4, 1, 2]]
I couldn't figure out why for the last three lists, the heading 1 from the first list is not removed.
Can anyone help? Thanks!
The reason you get those unexpected results is that you mutate added at this place:
added.append(cur1)
...this will affect the caller's added list (unintentionally). While the "undo" operation is not mutating the list:
added = added[:-1]
This creates a new list, and therefore this "undo" action does not roll back the change in the list of the caller.
The easy fix is to replace the call to append with:
added = added + [cur1]
And the same should happen in the second block.
It is easier if you pass the new values for the recursive call on-the-fly, and replace those two code blocks with just:
weaveLists(first[1:], second, added + [first[0]])
weaveLists(first, second[1:], added + [second[0]])
Here is another way to do it: we generate the possible indices of the items of the first list inside the weaved list, and fill the list accordingly.
We can generate the indices with itertools.combinations: it's the combinations of the indices of the weaved list, taking len(first_list) of them each time.
from itertools import combinations
def weave(l1, l2):
total_length = len(l1) + len(l2)
# indices at which to put items from l1 in the weaved output
for indices in combinations(range(total_length), r=len(l1)):
out = []
it1 = iter(l1)
it2 = iter(l2)
for i in range(total_length):
if i in indices:
out.append(next(it1))
else:
out.append(next(it2))
yield out
Sample run:
l1 = [1, 2]
l2 = [3, 4]
for w in weave(l1, l2):
print(w)
[1, 2, 3, 4]
[1, 3, 2, 4]
[1, 3, 4, 2]
[3, 1, 2, 4]
[3, 1, 4, 2]
[3, 4, 1, 2]
Another sample run with a longer list:
l1 = [1, 2]
l2 = [3, 4, 5]
for w in weave(l1, l2):
print(w)
[1, 2, 3, 4, 5]
[1, 3, 2, 4, 5]
[1, 3, 4, 2, 5]
[1, 3, 4, 5, 2]
[3, 1, 2, 4, 5]
[3, 1, 4, 2, 5]
[3, 1, 4, 5, 2]
[3, 4, 1, 2, 5]
[3, 4, 1, 5, 2]
[3, 4, 5, 1, 2]
I want to create a date list that- contains the list of dates of one week but each date should be there 4 times in that list. like this:
[1-jan-2018, 1-jan-2018, 1-jan-2018, 1-jan-2018, 2-jan-2018, 2-jan-2018, 2-jan-2018, 2-jan-2018, 3-jan-2018, 3-jan-2018, 3-jan-2018, 3-jan-2018, 4-jan-2018, 4-jan-2018, 4-jan-2018, 4-jan-2018, 5-jan-2018, 5-jan-2018, 5-jan-2018, 5-jan-2018, 6-jan-2018, 6-jan-2018, 6-jan-2018, 6-jan-2018,7-jan-2018, 7-jan-2018, 7-jan-2018, 7-jan-2018]
I don't exactly have the idea how to do it but here is my attempt:
import pandas as pd
timeSeries = list(pd.date_range(start='1/1/2020', end='7/1/2020'))
print(timeSeries)
This will just create the list of dates of one week but I want the answer in the above format. Can someone please help?
How to duplicate items in a list
A solution is create various list with each element of your primary list repeated N time. In this example, I will duplicated each element four times, so:
old_list = [1,2,3,4]
# [i,i,i,i] will clone each item four times.
new_list = list([i,i,i,i] for i in old_list)
# new_list = [[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]]
But, now you will have a list of lists, so you need to transform that result into a list of elements, this operation is called flat. In order to do this in python, you can use the itertools.chain.
import itertools
old_list = [1,2,3,4]
# [i,i,i,i] will clone each item four times.
new_list = list([i,i,i,i] for i in old_list)
# new_list = [[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]]
new_list_flatten = list(itertools.chain(*new_list))
# new_list_flatten = [1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4]
You can avoid the usage of * operation by calling itertools.chain.from_iterable:
import itertools
old_list = [1,2,3,4]
# [i,i,i,i] will clone each item four times.
new_list = list([i,i,i,i] for i in old_list)
# new_list = [[1, 1, 1, 1], [2, 2, 2, 2], [3, 3, 3, 3], [4, 4, 4, 4]]
new_list_flatten = list(itertools.chain.from_iterable(new_list))
# new_list_flatten = [1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4]
Therefore, this is a code that do what you want:
import itertools
import pandas as pd
time_series = list(pd.date_range(start='1/1/2020', end='7/1/2020'))
newseries = list( itertools.chain.from_iterable((i,i,i,i) for i in time_series) )
print(newseries)
I was attempting to remove all duplicated numbers in a list.
I was trying to understand what is wrong with my code.
numbers = [1, 1, 1, 1, 6, 5, 5, 2, 3]
for x in numbers:
if numbers.count(x) >= 2:
numbers.remove(x)
print(numbers)
The result I got was:
[1, 1, 6, 5, 2, 3]
I guess the idea is to write code yourself without using library functions. Then I would still suggest to use additional set structure to store your previous items and go only once over your array:
numbers = [1, 1, 1, 1, 6, 5, 5, 2, 3]
unique = set()
for x in numbers:
if x not in unique:
unique.add(x)
numbers = list(unique)
print(numbers)
If you want to use your code then the problem is that you modify collection in for each loop, which is a big NO NO in most programming languages. Although Python allows you to do that, the problem and solution are already described in this answer: How to remove items from a list while iterating?:
Note: There is a subtlety when the sequence is being modified by the loop (this can only occur for mutable sequences, i.e. lists). An internal counter is used to keep track of which item is used next, and this is incremented on each iteration. When this counter has reached the length of the sequence the loop terminates. This means that if the suite deletes the current (or a previous) item from the sequence, the next item will be skipped (since it gets the index of the current item which has already been treated). Likewise, if the suite inserts an item in the sequence before the current item, the current item will be treated again the next time through the loop. This can lead to nasty bugs that can be avoided by making a temporary copy using a slice of the whole sequence, e.g.,
for x in a[:]:
if x < 0: a.remove(x)
numbers = [1, 1, 1, 1, 6, 5, 5, 2, 3]
Using a shallow copy of the list:
for x in numbers[:]:
if numbers.count(x) >= 2:
numbers.remove(x)
print(numbers) # [1, 6, 5, 2, 3]
Alternatives:
Preserving the order of the list:
Using dict.fromkeys()
print(list(dict.fromkeys(numbers).keys())) # [1, 6, 5, 2, 3]
Using more_itertools.unique_everseen(iterable, key=None):
from more_itertools import unique_everseen
print(list(unique_everseen(numbers))) # [1, 6, 5, 2, 3]
Using pandas.unique:
import pandas as pd
print(pd.unique(numbers).tolist()) # [1, 6, 5, 2, 3]
Using collections.OrderedDict([items]):
from collections import OrderedDict
print(list(OrderedDict.fromkeys(numbers))) # [1, 6, 5, 2, 3]
Using itertools.groupby(iterable[, key]):
from itertools import groupby
print([k for k,_ in groupby(numbers)]) # [1, 6, 5, 2, 3]
Ignoring the order of the list:
Using numpy.unique:
import numpy as np
print(np.unique(numbers).tolist()) # [1, 2, 3, 5, 6]
Using set():
print(list(set(numbers))) # [1, 2, 3, 5, 6]
Using frozenset([iterable]):
print(list(frozenset(numbers))) # [1, 2, 3, 5, 6]
Why don't you simply use a set:
numbers = [1, 1, 1, 1, 6, 5, 5, 2, 3]
numbers = list(set(numbers))
print(numbers)
Before anything, the first advice I can give is to never edit over an array that you are looping. All kinds of wacky stuff happens. Your code is fine (I recommend reading other answers though, there's an easier way to do this with a set, which pretty much handles the duplication thing for you).
Instead of removing number from the array you are looping, just clone the array you are looping in the actual for loop syntax with slicing.
numbers = [1, 1, 1, 1, 6, 5, 5, 2, 3]
for x in numbers[:]:
if numbers.count(x) >= 2:
numbers.remove(x)
print(numbers)
print("Final")
print(numbers)
The answer there is numbers[:], which gives back a clone of the array. Here's the print output:
[1, 1, 1, 6, 5, 5, 2, 3]
[1, 1, 6, 5, 5, 2, 3]
[1, 6, 5, 5, 2, 3]
[1, 6, 5, 5, 2, 3]
[1, 6, 5, 5, 2, 3]
[1, 6, 5, 2, 3]
[1, 6, 5, 2, 3]
[1, 6, 5, 2, 3]
[1, 6, 5, 2, 3]
Final
[1, 6, 5, 2, 3]
Leaving a placeholder here until I figure out how to explain why in your particular case it's not working, like the actual step by step reason.
Another way to solve this making use of the beautiful language that is Python, is through list comprehension and sets.
Why a set. Because the definition of this data structure is that the elements are unique, so even if you try to put in multiple elements that are the same, they won't appear as repeated in the set. Cool, right?
List comprehension is some syntax sugar for looping in one line, get used to it with Python, you'll either use it a lot, or see it a lot :)
So with list comprehension you will iterate an iterable and return that item. In the code below, x represents each number in numbers, x is returned to be part of the set. Because the set handles duplicates...voila, your code is done.
numbers = [1, 1, 1, 1, 6, 5, 5, 2, 3]
nubmers_a_set = {x for x in numbers }
print(nubmers_a_set)
This seems like homework but here is a possible solution:
import numpy as np
numbers = [1, 1, 1, 1, 6, 5, 5, 2, 3]
filtered = list(np.unique(numbers))
print(filtered)
#[1, 2, 3, 5, 6]
This solution does not preserve the ordering. If you need also the ordering use:
filtered_with_order = list(dict.fromkeys(numbers))
Why don't you use fromkeys?
numbers = [1, 1, 1, 1, 6, 5, 5, 2, 3]
numbers = list(dict.fromkeys(numbers))
Output: [1,6,5,2,3]
The flow is as follows.
Now the list is [1, 1, 1, 1, 6, 5, 5, 2, 3] and Index is 0.
The x is 1. The numbers.count(1) is 4 and thus the 1 at index 0 is removed.
Now the numbers list becomes [1, 1, 1, 6, 5, 5, 2, 3] but the Index will +1 and becomes 1.
The x is 1. The numbers.count(1) is 3 and thus the 1 and index 1 is removed.
Now the numbers list becomes [1, 1, 6, 5, 5, 2, 3] but the Index will +1 and becomes 2.
The x will be 6.
etc...
So that's why there are two 1's.
Please correct me if I am wrong. Thanks!
A fancy method is to use collections.Counter:
>>> from collections import Counter
>>> numbers = [1, 1, 1, 1, 6, 5, 5, 2, 3]
>>> c = Counter(numbers)
>>> list(c.keys())
[1, 6, 5, 2, 3]
This method have a linear time complexity (O(n)) and uses a really performant library.
You can try:
from more_itertools import unique_everseen
items = [1, 1, 1, 1, 6, 5, 5, 2, 3]
list(unique_everseen(items))
or
from collections import OrderedDict
>>> items = [1, 1, 1, 1, 6, 5, 5, 2, 3]
>>> list(OrderedDict.fromkeys(items))
[1, 2, 0, 3]
more you can find here
How do you remove duplicates from a list whilst preserving order?
I have a list [[1, 2, 7], [1, 2, 3], [1, 2, 3, 7], [1, 2, 3, 5, 6, 7]] and I need [1,2,3,7] as final result (this is kind of reverse engineering). One logic is to check intersections -
while(i<dlistlen):
j=i+1
while(j<dlistlen):
il = dlist1[i]
jl = dlist1[j]
tmp = list(set(il) & set(jl))
print tmp
#print i,j
j=j+1
i=i+1
this is giving me output :
[1, 2]
[1, 2, 7]
[1, 2, 7]
[1, 2, 3]
[1, 2, 3]
[1, 2, 3, 7]
[]
Looks like I am close to getting [1,2,3,7] as my final answer, but can't figure out how. Please note, in the very first list (([[1, 2, 7], [1, 2, 3], [1, 2, 3, 7], [1, 2, 3, 5, 6, 7]] )) there may be more items leading to one more final answer besides [1,2,3,4]. But as of now, I need to extract only [1,2,3,7] .
Please note, this is not kind of homework, I am creating own clustering algorithm that fits my need.
You can use the Counter class to keep track of how often elements appear.
>>> from itertools import chain
>>> from collections import Counter
>>> l = [[1, 2, 7], [1, 2, 3], [1, 2, 3, 7], [1, 2, 3, 5, 6, 7]]
>>> #use chain(*l) to flatten the lists into a single list
>>> c = Counter(chain(*l))
>>> print c
Counter({1: 4, 2: 4, 3: 3, 7: 3, 5: 1, 6: 1})
>>> #sort keys in order of descending frequency
>>> sortedValues = sorted(c.keys(), key=lambda x: c[x], reverse=True)
>>> #show the four most common values
>>> print sortedValues[:4]
[1, 2, 3, 7]
>>> #alternatively, show the values that appear in more than 50% of all lists
>>> print [value for value, freq in c.iteritems() if float(freq) / len(l) > 0.50]
[1, 2, 3, 7]
It looks like you're trying to find the largest intersection of two list elements. This will do that:
from itertools import combinations
# convert all list elements to sets for speed
dlist = [set(x) for x in dlist]
intersections = (x & y for x, y in combinations(dlist, 2))
longest_intersection = max(intersections, key=len)