This question already has an answer here:
Create index list for np.split from the list that already has number for each section
(1 answer)
Closed 3 years ago.
Let's say I've got an array [0, 1, 2, 3, 4, 5, 6, 7] and a tuple: (3, 3, 2).
I'm looking for a way to split my array to 3 array based on my tuple data:
[0, 1, 2]
[3, 4, 5]
[6, 7]
I can write a simple code like this to get what I want, however I'm looking for a correct and pythonic way to do this:
I used lists for simplicity.
a = [0, 1, 2, 3, 4, 5, 6, 7]
b = (3, 3, 2)
pointer = 0
for i in b:
lst = []
for j in range(i):
lst.append(a[pointer])
pointer += 1
print(lst)
Or this one:
a = [0, 1, 2, 3, 4, 5, 6, 7]
b = (3, 3, 2)
pointer = 0
for i in b:
lst = a[pointer:pointer+i]
pointer += i
print(lst)
Results:
[0, 1, 2]
[3, 4, 5]
[6, 7]
you can use the split method of numpy
import numpy as np
a = [0, 1, 2, 3, 4, 5, 6, 7]
b = (3, 3, 2)
c = np.split(a, np.cumsum(b)[:-1])
for r in c:
print(r)
np.split(a, b) splits a by the indices in b along a given axis(0 by default).
If you don't want to modify your input list, you can use an iterator and the itertools module.
>>> from itertools import islice
>>> a = [0, 1, 2, 3, 4, 5, 6, 7]
>>> b = (3, 3, 2)
>>> i = iter(a)
>>> [list(islice(i, x)) for x in b]
[[0, 1, 2], [3, 4, 5], [6, 7]]
In the first step you create an iterator, which starts at the first element of a. Then you iterate in a list comprehension over your numbers in b and in each step you pull accordingly many elements from the iterator and store them in your result list.
One simpler way is this:
a = [0, 1, 2, 3, 4, 5, 6, 7]
b = (3, 3, 2)
for ind in b:
print(a[:ind])
a = a[ind:]
It loops through slice sizes in b while shortening the original array every time. You can easily append the resulting slices as sublists if you need them for something else. It's almost like one of your solutions except it doesn't use any extra variables and iterates directly through elements of b.
Also, I wouldn't call variables a and b - surely not in this case where variables have clear meanings that you can express through their names. More meaningful names lessen bugs number and make code more clear, becomes a real difference with larger/more complex code. I'd call a at least in_list and b slices, but with more context this could be better.
The most "concise" syntax would be :
ex_array = [0, 1, 2, 3, 4, 5, 6, 7]
extuple = (3, 3, 2)
result = [ex_array[sum(extuple[:iii]):sum(extuple[:iii])+extuple[iii]] for iii in range(len(extuple))]
result would be a list of the expected sub-lists
Re-using the pairwise function from Compare two adjacent elements in same list, you could also:
from itertools import accumulate
from more_itertools import pairwise
a = [0, 1, 2, 3, 4, 5, 6, 7]
b = (3, 3, 2)
[a[slice(*s)] for s in pairwise(accumulate((0,)+b))]
That begin said, the np.split answer is probably faster (and easier to read).
Related
I was attempting to remove all duplicated numbers in a list.
I was trying to understand what is wrong with my code.
numbers = [1, 1, 1, 1, 6, 5, 5, 2, 3]
for x in numbers:
if numbers.count(x) >= 2:
numbers.remove(x)
print(numbers)
The result I got was:
[1, 1, 6, 5, 2, 3]
I guess the idea is to write code yourself without using library functions. Then I would still suggest to use additional set structure to store your previous items and go only once over your array:
numbers = [1, 1, 1, 1, 6, 5, 5, 2, 3]
unique = set()
for x in numbers:
if x not in unique:
unique.add(x)
numbers = list(unique)
print(numbers)
If you want to use your code then the problem is that you modify collection in for each loop, which is a big NO NO in most programming languages. Although Python allows you to do that, the problem and solution are already described in this answer: How to remove items from a list while iterating?:
Note: There is a subtlety when the sequence is being modified by the loop (this can only occur for mutable sequences, i.e. lists). An internal counter is used to keep track of which item is used next, and this is incremented on each iteration. When this counter has reached the length of the sequence the loop terminates. This means that if the suite deletes the current (or a previous) item from the sequence, the next item will be skipped (since it gets the index of the current item which has already been treated). Likewise, if the suite inserts an item in the sequence before the current item, the current item will be treated again the next time through the loop. This can lead to nasty bugs that can be avoided by making a temporary copy using a slice of the whole sequence, e.g.,
for x in a[:]:
if x < 0: a.remove(x)
numbers = [1, 1, 1, 1, 6, 5, 5, 2, 3]
Using a shallow copy of the list:
for x in numbers[:]:
if numbers.count(x) >= 2:
numbers.remove(x)
print(numbers) # [1, 6, 5, 2, 3]
Alternatives:
Preserving the order of the list:
Using dict.fromkeys()
print(list(dict.fromkeys(numbers).keys())) # [1, 6, 5, 2, 3]
Using more_itertools.unique_everseen(iterable, key=None):
from more_itertools import unique_everseen
print(list(unique_everseen(numbers))) # [1, 6, 5, 2, 3]
Using pandas.unique:
import pandas as pd
print(pd.unique(numbers).tolist()) # [1, 6, 5, 2, 3]
Using collections.OrderedDict([items]):
from collections import OrderedDict
print(list(OrderedDict.fromkeys(numbers))) # [1, 6, 5, 2, 3]
Using itertools.groupby(iterable[, key]):
from itertools import groupby
print([k for k,_ in groupby(numbers)]) # [1, 6, 5, 2, 3]
Ignoring the order of the list:
Using numpy.unique:
import numpy as np
print(np.unique(numbers).tolist()) # [1, 2, 3, 5, 6]
Using set():
print(list(set(numbers))) # [1, 2, 3, 5, 6]
Using frozenset([iterable]):
print(list(frozenset(numbers))) # [1, 2, 3, 5, 6]
Why don't you simply use a set:
numbers = [1, 1, 1, 1, 6, 5, 5, 2, 3]
numbers = list(set(numbers))
print(numbers)
Before anything, the first advice I can give is to never edit over an array that you are looping. All kinds of wacky stuff happens. Your code is fine (I recommend reading other answers though, there's an easier way to do this with a set, which pretty much handles the duplication thing for you).
Instead of removing number from the array you are looping, just clone the array you are looping in the actual for loop syntax with slicing.
numbers = [1, 1, 1, 1, 6, 5, 5, 2, 3]
for x in numbers[:]:
if numbers.count(x) >= 2:
numbers.remove(x)
print(numbers)
print("Final")
print(numbers)
The answer there is numbers[:], which gives back a clone of the array. Here's the print output:
[1, 1, 1, 6, 5, 5, 2, 3]
[1, 1, 6, 5, 5, 2, 3]
[1, 6, 5, 5, 2, 3]
[1, 6, 5, 5, 2, 3]
[1, 6, 5, 5, 2, 3]
[1, 6, 5, 2, 3]
[1, 6, 5, 2, 3]
[1, 6, 5, 2, 3]
[1, 6, 5, 2, 3]
Final
[1, 6, 5, 2, 3]
Leaving a placeholder here until I figure out how to explain why in your particular case it's not working, like the actual step by step reason.
Another way to solve this making use of the beautiful language that is Python, is through list comprehension and sets.
Why a set. Because the definition of this data structure is that the elements are unique, so even if you try to put in multiple elements that are the same, they won't appear as repeated in the set. Cool, right?
List comprehension is some syntax sugar for looping in one line, get used to it with Python, you'll either use it a lot, or see it a lot :)
So with list comprehension you will iterate an iterable and return that item. In the code below, x represents each number in numbers, x is returned to be part of the set. Because the set handles duplicates...voila, your code is done.
numbers = [1, 1, 1, 1, 6, 5, 5, 2, 3]
nubmers_a_set = {x for x in numbers }
print(nubmers_a_set)
This seems like homework but here is a possible solution:
import numpy as np
numbers = [1, 1, 1, 1, 6, 5, 5, 2, 3]
filtered = list(np.unique(numbers))
print(filtered)
#[1, 2, 3, 5, 6]
This solution does not preserve the ordering. If you need also the ordering use:
filtered_with_order = list(dict.fromkeys(numbers))
Why don't you use fromkeys?
numbers = [1, 1, 1, 1, 6, 5, 5, 2, 3]
numbers = list(dict.fromkeys(numbers))
Output: [1,6,5,2,3]
The flow is as follows.
Now the list is [1, 1, 1, 1, 6, 5, 5, 2, 3] and Index is 0.
The x is 1. The numbers.count(1) is 4 and thus the 1 at index 0 is removed.
Now the numbers list becomes [1, 1, 1, 6, 5, 5, 2, 3] but the Index will +1 and becomes 1.
The x is 1. The numbers.count(1) is 3 and thus the 1 and index 1 is removed.
Now the numbers list becomes [1, 1, 6, 5, 5, 2, 3] but the Index will +1 and becomes 2.
The x will be 6.
etc...
So that's why there are two 1's.
Please correct me if I am wrong. Thanks!
A fancy method is to use collections.Counter:
>>> from collections import Counter
>>> numbers = [1, 1, 1, 1, 6, 5, 5, 2, 3]
>>> c = Counter(numbers)
>>> list(c.keys())
[1, 6, 5, 2, 3]
This method have a linear time complexity (O(n)) and uses a really performant library.
You can try:
from more_itertools import unique_everseen
items = [1, 1, 1, 1, 6, 5, 5, 2, 3]
list(unique_everseen(items))
or
from collections import OrderedDict
>>> items = [1, 1, 1, 1, 6, 5, 5, 2, 3]
>>> list(OrderedDict.fromkeys(items))
[1, 2, 0, 3]
more you can find here
How do you remove duplicates from a list whilst preserving order?
This question already has answers here:
Compact way to assign values by slicing list in Python
(5 answers)
Closed 2 years ago.
Is there a fast way to get the 1st, 3rd and 5th element from an array in Python like a[0,2,4]? Thanks.
Using operator.itemgetter:
>>> lst = [1,2,3,4,5,6,7]
>>> import operator
>>> get135 = operator.itemgetter(0, 2, 4)
>>> get135(lst)
(1, 3, 5)
You could just do this, a simple method with no imports necessary:
>>> a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
>>> [a[i] for i in (0, 2, 4)]
[1, 3, 5]
Slicing is the simplest way to do this. You'll want to slice it with [0:5:2].
>>> range(100)[0:5:2]
[0, 2, 4]
This is the equivalent of saying "Starting from element 0, up to (but not including) element 5, give me every 2nd element."
You can use slicing to get this.
a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
d = a[0:5:2]
print d
[1, 3, 5]
If you want to generalize to every other entry you would use
a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
b = a[::2]
print b
[1, 3, 5, 7, 9]
You can use ,
Slicing operation on list.
>>> a=[i for i in range(10)]
>>> a[::2]
Ouput:
[0, 2, 4, 6, 8]
perhaps:
[list[0], list[2], list[4]]
Is there a way to compare all elements of a list (ie one such as [4, 3, 2, 1, 4, 3, 2, 1, 4]) to all others and return, for each element, the number of other elements it is different from (ie, for the list above [6, 7, 7, 7, 6, 7, 7, 7, 6])? I then will need to add the numbers from this list.
li = [4, 3, 2, 1, 4, 3, 2, 1, 4]
from collections import Counter
c = Counter(li)
print c
length = len(li)
print [length - c[el] for el in li]
Creating c before executing [length - c[el] for el in li] is better than doing count(i) for each element i of the list, because that means that count() do the same count several times (each time it encounters a given element, it counts it)
By the way, another way to write it:
map(lambda x: length-c[x] , li)
You can get similar counter with count() method.
And subtract the total number.
Do it in one line with a comprehension list.
>>> l = [4, 3, 2, 1, 4, 3, 2, 1, 4]
>>> [ len(l)-l.count(i) for i in l ]
[6, 7, 7, 7, 6, 7, 7, 7, 6]
For Python 2.7:
test = [4, 3, 2, 1, 4, 3, 2, 1, 4]
length = len(test)
print [length - test.count(x) for x in test]
You could just use the sum function, along with a generator expression.
>>> l = [4, 3, 2, 1, 4, 3, 2, 1, 4]
>>> length = len(l)
>>> print sum(length - l.count(i) for i in l)
60
The good thing about a generator expression is that you don't create an actual list in memory, but functions like sum can still iterate over them and produce the desired result. Note, however, that once you iterate over a generator once, you can't iterate over it again.
I have List = [0, 1, 2, 3] and I want to assign all of them a number say, 5: List = [5, 5, 5, 5]. I know I can do List = [5] * 4 easily. But I also want to be able to assign say, 6 to the last 3 elements.
List = [5, 5, 5, 5]
YourAnswer(List, 6, 1, 3)
>> [5, 6, 6, 6]
where YourAnswer is a function.
Anything similar would be helpful. Prefer to avoid for-loop. But if there is no known possibility, please tell me.
You can actually use the slice assignment mechanism to assign to part of the list.
>>> some_list = [0, 1, 2, 3]
>>> some_list[:1] = [5]*1
>>> some_list[1:] = [6]*3
>>> some_list
[5, 6, 6, 6]
This is beneficial if you are updating the same list, but in case you want to create a new list, you can simply create a new list based on your criteria by concatenation
>>> [5]*1 + [6]*3
[5, 6, 6, 6]
You can wrap it over to a function
>>> def YourAnswer(lst, repl, index, size):
lst[index:index + size] = [repl] * size
>>> some_list = [0, 1, 2, 3]
>>> YourAnswer(some_list, 6, 1, 3)
>>> some_list
[0, 6, 6, 6]
In [138]: def func(lis,x,y,z):
.....: return lis[:y]+[x]*z
.....:
In [139]: lis=[5,5,5,5]
In [140]: func(lis,6,1,3)
Out[140]: [5, 6, 6, 6]
you can use + to implement it
def createList(val1, n1, val2, n2):
return [val1] * n1 + [val2]*n2
This question already has answers here:
Repeating elements of a list n times
(14 answers)
Closed 5 months ago.
I want to write a function that reads a list [1,5,3,6,...]
and gives [1,1,5,5,3,3,6,6,...].
Any idea how to do it?
>>> a = range(10)
>>> [val for val in a for _ in (0, 1)]
[0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9]
N.B. _ is traditionally used as a placeholder variable name where you do not want to do anything with the contents of the variable. In this case it is just used to generate two values for every time round the outer loop.
To turn this from a list into a generator replace the square brackets with round brackets.
>>> a = [1, 2, 3]
>>> b = []
>>> for i in a:
b.extend([i, i])
>>> b
[1, 1, 2, 2, 3, 3]
or
>>> [a[i//2] for i in range(len(a)*2)]
[1, 1, 2, 2, 3, 3]
numpy.repeat does what you want:
import numpy as np
yourList = [1,5,3,6]
n = 2
list(np.repeat(yourList, n))
result:
[1, 1, 5, 5, 3, 3, 6, 6]
If you don't mind using numpy arrays you can also omit the list() call in the last line.
If you already have the roundrobin recipe described in the documentation for itertools—and it is quite handy—then you can just use
roundrobin(my_list, my_list)
I would use zip and itertools.chain.
>>> import itertools
>>> l = [1,5,3,6,16]
>>> list(itertools.chain(*zip(l,l)))
[1, 1, 5, 5, 3, 3, 6, 6, 16, 16]
Note: I only used list to consume the generator to make it fit for printing. You probably don't need the list call in your code...
It is possible use list multiplication. Case you need each list member together just use sorted method.
>>> lst = [1,2,3,4]
>>> sorted(lst*2)
[1,1,2,2,3,3,4,4]
With a little slicing...
>>> a = [3, 1, 4, 1, 5]
>>> a[:0] = a[::2] = a[1::2] = a[:]
>>> a
[3, 3, 1, 1, 4, 4, 1, 1, 5, 5]
I would use
import itertools
foo = [1, 5, 3, 6]
new = itertools.chain.from_iterable([item, item] for item in foo)
new will be an iterator that lazily iterates over the duplicated items. If you need the actual list computed, you can do list(new) or use one of the other solutions.
One can use zip and flat the list
a = [3, 1, 4, 1, 5]
sum(zip(a,a), ()) # (3, 3, 1, 1, 4, 4, 1, 1, 5, 5)
The output is a tuple, but conversion to a list is easy.
Regarding flatting a tuple with sum see https://stackoverflow.com/a/952946/11769765 and python: flat zip.
For as much as Guido dislikes the functional operators, they can be pretty darned handy:
>>> from operator import add
>>> a = range(10)
>>> b = reduce(add, [(x,x) for x in a])
For a more general approach you could go with a list comprehension and a factor term.
Example
sample_list = [1,2,3,4,5]
factor = 2
new_list = [entry for entry in sample_list for _ in range(factor)]
Out:
>>> new_list
[1, 1, 2, 2, 3, 3, 4, 4, 5, 5]
Changing the factor variable will change how many entry of each item in the list you will have in the new list.
You could also wrap it up in a function:
def multiply_list_entries(list_, factor = 1):
list_multiplied = [entry for entry in list_ for _ in range(factor)]
return list_multiplied
>>> multiply_list_entries(sample_list, factor = 3)
[1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5]
ls1=[1,2,3]
ls2=[]
for i in ls1:
ls2.append(i)
ls2.append(i)
This code duplicates each elements in ls1
the result ls2 --> [1,1,2,2,3,3]