This question already has answers here:
Combining two sorted lists in Python
(22 answers)
Closed 6 months ago.
I have a function insarrintomain which takes 2 arguments. The first one is main list, the second one is an insert list. I need to create a new list, where I will have all numbers from both arrays in increasing order. For example: main is [1, 2, 3, 4, 8, 9, 12], ins is [5, 6, 7, 10]. I should get [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12]
Here is my code:
def insarrintomain(main, ins):
arr = []
c = 0
for i, el in enumerate(main):
if c < len(ins):
if el > ins[c]:
for j, ins_el in enumerate(ins):
if ins_el < el:
c += 1
arr.append(ins_el)
else:
break
else:
arr.append(el)
else:
arr.append(el)
return arr
What did I miss?
Why not
new_array = main + insert
new_array.sort()
The pyhonic way of solving this problem is something like this:
def insarrintomain(main, ins):
new_list = main + ins
new_list.sort()
return new_list
In Python readability counts.
This code is pythonic because it’s easy to read: the function takes two lists, concatenates them into one new list, sorts the result and returns it.
Another reason why this code is pythonic is because it uses built-in functions. There is no need to reinvent the wheel: someone already needed to concatenate two lists, or to sort one. Built-in functions such as sort have been optimised for decades and are mostly written in C language. By no chance we can beat them using Python.
Let’s analyse the implementation from #RiccardoBucco.
That is perfect C code. You barely can understand what is happening without comments. The algorithm is the best possible for our case (it exploits the existing ordering of the lists) and if you can find in the standard libraries an implementation of that algorithm you should substitute sort with that.
But this is Python, not C. Solving your problem from scratch and not by using built-ins results in an uglier and slower solution.
You can have a proof of that by running the following script and watching how many time each implementation needs
import time
long_list = [x for x in range(100000)]
def insarrintomain(main, ins):
# insert here the code you want to test
return new_list
start = time.perf_counter()
_ = insarrintomain(long_list, long_list)
stop = time.perf_counter()
print(stop - start)
On my computer my implementation took nearly 0.003 seconds, while the C-style implementation from #RiccardoBucco needed 0.055 seconds.
A simple solution would be:
def insarrintomain(main, ins):
return (main + ins).sorted()
But this solution is clearly not optimal (the complexity is high, as we are not using the fact that the input arrays are already sorted). Specifically, the complexity here is O(k * log(k)), where k is the sum of n and m (n is the length of main and m is the length of ins).
A better solution:
def insarrintomain(main, ins):
i = j = 0
arr = []
while i < len(main) and j < len(ins):
if main[i] < ins[j]:
arr.append(main[i])
i += 1
else:
arr.append(ins[j])
j += 1
while i < len(main):
arr.append(main[i])
i += 1
while j < len(ins):
arr.append(ins[j])
j += 1
return arr
Example:
>>> insarrintomain([1, 2, 3, 4, 8, 9, 12], [5, 6, 7, 10])
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12]
This solution is much faster for big arrays (O(k), where k is the sum of n and m, n is the length of main and m is the length of ins).
Related
I need to find the subsets of a set L = [0, 3, 4, 6, 9, 11, 12, 13].
I saw online the following solution:
def powerset(s):
x = len(s)
masks = [1 << i for i in range(x)]
for i in range(1 << x):
yield [ss for mask, ss in zip(masks, s) if i & mask]
print(list(powerset(L)))
However, this code will also return my full set L and an empty set which I do not want.
I was looking for a way to do this directly in python without using any additional packages.
Here is a pretty simple solution. I did it as a list of sets, but you could easily swap to a list of tuples if you'd rather.
def subsets(r):
res = []
for i in range(1, 2**len(r)-1):
key = format(i, '#010b')
res.append({x for j, x in enumerate(r) if key[~j] == '1'})
return res
L = [0, 3, 4, 6, 9, 11, 12, 13]
print(subsets(L))
Edit: I just realised I pretty much just replicated the solution you already had, and probably in a less efficient way. Oh well I will leave this up as it does answer the question.
I wrote this code:
arr = [1, 2, 3, 4, 5, 6]
arr1 = []
count = 0
arr.append(0) # i forgot to wrote this line.
for i in range(0, len(arr)):
count = sum(arr)
arr.remove(arr[0])
arr1.append(count)
print(arr1)
The output is:
[20, 20, 19, 16, 10, 0]
But I have a little problem. Time of execution it is a little bit to long for large lists.
Can u tell me if exist another mode to write it?
Thx!
I'd suggest itertools.accumulate
from itertools import accumulate
from operator import add
result = list(accumulate(reversed(arr), add))[::-1]
With a few test it's a lot more performant, for 20k digits
accumulate 0:00:00.004001
question 0:00:05.340281
In python, removing the first element of a list requires all the subsequent elements to be moved and therefore takes O(n). Since you're doing that n times (with n being the length of your array), the overall time complexity of your solution is O(n2)
Any solution that only runs in O(n) time complexity should be fine for you. Here is one that doesn't require any external imports and is similar in style to your original solution:
arr = [1, 2, 3, 4, 5, 6]
total = sum(arr)
sums = [total]
for to_remove in arr[:-1]:
total -= to_remove
sums.append(total)
print(sums)
[21, 20, 18, 15, 11, 6]
It is not perfect but a little faster:
import timeit
def function():
arr = [1, 2, 3, 4, 5, 6]
arr1 = []
count = 0
for i in range(0, len(arr)):
count = sum(arr)
arr.remove(arr[0])
arr1.append(count)
return arr1
print(timeit.timeit(function))
Time: 1.9620031519999657
import timeit
def function():
arr = [1, 2, 3, 4, 5, 6]
arr1 = [sum(arr)]
i = 0
while i < len(arr)-1:
arr1.append(arr1[i] - arr[i])
i = i +1
arr1.append(arr1[-1]-arr[-1])
return arr1
print(timeit.timeit(function))
Time: 1.408351424999978
If you are every time conscious, always try to use libraries like numpy.
In numpy this will be every easy, and efficient.
import numpy
a = numpy.arange(7) # since you also want to include 6 in array, 0, 1, 2, ..., 5, 6
np.cumsum(a[::-1])[::-1]
You would need to install numpy separately. If you don't know you can install numpy by:
pip install numpy
This question already has answers here:
How to find the cumulative sum of numbers in a list?
(25 answers)
Closed 6 years ago.
I have an old list and I want to sum up every single element to a new list:
lst_old = [1, 2, 3, 4, 5]
lst_new = [1, 3, 6, 10, 15]
Is there an elegant way to implement that in Python 3 with short and fast code? Apart from sum() which only prints last element I couldn't find a proper solution for my problem.
You can use itertools.accumulate, eg:
from itertools import accumulate
lst_old = [1, 2, 3, 4, 5]
lst_new = list(accumulate(lst_old))
# [1, 3, 6, 10, 15]
itertools.accumulate as mentioned by Jon Clements is the best way to do. However, in case you want an explicit way to do it, here goes:
lst_old = [1, 2, 3, 4, 5]
sum = 0
lst_new = []
for item in lst_old:
sum += item
lst_new.append(sum)
Main advantage of this is, you can wrap it in a function and if there is any transform or validation need to be performed, can be added.
For example, lets say you want to stop the function keep going after a limit, the following will help:
def accumulate_items(lst_old, limit=0):
sum = 0
output_list = []
for item in lst_old:
sum += item
output_list.append(sum)
if limit and sum > limit:
break
return output_list
The flexibility will be limitless if there is any transformation or operation need to be done here. Have a pre-condition that need to be set? Go ahead. Have a post-condition that need to be tested? Go ahead. Pretty huge list and wanted a generator based solution just like the itertools.accumulate? Go ahead. Need to add a validation or exception handling? Go ahead.
However, no transformation and simply accumulate? The previous answer is the best. Using sum with list indices is pretty slow as the order of complexity sky rockets.
You can take the sum of a slice of a list using listname[start:end], with both start and end as optional arguments (defaulting to the beginning and end of the list):
lst_old = [1, 2, 3, 4, 5]
lst_new = []
for i, num in enumerate(lst_old):
index = i+1
var = sum(lst_old[:index])
print(var)
lst_new.append(var)
This question already has answers here:
Remove all the elements that occur in one list from another
(13 answers)
Closed 6 years ago.
I am looking for a way to remove all values within a list from another list.
Something like this:
a = range(1,10)
a.remove([2,3,7])
print a
a = [1,4,5,6,8,9]
>>> a = range(1, 10)
>>> [x for x in a if x not in [2, 3, 7]]
[1, 4, 5, 6, 8, 9]
I was looking for fast way to do the subject, so I made some experiments with suggested ways. And I was surprised by results, so I want to share it with you.
Experiments were done using pythonbenchmark tool and with
a = range(1,50000) # Source list
b = range(1,15000) # Items to remove
Results:
def comprehension(a, b):
return [x for x in a if x not in b]
5 tries, average time 12.8 sec
def filter_function(a, b):
return filter(lambda x: x not in b, a)
5 tries, average time 12.6 sec
def modification(a,b):
for x in b:
try:
a.remove(x)
except ValueError:
pass
return a
5 tries, average time 0.27 sec
def set_approach(a,b):
return list(set(a)-set(b))
5 tries, average time 0.0057 sec
Also I made another measurement with bigger inputs size for the last two functions
a = range(1,500000)
b = range(1,100000)
And the results:
For modification (remove method) - average time is 252 seconds
For set approach - average time is 0.75 seconds
So you can see that approach with sets is significantly faster than others. Yes, it doesn't keep similar items, but if you don't need it - it's for you.
And there is almost no difference between list comprehension and using filter function. Using 'remove' is ~50 times faster, but it modifies source list.
And the best choice is using sets - it's more than 1000 times faster than list comprehension!
If you don't have repeated values, you could use set difference.
x = set(range(10))
y = x - set([2, 3, 7])
# y = set([0, 1, 4, 5, 6, 8, 9])
and then convert back to list, if needed.
a = range(1,10)
itemsToRemove = set([2, 3, 7])
b = filter(lambda x: x not in itemsToRemove, a)
or
b = [x for x in a if x not in itemsToRemove]
Don't create the set inside the lambda or inside the comprehension. If you do, it'll be recreated on every iteration, defeating the point of using a set at all.
The simplest way is
>>> a = range(1, 10)
>>> for x in [2, 3, 7]:
... a.remove(x)
...
>>> a
[1, 4, 5, 6, 8, 9]
One possible problem here is that each time you call remove(), all the items are shuffled down the list to fill the hole. So if a grows very large this will end up being quite slow.
This way builds a brand new list. The advantage is that we avoid all the shuffling of the first approach
>>> removeset = set([2, 3, 7])
>>> a = [x for x in a if x not in removeset]
If you want to modify a in place, just one small change is required
>>> removeset = set([2, 3, 7])
>>> a[:] = [x for x in a if x not in removeset]
Others have suggested ways to make newlist after filtering e.g.
newl = [x for x in l if x not in [2,3,7]]
or
newl = filter(lambda x: x not in [2,3,7], l)
but from your question it looks you want in-place modification for that you can do this, this will also be much much faster if original list is long and items to be removed less
l = range(1,10)
for o in set([2,3,7,11]):
try:
l.remove(o)
except ValueError:
pass
print l
output:
[1, 4, 5, 6, 8, 9]
I am checking for ValueError exception so it works even if items are not in orginal list.
Also if you do not need in-place modification solution by S.Mark is simpler.
>>> a=range(1,10)
>>> for i in [2,3,7]: a.remove(i)
...
>>> a
[1, 4, 5, 6, 8, 9]
>>> a=range(1,10)
>>> b=map(a.remove,[2,3,7])
>>> a
[1, 4, 5, 6, 8, 9]
I am looking for a way to easily split a python list in half.
So that if I have an array:
A = [0,1,2,3,4,5]
I would be able to get:
B = [0,1,2]
C = [3,4,5]
A = [1,2,3,4,5,6]
B = A[:len(A)//2]
C = A[len(A)//2:]
If you want a function:
def split_list(a_list):
half = len(a_list)//2
return a_list[:half], a_list[half:]
A = [1,2,3,4,5,6]
B, C = split_list(A)
A little more generic solution (you can specify the number of parts you want, not just split 'in half'):
def split_list(alist, wanted_parts=1):
length = len(alist)
return [ alist[i*length // wanted_parts: (i+1)*length // wanted_parts]
for i in range(wanted_parts) ]
A = [0,1,2,3,4,5,6,7,8,9]
print split_list(A, wanted_parts=1)
print split_list(A, wanted_parts=2)
print split_list(A, wanted_parts=8)
f = lambda A, n=3: [A[i:i+n] for i in range(0, len(A), n)]
f(A)
n - the predefined length of result arrays
def split(arr, size):
arrs = []
while len(arr) > size:
pice = arr[:size]
arrs.append(pice)
arr = arr[size:]
arrs.append(arr)
return arrs
Test:
x=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
print(split(x, 5))
result:
[[1, 2, 3, 4, 5], [6, 7, 8, 9, 10], [11, 12, 13]]
If you don't care about the order...
def split(list):
return list[::2], list[1::2]
list[::2] gets every second element in the list starting from the 0th element.
list[1::2] gets every second element in the list starting from the 1st element.
Using list slicing. The syntax is basically my_list[start_index:end_index]
>>> i = [0,1,2,3,4,5]
>>> i[:3] # same as i[0:3] - grabs from first to third index (0->2)
[0, 1, 2]
>>> i[3:] # same as i[3:len(i)] - grabs from fourth index to end
[3, 4, 5]
To get the first half of the list, you slice from the first index to len(i)//2 (where // is the integer division - so 3//2 will give the floored result of1, instead of the invalid list index of1.5`):
>>> i[:len(i)//2]
[0, 1, 2]
..and the swap the values around to get the second half:
>>> i[len(i)//2:]
[3, 4, 5]
B,C=A[:len(A)/2],A[len(A)/2:]
Here is a common solution, split arr into count part
def split(arr, count):
return [arr[i::count] for i in range(count)]
def splitter(A):
B = A[0:len(A)//2]
C = A[len(A)//2:]
return (B,C)
I tested, and the double slash is required to force int division in python 3. My original post was correct, although wysiwyg broke in Opera, for some reason.
If you have a big list, It's better to use itertools and write a function to yield each part as needed:
from itertools import islice
def make_chunks(data, SIZE):
it = iter(data)
# use `xragne` if you are in python 2.7:
for i in range(0, len(data), SIZE):
yield [k for k in islice(it, SIZE)]
You can use this like:
A = [0, 1, 2, 3, 4, 5, 6]
size = len(A) // 2
for sample in make_chunks(A, size):
print(sample)
The output is:
[0, 1, 2]
[3, 4, 5]
[6]
Thanks to #thefourtheye and #Bede Constantinides
This is similar to other solutions, but a little faster.
# Usage: split_half([1,2,3,4,5]) Result: ([1, 2], [3, 4, 5])
def split_half(a):
half = len(a) >> 1
return a[:half], a[half:]
There is an official Python receipe for the more generalized case of splitting an array into smaller arrays of size n.
from itertools import izip_longest
def grouper(n, iterable, fillvalue=None):
"Collect data into fixed-length chunks or blocks"
# grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx
args = [iter(iterable)] * n
return izip_longest(fillvalue=fillvalue, *args)
This code snippet is from the python itertools doc page.
10 years later.. I thought - why not add another:
arr = 'Some random string' * 10; n = 4
print([arr[e:e+n] for e in range(0,len(arr),n)])
While the answers above are more or less correct, you may run into trouble if the size of your array isn't divisible by 2, as the result of a / 2, a being odd, is a float in python 3.0, and in earlier version if you specify from __future__ import division at the beginning of your script. You are in any case better off going for integer division, i.e. a // 2, in order to get "forward" compatibility of your code.
#for python 3
A = [0,1,2,3,4,5]
l = len(A)/2
B = A[:int(l)]
C = A[int(l):]
General solution split list into n parts with parameter verification:
def sp(l,n):
# split list l into n parts
if l:
p = len(l) if n < 1 else len(l) // n # no split
p = p if p > 0 else 1 # split down to elements
for i in range(0, len(l), p):
yield l[i:i+p]
else:
yield [] # empty list split returns empty list
Since there was no restriction put on which package we can use.. Numpy has a function called split with which you can easily split an array any way you like.
Example
import numpy as np
A = np.array(list('abcdefg'))
np.split(A, 2)
With hints from #ChristopheD
def line_split(N, K=1):
length = len(N)
return [N[i*length/K:(i+1)*length/K] for i in range(K)]
A = [0,1,2,3,4,5,6,7,8,9]
print line_split(A,1)
print line_split(A,2)
Another take on this problem in 2020 ... Here's a generalization of the problem. I interpret the 'divide a list in half' to be .. (i.e. two lists only and there shall be no spillover to a third array in case of an odd one out etc). For instance, if the array length is 19 and a division by two using // operator gives 9, and we will end up having two arrays of length 9 and one array (third) of length 1 (so in total three arrays). If we'd want a general solution to give two arrays all the time, I will assume that we are happy with resulting duo arrays that are not equal in length (one will be longer than the other). And that its assumed to be ok to have the order mixed (alternating in this case).
"""
arrayinput --> is an array of length N that you wish to split 2 times
"""
ctr = 1 # lets initialize a counter
holder_1 = []
holder_2 = []
for i in range(len(arrayinput)):
if ctr == 1 :
holder_1.append(arrayinput[i])
elif ctr == 2:
holder_2.append(arrayinput[i])
ctr += 1
if ctr > 2 : # if it exceeds 2 then we reset
ctr = 1
This concept works for any amount of list partition as you'd like (you'd have to tweak the code depending on how many list parts you want). And is rather straightforward to interpret. To speed things up , you can even write this loop in cython / C / C++ to speed things up. Then again, I've tried this code on relatively small lists ~ 10,000 rows and it finishes in a fraction of second.
Just my two cents.
Thanks!
from itertools import islice
Input = [2, 5, 3, 4, 8, 9, 1]
small_list_length = [1, 2, 3, 1]
Input1 = iter(Input)
Result = [list(islice(Input1, elem)) for elem in small_list_length]
print("Input list :", Input)
print("Split length list: ", small_list_length)
print("List after splitting", Result)
You can try something like this with numpy
import numpy as np
np.array_split([1,2,3,4,6,7,8], 2)
result:
[array([1, 2, 3, 4]), array([6, 7, 8])]