Related
This question already has answers here:
Combining two sorted lists in Python
(22 answers)
Closed 6 months ago.
I have a function insarrintomain which takes 2 arguments. The first one is main list, the second one is an insert list. I need to create a new list, where I will have all numbers from both arrays in increasing order. For example: main is [1, 2, 3, 4, 8, 9, 12], ins is [5, 6, 7, 10]. I should get [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12]
Here is my code:
def insarrintomain(main, ins):
arr = []
c = 0
for i, el in enumerate(main):
if c < len(ins):
if el > ins[c]:
for j, ins_el in enumerate(ins):
if ins_el < el:
c += 1
arr.append(ins_el)
else:
break
else:
arr.append(el)
else:
arr.append(el)
return arr
What did I miss?
Why not
new_array = main + insert
new_array.sort()
The pyhonic way of solving this problem is something like this:
def insarrintomain(main, ins):
new_list = main + ins
new_list.sort()
return new_list
In Python readability counts.
This code is pythonic because it’s easy to read: the function takes two lists, concatenates them into one new list, sorts the result and returns it.
Another reason why this code is pythonic is because it uses built-in functions. There is no need to reinvent the wheel: someone already needed to concatenate two lists, or to sort one. Built-in functions such as sort have been optimised for decades and are mostly written in C language. By no chance we can beat them using Python.
Let’s analyse the implementation from #RiccardoBucco.
That is perfect C code. You barely can understand what is happening without comments. The algorithm is the best possible for our case (it exploits the existing ordering of the lists) and if you can find in the standard libraries an implementation of that algorithm you should substitute sort with that.
But this is Python, not C. Solving your problem from scratch and not by using built-ins results in an uglier and slower solution.
You can have a proof of that by running the following script and watching how many time each implementation needs
import time
long_list = [x for x in range(100000)]
def insarrintomain(main, ins):
# insert here the code you want to test
return new_list
start = time.perf_counter()
_ = insarrintomain(long_list, long_list)
stop = time.perf_counter()
print(stop - start)
On my computer my implementation took nearly 0.003 seconds, while the C-style implementation from #RiccardoBucco needed 0.055 seconds.
A simple solution would be:
def insarrintomain(main, ins):
return (main + ins).sorted()
But this solution is clearly not optimal (the complexity is high, as we are not using the fact that the input arrays are already sorted). Specifically, the complexity here is O(k * log(k)), where k is the sum of n and m (n is the length of main and m is the length of ins).
A better solution:
def insarrintomain(main, ins):
i = j = 0
arr = []
while i < len(main) and j < len(ins):
if main[i] < ins[j]:
arr.append(main[i])
i += 1
else:
arr.append(ins[j])
j += 1
while i < len(main):
arr.append(main[i])
i += 1
while j < len(ins):
arr.append(ins[j])
j += 1
return arr
Example:
>>> insarrintomain([1, 2, 3, 4, 8, 9, 12], [5, 6, 7, 10])
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12]
This solution is much faster for big arrays (O(k), where k is the sum of n and m, n is the length of main and m is the length of ins).
I have a list arr and a value dif. I need to create an algorithm that will go through the values in arr, subtract them (the largest value after subtraction can be maximally dif) and then output the resulting values in new lists.
I created this code. The first problem is that it is broken if there are 3+ lists in the output. The second problem is the time complexity - if I'm not mistaken, it is O(n^2) so it's really slow.
I've tried to create this using Insertion sort and Binary search, but I never get the right result. The output was even worse than in the above code.
Can someone help me? How to simplify the algorithm and how to make it work correctly?
Code :
arr = [9, 3, 11, 5, 10, 4]
arr2 = [16, 13, 3, 1, 8, 2]
dif = 3
def func(arr):
pre_list = []
for i in range(0, len(arr)):
result = []
for j in range(0, len(arr)):
sub = arr[i] - arr[j]
if (abs(sub) <= dif):
result.append(arr[j])
if (result not in pre_list):
pre_list.append(result)
print(result)
func(arr)
Output:
arr:
[9, 11, 10]
[3, 5, 4]
arr 1:
[16, 13]
[3, 1, 2]
[8]
I'm trying my hand at converting the following loop to a comprehension.
Problem is given an input_list = [1, 2, 3, 4, 5]
return a list with each element as multiple of all elements till that index starting from left to right.
Hence return list would be [1, 2, 6, 24, 120].
The normal loop I have (and it's working):
l2r = list()
for i in range(lst_len):
if i == 0:
l2r.append(lst_num[i])
else:
l2r.append(lst_num[i] * l2r[i-1])
Python 3.8+ solution:
:= Assignment Expressions
lst = [1, 2, 3, 4, 5]
curr = 1
out = [(curr:=curr*v) for v in lst]
print(out)
Prints:
[1, 2, 6, 24, 120]
Other solution (with itertools.accumulate):
from itertools import accumulate
out = [*accumulate(lst, lambda a, b: a*b)]
print(out)
Well, you could do it like this(a):
import math
orig = [1, 2, 3, 4, 5]
print([math.prod(orig[:pos]) for pos in range(1, len(orig) + 1)])
This generates what you wanted:
[1, 2, 6, 24, 120]
and basically works by running a counter from 1 to the size of the list, at each point working out the product of all terms before that position:
pos values prod
=== ========= ====
1 1 1
2 1,2 2
3 1,2,3 6
4 1,2,3,4 24
5 1,2,3,4,5 120
(a) Just keep in mind that's less efficient at runtime since it calculates the full product for every single element (rather than caching the most recently obtained product). You can avoid that while still making your code more compact (often the reason for using list comprehensions), with something like:
def listToListOfProds(orig):
curr = 1
newList = []
for item in orig:
curr *= item
newList.append(curr)
return newList
print(listToListOfProds([1, 2, 3, 4, 5]))
That's obviously not a list comprehension but still has the advantages in that it doesn't clutter up your code where you need to calculate it.
People seem to often discount the function solution in Python, simply because the language is so expressive and allows things like list comprehensions to do a lot of work in minimal source code.
But, other than the function itself, this solution has the same advantages of a one-line list comprehension in that it, well, takes up one line :-)
In addition, you're free to change the function whenever you want (if you find a better way in a later Python version, for example), without having to change all the different places in the code that call it.
This should not be made into a list comprehension if one iteration depends on the state of an earlier one!
If the goal is a one-liner, then there are lots of solutions with #AndrejKesely's itertools.accumulate() being an excellent one (+1). Here's mine that abuses functools.reduce():
from functools import reduce
lst = [1, 2, 3, 4, 5]
print(reduce(lambda x, y: x + [x[-1] * y], lst, [lst.pop(0)]))
But as far as list comprehensions go, #AndrejKesely's assignment-expression-based solution is the wrong thing to do (-1). Here's a more self contained comprehension that doesn't leak into the surrounding scope:
lst = [1, 2, 3, 4, 5]
seq = [a.append(a[-1] * b) or a.pop(0) for a in [[lst.pop(0)]] for b in [*lst, 1]]
print(seq)
But it's still the wrong thing to do! This is based on a similar problem that also got upvoted for the wrong reasons.
A recursive function could help.
input_list = [ 1, 2, 3, 4, 5]
def cumprod(ls, i=None):
i = len(ls)-1 if i is None else i
if i == 0:
return 1
return ls[i] * cumprod(ls, i-1)
output_list = [cumprod(input_list, i) for i in range(len(input_list))]
output_list has value [1, 2, 6, 24, 120]
This method can be compressed in python3.8 using the walrus operator
input_list = [ 1, 2, 3, 4, 5]
def cumprod_inline(ls, i=None):
return 1 if (i := len(ls)-1 if i is None else i) == 0 else ls[i] * cumprod_inline(ls, i-1)
output_list = [cumprod_inline(input_list, i) for i in range(len(input_list))]
output_list has value [1, 2, 6, 24, 120]
Because you plan to use this in list comprehension, there's no need to provide a default for the i argument. This removes the need to check if i is None.
input_list = [ 1, 2, 3, 4, 5]
def cumprod_inline_nodefault(ls, i):
return 1 if i == 0 else ls[i] * cumprod_inline_nodefault(ls, i-1)
output_list = [cumprod_inline_nodefault(input_list, i) for i in range(len(input_list))]
output_list has value [1, 2, 6, 24, 120]
Finally, if you really wanted to keep it to a single , self-contained list comprehension line, you can follow the approach note here to use recursive lambda calls
input_list = [ 1, 2, 3, 4, 5]
output_list = [(lambda func, x, y: func(func,x,y))(lambda func, ls, i: 1 if i == 0 else ls[i] * func(func, ls, i-1),input_list,i) for i in range(len(input_list))]
output_list has value [1, 2, 6, 24, 120]
It's entirely over-engineered, and barely legible, but hey! it works and its just for fun.
For your list, it might not be intentional that the numbers are consecutive, starting from 1. But for cases that that pattern is intentional, you can use the built in method, factorial():
from math import factorial
input_list = [1, 2, 3, 4, 5]
l2r = [factorial(i) for i in input_list]
print(l2r)
Output:
[1, 2, 6, 24, 120]
The package numpy has a number of fast implementations of list comprehensions built into it. To obtain, for example, a cumulative product:
>>> import numpy as np
>>> np.cumprod([1, 2, 3, 4, 5])
array([ 1, 2, 6, 24, 120])
The above returns a numpy array. If you are not familiar with numpy, you may prefer to obtain just a normal python list:
>>> list(np.cumprod([1, 2, 3, 4, 5]))
[1, 2, 6, 24, 120]
using itertools and operators:
from itertools import accumulate
import operator as op
ip_lst = [1,2,3,4,5]
print(list(accumulate(ip_lst, func=op.mul)))
I have a random list like this
X = [0, 1, 5, 6, 7, 10, 15]
and need to find and replace every climbing sequence with its average.
In the end it should look like this:
X = [0, 6, 10, 15] #the 0 and 1 to 0; and the 5,6,7 to 6
I tried to find the sequence by subtracting the second value from the first like this:
y = 0
z = []
while X[y +1] -X[y] == 1:
z.append(X[y])
y = y +1
And now I dont know how to delete for example 5,6 and 7 and replace it with the average 6.
You can use itertools.groupby on the list with a key function that returns each item's difference with an incremental counter:
from itertools import groupby, count
from statistics import mean
X = [0, 1, 5, 6, 7, 10, 15]
c = count()
X = [int(mean(g)) for _, g in groupby(X, key=lambda i: i - next(c))]
X becomes:
[0, 6, 10, 15]
You can iterate and group in the same list each climbing sequence for then taking the mean.
>>> res = [[x[0]]]
>>> for i in range(1, len(x)):
... if x[i] == x[i-1] + 1:
... res[-1].append(x[i])
... else:
... res.append([x[i]]
>>> res
[[0, 1], [5, 6, 7], [10], [15]]
>>> [int(sum(l)/len(l)) for l in res]
[0, 6, 10, 15]
Here's a starting technique: make a new list that's the difference of adjacent elements in the list:
diff = [X[i] - X[i-1] for i in range(1, len(X)) ]
There are more "Pythonic" ways to do this, but I want to make sure this is accessible to newer programmers.
You now have diff as
[1, 4, 1, 1, 3, 5]
Where you have a 1 in diff, you have a climbing pair in X. Iterate through diff to find a sequence of 1 values. Where you find this, take the slice of X that corresponds to the 1 values. The middle element of that slice is your mean.
If the value is not 1, then you simply take the corresponding element of X, as you've been doing.
append the identified values to z, and there's your desired result.
Can you take it from there?
Not really to answer the question, which is a fairly basic CS 101 question that people should try to figure out themselves, but what I noticed about the nice answer of #blhsing was that it appeared fairly slow. I found that mean() is incredibly slow!
from itertools import groupby, count
from statistics import mean
from timeit import timeit
def generate_1step_seq1(xs):
result = []
n = 0
while n < len(xs):
# sequences with step of 1 only
if not result or xs[n] == result[-1] + 1:
result += [xs[n]]
else:
# int result, rounding down
yield sum(result) // len(result)
result = [xs[n]]
n += 1
if result:
yield sum(result) // len(result)
def generate_1step_seq2(xs):
c = count()
return [int(sum(xs) // len(xs)) for xs in [list(g) for _, g in groupby(xs, key=lambda i: i - next(c))]]
def generate_1step_seq3(xs):
c = count()
return [int(mean(g)) for _, g in groupby(xs, key=lambda i: i - next(c))]
values = [0, 1, 5, 6, 7, 10, 15]
print(list(generate_1step_seq1(values)))
print(generate_1step_seq2(values))
print(generate_1step_seq3(values))
print(timeit(lambda: list(generate_1step_seq1(values)), number=10000))
print(timeit(lambda: list(generate_1step_seq2(values)), number=10000))
print(timeit(lambda: list(generate_1step_seq3(values)), number=10000))
Initially I figured that was probably due to the tiny list size, but even for large lists, mean() is horribly slow. Anyone happen to know why? It appears due to the very safe nature of statistics _sum, trying to avoid float rounding errors?
Premise: My question is not a duplicate of Cyclic rotation in Python . I am not asking how to resolve the problem or why my solution does not work, I have already resolved it and it works. My question is about another particular solution to the same problem I found, because I would like to understand the logic behind the other solution.
I came across the following cyclic array rotation problem (below the sources):
Cyclic rotation in Python
https://app.codility.com/programmers/lessons/2-arrays/cyclic_rotation/
An array A consisting of N integers is given. Rotation of the array means that each element is shifted right by one index, and the last element of the array is moved to the first place. For example, the rotation of array A = [3, 8, 9, 7, 6] is [6, 3, 8, 9, 7] (elements are shifted right by one index and 6 is moved to the first place).
The goal is to rotate array A K times; that is, each element of A will be shifted to the right K times.
which I managed to solve with the following Python code:
def solution(A , K):
N = len(A)
if N < 1 or N == K:
return A
K = K % N
for x in range(K):
tmp = A[N - 1]
for i in range(N - 1, 0, -1):
A[i] = A[i - 1]
A[0] = tmp
return A
Then, on the following website https://www.martinkysel.com/codility-cyclicrotation-solution/, I have found the following fancy solution to the same problem:
def reverse(arr, i, j):
for idx in xrange((j - i + 1) / 2):
arr[i+idx], arr[j-idx] = arr[j-idx], arr[i+idx]
def solution(A, K):
l = len(A)
if l == 0:
return []
K = K%l
reverse(A, l - K, l -1)
reverse(A, 0, l - K -1)
reverse(A, 0, l - 1)
return A
Could someone explain me how this particular solution works? (The author does not explain it on his website)
My solution does not perform quite well for large A and K, where K < N, e.g.:
A = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] * 1000
K = 1000
expectedResult = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] * 1000
res = solution(A, K) # 1455.05908203125 ms = almost 1.4 seconds
Because for K < N, my code has a time complexity of O(N * K), where N is the length of the array.
For large K and small N (K > N), my solution performs well thanks to the modulo operation K = K % N:
A = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
K = 999999999999999999999999
expectedRes = [2, 3, 4, 5, 6, 7, 8, 9, 10, 1]
res = solution(A, K) # 0.0048828125 ms, because K is torn down to 9 thanks to K = K % N
The other solution, on the other hand, performs greatly in all cases, even when N > K and has a complexity of O(N).
What is the logic behind that solution?
Thank you for the attention.
Let me talk first the base case with K < N, the idea in this case is to split the array in two parts A and B, A is the first N-K elements array and B the last K elements. the algorithm reverse A and B separately and finally reverse the full array (with the two part reversed separately). To manage the case with K > N, think that every time you reverse the array N times you obtain the original array again so we can just use the module operator to find where to split the array (reversing only the really useful times avoiding useless shifting).
Graphical Example
A graphical step by step example can help understanding better the concept. Note that
The bold line indicate the the splitting point of the array (K = 3 in this example);
The red array indicate the input and the expected output.
Starting from:
look that what we want in front of the final output will be the last 3 letter reversed, for now let reverse it in place (first reverse of the algorithm):
now reverse the first N-K elements (second reverse of the algorithm):
we already have the solution but in the opposite direction, we can solve it reversing the whole array (third and last reverse of the algorithm):
Here the final output, the original array cyclical rotated with K = 3.
Code Example
Let give also another step by step example with python code, starting from:
A = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
K = 22
N = len(A)
we find the splitting index:
K = K%N
#2
because, in this case, the first 20 shift will be useless, now we reverse the last K (2) elements of the original array:
reverse(A, N-K, N-1)
# [1, 2, 3, 4, 5, 6, 7, 8, 10, 9]
as you can see 9 and 10 has been shift, now we reverse the first N-K elements:
reverse(A, 0, N-K-1)
# [8, 7, 6, 5, 4, 3, 2, 1, 10, 9]
And, finally, we reverse the full array:
reverse(A, 0, N-1)
# [9, 10, 1, 2, 3, 4, 5, 6, 7, 8]
Note that reversing an array have time complexity O(N).
Here is a very simple solution in Ruby. (scored 100% in codility)
Remove the last element in the array, and insert it in the beginning.
def solution(a, k)
if a.empty?
return []
end
modified = a
1.upto(k) do
last_element = modified.pop
modified = modified.unshift(last_element)
end
return modified
end