Numpy concatenate + merge 1D arrays - python

I need to concatenate arrays but also merge the end of A with the start of B if they are overlapping.
[1, 2, 4] + [2, 4, 5] -> [1, 2, 4, 5]
[1, 2, 4] + [2, 5, 4] -> [1, 2, 4, 2, 5, 4]
[1, 2, 4] + [1, 2, 4, 5] -> [1, 2, 4, 5]
Note: Order of elements must be preserved, [4, 5] is not the same as [5, 4].
Note 2: The question can be understood like this too: We need the shortest possible extension of A such that the output ends with B.
Of course I can iterate over the second array and compare element-by element, but I am looking for a nice Numpy solution.

Originally misunderstood the problem. The problem, is from my understanding:
Two item suffix of A matches 2 item prefix of B:
[1, 2, 4] +
[2, 4, 5] =>
[1, 2, 4, 5]
No suffix of A matches a prefix of B:
[1, 2, 4] +
[2, 5, 4] ->
[1, 2, 4, 2, 5, 4]
Then we can use this terribly inefficient function:
def merge(A,B):
i = 0
m = 0
# Find largest suffix of A that matches the prefix of B with the same length
while i <= len(A):
if A[-i:] == B[:i] and i > m:
m = i
i += 1
return A + B[m:]

Below is a solution using NumPy. It's not ideal, since it requires a (possibly unneeded) sort, and an iteration. Both the sorting and iteration should be over a relatively small array (or even a single element).
import numpy as np
def merge(left, right):
"""Concatenating two arrays, merging the overlapping end and start of
the left and right array"""
# We can limit the search to the maximum possible overlap between
# the arrays, which is the minimum of the two lengths
l = min(len(left), len(right))
# Find all indices in `right` where the element matches the last element of `left`.
# Need to sort, since the `nonzero` documentation doesn't
# explicitly state whether the returned indices follow the order
# as in `right`
# As long as there are few matches, sorting will not be a showstopper
# Need to reverse the sorted array, to start from the back of the
# right array, work towards the front, until there is a proper match
for i in np.sort(np.nonzero(right[:l] == left[-1])[0])[::-1]:
# Check if the subarrays are equal
if np.all(left[-i-1:] == right[:i+1]):
return np.concatenate([left, right[i+1:]])
# No match
return np.concatenate([left, right])
a = np.array([1, 2, 4])
b = np.array([2, 4, 5])
c = np.array([2, 5, 4])
d = np.array([1, 2, 4, 5])
e = np.array([1, 2, 4, 2])
f = np.array([2, 4, 2, 5])
print(merge(a, b))
print(merge(a, c))
print(merge(a, d))
print(merge(e, b))
print(merge(e, f))
which yields
[1 2 4 5]
[1 2 4 2 5 4]
[1 2 4 5]
[1 2 4 2 4 5]
[1 2 4 2 5]

I have an O(n) solution, albeit without Numpy:
def merge(a, b):
n_a = len(a)
n = min(n_a, len(b))
m = 0
for i in range(1, n + 1):
if b[n - i] == a[n_a - 1 - m]:
m += 1
else:
m = 0
return a + b[m:]

You could do it like this.
def concatenate(a,b):
ret = a.copy()
for element in b:
if not element in ret:
ret.append(element)
return ret
This keeps the order in a + b formation.

Related

Generate Permutation With Minimum Guaranteed Distance from Elements in Source

Given a sequence a with n unique elements, I want to create a sequence b which is a randomly selected permutation of a such that there is at least a specified minimum distance d between duplicate elements of the sequence which is b appended to a.
For example, if a = [1,2,3] and d = 2, of the following permutations:
a b
[1, 2, 3] (1, 2, 3) mindist = 3
[1, 2, 3] (1, 3, 2) mindist = 2
[1, 2, 3] (2, 1, 3) mindist = 2
[1, 2, 3] (2, 3, 1) mindist = 2
[1, 2, 3] (3, 1, 2) mindist = 1
[1, 2, 3] (3, 2, 1) mindist = 1
b could only take one of the first four values since the minimum distance for the last two is 1 < d.
I wrote the following implementation:
import random
n = 10
alist = list(range(n))
blist = alist[:]
d = n//2
avail_indices = list(range(n))
for a_ind, a_val in enumerate(reversed(alist)):
min_ind = max(d - a_ind - 1, 0)
new_ind = random.choice(avail_indices[min_ind:])
avail_indices.remove(new_ind)
blist[new_ind] = a_val
print(alist, blist)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9] [1, 3, 2, 8, 5, 6, 4, 0, 9, 7]
but I think this is n^2 time complexity (not completely sure). Here's a plot of the time required as n increases for d = n//2:
Is it possible to do better than this?
Yes, your implementation is O(n^2).
You can adapt the Fisher-Yates shuffle to this purpose. What you do is work from the start of an array to the end, placing the final value into place out of the remaining.
The trick is that while in a full shuffle you can place any element at the start, you can only place from an index that respects the distance condition.
Here is an implementation.
import random
def distance_permutation (orig, d):
answer = orig.copy()
for i in range(len(orig)):
choice = random.randrange(i, min(len(orig), len(orig) + i - d + 1))
if i < choice:
(answer[i], answer[choice]) = (answer[choice], answer[i])
return answer
n = 10
x = list(range(n))
print(x, distance_permutation(x, n//2))

How to shift items in an array by a "K" number of times?

Shift the items in the given array, by some number of times, as shown in the below examples;
array = [1, 2 ,3 , 4, 5, 6]
k1 = 2
k2 = -3
k3 = 20
test1:
cirShift(array, k1)
Result: [5, 6, 1, 2, 3, 4]
test2:
cirShift(array, k2)
Result: [4, 5, 6, 1, 2, 3]
test3:
cirShift(array, k3)
Result: [5, 6, 1, 2, 3, 4]
I have used the below to achieve the right-rotate a list by k positions;
def rightRotateByOne(A):
Fin= A[-1]
for i in reversed(range(len(A) - 1)):
A[i + 1] = A[i]
A[0] = Fin
def rightRotate(A, k):
if k < 0 or k >= len(A):
return
for i in range(k):
rightRotateByOne(A)
if __name__ == '__main__':
A = [1, 2, 3, 4, 5, 6, 7]
k = 3
rightRotate(A, k)
print(A)
As of now, able to obtain results for test1 but would like to achieve the test2 and test3
Even easier, split the array in half given the boundary, swap and glue back:
def cirShift(a, shift):
if not shift or not a:
return a
return a[-shift%len(a):] + a[:-shift%len(a)]
Courtesy of #KellyBundy, a short-circut one-liner:
def cirShift(a, shift):
return a and a[-shift%len(a):] + a[:-shift%len(a)]
I think this question may be an exercise in self learning ('how to do X using just python'), so my answer is auxiliary, but you can always use np.roll():
#test 1
import numpy as np
a = [1, 2 ,3, 4, 5, 6]
np.roll(a, 2)
gives output
[5, 6, 1, 2, 3, 4]
and
#test 2
np.roll(a, -2)
gives output
[3, 4, 5, 6, 1, 2]
Even if you give a number that is larger than the array size, it handles the overflow:
#test 3
np.roll(a, 10)
gives output
[3, 4, 5, 6, 1, 2]
Rolling also works in multiple dimension arrays and across specified axes, which is pretty neat.
def shift(l, shift_t):
r = [*l]
for n in range(abs(shift_t)):
if shift_t < 0:
r = r[1:] + [r[0]]
else:
r = [r[-1]] + r[:-1]
return r
The key is to take one item of the list and place it on the opposite side, which is essentially all that shifting is doing. If you shift negative, you put the first one at the end, and if you shift positive, you put the last one at the beginning.

I am getting error when running this loop

You are given an array of integers a. A new array b is generated by rearranging the elements of a in the following way:
b = [a[0], a[len(a)-1], a[1], a[len(a)-2, ...]
my code only loops for one time, and I am just stuck from here. What I have tried is below
def alternatingSort(a):
length = len(a)
b = []
for i in range(length):
if i % 2:
b.append(a[length-i])
else:
b.append(a[i])
return b
if my input is [1, 3, 5, 6, 4, 2], my output should be [1,2,3,4,5,6]
But i get [1, 2, 5, 6, 4, 3].
Your logic is not correct. Here, is the working solution with minimum changes:
def alternatingSort(a):
length = len(a)
b = []
for i in range(length):
if i % 2:
b.append(a[length - (i // 2) - 1]) # Updated.
else:
b.append(a[i // 2]) # Updated.
return b
a = [1, 3, 5, 6, 4, 2]
print(alternatingSort(a))
a = [1, 3 ,2]
print(alternatingSort(a))

Not able to print the desired output with these python lists

I want my source code output to be like this:
Lists: 1 3 4 2 1 2 1 3; 4 4 2 4 3 2 4 4 3 1 3
[2, 3]
Lists : 1 1 2 3 4 5; 2 3 4 5 6
[]
Lists : ;
[]
Lists:
I want to write a function which takes two lists and returns back all the elements that occur multiple number of times in both lists but instead I end up finding common elements in these lists. My return list should be in ascending order without duplicates.
def occur_multiple(a, b):
a_set = set(a)
b_set = set(b)
# check length
if len(a_set.intersection(b_set)) > 0:
return (a_set.intersection(b_set))
else:
return ("no common elements")
while True:
original_string = input("Lists: ")
if not original_string:
exit()
first_split = original_string.split(';')
first_list, second_list = [elem.split(' ') for elem in first_split]
first_list.sort()
second_list.sort()
print(occur_multiple(first_list, second_list))
The count function for list might be helpful for your task. I have modified your code so that it go through the element in the intersection set and check if the count in both list is more than 1.
def occur_multiple(a, b):
a_set = set(a)
b_set = set(b)
# check length
ans_set = set()
c = a_set.intersection(b_set)
if len(c) > 0:
for i in c:
if a.count(i) > 1 and b.count(i) > 1:
ans_set.add(i)
return (sorted(list(ans_set)))
else:
return ("no common elements")
Also, you might like to change your list input into ints. For improvement, you might like to store the counts of each element in dictionary rather than reading the list multiple times.
Using NumPy functions np.unique and np.intersect1d:
import numpy as np
def my_fun(a, b):
val_1, count_1 = np.unique(a, return_counts=True) # Find unique elements and
val_2, count_2 = np.unique(b, return_counts=True) # number of occurrences
val_1 = val_1[count_1 > 1] # Retain elements occurring
val_2 = val_2[count_2 > 1] # more than once
result = np.intersect1d(val_1, val_2) # Set intersection
return list(result) # Convert to list
>>> a = [1, 3, 4, 2, 1, 2, 1, 3]
>>> b = [4, 4, 2, 4, 3, 2, 4, 4, 3, 1, 3]
>>> c = my_fun(a, b)
>>> print(c)
[2, 3]
>>> a = [1, 1, 2, 3, 4, 5]
>>> b = [2, 3, 4, 5, 6]
>>> c = my_fun(a, b)
>>> print(c)
[]
>>> a = [-5, 1, 2, 3, 4, 1, 0, 1, 2, 4, 4, 2, -5]
>>> b = [1, 3, 4, 5, -5, -5, -5, 1, 4]
>>> c = my_fun(a, b)
>>> print(c)
[-5, 1, 4]

Combining array elements in a particular way, and recording it

I am given a 1D array of numbers.
I need to go through the array adding each consecutive element to form a sum. Once this sum reaches a certain value, it forms the first element of a new array. The sum is then reset and the process repeats, thus iterating over the whole array.
For example if given:
[1, 3, 4, 5, 2, 5, 3]
and requiring the minimum sum to be 5,
the new array would be:
[8, 5, 7]
Explicity: [1 + 3 + 4, 5, 2 + 5]
I then also need to keep a record of the way the elements were combined for that particular array: I need to be to take a different array of the same length and combine the elements in the same way as above.
e.g. give the array
[1, 2, 1, 1, 3, 2, 1]
I require the output
[4, 1, 5]
Explicity: [1 + 2 + 1, 1, 3 + 2]
I have accomplished this with i loops and increment counters, but it is very ugly. The array named "record" contains the number of old elements summed to make each element of the new array i.e. [3, 1, 2]
import numpy as np
def bin(array, min_sum):
num_points = len(array)
# Create empty output.
output = list()
record = list()
i = 0
while i < num_points:
sum = 0
j = 0
while sum < min_sum:
# Break out if it reaches end of data whilst in loop.
if i+j == num_points:
break
sum += array[i+j]
j += 1
output.append(sum)
record.append(j)
i += j
# The final data point does not reach the min sum.
del output[-1]
return output
if __name__ == "__main__":
array = [1, 3, 4, 5, 2, 5, 3]
print bin(array, 5)
I would advice you to simply walk through the list. Add it to an accumulator like the_sum (do not use sum, since it is a builtin), and in case the_sum reaches a number higher than the min_sum, you add it, and reset the_sum to zero. Like:
def bin(array, min_sum):
result = []
the_sum = 0
for elem in array:
the_sum += elem
if the_sum >= min_sum:
result.append(the_sum)
the_sum = 0
return result
The lines where the accumulator is involved, are put in boldface.
I leave combining the other array the same way as an exercise, but as a hint: use an additional accumulator and zip to iterate over both arrays concurrently.
Here is a straightforward solution. which computes a list of boolean values where the value is true when accumulated element equals or exceeds the target value and calc computes an accumulation using this list.
def which(l, s):
w, a = [], 0
for e in l:
a += e
c = (a >= s)
w.append(c)
if c:
a = 0
return w
def calc(l, w):
a = 0
for (e, c) in zip(l, w):
a += e
if c:
yield a
a = 0
here is an interactive demonstration
>>> l1 = [1, 3, 4, 5, 2, 5, 3]
>>> w = which(l1, 5)
>>> w
[False, False, True, True, False, True, False]
>>> list(calc(l1, w))
[8, 5, 7]
>>> l2 = [1, 2, 1, 1, 3, 2, 1]
>>> list(calc(l2, w))
[4, 1, 5]
You can use short solutions I found out after a long struggle with flattening arrays.
For getting bounded sums use:
f = lambda a,x,j,l: 0 if j>=l else [a[i] for i in range(j,l) if sum(a[j:i])<x]
This outputs:
>>> f = lambda a,x,j,l: 0 if j>=l else [a[i] for i in range(j,l) if sum(a[j:i])< x]
>>> a= [1, 3, 4, 5, 2, 5, 3]
>>> f(a,5,0,7)
[1, 3, 4]
>>> sum(f(a,5,0,7))
8
>>> sum(f(a,5,3,7))
5
>>> sum(f(a,5,4,7))
7
>>>
To get your records use the function:
>>> y = lambda a,x,f,j,l: [] if j>=l else list(np.append(j,np.array(y(a,x,f,j+len(f(a,x,j,l)),l))))
From here, you can get both array of records and sums:
>>> listt=y(a,5,f,0,len(a))
>>> listt
[0.0, 3.0, 4.0, 6.0]
>>> [sum(f(a,5,int(listt[u]),len(a))) for u in range(0,len(listt)-1)]
[8, 5, 7]
>>>
Now, the bit of magic you can even use it as an index-conditional boundary for the second vector:
>>> b=[1, 2, 1, 1, 3, 2, 1]
>>> [sum(f(b,5,int(listt[u]),int(listt[u+1]))) for u in range(0,len(listt)-1)]
[4, 1, 5]
>>>

Categories

Resources