Faster Algorithm to Tailor Given Mathematical Expression [duplicate] - python

This question already has answers here:
How can I find the minimum index of the array in this case?
(3 answers)
Closed 3 years ago.
Is there a more optimized solution to solve the stated problem?
Given an array 'arr' of 'N' elements and a number 'M', find the least index 'z' at which the equation gets satisfied. [ ] is considered as floor().
Code:
counts=0
ans=0
while(ans==0):
s=0
for i in range(counts,len(arr)):
s+=int(arr[i]/(i+1-counts))
if(s>M):
break
if((i+1)==len(arr) and s<=M):
print(counts)
ans=1
counts+=1
Explanation:
Check array from left to right. The first index that satisfies the condition is the answer. This is more optimized than considering from right to left.
If at any time during the calculation, 's' is deemed more than M, break the loop and consider the next. This is more optimized than calculating 's' completely.
Example:
INPUT:
N=3 M=3
arr=[1 2 3]
OUTPUT:
0
This would give the answer 0 since the 0th index contains the first element to satisfy the given relation.
Thanks in advance.

If you're working with relatively small arrays, your algorithm is going to be fast enough. Minor improvements could be achieved by reorganizing the code a bit but nothing dramatic.
If you're working with very large arrays, then I would suggest you look into numpy. It is optimized for array wide operations and has impressive performance.
For example, you can divide all the elements in an array by their inverted position in one operation:
terms = arr / np.arange(len(arr),0,-1)
and then get cumulative sums and the first index in a single line
index = np.where(np.cumsum(terms) <= M)

Related

Finding and removing palindrome rows in 2D numpy array

What would be pythonic and effective way to find/remove palindrome rows from matrix. Though the title suggests matrix to be a numpy ndarray, it can be pandas DataFrame if it lead to more elegant solution.
Obvious way would be to implement this using for-loop, but I'm interested is there a more effective and succint way.
My first idea was to concatenate rows and rows-inverse, and then extract duplicates from concatenated matrix. But this list of duplicates will contain both initial row and its inverse. So to remove second instance of a palindrome I'd still have to do some for-looping.
My second idea was to somehow use broadcasting to get cartesian product of rows and apply my own ufunc (perhaps created using numba) to get 2D bool matrix. But I don't know how to create ufunc that would get matrix axis, instead of scalar.
EDIT:
I guess I should apologize for poorly formulated question (English is not my native language). I don't need to find out if any row itself is palindrome, but if there are pairs of rows within matrix that are palindromes.
I simply check if the array is equal its reflection (around axis 1) in all elements, if true it is a palindrome (correct me if I am wrong). Then I index out the rows that aren't palindromes.
import numpy as np
a = np.array([
[1,0,0,1], # Palindrome
[0,2,2,0], # Palindrome
[1,2,3,4],
[0,1,4,0],
])
wherepalindrome = (a == a[:,::-1]).all(1)
print(a[~wherepalindrome])
#[[1 2 3 4]
# [0 1 4 0]]
Naphat's answer is the pythonic (numpythonic) way to go. That should be the accepted answer.
But if your array is really large, you don't want to create a temporary copy, and you wish to explore Numba's intricacies, you can use something like this:
import numba as nb
#nb.njit(parallel=True)
def palindromic_rows(a):
rows, cols = a.shape
palindromes = np.full(rows, True, dtype=nb.boolean)
mid = cols // 2
for r in nb.prange(rows): # <-- parallel loop
for c in range(mid):
if a[r, c] != a[r, -c-1]:
palindromes[r] = False
break
return palindromes
This contraption just replaces the elegant (a == a[:,::-1]).all(axis=1), but it's almost an order of magnitude faster for very large arrays and it doesn't duplicate them.

how to calculate the minimum unfairness sum of a list

I have tried to summarize the problem statement something like this::
Given n, k and an array(a list) arr where n = len(arr) and k is an integer in set (1, n) inclusive.
For an array (or list) myList, The Unfairness Sum is defined as the sum of the absolute differences between all possible pairs (combinations with 2 elements each) in myList.
To explain: if mylist = [1, 2, 5, 5, 6] then Minimum unfairness sum or MUS. Please note that elements are considered unique by their index in list not their values
MUS = |1-2| + |1-5| + |1-5| + |1-6| + |2-5| + |2-5| + |2-6| + |5-5| + |5-6| + |5-6|
If you actually need to look at the problem statement, It's HERE
My Objective
given n, k, arr(as described above), find the Minimum Unfairness Sum out of all of the unfairness sums of sub arrays possible with a constraint that each len(sub array) = k [which is a good thing to make our lives easy, I believe :) ]
what I have tried
well, there is a lot to be added in here, so I'll try to be as short as I can.
My First approach was this where i used itertools.combinations to get all the possible combinations and statistics.variance to check its spread of data (yeah, I know I'm a mess).
Before you see the code below, Do you think these variance and unfairness sum are perfectly related (i know they are strongly related) i.e. the sub array with minimum variance has to be the sub array with MUS??
You only have to check the LetMeDoIt(n, k, arr) function. If you need MCVE, check the second code snippet below.
from itertools import combinations as cmb
from statistics import variance as varn
def LetMeDoIt(n, k, arr):
v = []
s = []
subs = [list(x) for x in list(cmb(arr, k))] # getting all sub arrays from arr in a list
i = 0
for sub in subs:
if i != 0:
var = varn(sub) # the variance thingy
if float(var) < float(min(v)):
v.remove(v[0])
v.append(var)
s.remove(s[0])
s.append(sub)
else:
pass
elif i == 0:
var = varn(sub)
v.append(var)
s.append(sub)
i = 1
final = []
f = list(cmb(s[0], 2)) # getting list of all pairs (after determining sub array with least MUS)
for r in f:
final.append(abs(r[0]-r[1])) # calculating the MUS in my messy way
return sum(final)
The above code works fine for n<30 but raised a MemoryError beyond that.
In Python chat, Kevin suggested me to try generator which is memory efficient (it really is), but as generator also generates those combination on the fly as we iterate over them, it was supposed to take over 140 hours (:/) for n=50, k=8 as estimated.
I posted the same as a question on SO HERE (you might wanna have a look to understand me properly - it has discussions and an answer by fusion which takes me to my second approach - a better one(i should say fusion's approach xD)).
Second Approach
from itertools import combinations as cmb
def myvar(arr): # a function to calculate variance
l = len(arr)
m = sum(arr)/l
return sum((i-m)**2 for i in arr)/l
def LetMeDoIt(n, k, arr):
sorted_list = sorted(arr) # i think sorting the array makes it easy to get the sub array with MUS quickly
variance = None
min_variance_sub = None
for i in range(n - k + 1):
sub = sorted_list[i:i+k]
var = myvar(sub)
if variance is None or var<variance:
variance = var
min_variance_sub=sub
final = []
f = list(cmb(min_variance_sub, 2)) # again getting all possible pairs in my messy way
for r in f:
final.append(abs(r[0] - r[1]))
return sum(final)
def MainApp():
n = int(input())
k = int(input())
arr = list(int(input()) for _ in range(n))
result = LetMeDoIt(n, k, arr)
print(result)
if __name__ == '__main__':
MainApp()
This code works perfect for n up to 1000 (maybe more), but terminates due to time out (5 seconds is the limit on online judge :/ ) for n beyond 10000 (the biggest test case has n=100000).
=====
How would you approach this problem to take care of all the test cases in given time limits (5 sec) ? (problem was listed under algorithm & dynamic programming)
(for your references you can have a look on
successful submissions(py3, py2, C++, java) on this problem by other candidates - so that you can
explain that approach for me and future visitors)
an editorial by the problem setter explaining how to approach the question
a solution code by problem setter himself (py2, C++).
Input data (test cases) and expected output
Edit1 ::
For future visitors of this question, the conclusions I have till now are,
that variance and unfairness sum are not perfectly related (they are strongly related) which implies that among a lots of lists of integers, a list with minimum variance doesn't always have to be the list with minimum unfairness sum. If you want to know why, I actually asked that as a separate question on math stack exchange HERE where one of the mathematicians proved it for me xD (and it's worth taking a look, 'cause it was unexpected)
As far as the question is concerned overall, you can read answers by archer & Attersson below (still trying to figure out a naive approach to carry this out - it shouldn't be far by now though)
Thank you for any help or suggestions :)
You must work on your list SORTED and check only sublists with consecutive elements. This is because BY DEFAULT, any sublist that includes at least one element that is not consecutive, will have higher unfairness sum.
For example if the list is
[1,3,7,10,20,35,100,250,2000,5000] and you want to check for sublists with length 3, then solution must be one of [1,3,7] [3,7,10] [7,10,20] etc
Any other sublist eg [1,3,10] will have higher unfairness sum because 10>7 therefore all its differences with rest of elements will be larger than 7
The same for [1,7,10] (non consecutive on the left side) as 1<3
Given that, you only have to check for consecutive sublists of length k which reduces the execution time significantly
Regarding coding, something like this should work:
def myvar(array):
return sum([abs(i[0]-i[1]) for i in itertools.combinations(array,2)])
def minsum(n, k, arr):
res=1000000000000000000000 #alternatively make it equal with first subarray
for i in range(n-k):
res=min(res, myvar(l[i:i+k]))
return res
I see this question still has no complete answer. I will write a track of a correct algorithm which will pass the judge. I will not write the code in order to respect the purpose of the Hackerrank challenge. Since we have working solutions.
The original array must be sorted. This has a complexity of O(NlogN)
At this point you can check consecutive sub arrays as non-consecutive ones will result in a worse (or equal, but not better) "unfairness sum". This is also explained in archer's answer
The last check passage, to find the minimum "unfairness sum" can be done in O(N). You need to calculate the US for every consecutive k-long subarray. The mistake is recalculating this for every step, done in O(k), which brings the complexity of this passage to O(k*N). It can be done in O(1) as the editorial you posted shows, including mathematic formulae. It requires a previous initialization of a cumulative array after step 1 (done in O(N) with space complexity O(N) too).
It works but terminates due to time out for n<=10000.
(from comments on archer's question)
To explain step 3, think about k = 100. You are scrolling the N-long array and the first iteration, you must calculate the US for the sub array from element 0 to 99 as usual, requiring 100 passages. The next step needs you to calculate the same for a sub array that only differs from the previous by 1 element 1 to 100. Then 2 to 101, etc.
If it helps, think of it like a snake. One block is removed and one is added.
There is no need to perform the whole O(k) scrolling. Just figure the maths as explained in the editorial and you will do it in O(1).
So the final complexity will asymptotically be O(NlogN) due to the first sort.

Is there a non brute force based solution to optimise the minimum sum of a 2D array only using 1 value from each row and column

I have a 2 arrays; one is an ordered array generated from a set of previous positions for connected points; the second is a new set of points specifying the new positions of the points. The task is to match up each old point with the best fitting new position. The differential between each set of points is stored in a new Array which is of size n*n. The objective is to find a way to map each previous point to a new point resulting in the smallest total sum. As such each old point is a row of the matrix and must match to a single column.
I have already looked into a exhaustive search. Although this works it has complexity O(n!) which is just not a valid solution.
The code below can be used to generate test data for the 2D array.
import numpy as np
def make_data():
org = np.random.randint(5000, size=(100, 2))
new = np.random.randint(5000, size=(100, 2))
arr = []
# ranges = []
for i,j in enumerate(org):
values = np.linalg.norm(new-j, axis=1)
arr.append(values)
# print(arr)
# print(ranges)
arr = np.array(arr)
return arr
Here are some small examples of the array and the expected output.
Ex. 1
1 3 5
0 2 3
5 2 6
The above output should return [0,2,1] to signify that row 0 maps to column 0, row 1 to column 2 and row 2 to column 1. As the optimal solution would b 1,3,2
In
The algorithm would be nice to be 100% accurate although something much quicker that is 85%+ would also be valid.
Google search terms: "weighted graph minimum matching". You can consider your array to be a weighted graph, and you're looking for a matching that minimizes edge length.
The assignment problem is a fundamental combinatorial optimization problem. It consists of finding, in a weighted bipartite graph, a matching in which the sum of weights of the edges is as large as possible. A common variant consists of finding a minimum-weight perfect matching.
https://en.wikipedia.org/wiki/Assignment_problem
The Hungarian method is a combinatorial optimization algorithm that solves the assignment problem in polynomial time and which anticipated later primal-dual methods.
https://en.wikipedia.org/wiki/Hungarian_algorithm
I'm not sure whether to post the whole algorithm here; it's several paragraphs and in wikipedia markup. On the other hand I'm not sure whether leaving it out makes this a "link-only answer". If people have strong feelings either way, they can mention them in the comments.

Rearrange items in a list such that no two adjacent are same [duplicate]

This question already has answers here:
How to shuffle a character array with no two duplicates next to each other? [duplicate]
(4 answers)
Efficient algorithm for ordering different types of objects
(6 answers)
Closed 5 years ago.
How can we do it most efficiently?
Given a list with repeated items, task is to rearrange items in a list so that no two adjacent items are same.
Input: [1,1,1,2,3]
Output: [1,2,1,3,1]
Input: [1,1,1,2,2]
Output: [1,2,1,2,1]
Input: [1,1]
Output: Not Possible
Input: [1,1,1,1,2,3]
Output: Not Possible
Edit: General Algorithm is fine too! It doesn't need to be Python.
I am not good at python, so I am writing the general algorithm here -
Build a max heap, maxHeap that stores numbers and their frequencies <array element, frequency>. maxHeap will be sorted based on the frequency of element.
Create a temporary Key that will used as the previous visited element ( previous element in resultant array. Initialize it as <item = -inf , freq = -1>.
While maxHeap is not empty
Pop an element and add it to result.
Decrease frequency of the popped element by 1
Push the previous element back into the max heap if it's frequency > 0
Make the current element as previous element for the next iteration.
If length of resultant array and original, there is no solution for this array. Else return the resultant array.
Edit
Those who're wondering why the greedy solution by putting the current most frequent element in even/odd positions by skipping one position each time won't work, you can try with the test-case [1 1 2 2 3 3].
Let n be the size of the list. If some element occurs at least (n + 2) / 2 (integer division), there's clearly no solution (according to the pigeonhole principle).
Otherwise, we can always construct the answer. We can do it like this: let's write down all even positions first and then all odd positions (I use 0-based indices). After that, let's sort the elements by their frequency (in decreasing order) and fill the positions in the order described above (note: the elements are sorted as pairs (frequency, element) so that the same elements are grouped together. There're sorted only once at the very beginning of the answer construction).
This algorithm can run in linear time if we use a hash table to count elements and sort them by frequency using count sort.

Array: Ascending and Multiplication Table

How to arrange 5 integers using array in ascending order, no use of sort()
then 5x5 multiplication table using array too, like using list, array append.
I'm going to take a stab at what I believe you're asking, mostly because I hope it's educational. You're lucky I'm procrastinating studying at the moment.
Sorting, because who likes entropy anyway?
Bubbles!
Your first task is to look at the bubble sort, a sorting algorithm that's as simple to code as it is to understand. (It performs poorly with large arrays due to its O(n2) performance but is probably among the first sorts a lot of people encounter.) I highly, highly suggest you understand the algorithm before even thinking about looking at code.
How does it work?
Start at the beginning! Look at the first pair of numbers. If they're in the wrong order, swap them. Increment your starting position by 1 and repeat until the end of the array.
What would this look like in Python?
I'm glad you asked. You need to loop through the whole array and swap whenever appropriate. Thankfully Python makes swapping very easy, allowing you to pull tricks like a, b = b, a. We can (hopefully quickly) write down some code to do what we want:
def bubble_sort(array):
for i in xrange(len(array)):
for j in xrange(len(array) - i - 1):
if array[j] > array[j + 1]:
array[j], array[j + 1] = array[j + 1], array[j]
return array
This should be straightforward and follows the sorting procedure directly. Pass in an array (or list of numbers) and it will return an array sorted in ascending order.
Multiplication Table
I'm assuming you mean something like this table that you learn in first grade. The requirement I'm imposing on your vague wording is that we want to return a 2D array where the first row is multiples of 0, the second is multiples of 1, etc. This goes for the columns as well, since multiplication tables are symmetric between rows and columns. There are a number of possible approaches, but I'm only going to consider the one I personally find the most elegant and Pythonic. Python comes packed with great list comprehension, so why not make use of it? Try this:
table = [[x*y for x in xrange(6)] for y in xrange(6)]
This creates a 6x6 matrix, i.e. the multiplication table from 0–5. Take some time to really understand this code. I think that list comprehension is absolutely fundamental to Python and is something that sets it apart. If you look at the (i, j)th element of the array, you'll see that it equals ij. For example, table[3][2] == 6 is true.
I desperately hope you learned something useful from this. Next time you post a question, hopefully you'll give us more to work on.

Categories

Resources