I have N elements in array. I can select the first item max of N times, the second item max of N-1 times and so on.
I have K tokens to use and need to use them so I can have the maximum number of items.
arr = [3, 4, 8] where array elements indicates tokens required for i'th item
n = 10 , represents number of tokens I have
Output:
3
Explanation:
We have 2 options here:
1. option 1: 1st item 2 times for 6 tokens (3*2) and second item once for 4 tokens (4*1)
2. option 2: 1st item 3 times for 9 tokens (3*3)
so maximum we can have 3 items
Code:
def process(arr,n):
count = 0
sum = 0
size = len(arr)+1
for i in range(0, len(arr), 1):
size1 = size-1
size -= 1
while((sum+arr[i] <= n) and (size1 > 0)):
size1 = size1 -1
sum = sum + arr[i]
count += 1
return count;
But it worked for only few test cases, it failed for some hidden test cases. I am not sure where I made a mistake. Can anybody help me?
Your greedy approach will fail for the test cases like this:
[8,2,1,1] 10
Your code will return 2 but the maximum will be 6.
I will use a heap of a tuple i.e. heap[(cost_of_ride,max_no_rides)] .
See the code below:
from heapq import *
def process(arr,n):
count = 0
heap = []
for i in range(len(arr)):
heappush(heap,(arr[i],-(len(arr)-i))) # Constructing min-heap with second index as negative of maximum number of rides
while(n>0 and heap):
cost,no_of_rides = heappop(heap)
no_of_rides = -1 * no_of_rides # Changing maximum no_of_rides from negative to positive
div = n//cost
# If the amount of money is not sufficient to calculate the last number of rides user could take
if(div<no_of_rides):
count += div
break
# Else decrement the number of tokens by minimum cost * maximum no_of_rides
else:
count += no_of_rides
n -= no_of_rides*cost
return count;
Time Complexity for the solution is: O(len(arr)*lg(len(arr))) or O(N*lg(N)).
Try:
def process(arr, n, res=[]):
l=len(arr)
for j in range(len(arr)+1):
r=[arr[0]]*j
if(sum(r)==n) or (sum(r)<n) and (l==1):
yield len(res+r)
elif(sum(r)<n):
yield from process(arr[1:], n-sum(r), res+r)
else:
break
The idea is to iterate over all possible combinations of resulting tokens, more precisely - all options for individual token are just this token taken between 0 and N times, where N refers to tokens position, per your logic.
Discarding on the way combinations, which exceed n, ultimately returning generator, which produces lengths of produced vector of all tokens taken in all possible quantities (so in order to address your question - you need to take max(...) from it).
Outputs:
>>> print(max(process([3,4,8],10)))
3
>>> print(max(process([8,2,1,1],10)))
6
>>> print(max(process([10, 8, 6, 4, 2], 30)))
6
#learner your logic doesn't seem to be working properly.
Please try these inputs: arr = [10, 8, 6, 4, 2], n = 30.
As per your description answer should be 6 rides but your code would produce 3
Use a modified form a quickselect, where you select the next pivot based on the sum of the products of cost * max_times, but still sort based on just cost. This is worst-case O(n^2), but expected O(n).
Related
import itertools as itt
layer_thickness=[1,2,3,4,5]
permu= itt.permutations(layer_thickness,5)
permu_list=list(permu)
for i in permu_list:
if sum(i)==15:
print(i)
Here, I want permutations of the elements in the layer_thickness and those sum of the permutations should be to 5. But the number of elements in prmutation is not constrained by any limit unless it gives the desired sum.
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 , 2 2 2 2 2 2 2 1, etc are should be an element also.
what modifications should I do to achieve that?
You cant create all permutation as list for any total - it will simply hug too much memory:
Assuming [1,2,3,4,5] as numbers, 1 is your lowest element.
Assuming you want to reach a total = 15 - your "biggest" solution is (1,1,1,1,1,1,1,1,1,1,1,1,1,1,1).
To create all possible permutation of 5 numbers over 15 places you need to create 15**5 list entries: 15**5 = 2.562.890.625 possible number permutations
Storing each combination as 1 byte this would need 2.56 Terabyte of ram. Doubt you got that much.
The best you can do is to generate only numbers that work and quit out as soon as you reached your total. To do that for a fixed number set you can start with this question: Finding all possible combinations of numbers to reach a given sum
Using that and provide a "maximized" list of your numbers that can be added up to achieve your goal might lead to something:
def subset_sum(numbers, target, partial=[], partial_sum=0):
if partial_sum == target:
yield partial
if partial_sum >= target:
return
for i, n in enumerate(numbers):
remaining = numbers[i + 1:]
yield from subset_sum(remaining, target, partial + [n], partial_sum + n)
Credit: https://stackoverflow.com/a/4633515/7505395
def calc_nums_for_permutate(nums, total):
"""Creates a list of numbers from num - each single number is added as
many times as itself fits into total"""
return [n for n in nums for _ in range(total//n)]
if __name__ == "__main__":
nums = [1, 2, 3, 4, 5]
total = 15
print( *subset_sum( calc_nums_for_permutate(nums, total), total))
This will not work for all and any inputs - good luck with your runtime, this will still work reasonably well for a total = 10 - for a total = 15 it will take more time then I needed to format/copy paste and formulate this answer here.
I'm doing some coding with DNA sequences and I'm interested in a function to find sequential repeats (which could represent where primers could 'slip' AKA do bad stuff).
An example of what I'm interested in would be as follows:
longest_repeat('ATTTTCCATGATGATG')
which would output the repeat length and coordinates, in this case 9 long and 7:15. The function should have picked up the ATGATGATG at the end and since it is longer than the TTTT repeat and the TGATGA repeat, it would only report the ATGATGATG. In the case of ties, I'd like if it could report all the ties, or at least one of them.
It would also be nice to set a threshold to only report these sequential repeats if they're over a specific length.
I have some experience in python, but this specific question has me stumped, since if I code it inefficiently and put in a 50 character long string it could take forever. I appreciate all the help!
Here is a solution:
def longest_repeat(seq, threshold):
results = []
longest = threshold
# starting position
for i in range(len(seq)):
# pattern period
for p in range(1, (len(seq)-i)//2+1):
# skip unecessary combinations
if results != [] and results[-1][0] == i and results[-1][3] % p == 0: continue
# max possible number of repetitions
repetitions = len(seq)//p
# position within the pattern's period
for k in range(p):
# get the max repetitions the k-th character in the period can support
m = 1
while i+k+m*p < len(seq) and seq[i+k] == seq[i+k+m*p]:
m += 1
repetitions = min(m, repetitions)
# check if we're already below the best result so far
if repetitions*p < longest: break
# save the result if it's good
if repetitions > 1 and repetitions*p >= longest:
# overwrite lesser results
if repetitions*p > longest: results = []
# store the current one (with ample information)
results += [(i, seq[i:i+p], repetitions, repetitions*p)]
longest = max(longest, repetitions*p)
return results
The logic is that you run through each starting position in the sequence (i), you check every sensible pattern period (p) and for that combination you check if they result in a substring at least as good as the best one so far (or the threshold, if no result has been found yet).
The result is a list of tuples of the form (starting index, period string, repetitions, total length). Running your example
threshold = 5
seq = 'ATTTCCATGATGATG'
t = time.time()
results = longest_repeat(seq, threshold)
print("execution time :", time.time()-t)
for t in results:
print(t)
we get
exec : 0.00010848045349121094
(6, 'ATG', 3, 9)
From there, it is trivial to get the full matched string (simply do period_string * repetitions)
For a random input of 700 characters, the execution time is ~6.8 seconds, compared to ~20.2 seconds using #IoaTzimas's answer.
The following will work pretty efficiently. It returns the longest sequence, its length, its starting index and its ending index. If there are multiple sequencies of max length, result will be a list of them. Second parameter in function longest(s, threshold) is the desired threshold-minimum length:
import numpy as np
def b(n): #it returns the factors of an integer. It will be used in next function
r = np.arange(1, int(n ** 0.5) + 1)
x = r[np.mod(n, r) == 0]
return set(np.concatenate((x, n / x), axis=None))
def isseq(s): #it tests if a string is a sequence. Using the result from previous function it compares all smaller parts of the devided string to check if they are equal
l=[int(p) for p in sorted(list(b(len(s))))[:-1]]
for i in l:
if len(set(s[k*i:i*(k+1)] for k in range(len(s)//i)))==1:
return True
return False
def longest(s, threshold): #the main function that returns the lenghtier sequense or a list of them if they are multiple, using a threshold as minimum length
m=[]
for i in range(len(s), threshold-1, -1):
for k in range(len(s)-i+1):
if isseq(s[k:k+i]):
m.append([s[k:k+i], i, k, k+i-1])
if len(m)>0:
return m
return False
Examples:
>>>s='ATTTTCCATGATGATGGST'
>>> longest(s, 1)
[['ATGATGATG', 9, 7, 15]]
>>> s='ATTTTCCATGATGATGGSTLWELWELWEGFRJGHIJH'
>>> longest(s, 1)
[['ATGATGATG', 9, 7, 15], ['LWELWELWE', 9, 19, 27]]
>>>s='ATTTTCCATGATGATGGSTWGTKWKWKWKWKWKWKWKWKWKWKWFRGWLWERLWERLWERLWERLWERLWERLWERLWERLWERLWERLWERLWERLWERLWERLWERLWERFGTFRGFTRUFGFGRFGRGBHJ'
>>> longest(longs, 1)
[['LWERLWERLWERLWERLWERLWERLWERLWERLWERLWERLWERLWERLWERLWERLWERLWER', 64, 48, 111]]
This is a Find All Numbers Disappeared in an Array problem from LeetCode:
Given an array of integers where 1 ≤ a[i] ≤ n (n = size of array),
some elements appear twice and others appear once.
Find all the elements of [1, n] inclusive that do not appear in this array.
Could you do it without extra space and in O(n) runtime? You may
assume the returned list does not count as extra space.
Example:
Input:
[4,3,2,7,8,2,3,1]
Output:
[5,6]
My code is below - I think its O(N) but interviewer disagrees
def findDisappearedNumbers(self, nums: List[int]) -> List[int]:
results_list=[]
for i in range(1,len(nums)+1):
if i not in nums:
results_list.append(i)
return results_list
You can implement an algorithm where you loop through each element of the list and set each element at index i to a negative integer if the list contains the element i as one of the values,. You can then add each index i which is positive to your list of missing items. It doesn't take any additional space and uses at the most 3 for loops(not nested), which makes the complexity O(3*n), which is basically O(n). This site explains it much better and also provides the source code.
edit- I have added the code in case someone wants it:
#The input list and the output list
input = [4, 5, 3, 3, 1, 7, 10, 4, 5, 3]
missing_elements = []
#Loop through each element i and set input[i - 1] to -input[i - 1]. abs() is necessary for
#this or it shows an error
for i in input:
if(input[abs(i) - 1] > 0):
input[abs(i) - 1] = -input[abs(i) - 1]
#Loop through the list again and append each positive value to output list
for i in range(0, len(input)):
if input[i] > 0:
missing_elements.append(i + 1)
For me using loops is not the best way to do it because loops increase the complexity of the given problem. You can try doing it with sets.
def findMissingNums(input_arr):
max_num = max(input_arr) # get max number from input list/array
input_set = set(input_arr) # convert input array into a set
set_num = set(range(1,max(input_arr)+1)) #create a set of all num from 1 to n (n is the max from the input array)
missing_nums = list(set_num - input_set) # take difference of both sets and convert to list/array
return missing_nums
input_arr = [4,3,2,7,8,2,3,1] # 1 <= input_arr[i] <= n
print(findMissingNums(input_arr)) # outputs [5 , 6]```
Use hash table, or dictionary in Python:
def findDisappearedNumbers(self, nums):
hash_table={}
for i in range(1,len(nums)+1):
hash_table[i] = False
for num in nums:
hash_table[num] = True
for i in range(1,len(nums)+1):
if not hash_table[i]:
print("missing..",i)
Try the following :
a=input() #[4,3,2,7,8,2,3,1]
b=[x for x in range(1,len(a)+1)]
c,d=set(a),set(b)
print(list(d-c))
Given a list of size N. Find the number of pairs (i, j) such that A[i] XOR A[j] = x, and 1 <= i < j <= N.
Input : list = [3, 6, 8, 10, 15, 50], x = 5
Output : 2
Explanation : (3 ^ 6) = 5 and (10 ^ 15) = 5
This is my code (brute force):
import itertools
n=int(input())
pairs=0
l=list(map(int,raw_input().split()))
q=[x for x in l if x%2==0]
p=[y for y in l if y%2!=0]
for a, b in itertools.combinations(q, 2):
if (a^b!=2) and ((a^b)%2==0) and (a!=b):
pairs+=1
for a, b in itertools.combinations(p, 2):
if (a^b!=2) and ((a^b)%2==0) and (a!=b):
pairs+=1
print pairs
how to do this more efficiently in a complexity of O(n) in python?
Observe that if A[i]^A[j] == x, this implies that A[i]^x == A[j] and A[j]^x == A[i].
So, an O(n) solution would be to iterate through an associate map (dict) where each key is an item from A and each value is the respective count of the item. Then, for each item, calculate A[i]^x, and see if A[i]^x is in the map. If it is in the map, this implies that A[i]^A[j] == x for some j. Since we have a map with the count of all items that equal A[j], the total number of pairs will be num_Ai * num_Aj. Note that each element will be counted twice since XOR is commutative (i.e. A[i]^A[j] == A[j]^A[i]), so we have to divide the final count by 2 since we've double counted each pair.
def create_count_map(lst):
result = {}
for item in lst:
if item in result:
result[item] += 1
else:
result[item] = 1
return result
def get_count(lst, x):
count_map = create_count_map(lst)
total_pairs = 0
for item in count_map:
xor_res = item ^ x
if xor_res in count_map:
total_pairs += count_map[xor_res] * count_map[item]
return total_pairs // 2
print(get_count([3, 6, 8, 10, 15, 50], 5))
print(get_count([1, 3, 1, 3, 1], 2))
outputs
2
6
as desired.
Why is this O(n)?
Converting a list to a dict s.t. the dict contains the count of each item in the list is O(n) time.
Calculating item ^ x is O(1) time, and calculating whether this result is in a dict is also O(1) time. dict key access is also O(1), and so is multiplication of two elements. We do all this n times, hence O(n) time for the loop.
O(n) + O(n) reduces to O(n) time.
Edited to handle duplicates correctly.
The accepted answer is not giving the correct result for X=0. This code handles that minute error. You can modify it to get answers for other values as well.
def calculate(a) :
# Finding the maximum of the array
maximum = max(a)
# Creating frequency array
# With initial value 0
frequency = [0 for x in range(maximum + 1)]
# Traversing through the array
for i in a :
# Counting frequency
frequency[i] += 1
answer = 0
# Traversing through the frequency array
for i in frequency :
# Calculating answer
answer = answer + i * (i - 1) // 2
return answer
I'm a stumped on how to speed up my algorithm which sums multiples in a given range. This is for a problem on codewars.com here is a link to the problem
codewars link
Here's the code and i'll explain what's going on in the bottom
import itertools
def solution(number):
return multiples(3, number) + multiples(5, number) - multiples(15, number)
def multiples(m, count):
l = 0
for i in itertools.count(m, m):
if i < count:
l += i
else:
break
return l
print solution(50000000) #takes 41.8 seconds
#one of the testers takes 50000000000000000000000000000000000000000 as input
# def multiples(m, count):
# l = 0
# for i in xrange(m,count ,m):
# l += i
# return l
so basically the problem ask the user return the sum of all the multiples of 3 and 5 within a number. Here are the testers.
test.assert_equals(solution(10), 23)
test.assert_equals(solution(20), 78)
test.assert_equals(solution(100), 2318)
test.assert_equals(solution(200), 9168)
test.assert_equals(solution(1000), 233168)
test.assert_equals(solution(10000), 23331668)
my program has no problem getting the right answer. The problem arises when the input is large. When pass in a number like 50000000 it takes over 40 seconds to return the answer. One of the inputs i'm asked to take is 50000000000000000000000000000000000000000, which a is huge number. That's also the reason why i'm using itertools.count() I tried using xrange in my first attempt but range can't handle numbers larger than a c type long. I know the slowest part the problem is the multiples method...yet it is still faster then my first attempt using list comprehension and checking whether i % 3 == 0 or i % 5 == 0, any ideas guys?
This solution should be faster for large numbers.
def solution(number):
number -= 1
a, b, c = number // 3, number // 5, number // 15
asum, bsum, csum = a*(a+1) // 2, b*(b+1) // 2, c*(c+1) // 2
return 3*asum + 5*bsum - 15*csum
Explanation:
Take any sequence from 1 to n:
1, 2, 3, 4, ..., n
And it's sum will always be given by the formula n(n+1)/2. This can be proven easily if you consider that the expression (1 + n) / 2 is just a shortcut for computing the average, or Arithmetic mean of this particular sequence of numbers. Because average(S) = sum(S) / length(S), if you take the average of any sequence of numbers and multiply it by the length of the sequence, you get the sum of the sequence.
If we're given a number n, and we want the sum of the multiples of some given k up to n, including n, we want to find the summation:
k + 2k + 3k + 4k + ... xk
where xk is the highest multiple of k that is less than or equal to n. Now notice that this summation can be factored into:
k(1 + 2 + 3 + 4 + ... + x)
We are given k already, so now all we need to find is x. If x is defined to be the highest number you can multiply k by to get a natural number less than or equal to n, then we can get the number x by using Python's integer division:
n // k == x
Once we find x, we can find the sum of the multiples of any given k up to a given n using previous formulas:
k(x(x+1)/2)
Our three given k's are 3, 5, and 15.
We find our x's in this line:
a, b, c = number // 3, number // 5, number // 15
Compute the summations of their multiples up to n in this line:
asum, bsum, csum = a*(a+1) // 2, b*(b+1) // 2, c*(c+1) // 2
And finally, multiply their summations by k in this line:
return 3*asum + 5*bsum - 15*csum
And we have our answer!