I have two lists
a = [1, 4, 12]
b = [2, 13]
I want to know if values in list b are between two values in list a
So, in this case, 2 will fall between 1 and 4. 13 will not fall between any numbers.
I have tried bisect function, but I couldn't get it work. I was able to use it with a single value and a list, but not two lists.
Maybe there's some subtlety that I do not get, but unless I am mistaken, you only have to check whether the elements are between the min and max from a. This is independent of whether the elements in a are sorted, or whether the values from b have to be between consecutive values from a. As long as they are between the min and max, there has to be a "segment" in a those values are in.
>>> a = [1, 4, 12]
>>> b = [2, 13]
>>> n, m = min(a), max(a)
>>> [n < x < m for x in b]
[True, False]
That is, of course, only if (a) you do not need which numbers they are in between, and (b) if not all values in b have to be in the same interval.
If you think I missed something, please comment.
It really depends on what you want it to return. I wrote a code that will return the first pattern that it finds, but with some changes I'm sure it would not be difficult to return all combinations.
def get_between(a, b):
a, b = sorted(a), sorted(b)
for b_value in b:
smaller = None
greater = None
for a_value in a:
if b_value > a_value:
smaller = a_value
elif b_value < a_value:
greater = a_value
if smaller and greater:
return f"{b_value} is between {smaller} and {greater}"
return "There is no such combination"
a = [1, 4, 12]
b = [2, 13]
print(get_between(a, b))
The output on that case will be 2 is between 1 and 4, but you can adapt the return value to be whatever you want.
You can keep two running indices to get a list of all elements that fall between values:
def get_between(arr1, arr2):
# first sort the arrays
arr1 = sorted(arr1)
arr2 = sorted(arr2)
# keep two indices into them
i1 = 0
i2 = 0
# keep track of the values between two values
ret = []
while i1 < len(arr1) - 1 and i2 < len(arr2):
# we're too small to be between a value
# so we should increase the second index
if arr2[i2] < arr1[i1]:
i2 += 1
# we're too large to be between a value
# so we should increase the first index
elif arr2[i2] > arr1[i1 + 1]:
i1 += 1
# we are between a value
# so we should append to the return array
# and move on to the next element
i2 += 1
return ret
get_between([1, 4, 12], [2, 8, 13]) # [2, 8]
If you don't care much about performance, here's a pythonic solution-
def betwn(rangelist, valuelist):
# Get list of all ranges between each consecutive number of rangelist
rgs = [range(rangelist[n], rangelist[n + 1]) for n in range(len(rangelist) - 1)]
# A function to check whether a given element exists between 2 consecutive numbers of rangelist
verifyfunc = lambda e: any(e in r for r in rgs)
# Return the qualifying elements from valuelist
return [e for e in valuelist if verifyfunc(e)]
>>> betwn([1, 4, 12], [2, 13])
Hellow. I have some trouble with this question:
i have a list, for example a=[1,4,5,7,10]
i need another list with randomly chosen elements from a (b = np.random.choice(a,3,replace=False) -> b = [4,7,10]
But the length of the b must be the same as len(a). I guess i confuse someone with that text, so here are example :
1. `a = [1,4,5,7,10]`
2. `b = [0,4,0,7,10]`
so, the elements that wasn't chosen must be zeros.
Any suggestions would be appreciated
Reverse the question to "replacing random list entries with 0". Then the question becomes how to decide for any value whether to set it to 0 or not. Here we're using a certain random chance per value; you'll want to substitute that with whatever other specific criterion you have:
import random
a = [1,4,5,7,10]
b = [0 if random.randint(0, len(a)) < 2 else i for i in a]
There'll be better solutions, but this can work:
import numpy as np
a = np.array([1,4,5,7,10])
b = np.random.choice(np.arange(len(a)), 3, replace=False)
out = np.zeros(len(a),
out[b] = a[b]
Note that I'm using choice on the list of indices into a, not the actual values of a.
import random
a = [1, 4, 5, 7, 10]
b = random.sample(a, len(a))
for i in range(len(b)):
if b[i] != a[i]:
b[i] = 0
print('a:', a)
print('b:', b)
I'm trying to look for the number of combinations of 7 digit numbers (or more, actually need it to work for 10, but its faster to test with 7) that have 1,3,5,7 in it. Tried a few different methods like using
combinations = 0
for combination in itertools.product(xrange(10), repeat=7):
if all(x in combination for x in (1,3,5,7)):
combinations += 1
However, this next method worked out to be about 4 times faster as it doesnt look for 3,5,7 if 1 is not in the list.
combinations = 0
for combination in itertools.product(xrange(10), repeat=7):
if 1 in combination:
if 3 in combination:
if 5 in combination:
if 7 in combination:
combinations += 1
I'm sure there is a more cleaver way to achieve this result with numpy or something like that, but I can't figure it out.
Thanks for feedback
The problem is to find k-digit numbers that contain all the digits 1, 3, 5, 7.
This answer contains a number of solutions, increasing in sophistication and algorithmic efficiency. By the end, we'll be able to, in a fraction of a second, count solutions for huge k, for example 10^12, modulo a large prime.
The section at the end includes tests that provide good evidence that all the implementations are correct.
Brute force: O(k10^k) time, O(k) space
We'll use this slow approach to test the more optimized versions of the code:
def contains_1357(i):
i = str(i)
return all(x in i for x in '1357')
def combos_slow(k):
return sum(contains_1357(i) for i in xrange(10 ** k))
Counting: O(k^4) time, O(k) space
The simplest moderately efficient method is to count. One way to do this is to count all k-digit numbers where the first occurrences of the four special digits appear at digits a, b, c, d.
Given such an a, b, c, d, the digits up to a must be 0,2,4,6,8,9, the digit a must be one of [1, 3, 5, 7], the digits between a and b must be either the same as the digit a or any of the safe digits, the digit b must be one of [1, 3, 5, 7] that's different from the digit at a, and so on.
Summing over all possible a, b, c, d gives the result. Like this:
import itertools
def combos0(k):
S = 0
for a, b, c, d in itertools.combinations(range(k), 4):
S += 6 ** a * 4 * 7**(b-a-1) * 3 * 8**(c-b-1) * 2 * 9**(d-c-1) * 10**(k-d-1)
return S
Dynamic programming: O(k) time, O(k) and then O(1) space
You can solve this more efficiently with dynamic programming: let c[j][i] be the number of i-digit numbers which contain exactly j different digits from (1, 3, 5, 7).
Then c satisfies these recurrence relations:
c[0][0] = 1
c[j][0] = 0 for j > 0
c[0][i] = 6 * c[0][i-1] for i > 0
c[j][i] = (6+j)c[j][i-1] + (5-j)c[j-1][i-1] for i, j > 0
The final line of the recurrence relations is the hardest one to understand. The first part (6+j)c[j][i-1] says that you can make an i digit number containing j of the digits 1, 3, 5, 7 from a i-1 digit number containing j of the digits 1, 3, 5, 7, and add an extra digit that's either 0, 2, 4, 6, 8, 9 or any of the digits you've already got. Similarly, the second part (5-j)c[j-1][i-1] says that you can take an i-1 digit number containing j-1 of the digits 1, 3, 5, 7 and make it an i-digit number containing j of the special digits by adding one of the digits you haven't already used. There's 5-j of these.
That leads to this O(k) solution using dynamic programming:
def combos(k):
c = [[0] * (k + 1) for _ in xrange(5)]
c[0][0] = 1
for i in xrange(1, k+1):
c[0][i] = 6 * c[0][i-1]
for j in xrange(1, 5):
c[j][i] = (6 + j) * c[j][i-1] + (5-j) * c[j-1][i-1]
return c[4][k]
We can print combos(10):
print 'combos(10) =', combos(10)
This gives this output:
combos(10) = 1425878520
The solution above is already fast enough to compute combos(10000) in a fraction of a second. But it's possible to optimize the DP solution a little to use O(1) rather than O(k) space by observing that values of c depend only on the previous column in the table. With a bit of care (to make sure that we're not overwriting values before they're used), we can write the code like this:
def combos2(k):
c = [1, 0, 0, 0, 0]
for _ in xrange(k):
for j in xrange(4, 0, -1):
c[j] = (6+j)*c[j] + (5-j)*c[j-1]
c[0] *= 6
return c[4]
Matrix power: O(log k) time, O(1) space.
Ultimately, it's possible to get the result in O(log k) time and O(1) space, by expressing the recurrence relation as a matrix-by-vector multiply, and using exponentiation by squaring. That makes it possible to compute combos(k) modulo X even for massive k (here combos(10^12) modulo 2^31 - 1). That looks like this:
def mat_vec(M, v, X):
return [sum(M[i][j] * v[j] % X for j in xrange(5)) for i in xrange(5)]
def mat_mul(M, N, X):
return [[sum(M[i][j] * N[j][k] for j in xrange(5)) % X for k in xrange(5)] for i in xrange(5)]
def mat_pow(M, k, X):
r = [[i==j for i in xrange(5)] for j in xrange(5)]
while k:
if k % 2:
r = mat_mul(r, M, X)
M = mat_mul(M, M, X)
k //= 2
return r
def combos3(k, X):
M = [[6, 0, 0, 0, 0], [4, 7, 0, 0, 0], [0, 3, 8, 0, 0], [0, 0, 2, 9, 0], [0, 0, 0, 1, 10]]
return mat_vec(mat_pow(M, k, X), [1, 0, 0, 0, 0], X)[4]
print combos3(10**12, (2**31) - 1)
Given that your original code struggled for k=10, this is quite an improvement!
We can test each of the functions against each other (and combos_slow for small values). Since combos3 has an extra arg, we wrap it in a function that passes a modulo that's guaranteed to be larger than the result.
def combos3p(k):
return combos3(k, 10**k)
for c in [combos0, combos, combos2, combos3p]:
for i in xrange(40 if c == combos0 else 100):
assert c(i) == (combos_slow if i < 7 else combos)(i)
This tests all the implementations against combos_slow for i<7, and against each other for 7 <= i < 100 (except for the less efficient combos0 which stops at 40).
I have implemented a cyclic iteration function in two ways:
def Spin1(n, N) : # n - current state, N - highest state
value = n + 1
case1 = (value > N)
case2 = (value <= N)
return case1 * 0 + case2 * value
def Spin2(n, N) :
value = n + 1
if value > N :
return 0
else : return value
These functions are identical regarding the returned results. However the second function is not broadcasting-capable for a numpy array. So to test the first function I run this:
import numpy
AR1 = numpy.zeros((3, 4), dtype = numpy.uint32)
AR1[1,2] = 5
print AR1
print Spin1(AR1,5)
Magically it works, and that is so sweet. So I see exactly what I want:
[[0 0 0 0]
[0 0 5 0]
[0 0 0 0]]
[[1 1 1 1]
[1 1 0 1]
[1 1 1 1]]
Now with the second function print Spin2(AR1,5) it fails with this error:
if value > N
ValueError: The truth value of an array with more than one element is ambiguous.
Use a.any() or a.all()
And it's clear why, since if Array statement is nonsence. So for now I just used the first variant. But when I look at those functions I have a strong feeling that in the first function there are much more mathematical operations so I don't lose the hope that I can do something about optimising it.
1. Is it possible to optimise the function Spin1 to do less operations or how do I use the function Spin2 in broadcasting mode (possibly without making my code too ugly)? Extra question: What would be the fastest way to do that manipulation with an array?
2. Is there some standard Python function which does the same calculation (not implicitly broadcasting-capable) and how it is correctly called - "cyclic increment" probably?
There is a numpy function for this: np.where:
In [590]: AR1
array([[0, 0, 0, 0],
[0, 0, 5, 0],
[0, 0, 0, 0]], dtype=uint32)
In [591]: np.where(AR1 >= 5, 0, 1)
array([[1, 1, 1, 1],
[1, 1, 0, 1],
[1, 1, 1, 1]])
So, you could define:
def Spin1(n, N) :
value = n + 1
return np.where(value > N, 0, value)
NumPy also provides a way to turn normal Python functions into ufuncs:
def Spin2(n, N) :
value = n + 1
if value > N :
return 0
else : return value
Spin2 = np.vectorize(Spin2)
So that you can now call Spin2 on arrays:
In [595]: Spin2(AR1, 5)
array([[1, 1, 1, 1],
[1, 1, 0, 1],
[1, 1, 1, 1]])
However, np.vectorize mainly provides syntactic sugar. There is still a Python function call being made for each array element, which makes np.vectorized ufuncs no faster than equivalent code using Python for-loops.
Your Spin1 follows a well established pattern in array oriented languages (e.g. APL, MATLAB) for 'vectorizing' a function like Spin2. You create one or more booleans (or 0/1 arrays) to represent the various states the array elements can take, and then construct the output by multiplication and summation.
For example, to avoid divide-by-zero problems, I have used:
A variation on this is to use a boolean index array to select array elements that should be changed. In this case, you want to return value, but with selected elements 'rolled over'.
def Spin3(n, N) : # n - current state, N - highest state
value = n + 1
value[value>N] = 0
return value
In this case, the indexing approach is simpler, and seems to fit the program logic better. It may be faster, but I can't guarantee that. It's good to keep both approaches in mind.
I put here some feedback as an answer, just not to mess up with the question. So I've done timing tests on various functions and it turns out that assigning by a boolean mask in this case is the fastest variant (hpaulj's answer). np.where was 1.4 times slower and np.vectorize(Spin2) was 15 times slower. Now just out of curiousity I wanted to test this with loops, so I made up this algorithm for testing:
AR1 = numpy.zeros((rows, cols), dtype = numpy.uint32)
while d <= 100:
Buf = numpy.zeros_like(AR1)
r = 0
c = 0
while (r < rows) :
while (c < cols) :
temp = AR1[r, c] + 1
if temp > 5 :
Buf[r, c] = 1
else : Buf[r, c] = temp
c += 1
r += 1
c = 0
AR1 = Buf
d += 1
I am not sure, but it seems to be very straightforward implementation of all the above mentioned functions. But it is sooo slow, almost 300 times slower. I have read similar questions on SO, but still I don't get it, WHY is it so? And what exactly is causing this slowdown. Here I have intentionally made up a buffer to avoid read-write functions on the same elements and do not do memory clean up. So what can be more simple, I am confused. Don't want to open a new question, since it was asked few times already, so probably someone will put comments or has good links clarifying this?
I'm totally stuck and have no idea how to go about solving this. Let's say I've an array
arr = [1, 4, 5, 10]
and a number
n = 8
I need shortest sequence from within arr which equals n. So for example following sequences within arr equals n
c1 = 5,1,1,1
c2 = 4,4
c3= 1,1,1,1,1,1,1,1
So in above case, our answer is c2 because it's shortest sequences in arr that equals sum.
I'm not sure what's the simplest way of finding a solution to above? Any ideas, or help will be really appreciated.
Fixed the array
Array will possibly have postive values only.
I'm not sure how subset problem fixes this, probably due to my own ignorance. Does sub-set algorithm always give the shortest sequence that equals sum? For example, will subset problem identify c2 as the answer in above scenario?
As has been pointed before this is the minimum change coin problem, typically solved with dynamic programming. Here's a Python implementation solved in time complexity O(nC) and space complexity O(C), where n is the number of coins and C the required amount of money:
def min_change(V, C):
table, solution = min_change_table(V, C)
num_coins, coins = table[-1], []
if num_coins == float('inf'):
return []
while C > 0:
C -= V[solution[C]]
return coins
def min_change_table(V, C):
m, n = C+1, len(V)
table, solution = [0] * m, [0] * m
for i in xrange(1, m):
minNum, minIdx = float('inf'), -1
for j in xrange(n):
if V[j] <= i and 1 + table[i - V[j]] < minNum:
minNum = 1 + table[i - V[j]]
minIdx = j
table[i] = minNum
solution[i] = minIdx
return (table, solution)
In the above functions V is the list of possible coins and C the required amount of money. Now when you call the min_change function the output is as expected:
min_change([1,4,5,10], 8)
> [4, 4]
For the benefit of people who find this question in future -
As Oscar Lopez and Priyank Bhatnagar, have pointed out, this is the coin change (change-giving, change-making) problem.
In general, the dynamic programming solution they have proposed is the optimal solution - both in terms of (provably!) always producing the required sum using the fewest items, and in terms of execution speed. If your basis numbers are arbitrary, then use the dynamic programming solution.
If your basis numbers are "nice", however, a simpler greedy algorithm will do.
For example, the Australian currency system uses denominations of $100, $50, $20, $10, $5, $2, $1, $0.50, $0.20, $0.10, $0.05. Optimal change can be given for any amount by repeatedly giving the largest unit of change possible until the remaining amount is zero (or less than five cents.)
Here's an instructive implementation of the greedy algorithm, illustrating the concept.
def greedy_give_change (denominations, amount):
# Sort from largest to smallest
denominations = sorted(denominations, reverse=True)
# number of each note/coin given
change_given = list()
for d in denominations:
while amount > d:
amount -= d
return change_given
australian_coins = [100, 50, 20, 10, 5, 2, 1, 0.50, 0.20, 0.10, 0.05]
change = greedy_give_change(australian_coins, 313.37)
print (change) # [100, 100, 100, 10, 2, 1, 0.2, 0.1, 0.05]
print (sum(change)) # 313.35
For the specific example in the original post (denominations = [1, 4, 5, 10] and amount = 8) the greedy solution is not optimal - it will give [5, 1, 1, 1]. But the greedy solution is much faster and simpler than the dynamic programming solution, so if you can use it, you should!
This is problem is known as Minimum coin change problem.
You can solve it by using dynamic programming.
Here is the pseudo code :
Set MinCoin[i] equal to Infinity for all of i
MinCoin[0] = 0
For i = 1 to N // The number N
For j = 0 to M - 1 // M denominations given
// Number i is broken into i-Value[j] for which we already know the answer
// And we update if it gives us lesser value than previous known.
If (Value[j] <= i and MinCoin[i-Value[j]]+1 < MinCoin[i])
MinCoin[i] = MinCoin[i-Value[j]]+1
Output MinCoin[N]
This is an variant of subset-sum problem. In your problem, you can pick an item several times. You still can use a similar idea to solve this problem by using the dynamic prorgamming technique. The basic idea is to design a function F(k, j), such that F(k, j) = 1 means that there is a sequence from arr whose sum is j and length is k.
Formally, the base case is that F(k, 1) = 1, if there exists an i, such that arr[i] = k. For inductive case, F(k, j) = 1, if there exists an i, such that arr[i] = m, and F(k-1, j-m) = 1.
The smallest k with F(k, n) = 1 is the length of the shortest sequence you want.
By using the dynamic programming technique, you can compute function F without using recursion.
By tracking additional information for every F(k, j), you also can reconstruct the shortest sequence.
What you're trying to solve is a variant of the coin change problem. Here you're looking for smallest amount of change, or the minimum amount of coins that sum up to a given amount.
Consider a simple case where your array is
c = [1, 2, 3]
you write 5 as a combination of elements from C and want to know what is the shortest such combination. Here C is the set of coin values and 5 is the amount for which you want to get change.
Let's write down all possible combinations:
1 + 1 + 1 + 1 + 1
1 + 1 + 1 + 2
1 + 2 + 2
1 + 1 + 3
2 + 3
Note that two combinations are the same up to re-ordering, so for instance 2 + 3 = 3 + 2.
Here there is an awesome result that's not obvious at first sight but it's very easy to prove. If you have any sequence of coins/values that is a sequence of minimum length that sums up to a given amount, no matter how you split this sequence the two parts will also be sequences of minimum length for the respective amounts.
For instance if c[3] + c[1] + c[2] + c[7] + c[2] + c[3] add up to S and we know that 6 is the minimal length of any sequence of elements from c that add up to S then if you split
S = c[3] + c[1] + c[2] + c[7] | + c[2] + c[3]
you have that 4 is the minimal length for sequences that add up to c[3] + c[1] + c[2] + c[7] and 2 the minimal length for sequences that add up to c[2] + c[3].
S = c[3] + c[1] + c[2] + c[7] | + c[2] + c[3]
= S_left + S_right
How to prove this? By contradiction, assume that the length of S_left is not optimal, that is there's a shorter sequence that adds up to S_left. But then we could write S as a sum of this shorter sequence and S_right, thus contradicting the fact that the length of S is minimal. □
Since this is true no matter how you split the sequence, you can use this result to build a recursive algorithm that follows the principles of dynamic programming paradigm (solving smaller problems while possibly skipping computations that won't be used, memoization or keeping track of computed values, and finally combining the results).
Because of this property of maintaining optimality for subproblems, the coins problem is also said to "exhibit optimal substructure".
OK, so in the small example above this is how we would go about solving the problem with a dynamic programming approach: assume we want to find the shortest sequence of elements from c = [1, 2, 3] for writing the sum 5. We solve the subproblems obtained by subtracting one coin: 5 - 1, 5 - 2, and 5 - 3, we take the smallest solution of these subproblems and add 1 (the missing coin).
So we can write something like
shortest_seq_length([1, 2, 3], 5) =
min( shortest_seq_length([1, 2, 3], 5-1),
shortest_seq_length([1, 2, 3], 5-2),
shortest_seq_length([1, 2, 3], 5-3)
) + 1
It is convenient to write the algorithm bottom-up, starting from smaller values of the sums that can be saved and used to form bigger sums. We just solve the problem for all possible values starting from 1 and going up to the desired sum.
Here's the code in Python:
def shortest_seq_length(c, S):
res = {0: 0} # res contains computed results res[i] = shortest_seq_length(c, i)
for i in range(1, S+1):
res[i] = min([res[i-x] for x in c if x<=i]) + 1
return res[S]
Now this works except for the cases when we cannot fill the memoization structure for all values of i. This is the case when we don't have the value 1 in c, so for instance we cannot form the sum 1 if c = [2, 5] and with the above function we get
shortest_seq_length([2, 3], 5)
# ValueError: min() arg is an empty sequence
So to take care of this issue one could for instance use a try/catch:
def shortest_seq_length(c, S):
res = {0: 0} # res contains results for each sum res[i] = shortest_seq_length(c, i)
for i in range(1, S+1):
res[i] = min([res[i-x] for x in c if x<=i and res[i-x] is not None]) +1
res[i] = None # takes care of error when [res[i-x] for x in c if x<=i] is empty
return res[S]
Or without try/catch:
def shortest_seq_length(c, S):
res = {0: 0} # res[i] = shortest_seq_length(c, i)
for i in range(1, S+1):
prev = [res[i-x] for x in c if x<=i and res[i-x] is not None]
if len(prev)>0:
res[i] = min(prev) +1
res[i] = None # takes care of error when [res[i-x] for x in c if x<=i] is empty
return res[S]
Try it out:
print(shortest_seq_length([2, 3], 5))
# 2
print(shortest_seq_length([1, 5, 10, 25], 37))
# 4
print(shortest_seq_length([1, 5, 10], 30))
# 3
print(shortest_seq_length([1, 5, 10], 25))
# 3
print(shortest_seq_length([1, 5, 10], 29))
# 7
print(shortest_seq_length([5, 10], 9))
# None
To show not only the length but also the combinations of coins of minimal length:
from collections import defaultdict
def shortest_seq_length(coins, sum):
combos = defaultdict(list)
combos[0] = [[]]
for i in range(1, sum+1):
for x in coins:
if x<=i and combos[i-x] is not None:
for p in combos[i-x]:
comb = sorted(p + [x])
if comb not in combos[i]:
if len(combos[i])>0:
m = (min(map(len,combos[i])))
combos[i] = [combo for i, combo in enumerate(combos[i]) if len(combo) == m]
combos[i] = None
return combos[sum]
total = 9
coin_sizes = [10, 8, 5, 4, 1]
shortest_seq_length(coin_sizes, total)
# [[1, 8], [4, 5]]
To show all sequences remove the minumum computation:
from collections import defaultdict
def all_seq_length(coins, sum):
combos = defaultdict(list)
combos[0] = [[]]
for i in range(1, sum+1):
for x in coins:
if x<=i and combos[i-x] is not None:
for p in combos[i-x]:
comb = sorted(p + [x])
if comb not in combos[i]:
if len(combos[i])==0:
combos[i] = None
return combos[sum]
total = 9
coin_sizes = [10, 5, 4, 8, 1]
all_seq_length(coin_sizes, total)
# [[4, 5],
# [1, 1, 1, 1, 5],
# [1, 4, 4],
# [1, 1, 1, 1, 1, 4],
# [1, 8],
# [1, 1, 1, 1, 1, 1, 1, 1, 1]]
One small improvement to the algorithm is to skip the step of computing the minimum when the sum is equal to one of the values/coins, but this can be done better if we write a loop to compute the minimum. This however doesn't improve the overall complexity that's O(mS) where m = len(c).