Search in sorted array - python

There is quite simple task for finding values in sorted array which may contain duplicities and return indices to standard output on a single line.
First line of the input contains the numbers N and k, separated by a space.
N is the count of numbers and k is the number of queries to perform.
The next line or lines contain N numbers in non-decreasing order (data) and k numbers (queries) to search for in the input sequence.
Numbers are separated by spaces and ends of lines.
Read the data into memory and for each request find its first position i in the sequence (i.e., the smallest value i for which data[i]=x). Positions are indexed from 1 to N.
Write all these indices to standard output on a single line, separated by spaces. If the requested number is not present in the sequence, output 0 instead of its position. If the number is present more than once, output the index of its first occurence. The size of the sequence (N) and number of the requests (k) are at most 1 000 000.
def custom_search(arr, target) -> int:
n = len(arr) + 1
for i in range(1, n):
if (arr[i-1] == target):
return(i)
return(0)
def give_numbers():
inputs = list(map(int, input().split()))
if len(inputs) != 2:
return([], None, None)
n, m = inputs
if ((n < 1 or n > 1000000) or (m < 1 or m > 1000000)):
return([], None, None)
i = 2
stuff = []
while i >= 1:
stuff.append(list(map(int, input().split())))
i -= 1
return(stuff, n, m)
inpt, n, m = give_numbers()
if len(inpt) != 0:
N, k = inpt
if n == len(N) and m == len(k):
for i in k:
print(custom_search(N, i), end=" ")
Inputs:
10 4
4 8 9 9 9 9 18 28 32 100
4 9 28 32
Outputs:
1 3 8 9
Is there any better way to avoid O(n) in searching in ordered array and speed this up?

The algorithm you are looking for is called binary search, and its time complexity is O(log2(N)). Here is a python function that has 2 parameters:
The value you are looking for
The sorted array
and it returns the first position i where array[i] = value
def find_first_appearence(value, array):
position = 0
left = 0;
right = len(array) - 1
while left <= right:
middle = int(left + (right - left) / 2)
if array[middle] >= value:
right = middle - 1
position = middle
else:
left = middle + 1
if array[position] != value:
return 0
return position

Have you considered implementing some sort of binary search?
Divide the array in half, if the value searched is greater than the the middle value take the second part and keep going. In pseudocode:
found = false
while(!found && array.length > 1){
i = array.length / 2;
if (array[i]==searchedValue) return true
if (array[i]>searchedValue) array = array.slice(0, i)
if (array[i]<searchedValie) array = array.slice(i+1, array.length)
}
if (array[0] == searchedValue) found = true
return found
This will decrease the complexity to O(log(n))

You can use modified binary search that can find left most occurenct of the given target in the given array:
int binsearchLeftmost(int l, int r, int target, const std::vector<int>& array) {
int res = 0;
while (l <= r) {
int m = l + (r - l) / 2;
if (array[m] > target) {
r = m - 1;
}
else if (array[m] < target) {
l = m + 1;
}
else {
res = m + 1;
r = m - 1;
}
}
return res;
}

Related

Number of occurrences of digit in numbers from 0 to n

Given a number n, count number of occurrences of digits 0, 2 and 4 including n.
Example1:
n = 10
output: 4
Example2:
n = 22
output: 11
My Code:
n = 22
def count_digit(n):
count = 0
for i in range(n+1):
if '2' in str(i):
count += 1
if '0' in str(i):
count += 1
if '4' in str(i):
count += 1
return count
count_digit(n)
Code Output: 10
Desired Output: 11
Constraints: 1 <= N <= 10^5
Note: The solution should not cause outOfMemoryException or Time Limit Exceeded for large numbers.
You can increment your count like this:
def count_digit(n):
count = 0
for i in range(n + 1):
if '2' in str(i):
count += str(i).count('2')
if '0' in str(i):
count += str(i).count('0')
if '4' in str(i):
count += str(i).count('4')
return count
In that way, edge cases like 22, 44, and so on are covered!
There are numbers in which the desired number is repeated, such as 20 or 22, so instead of adding 1 you must add 2
>>>
>>> string = ','.join(map(str,range(23)))
>>>
>>> string
'0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22'
>>>
>>> string.count('0') + string.count('2') + string.count('4')
11
>>>
n = 22
def count_digit(n):
count = 0
for i in map(str,range(n+1)):
count+=i.count('0')
count+=i.count('2')
count+=i.count('3')
return count
print(count_digit(n))
that solotion is fast:
It can be developed to be faster:
def count_digit(n):
i=0
count=0
s='024'
while i<n-1:
j = 0
for v in str(i):
if v in s:
j+=1
count+=3*j + (7*(j-1))
i+=10
for i in range(i,n+1,1):
for v in str(i):
if v in s:
count+=1
return count
TL;DR: If you do it right, you can compute the count about a thousand times faster for n close to 10**5, and since the better algorithm uses time proportional to the number of digits in n, it can easily handle even values of n too large for a 64-bit integer.
As is often the case with puzzles like this ("in the numbers from x to y, how many...?"), the key is to find a way to compute an aggregate count, ideally in O(1), for a large range. For combinatorics over the string representation of numbers, a convenient range is often something like the set of all numbers whose string representation is a given size, possibly with a specific prefix. In other words, ranges of the form [prefix*10⁴, prefix*10⁴+9999], where 0s in the lower limit is the same as the number of 9s in the upper limit and the exponent of 10 in the multiplier. (It's often actually more convenient to use half-open ranges, where the lower limit is inclusive and the upper limit is exclusive, so the above example would be [prefix*10⁴, (prefix+1)*10⁴).)
Also note that if the problem is to compute a count for [x, y), and you only know how to compute [0, y), then you just do two computations, because
count [x, y) == count [0, y) - count [0, x)
That identity is one of the simplifications which half-open intervals allow.
That would work nicely with this problem, because it's clear how many times a digit d occurs in the set of all k-digit suffixes for a given prefix. (In the 10k suffixes, every digit has the same frequency as every other digit; there are a total of k×10k digits in those 10k, and since all digits have the same count, that count must be k×10k−1.) Then you just have to add the digit count of the prefixes, but the prefix appears exactly 10k times, and each one contributes the same count.
So you could take a number like 72483, and decompose it into the following ranges, which roughly correspond to the sum of the digits in 72483, plus a few ranges containing fewer digits.
[0, 9]
[10, 99]
[100, 999]
[1000, 9999]
[10000, 19999]
[20000, 29999]
[30000, 39999]
[40000, 49999]
[50000, 59999]
[60000, 69999]
[70000, 70999]
[71000, 71999]
[72000, 72099]
[72100, 72199]
[72200, 72299]
[72300, 72399]
[72400, 72409]
[72410, 72419]
[72420, 72429]
[72430, 72439]
[72440, 72449]
[72450, 72459]
[72460, 72469]
[72470, 72479]
[72480, 72480]
[72481, 72481]
[72482, 72482]
[72483, 72483]
However, in the following code, I used a slightly different algorithm, which turned out to be a bit shorter. It considers the rectangle in which all the mumbers from 0 to n are written out, including leading zeros, and then computes counts for each column. A column of digits in a rectangle of sequential integers follows a simple recurring pattern; the frequency can easily be computed by starting with the completely repetitive part of the column. After the complete repetitions, the remaining digits are in order, with each one except the last one appearing the same number of times. It's probably easiest to understand that by drawing out a small example on a pad of paper, but the following code should also be reasonably clear (I hope).
The one problem with that is that it counts leading zeros which don't actually exist, so it needs to be corrected by subtracting the leading zero count. Fortunately, that count is extremely easy to compute. If you consider a range ending with a five-digit number (which itself cannot start with a zero, since it wouldn't really be a five-digit number if it started with zero), then you can see that the range includes:
10000 numbers start with a zero
1000 more numbers which have a second leading zero
100 more numbers which have a third leading zero
10 more numbers which have a fourth leading zero
No numbers have five leading zeros, because we write 0 as such, not as an empty string.
That adds up to 11110, and it's easy to see how that generalises. That value can be computed without a loop, as (10⁵ − 1) / 9 − 1. That correction is done at the end of the following function:
def countd(m, s=(0,2,4)):
if m < 0: return 0
m += 1
rv = 0
rest = 0
pos = 1
while True:
digit = m % 10
m //= 10
rv += m * pos * len(s)
for d in s:
if digit > d:
rv += pos
elif digit == d:
rv += rest
if m == 0:
break
rest += digit * pos
pos *= 10
if 0 in s:
rv -= (10 * pos - 1) // 9 - 1
return rv
That code could almost certainly be tightened up; I was just trying to get the algorithm down. But, as it is, it's execution time is measured in microseconds, not milliseconds, even for much larger values of n.
Here's an update of Kelly's benchmark; I removed the other solutions because they were taking too long for the last value of n:
Try it online!
Another brute force, seems faster:
def count_digit(n):
s = str(list(range(n+1)))
return sum(map(s.count, '024'))
Benchmark with n = 10**5:
result time solution
115474 244 ms original
138895 51 ms Kelly
138895 225 ms islam_abdelmoumen
138895 356 ms CodingDaveS
Code (Try it online!):
from timeit import default_timer as time
def original(n):
count = 0
for i in range(n+1):
if '2' in str(i):
count += 1
if '0' in str(i):
count += 1
if '4' in str(i):
count += 1
return count
def Kelly(n):
s = str(list(range(n+1)))
return sum(map(s.count, '024'))
def islam_abdelmoumen(n):
count = 0
for i in map(str,range(n+1)):
count+=i.count('0')
count+=i.count('2')
count+=i.count('3')
return count
def CodingDaveS(n):
count = 0
for i in range(n + 1):
if '2' in str(i):
count += str(i).count('2')
if '0' in str(i):
count += str(i).count('0')
if '4' in str(i):
count += str(i).count('4')
return count
funcs = original, Kelly, islam_abdelmoumen, CodingDaveS
print('result time solution')
print()
for _ in range(3):
for f in funcs:
t = time()
print(f(10**5), ' %3d ms ' % ((time()-t)*1e3), f.__name__)
print()
I ended up with a similar answer to rici's, except maybe from a slightly different phrasing for the numeric formulation. How many instances of each digit in each position ("counts for each column," as rici described) we can formulate in two parts as first p * floor(n / (10 * p)), where p is 10 raised to the power of position. For example, in position 0 (the rightmost), there is one 1 for each ten numbers. Counting the 0's, however, requires an additional check regarding the population of the current and next position.
To the first part we still need to add the counts attributed to the remainder of the division. For example, for n = 6, floor(6 / 10) = 0 but we do have one count of 2 and one of 4. We add p if the digit in that position in n is greater than the digit we're counting; or, if the digit is the same, we add the value on the right of the digit plus 1 (for example, for n = 45, we want to count the 6 instances where 4 appears in position 1: 40, 41, 42, 43, 44, 45).
JavaScript code, comparing with rici's instantly for all numbers from 1 to 600,000. (If I'm not mistaken, rici's code wrongly returns 0 for n = 0, when the answer should be 1 count.
function countd(m, s = [0,2,4]) {
if (m <= 0)
return 0
m += 1
rv = 0
rest = 0
pos = 1
while (true) {
digit = m % 10
m = Math.floor(m / 10)
rv += m * pos * s.length
for (d of s) {
if (digit > d)
rv += pos
else if (digit == d)
rv += rest
}
if (m == 0) {
break
}
rest += digit * pos
pos *= 10
}
if (s.includes(0)) {
rv -= Math.floor((10 * pos - 1) / 9) - 1
}
return rv
}
function f(n, ds = [0, 2, 4]) {
// Value on the right of position
let curr = 0;
let m = n;
// 10 to the power of position
let p = 1;
let result = 1;
while (m) {
const digit = m % 10;
m = Math.floor(m / 10);
for (const d of ds) {
if (d != 0 || n >= 11 * p) {
result += p * Math.floor((n - (d ? 0 : 10 * p)) / (10 * p));
}
if (digit > d && (d != 0 || m > 0)) {
result += p;
} else if (digit == d) {
result += curr + 1;
}
}
curr += p * digit;
p *= 10;
}
return result;
}
for (let n = 1; n <= 600000; n += 1) {
const _f = f(n);
const _countd = countd(n);
if (_f != _countd) {
console.log(`n: ${ n }`);
console.log(_f, _countd);
break;
}
}
console.log("Done.");
Using single branch conditional
def count_digit(n):
s = '024'
out = 0
for integer in map(str, range(n+1)): # integer as string
for digit in integer:
if digit in s:
out += 1
return out
or more compactly
def count_digit(n):
s = '024'
return sum(1 for i in map(str, range(n+1)) for d in i if d in s)

Algorithm not passing tests even when I get correct results on my end

The question is mostly about base conversion. Here's the question.
Start with a random minion ID n, which is a nonnegative integer of length k in base b
Define x and y as integers of length k. x has the digits of n in descending order, and y has the digits of n in ascending order
Define z = x - y. Add leading zeros to z to maintain length k if necessary
Assign n = z to get the next minion ID, and go back to step 2
For example, given minion ID n = 1211, k = 4, b = 10, then x = 2111, y = 1112 and z = 2111 - 1112 = 0999. Then the next minion ID will be n = 0999 and the algorithm iterates again: x = 9990, y = 0999 and z = 9990 - 0999 = 8991, and so on.
Depending on the values of n, k (derived from n), and b, at some point the algorithm reaches a cycle, such as by reaching a constant value. For example, starting with n = 210022, k = 6, b = 3, the algorithm will reach the cycle of values [210111, 122221, 102212] and it will stay in this cycle no matter how many times it continues iterating. Starting with n = 1211, the routine will reach the integer 6174, and since 7641 - 1467 is 6174, it will stay as that value no matter how many times it iterates.
Given a minion ID as a string n representing a nonnegative integer of length k in base b, where 2 <= k <= 9 and 2 <= b <= 10, write a function solution(n, b) which returns the length of the ending cycle of the algorithm above starting with n. For instance, in the example above, solution(210022, 3) would return 3, since iterating on 102212 would return to 210111 when done in base 3. If the algorithm reaches a constant, such as 0, then the length is 1.
Here's my code
def solution(n, b): #n(num): str, b(base): int
#Your code here
num = n
k = len(n)
resList = []
resIdx = 0
loopFlag = True
while loopFlag:
numX = "".join(x for x in sorted(num, reverse=True))
numY = "".join(y for y in sorted(num))
xBaseTen, yBaseTen = getBaseTen(numX, b), getBaseTen(numY, b)
xMinusY = xBaseTen - yBaseTen
num = getBaseB(xMinusY, b, k)
resListLen = len(resList)
for i in range(resListLen - 1, -1, -1):
if resList[i] == num:
loopFlag = False
resIdx = resListLen - i
break
if loopFlag:
resList.append(num)
if num == 0:
resIdx = 1
break
return resIdx
def getBaseTen(n, b): #n(number): str, b(base): int -> int
nBaseTenRes = 0
n = str(int(n)) # Shave prepending zeroes
length = len(n) - 1
for i in range(length + 1):
nBaseTenRes += int(n[i]) * pow(b, length - i)
return nBaseTenRes
def getBaseB(n, b, k): #(number): int, b(base): int, k:(len): int -> str
res = ""
r = 0 # Remainder
nCopy = n
while nCopy > 0:
r = nCopy % b
nCopy = floor(nCopy / b)
res += str(r)
res = res[::-1]
resPrependZeroesLen = k - len(res)
if resPrependZeroesLen > 0:
for i in range(resPrependZeroesLen):
res = "0" + res
return res
The two test that are available to me and are not passing, are ('1211', 10) and ('210022', 3). But I get the right answers for them (1, 3).
Why am I failing? Is the algo wrong? Hitting the time limit?
The problem arose between the differences of the execution environments.
When I executed on my machine on Python 3.7 this
r = nCopy % n
gave me an answer as an int.
While Foobar runs on 2.7, and the answer above is given as a float

How to reorder digits of a number and insert digit 5 to get the maximum possible absolute value

Please advise how I can reorder digits of a number and add a digit 5 to the result so that its absolute value will be the highest.
For example, if the input is 578 the expected result is 8755. Otherwise, if the input is negative -483, the output is expected to be -8543.
I've managed to make it work on positive numbers only, however, I need to make it work for negative numbers as well:
def solution(N):
a = [] # list of digits, e.g. int(123)
while N != 0:
v = N % 10 # last digit as div remainder, e.g.: 123 % 10 = 3
N = int(N / 10) # remove last digit using integer division: 123 / 10 = 12.3; int(12.3) = 12
a = [v] + a # concatenate two lists: newly created list with one element (v = 3) and list a
# as a result the digits will be in natural order => [1,2,3]
if len(a) == 0: # need to create list with one element [0] as the cycle before ignores 0
a = [0]
inserted = False
for i in range(0, len(a)): # i = 0, 1, 2; len = 3
if a[i] < 5:
# a[from:to exclusive] e.g.: [1, 2, 3][0:2] => [1, 2]. index of 1 is 0, index of 2 is 1, index 2 is excluded
a = a[0:i] + [5] + a[i:]
inserted = True
break
if not inserted:
a = a + [5]
N = 0 # reconstruct number from digits, list of digits to int
for i in range(0, len(a)):
N = N * 10 + a[i] # 0 + 1; 1 * 10 + 2; 12 * 10 + 3 = 123
return N
if __name__ == ‘__main__’:
print(“Solution:”, solution(0))
here i made some major changes by using some inbuilt python methods :
def solution(N):
sign = False #to determine the sign of N (positive or negative )
if N < 0:
sign = True
N= N * -1 # as N<0 we make it positive
a = []
while N != 0:
v = N % 10
N = int(N / 10)
a = [v] + a
a.append(5) # in built method to add an element at the end of the list
a.sort() # in built method to sort the list (ascending order)
a.reverse() # in build method to reverse the order of list (make it descending order)
N = 0
for i in range(0, len(a)):
N = N * 10 + a[i]
if sign: # convert negative integers back to negative
N = N * -1
return N
Sample output :
for negative
solution(-2859)
-98552
positive
solution(9672)
97652
If you need to insert 5 and to make the output number the maximum number both for negative and positive numbers (and without the condition to not replace or transform the input set of digits), then this may be a solution:
def solution(N):
negative = False
if N < 0:
negative = True
N = N * -1 # as N<0 we make it positive
a = []
while N != 0:
v = N % 10
N = int(N / 10)
a = [v] + a
if len(a) == 0:
a = [0]
inserted = False
for i in range(0, len(a)):
if (not negative and a[i] < 5) or (negative and a[i] > 5):
a = a[0:i] + [5] + a [i:]
inserted = True
break
if not inserted:
a = a + [5]
N = 0
for i in range(0, len(a)):
N = N * 10 + a[i]
if negative:
N = N * -1
return N
if __name__ == '__main__':
print("Solution:", solution(N))
Will the below do the trick:
x=-34278
no_to_insert=5
res=int(''.join(sorted(list(str(abs(x)))+[str(no_to_insert)], reverse=True)))
if x<0:
res=-res
Output:
-875432
Java Solution
public int solution(int N) {
int digit = 5;
if (N == 0) return digit * 10;
int neg = N/Math.abs(N);
N = Math.abs(N);
int n = N;
int ctr = 0;
while (n > 0){
ctr++;
n = n / 10;
}
int pos = 1;
int maxVal = Integer.MIN_VALUE;
for (int i=0;i<=ctr;i++){
int newVal = ((N/pos) * (pos*10)) + (digit*pos) + (N%pos);
if (newVal * neg > maxVal){
maxVal = newVal*neg;
}
pos = pos * 10;
}
return maxVal;
}

Find all pairs of elements within an array that sum up to a given value

I am stuck in one of Hackerrank problems with following problem description :-
You will be given an array of integers and a target value. Determine the number of pairs of array elements that have a difference equal to a target value.
For example, given an array of [1, 2, 3, 4] and a target value of 1, we have three values meeting the condition: (2,1), (3,2), (4,3). So function pairs should return value 3.
We have to implement pairs function with following parameters :-
k: an integer, the target difference
arr: an array of integers
Constraints :-
1> Each Integer in arr[i] will be unique and positive.
2> target k will also be positive.
My below function implementation is failing one of 18 test cases because of the wrong result. Can anyone please help me debug the issue :-
def binSearch(target,arr):
lower = 0
upper = len(arr)-1
while lower <= upper:
mid = int((lower + upper)/2)
if(arr[mid] == target):
return 1
elif(arr[mid] > target):
upper = mid - 1
elif(arr[mid] < target):
lower = mid + 1
return -1
def pairs(k, arr):
arr.sort()
count = 0
for i in range(len(arr)):
target = abs(arr[i] - k)
if(arr[i] == target):
pass
elif(binSearch(target,arr) == 1):
count += 1
return count
This should be an O(n) solution (where n is the size of arr). First, convert the array to a set. Then iterate through each value in arr and check if arr + k is in the set, i.e. the difference between the other value and the current value val is equal to k. If so, increment counter by one.
def pairs(k, arr):
counter = 0
set_arr = set(arr)
for val in arr:
if val + k in set_arr:
counter += 1
return counter
By java, for rum time it will take O(n log n) to sort the array, and for binary search it takes O(log m) so O(n log n + log m) == > O(n log n)
space it will take O(n)
public static ArrayList<Pair<Integer, Integer>> pairThatHaveDiffrenceK(int[] arr , int arrLength , int k ){
ArrayList<Pair<Integer, Integer>> arrayList = new ArrayList<>();
Arrays.sort(arr);
for (int i = 0; i < arrLength ; i++) {
temp = arr[i] + k;
int testIfFound = indexFromBinarySearch(arr, temp);
if (testIfFound != -1){
arrayList.add(new Pair<>(arr[i],arr[testIfFound] ));
}
}
return arrayList ;
}
public static int indexFromBinarySearch(int[] arr, int valueForTarget){
int start =0 ;
int end = arr.length-1;
while (start <= end){
int mid = (start+end)/2 ;
if(arr[mid] == valueForTarget){
return mid;
}
else if(valueForTarget > arr[mid])
start =mid+1;
else
end = mid-1;
}
return -1 ;
}

Math: More control on Integer Partition Functions and Algorithm

Integer partitions is a very interesting topic. Creating all the partitions of a given integer is almost simple, such as the following code:
def aP(n):
"""Generate partitions of n as ordered lists in ascending
lexicographical order.
This highly efficient routine is based on the delightful
work of Kelleher and O'Sullivan."""
a = [1]*n
y = -1
v = n
while v > 0:
v -= 1
x = a[v] + 1
while y >= 2 * x:
a[v] = x
y -= x
v += 1
w = v + 1
while x <= y:
a[v] = x
a[w] = y
yield a[:w + 1]
x += 1
y -= 1
a[v] = x + y
y = a[v] - 1
yield a[:w]
Searching for a way to gain much more control over the function, for example in order to generate only the partitions of N size, this solution appears better:
def sum_to_n(n, size, limit=None):
"""Produce all lists of `size` positive integers in decreasing order
that add up to `n`."""
if size == 1:
yield [n]
return
if limit is None:
limit = n
start = (n + size - 1) // size
stop = min(limit, n - size + 1) + 1
for i in range(start, stop):
for tail in sum_to_n(n - i, size - 1, i):
yield [i] + tail
But both of them do generates ALL the possible partitions of the given number, the first with every size, the second of the given size. What if I want only one specific partition of a given number ?
The following code generates the next partition of a given partition:
def next_partition(p):
if max(p) == 1:
return [sum(p)]
p.sort()
p.reverse()
q = [ p[n] for n in range(len(p)) if p[n] > 1 ]
q[-1] -= 1
if (p.count(1)+1) % q[-1] == 0:
return q + [q[-1]]*((p.count(1)+1) // q[-1])
else:
return q + [q[-1]]*((p.count(1)+1) // q[-1]) + [(p.count(1)+1) % q[-1]]
But still there is a problem, you have to know what is the partition before the partition requested.
Suppose now to need a given partition of an integer N and you only know the number of the partition; example:
The partitions of 4 are:
n.1 4
n.2 3+1
n.3 2+2
n.4 2+1+1
n.5 1+1+1+1
How to create the partition number 2 (3+1) giving only the integer (4) and the sequence number (2) ? All of this without the creation of all the partitions ?
I have read somewhere that is possible with a mathematic formula but I do not know how.

Categories

Resources