Nth 1 in a sequence [closed] - python

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
A array of length t has all elements initialized by 1 .Now we can perform two types of queries on the array
to replace the element at ith index to 0 .This query is denoted by 0 index
find and print an integer denoting the index of the kth 1 in array A on a new line; if no such index exists print -1.This query is denoted by 1 k
Now suppose for array of length t=4 all its elements at the beginning are [1,1,1,1] now for query 0 2 the array becomes [1,0,1,1] and for query 1 3 the output comes out to be 4
I have used a brute force approach but how to make the code more efficient?
n,q=4,2
arr=[1]*4
for i in range(q):
a,b=map(int,input().split())
if a==0:
arr[b-1]=0
else:
flag=True
count=0
target=b
for i,j in enumerate(arr):
if j ==1:
count+=1
if count==target:
print(i+1)
flag=False
break
if flag:
print(-1)
I have also tried to first append all the indexes of 1 in a list and then do binary search but pop 0 changes the indices due to which the code fails
def binary_search(low,high,b):
while(low<=high):
mid=((high+low)//2)
#print(mid)
if mid+1==b:
print(stack[mid]+1)
return
elif mid+1>b:
high=mid-1
else:
low=mid+1
n=int(input())
q=int(input())
stack=list(range(n))
for i in range(q):
a,b=map(int,input().split())
if a==0:
stack.pop(b-1)
print(stack)
else:
if len(stack)<b:
print(-1)
continue
else:
low=0
high=len(stack)-1
binary_search(low,high,b)

You could build a binary tree where each node gives you the number of ones that are below and at the left of it. So if n is 7, that tree would initially look like this (the actual list with all ones is shown below it):
4
/ \
2 2
/ \ / \
1 1 1 1
----------------
1 1 1 1 1 1 1 -
Setting the array element at index 4 (zero-based) to 0, would change that tree to:
4
/ \
2 1*
/ \ / \
1 1 0* 1
----------------
1 1 1 1 0*1 1 -
Putting a 0 thus represents a O(log(n)) time complexity.
Counting the number of ones can then also be done in the same time complexity by summing up the node values while descending down the tree in the right direction.
Here is Python code you could use. It represents the tree in a list in breadth-first order. I have not gone to great lengths to further optimise the code, but it has the above time complexities:
class Ones:
def __init__(self, n): # O(n)
self.lst = [1] * n
self.one_count = n
self.tree = []
self.size = 1 << (n-1).bit_length()
at_left = self.size // 2
width = 1
while width <= at_left:
self.tree.extend([at_left//width] * width)
width *= 2
def clear_index(self, i): # O(logn)
if i >= len(self.lst) or self.lst[i] == 0:
return
self.one_count -= 1
self.lst[i] = 0
# Update tree
j = 0
bit = self.size >> 1
while bit >= 1:
go_right = (i & bit) > 0
if not go_right:
self.tree[j] -= 1
j = j*2 + 1 + go_right
bit >>= 1
def get_index_of_ith_one(self, num_ones): # O(logn)
if num_ones <= 0 or num_ones > self.one_count:
return -1
j = 0
k = 0
bit = self.size >> 1
while bit >= 1:
go_right = num_ones > self.tree[j]
if go_right:
k |= bit
num_ones -= self.tree[j]
j = j*2 + 1 + go_right
bit >>= 1
return k
def is_consistent(self): # Only for debugging
# Check that list can be derived by calling get_index_of_ith_one for all i
lst = [0] * len(self.lst)
for i in range(1, self.one_count+1):
lst[self.get_index_of_ith_one(i)] = 1
return lst == self.lst
# Example use
ones = Ones(12)
print('tree', ones.tree)
ones.clear_index(5)
ones.clear_index(2)
ones.clear_index(1)
ones.clear_index(10)
print('tree', ones.tree)
print('lst', ones.lst)
print('consistent = ', ones.is_consistent())
Be aware that this treats indexes as zero-based, while the method get_index_of_ith_one expects an argument that is at least 1 (but it returns a zero-based index).
It should be easy to adapt to your needs.
Complexity
Creation: O(n)
Clear at index: O(logn)
Get index of one: O(logn)
Space complexity: O(n)

Let's start with some general tricks:
Check if the n-th element is too big for the list before iterating. If you also keep a "counter" that stores the number of zeros, you could even check if nth >= len(the_list) - number_of_zeros (not sure if >= is correct here, it seems like the example uses 1-based indices so I could be off-by-one). That way you save time whenever too big values are used.
Use more efficient functions.
So instead of input you could use sys.stdin.readline (note that it will include the trailing newline).
And, even though it's probably not useful in this context, the built-in bisect module would be better than the binary_search function you created.
You could also use for _ in itertools.repeat(None, q) instead of for i in range(q), that's a bit faster and you don't need that index.
Then you can use some more specialized facts about the problem to improve the code:
You only store zeros and ones, so you can use if j to check for ones and if not j to check for zeros. These will be a bit faster than manual comparisons especially in when you do that in a loop.
Every time you look for the nth 1, you could create a temporary dictionary (or a list) that contains the encountered ns + index. Then re-use that dict for subsequent queries (dict-lookup and list-random-access is O(1) while your search is O(n)). You could even expand it if you have subsequent queries without change in-between.
However if a change happens you either need to discard that dictionary (or list) or update it.
A few nitpicks:
The variable names are not very descriptive, you could use for index, item in enumerate(arr): instead of i and j.
You use a list, so arr is a misleading variable name.
You have two i variables.
But don't get me wrong. It's a very good attempt and the fact that you use enumerate instead of a range is great and shows that you already write pythonic code.

Consider something akin to the interval tree:
root node covers the entire array
children nodes cover left and right halves of the parent range respectively
each node holds the number of ones in its range
Both replace and search queries could be completed in logarithmic time.

Refactored with less lines, so more efficient in terms of line count but run time probably the same O(n).
n,q=4,2
arr=[1]*4
for i in range(q):
query, target = map(int,input('query target: ').split())
if query == 0:
arr[target-1] = 0
else:
count=0
items = enumerate(arr, 1)
try:
while count < target:
index, item = next(items)
count += item
except StopIteration as e:
index = -1
print(index)
Assumes arr contains ONLY ones and zeroes - you don't have to check if an item is one before you add it to count, adding zero has no affect.
No flags to check, just keep calling next on the enumerate object (items) till you reach your target or the end of arr.
For runtime efficiency, using an external library but basically the same process (algorithm):
import numpy as np
for i in range(q):
query, target = map(int,input('query target: ').split())
if query == 0:
arr[target-1] = 0
else:
index = -1
a = np.array(arr).cumsum() == target
if np.any(a):
index = np.argmax(a) + 1
print(index)

Related

How to improve efficiency and time-complexity of AIO Cloud Cover solution

Sorry for the noob question, but is there a less time expensive method to iterate through the input list, as upon submission I receive timeout errors. I tried changing the method of checking for the minimum answer by appending to a list and using min function, but as expected that didn't help at all.
Input:
6 3
3
6
4
2
5
Solution:
with open("cloudin.txt", "r") as input_file:
n, covered = map(int, input_file.readline().split())
ls = [None for i in range(100005)]
for i in range(n-1):
ls[i] = int(input_file.readline().strip())
ans = 1000000001
file = open("cloudout.txt", "w")
for i in range(n-covered):
a = 0
for j in range(covered):
a += ls[i+j]
if a < ans:
ans = a
file.write(str(ans))
output:
11
https://orac2.info/problem/aio18cloud/
Note: Blue + White indicates timeout
The core logic of your code is contained in these lines:
ans = 1000000001
for i in range(n-covered):
a = 0
for j in range(covered):
a += ls[i+j]
if a < ans:
ans = a
Let's break down what this code actually does. For each closed interval (i.e. including the endpoints) [left, right] from the list [0, covered-1], [1, covered], [2, covered+1], ..., [n-covered-1, n-2] (that is, all closed intervals containing exactly covered elements and that are subintervals of [0, n-2]), you are computing the range sum ls[left] + ls[left+1] + ... + ls[right]. Then you set ans to the minimum such range sum.
Currently, that nested loop takes O((n-covered)*covered)) steps, which is O(n^2) if covered is n/2, for example. You want a way to compute that range sum in constant time, eliminating the nested loop, to make the runtime O(n).
The easiest way to do this is with a prefix sum array. In Python, itertools.accumulate() is the standard/simplest way to generate those. To see how this helps:
Original Sum: ls[left] + ls[left+1] + ... + ls[right]
can be rewritten as the difference of prefix sums
(ls[0] + ls[1] + ... + ls[right])
- (ls[0] + ls[1] + ... + ls[left-1])
which is prefix_sum(0, right) - prefix_sum(0, left-1)
where are intervals are written in inclusive notation.
Pulling this into a separate range_sum() function, you can rewrite the original core logic block as:
prefix_sums = list(itertools.accumulate(ls, initial=0))
def range_sum(left: int, right: int) -> int:
"""Given indices left and right, returns the sum of values of
ls in the inclusive interval [left, right].
Equivalent to sum(ls[left : right+1])"""
return prefix_sums[right+1] - prefix_sums[left]
ans = 1000000001
for i in range(n - covered):
a = range_sum(left=i, right=i+covered-1)
if a < ans:
ans = a
The trickiest part of prefix sum arrays is just avoiding off-by-one errors in indexes. Notice that our prefix sum array of the length-n array ls has n+1 elements, since it starts with the empty initial prefix sum of 0, and so we add 1 to array accesses to prefix_sums compared to our formula.
Also, it's possible there may be an off-by-one error in your original code, as the value ls[n-1] is never accessed or used for anything after being set?

How to add the last two elements in a list and add the sum to existing list

I am learning Python and using it to work thru a challenge found in Project Euler. Unfortunately, I cannot seem to get around this problem.
The problem:
Even Fibonacci numbers
Each new term in the Fibonacci sequence is generated by adding the
previous two terms. By starting with 1 and 2, the first 10 terms will
be:
1, 2, 3, 5, 8, 13, 21, 34, 55, 89, ...
By considering the terms in the Fibonacci sequence whose values do not
exceed four million, find the sum of the even-valued terms.
I created a for loop that adds the second to last element and the last element from the list x:
x = [1,2]
for i in x:
second_to_last = x[-2]
running_sum = i + second_to_last
If you run the above, you get 3. I am looking to add this new element back to the original list, x, and repeat the process. However, each time I try to use the append() function, the program crashes and keeps on running without stopping. I tried to use a while loop to stop this, but that was a complete failure. Why am I not able to add or append() the new element (running_sum) back to the original list (x)?
UPDATE:
I did arrive at the solution (4613732), but I the work to getting there did not seem efficient. Here is my solution:
while len(x) in range(1,32):
for i in x:
second_to_last = x[-2]
running_sum = i + second_to_last
x.append(running_sum)
print(x)
new_x = []
for i in x:
if i%2 == 0:
new_x.append(i)
sum(new_x)
I did have to check the range to see visually whether I did not exceed 4 million. But as I said, the process I took was not efficient.
If you keep adding elements to a list while iterating over that list, the iteration will never finish.
You will need some other criterion to abort the loop - for example, in this case
if running_sum > 4000000:
break
would work.
(Note that you don't strictly speaking need a list at all here; I'd suggest experimenting a bit with it.)
Here are two different ways to solve this. One of them builds the whole list, then sums the even elements. The other one only keeps the last two elements, without making the whole list.
fib = [1,2]
while fib[-1] < 4000000:
fib.append(fib[-2]+fib[-1])
# Get rid of the last one, since it was over the limit.
fib.pop(-1)
print( sum(i for i in fib if i % 2 == 0) )
fib = (1,2)
sumx = 2
while True:
nxt = fib[0]+fib[1]
if nxt >= 4000000:
break
if nxt % 2 == 0:
sumx += nxt
fib = (fib[1],nxt)
print(sumx)
I don't answer your question about list modification but the solution for your problem:
def sum_even_number_fibonacci(limit):
n0 = 0 # Since we don't care about index (n-th), we can use n0 = 0 or 1
n1 = 1
even_number_sum = 0
while n1 <= limit:
if n1 % 2 == 0:
even_number_sum += n1
n2 = n0 + n1
# Only store the last two number of the Fibonacci sequence to calculate the next one
n0 = n1
n1 = n2
return even_number_sum
sum_even_number_fibonacci(4_000_000)

Python: Performance when looping and manipulating a large array

My question is two-fold:
Is there a way to both efficiently loop over and manipulate an
array using enumerate for example and manipulate the loop at
the same time?
Are there any memory-optimized versions of arrays in python?
(like NumPy creating smaller arrays with a specified type)
I have made an algorithm finding prime numbers in range (2 - rng) with the Sieve of Eratosthenes.
Note: The problem is nonexistent if searching for primes in 2 - 1,000,000 (under 1 sec total runtime too). In the tens and hundreds of millions this starts to hurt. So far changing the table from including all natural numbers to just odd ones, the rough maximum range I was able to search was 400 million (200 million in odd numbers).
Whiles instead of for loops decrease performance at least with the current algorithm.
NumPy while being able to create smaller arrays with type conversion, it actually takes roughly double the time to process with the same code, except
oddTable = np.int8(np.zeros(size))
in place of
oddTable = [0] * size
and using integers to assign values "prime" and "not prime" to keep the array type.
Using pseudo-code, the algorithm would look like this:
oddTable = [0] * size # Array representing odd numbers excluding 1 up to rng
for item in oddTable:
if item == 0: # Prime, since not product of any previous prime
set item to "prime"
set every multiple of item in oddTable to "not prime"
Python is a neat language particularly when looping over every item in a list, but as the index in, say
for i in range(1000)
can't be manipulated while in the loop, I had to convert the range a few times to produce an iterable which to use. In the code: "P" marks prime numbers, "_" marks not primes and 0 not checked.
num = 1 # Primes found (2 is prime)
size = int(rng / 2) - 1 # Size of table required to represent odd numbers
oddTable = [0] * size # Array with odd numbers \ 1: [3, 5, 7, 9...]
new_rng = int((size - 1) / 3) # To go through every 3rd item
for i in range(new_rng): # Eliminate no % 3's
oddTable[i * 3] = "_"
oddTable[0] = "P" # Set 3 to prime
num += 1
def act(x): # The actual integer index x in table refers to
x = (x + 1) * 2 + 1
return x
# Multiples of 2 and 3 eliminated, so all primes are 6k + 1 or 6k + 5
# In the oddTable: remaining primes are either 3*i + 1 or 3*i + 2
# new_rng to loop exactly 1/3 of the table length -> touch every item once
for i in range(new_rng):
j = 3*i + 1 # 3*i + 1
if oddTable[j] == 0:
num += 1
oddTable[j] = "P"
k = act(j)
multiple = j + k # The odd multiple indexes of act(j)
while multiple < size:
oddTable[multiple] = "_"
multiple += k
j += 1 # 3*i + 2
if oddTable[j] == 0:
num += 1
oddTable[j] = "P"
k = act(j)
multiple = j + k
while multiple < size:
oddTable[multiple] = "_"
multiple += k
To make your code more pythonic, split your algorithm in smaller chunks (functions), so that each chunk can be grasped easily.
My second comment might astound you: Python comes with "batteries included". In order to program your Erathostenes' Sieve, why do you need to manipulate arrays explicitly and pollute your code with it? Why not create a function (e.g. is_prime) and use the standard memoize decorator that was provided for that purpose? (If you insist on using 2.7, see also memoization library for python 2.7).
The result of the two pieces of advice above might not be the "most efficient", but it will (as I experienced with that exact problem) work well enough, while allowing you to quickly create sleek code that will save your programmer's time (both for creation and maintenance).

Is there a better way to write the following method in python?

I am writing a small program, in python, which will find a lone missing element from an arithmetic progression (where the starting element could be both positive and negative and the series could be ascending or descending).
so for example: if the input is 1 3 5 9 11, then the function should return 7 as this is the lone missing element in the above AP series.
The input format: the input elements are separated by 1 white space and not commas as is commonly done.
Here is the code:
def find_missing_elm_ap_series(n, series):
ap = series
ap = ap.split(' ')
ap = [int(i) for i in ap]
cd = []
for i in range(n-1):
cd.append(ap[i+1]-ap[i])
common_diff = 0
if len(set(cd)) == 1:
print 'The series is complete'
return series
else:
cd = [abs(i) for i in cd]
common_diff = min(cd)
if ap[0] > ap[1]:
common_diff = (-1)*common_diff
new_ap = []
for i in range(n+1):
new_ap.append(ap[0] + i*common_diff)
missing_element = set(new_ap).difference(set(ap))
return missing_element
where n is the length of the series provided (the series with the missing element:5 in the above example).
I am sure there are other shorter and more elegant way of writing this code in python. Can anybody help ?
Thanks
BTW: i am learning python by myself and hence the question.
Based on the fact that if an element is missing it is exactly expected-sum(series) - actual-sum(series). The expected sum for a series with n elements starting at a and ending at b is (a+b)*n/2. The rest is Python:
def find_missing(series):
A = map(int, series.split(' '))
a, b, n, sumA = A[0], A[-1], len(A), sum(A)
if (a+b)*n/2 == sumA:
return None #no element missing
return (a+b)*(n+1)/2-sumA
print find_missing("1 3 5 9") #7
print find_missing("-1 1 3 5 9") #7
print find_missing("9 6 0") #3
print find_missing("1 2 3") #None
print find_missing("-3 1 3 5") #-1
Well... You can do simpler, but it would completely change your algorithm.
First, you can prove that the step for the arithmetic progression is ap[1] - ap[0], unless ap[2] - ap[1] is lower in magnitude than it, in which case the missing element is between terms 0 and 1. (This is true as there is a single missing element.)
Then you can just take ap[0] + n * step and print the first one that doesn't match.
Here is the source code (also implementing some minor shortcuts, such as grouping your first three lines into one):
def find_missing_elm_ap_series(n, series):
ap = [int(i) for i in series.split(' ')]
step = ap[1] - ap[0]
if (abs(ap[2] - ap[1]) <= abs(step)): # Check missing elt is not between 0 and 1
return ap[0] + ap[2] - ap[1]
for (i, val) in zip(range(len(ap)), ap): # And check position of missing element
if ap[0] + i * step != val:
return ap[0] + i * step
return series # missing element not found
The code appears to be working. There is perhaps a slightly easier way to get it done. This is due to the fact that you don't have to attempt to look through all of the values to get the common difference. The following code simply looks at the difference between the 1st and 2nd as well as the last and second last.
This works in the event that only a single value is missing (and the length of the list is at least 3). As the min difference between the values will provide you the common difference.
def find_missing(prog):
# First we cast them to numbers.
items = [int(x) for x in prog.split()]
#Then we compare the first and second
first_to_second = items[1] - items[0]
#then we compare the last to second last
last_to_second_last = items[-1] - items[-2]
#Now we have to care about which one is closes
# to zero
if abs(first_to_second) < abs(last_to_second_last):
change = first_to_second
else:
change = last_to_second_last
#Iterate through the list. As soon as we find a gap
#that is larger than change, we fill in and return
for i in range(1, len(items)):
comp = items[i] - items[i-1]
if comp != change:
return items[i-1] + change
#There was no gap
return None
print(find_missing("1 3 5 9")) #7
print(find_missing("-1 1 3 5 9")) #7
print(find_missing("9 6 0")) #3
print(find_missing("1 2 3")) #None
The previous code shows this example. First of all attempting to find change between each of the values of the list. Then iterating till the change is missed, and returning the value that has been expected.
Here's the way I thought about it: find the position of the maximum difference between the elements of the array; then regenerate the expected number in the sequence from the other differences (which should be all the same and the minimum number in the differences list):
def find_missing(a):
d = [a[i+1] - a[i] for i in range(len(a)-1)]
i = d.index(max(d))
x = min(d)
return a[0] + (i+1)*x
print find_missing([1,3,5,9,11])
7
print find_missing([1,5,7,9,11])
3
Here are some ideas:
Passing the length of the series seems like a bad idea. The function can more easily calculate the length
There is no reason to assign series to ap, just do a function using series and assign the result to ap
When splitting the string, don't give the sep argument. If you don't give the argument, then consecutive white space will also be removed and leading and trailing white space will also be ignored. This is more friendly on the format of the data.
I've combined a few operations. For example the split and the list comprehension converting to integer make sense to group together. There is also no need to create cd as a list and then convert that to a set. Just build it as a set to start with.
I don't like that the function returns the original series in the case of no missing element. The value None would be more in keeping with the name of the function.
Your original function returned a one item set as the result. That seems odd, so I've used pop() to extract that item and return just the missing element.
The last item was more of an experiment with combining all of the code at the bottom into a single statement. Don't know if it is better, but it's something to think about. I built a set with all the correct numbers and a set with the given numbers and then subtracted them and returned the number that was missing.
Here's the code that I came up with:
def find_missing_elm_ap_series(series):
ap = [int(i) for i in series.split()]
n = len(ap)
cd = {ap[i+1]-ap[i] for i in range(n-1)}
if len(cd) == 1:
print 'The series is complete'
return None
else:
common_diff = min([abs(i) for i in cd])
if ap[0] > ap[1]:
common_diff = (-1)*common_diff
return set(range(ap[0],ap[0]+common_diff*n,common_diff)).difference(set(ap)).pop()
Assuming the first & last items are not missing, we can also make use of range() or xrange() with the step of the common difference, getting rid of the n altogether, it can also return more than 1 missing item (although not reliably depending on number of items missing):
In [13]: def find_missing_elm(series):
ap = map(int, series.split())
cd = map(lambda x: x[1]-x[0], zip(ap[:-1], ap[1:]))
if len(set(cd)) == 1:
print 'complete series'
return ap
mcd = min(cd) if ap[0] < ap[1] else max(cd)
sap = set(ap)
return filter(lambda x: x not in sap, xrange(ap[0], ap[-1], mcd))
....:
In [14]: find_missing_elm('1 3 5 9 11 15')
Out[14]: [7, 13]
In [15]: find_missing_elm('15 11 9 5 3 1')
Out[15]: [13, 7]

Fastest algorithm possible to pick number pairs

The question:
Given N integers [N<=10^5], count the total pairs of integers that have a difference of K. [K>0 and K<1e9]. Each of the N integers will be greater than 0 and at least K away from 2^31-1 (Everything can be done with 32 bit integers).
1st line contains N & K (integers).
2nd line contains N numbers of the set. All the N numbers are assured to be distinct.
Now the question is from hackerrank. I got a solution for the question but it doesn't satisfy the time-limit for all the sample test cases. I'm not sure if its possible to use another algorithm but I'm out of ideas. Will really appreciate if someone took a bit of time to check my code and give a tip or two.
temp = input()
temp = temp.split(" ")
N = int(temp[0])
K = int(temp[1])
num_array = input()
num_array = num_array.split(" ")
diff = 0
pairs= 0
i = 0
while(i < N):
num_array[i] = int(num_array[i])
i += 1
while(num_array != []):
j = 0
while(j < (len(num_array)-1)):
diff = abs(num_array[j+1] - num_array[0])
if(diff == K):
pairs += 1
j += 1
del num_array[0]
if(len(num_array) == 1):
break
print(pairs)
You can do this in aproximately linear time by following the procedure:
So, O(n) solution:
For each number x add it to hash-set H[x]
For each number x check whether x-k is in H, if yes - add 1 to answer
Or by using some balanced structure (like tree-based set) in O(nlgn)
This solution bases on the assumption that integers are distinct, if they are not you need to store the number of times element has been "added to set" and instead of adding 1 to answer - add the product of H[x]*H[x+k]
So in general you take some HashMap H with "default value 0"
For each number x update map: H[x] = H[x]+1
For each number x add to answer H[x]*H[x-k] (you don't have to check whether it is in the map, because if it is not, H[x-k]=0 )
and again - solution using hash-map is O(n) and using tree-map O(nlgn)
So given set of numbesr A, and number k (solution for distinct numbers):
H=set()
ans=0
for a in A:
H.add(a)
for a in A:
if a-k in H:
ans+=1
print ans
or shorter
H=set(A)
ans = sum(1 for a in A if a-k in H)
print ans
Use a dictionary (hash map).
Step 1: Fill the dictionary D with all entries from the array.
Step 2: Count occurences of all A[i] + k in the dictionary.
Dictionary<int> dict = new Dictionary<int>();
foreach (int n in num_array) do dict.Add(n);
int solitions = 0;
foreach (int n in num_Array) do
if dict.contains(n+k)
solutions += 1;
Filling a dictionary is O(1), Searching is O(1) as well. Doing it for each element in the array is O(n). This is as fast as it can get.
Sorry, you have to translate it to python, though.
EDIT: Same idea as the previous one. Sorry to post a duplicate. It's too late to remove my duplicate I guess.

Categories

Resources