I have a question for Divide and Conquering in programming algorithms. Suppose you are given a random integer list in Python which consists of:
Unique contiguous pairs of integers
A single integer somewhere in the list
And the conditions are exclusive, meaning while [2,2,1,1,3,3,4,5,5,6,6] is valid, these are not:
[2,2,2,2,3,3,4] (violates condition 1: because there are two pairs of 2s while there can only be a maximum of 1 pair of any number)
[1,4,4,5,5,6,6,1] (violates condition 1: because there is a pair of 1s but they are not contiguous).
[1,4,4,5,5,6,6,3] (violates condition 2: there are 2 single numbers, 1 and 3)
Now the question is can you find the 'single' number index in an O(lgn) algorithm?
My original jab is this:
def single_num(array, arr_max_len):
i = 0
while (i < arr_max_len):
if (arr_max_len - i == 1):
return i
elif (array[i] == array[i + 1]):
i = i + 2
else:
return i # don't have to worry about odd index because it will never happen
return None
However, the algorithm seems to run at O(n/2) time, which seems like the best it could do.
Even if I use divide and conquer, I don't think it's going to get better than O(n/2) time, unless there's some method that's beyond my scope of comprehension at the moment.
Anyone has any better idea, or can I arguably say, this is already in O(log n) time?
EDIT: It seems like Manuel has the best solution, if allowed Ill have some time to implement a solution myself for understanding, and then accept Manuel’s answer.
Solution
Just binary search the even indexes to find the first whose value differs from the next value.
from bisect import bisect
def single_num(a):
class E:
def __getitem__(_, i):
return a[2*i] != a[2*i+1]
return 2 * bisect(E(), False, 0, len(a)//2)
Explanation
Visualization of the virtual "list" E() that I'm searching on:
0 1 2 3 4 5 6 7 8 9 10 (indices)
a = [2, 2, 1, 1, 3, 3, 4, 5, 5, 6, 6]
E() = [False, False, False, True, True]
0 1 2 3 4 (indices)
In the beginning, the pairs match (so != results in False-values). Starting with the single number, the pairs don't match (so != returns True). Since False < True, that's a sorted list which bisect happily searches in.
Alternative implementation
Without bisect, if you're not yet tired of writing binary searches:
def single_num(a):
i, j = 0, len(a) // 2
while i < j:
m = (i + j) // 2
if a[2*m] == a[2*m+1]:
i = m + 1
else:
j = m
return 2*i
Sigh...
I wish bisect would support giving it a callable so I could just do return 2 * bisect(lambda i: a[2*i] != a[2*i+1], False, 0, len(a)//2). Ruby does, and it's perhaps the most frequent reason I sometimes solve coding problems with Ruby instead of Python.
Testing
Btw I tested both with all possible cases for up to 1000 pairs:
from random import random
for pairs in range(1001):
a = [x for _ in range(pairs) for x in [random()] * 2]
single = random()
assert len(set(a)) == pairs and single not in a
for i in range(0, 2*pairs+1, 2):
a.insert(i, single)
assert single_num(a) == i
a.pop(i)
A lg n algorithm is one in which you split the input into smaller parts, and discard some of the smaller part such that you have a smaller input to work with. Since this is a searching problem, the likely solution for a lg n time complexity is binary search, in which you split the input in half each time.
My approach is to start off with a few simple cases, to spot any patterns that I can make use of.
In the following examples, the largest integer is the target number.
# input size: 3
[1,1,2]
[2,1,1]
# input size: 5
[1,1,2,2,3]
[1,1,3,2,2]
[3,1,1,2,2]
# input size: 7
[1,1,2,2,3,3,4]
[1,1,2,2,4,3,3]
[1,1,4,2,2,3,3]
[4,1,1,2,2,3,3]
# input size: 9
[1,1,2,2,3,3,4,4,5]
[1,1,2,2,3,3,5,4,4]
[1,1,2,2,5,3,3,4,4]
[1,1,5,2,2,3,3,4,4]
[5,1,1,2,2,3,3,4,4]
You probably notice that the input size is always an odd number i.e. 2*x + 1.
Since this is a binary search, you can check if the middle number is your target number. If the middle number is the single number (if middle_number != left_number and middle_number != right_number), then you have found it. Otherwise, you have to search the left side or the right side of the input.
Notice that in the sample test cases above, in which the middle number is not the target number, there is a pattern between the middle number and its pair.
For input size 3 (2*1 + 1), if middle_number == left_number, the target number is on the right, and vice versa.
For input size 5 (2*2 + 1), if middle_number == left_number, the target number is on the left, and vice versa.
For input size 7 (2*3 + 1), if middle_number == left_number, the target number is on the right, and vice versa.
For input size 9 (2*4 + 1), if middle_number == left_number, the target number is on the left, and vice versa.
That means the parity of x in 2*x + 1 (the array length) affects whether to search the left or right side of the input: search the right if x is odd and search the left if x is even, if middle_number == left_number (and vice versa).
Base on all these information, you can come up with a recursive solution. Note that you have to ensure that the input size is odd in each recursive call. (Edit: Ensuring that input size is odd makes the code even more messy. You probably want to come up with a solution in which parity of input size does not matter.)
def find_single_number(array: list, start_index: int, end_index: int):
# base case: array length == 1
if start_index == end_index:
return start_index
middle_index = (start_index + end_index) // 2
# base case: found target
if array[middle_index] != array[middle_index - 1] and array[middle_index] != array[middle_index + 1]:
return middle_index
# make use of parity of array length to search left or right side
# end_index == array length - 1
x = (end_index - start_index) // 2
# ensure array length is odd
include_middle = (middle_index % 2 == 0)
if array[middle_index] == array[middle_index - 1]: # middle == number on its left
if x % 2 == 0: # x is even
# search left side
return find_single_number(
array,
start_index,
middle_index if include_middle else middle_index - 1
)
else: # x is odd
# search right side side
return find_single_number(
array,
middle_index if include_middle else middle_index + 1,
end_index,
)
else: # middle == number on its right
if x % 2 == 0: # x is even
# search right side side
return find_single_number(
array,
middle_index if include_middle else middle_index + 1,
end_index,
)
else: # x is odd
# search left side
return find_single_number(
array,
start_index,
middle_index if include_middle else middle_index - 1
)
# test out the code
if __name__ == '__main__':
array = [2,2,1,1,3,3,4,5,5,6,6] # target: 4 (index: 6)
print(find_single_number(array, 0, len(array) - 1))
array = [1,1,2] # target: 2 (index: 2)
print(find_single_number(array, 0, len(array) - 1))
array = [1,1,3,2,2] # target: 3 (index: 2)
print(find_single_number(array, 0, len(array) - 1))
array = [1,1,4,2,2,3,3] # target: 4 (index: 2)
print(find_single_number(array, 0, len(array) - 1))
array = [5,1,1,2,2,3,3,4,4] # target: 5 (index:0)
print(find_single_number(array, 0, len(array) - 1))
My solution is probably not the most efficient or elegant, but I hope my explanation helps you understand the approach towards tackling these kind of algorithmic problems.
Proof that it has a time complexity of O(lg n):
Let's assume that the most important operation is the comparison of the middle number against the left and right number (if array[middle_index] != array[middle_index - 1] and array[middle_index] != array[middle_index + 1]), and that it has a time cost of 1 unit. Let us refer to this comparison as the main comparison.
Let T be time cost of the algorithm.
Let n be the length of the array.
Since this solution involves recursion, there is a base case and recursive case.
For the base case (n = 1), it is just the main comparison, so:
T(1) = 1.
For the recursive case, the input is split in half (either left half or right half) each time; at the same time, there is one main comparison. So:
T(n) = T(n/2) + 1
Now, I know that the input size must always be odd, but let us assume that n = 2k for simplicity; the time complexity would still be the same.
We can rewrite T(n) = T(n/2) + 1 as:
T(2k) = T(2k-1) + 1
Also, T(1) = 1 is:
T(20) = 1
When we expand T(2k) = T(2k-1) + 1, we get:
T(2k)
= T(2k-1) + 1
= [T(2k-2) + 1] + 1 = T(2k-2) + 2
= [T(2k-3) + 1] + 2 = T(2k-3) + 3
= [T(2k-4) + 1] + 3 = T(2k-4) + 4
= ...(repeat until k)
= T(2k-k) + k = T(20) + k = k + 1
Since n = 2k, that means k = log2 n.
Substituting n back in, we get:
T(n) = log2 n + 1
1 is a constant so it can be dropped; same goes for the base of the log operation.
Therefore, the upperbound of the time complexity of the algorithm is:
T(n) = lg n
so i need to find whats the longest sublist that can be mirrored, knowing the number of element
ex:
n = 5
my_list = [1,2,3,2,1]
heres my code:
n = int(input())
my_list = list(map(int, input().split()))
c = 0
s1 = my_list
x = 0
i = 0
while i < n:
s2 = s1[i:]
if s2 == s2[::-1]:
if c <= len(s2):
c = len(s2)
if i >= n-1:
i = 0
n = n - 1
s1 = s1[:-1]
i += 1
print(c)
as we see the list is the same when mirrored, but when n = 10 and my_list = [1,2,3,2,1,332,6597,6416,614,31] the result is 3 instead of the expected 5.
My solution would be splitting the array in each iteration into a left and a right array, and then reversing the left array.
Next, compare each element from each array and increment the length variable by one while the elements are the same.
def longest_subarr(a):
longest_exclude = 0
for i in range(1, len(a) - 1):
# this excludes a[i] as the root
left = a[:i][::-1]
# this also excludes a[i], needs to consider this in calculation later
right = a[i + 1:]
max_length = min(len(left), len(right))
length = 0
while(length < max_length and left[length] == right[length]):
length += 1
longest_exclude = max(longest_exclude, length)
# times 2 because the current longest is for the half of the array
# plus 1 to include to root
longest_exclude = longest_exclude * 2 + 1
longest_include = 0
for i in range(1, len(a)):
# this excludes a[i] as the root
left = a[:i][::-1]
# this includes a[i]
right = a[i:]
max_length = min(len(left), len(right))
length = 0
while(length < max_length and left[length] == right[length]):
length += 1
longest_include = max(longest_include, length)
# times 2 because the current longest is for the half of the array
longest_include *= 2
return max(longest_exclude, longest_include)
print(longest_subarr([1, 4, 3, 5, 3, 4, 1]))
print(longest_subarr([1, 4, 3, 5, 5, 3, 4, 1]))
print(longest_subarr([1, 3, 2, 2, 1]))
This covers test case for odd-length sub-array like [a, b, a] and even-length sub-array [a, b, b, a].
Since you need the longest sequence that can be mirrored, here is a simple O(n^2) approach for this.
Go to each index, consider it as the center, and expand towards both left and right, one step at a time, if the numbers are equal. Or else break, and move onto the next index.
def longest_mirror(my_array):
maxLength = 1
start = 0
length = len(my_array)
low = 0
high = 0
# One by one consider every character as center point of mirrored subarray
for i in range(1, length):
# checking for even length subarrays
low = i - 1
high = i
while low >= 0 and high < length and my_array[low] == my_array[high]:
if high - low + 1 > maxLength:
start = low
maxLength = high - low + 1
low -= 1
high += 1
# checking for even length subarrays
low = i - 1
high = i + 1
while low >= 0 and high < length and my_array[low] == my_array[high]:
if high - low + 1 > maxLength:
start = low
maxLength = high - low + 1
low -= 1
high += 1
return maxLength
I am working on Fibonacci series but in bit string which can be represented as:
f(0)=0;
f(1)=1;
f(2)=10;
f(3)=101;
f(4)=10110;
f(5)=10110101;
Secondly, I have a pattern for example '10' and want to count how many times this occurs in particular series, for example, the Fibonacci series for 5 is '101101101' so '10' occur 3 times.
my code is running correctly without error but the problem is that it cannot run for more than the value of n=45 I want to run n=100
can anyone help? I only want to calculate the count of occurrence
n=5
fibonacci_numbers = ['0', '1']
for i in range(1,n):
fibonacci_numbers.append(fibonacci_numbers[i]+fibonacci_numbers[i-1])
#print(fibonacci_numbers[-1])
print(fibonacci_numbers[-1])
nStr = str (fibonacci_numbers[-1])
pattern = '10'
count = 0
flag = True
start = 0
while flag:
a = nStr.find(pattern, start)
if a == -1:
flag = False
else:
count += 1
start = a + 1
print(count)
This is a fun one! The trick is that you don't actually need that giant bit string, just the number of 10s it contains and the edges. This solution runs in O(n) time and O(1) space.
from typing import NamedTuple
class FibString(NamedTuple):
"""First digit, last digit, and the number of 10s in between."""
first: int
tens: int
last: int
def count_fib_string_tens(n: int) -> int:
"""Count the number of 10s in a n-'Fibonacci bitstring'."""
def combine(b: FibString, a: FibString) -> FibString:
"""Combine two FibStrings."""
tens = b.tens + a.tens
# mind the edges!
if b.last == 1 and a.first == 0:
tens += 1
return FibString(b.first, tens, a.last)
# First two values are 0 and 1 (tens=0 for both)
a, b = FibString(0, 0, 0), FibString(1, 0, 1)
for _ in range(1, n):
a, b = b, combine(b, a)
return b.tens # tada!
I tested this against your original implementation and sure enough it produces the same answers for all values that the original function is able to calculate (but it's about eight orders of magnitude faster by the time you get up to n=40). The answer for n=100 is 218922995834555169026 and it took 0.1ms to calculate using this method.
The nice thing about the Fibonacci sequence that will solve your issue is that you only need the last two values of the sequence. 10110 is made by combining 101 and 10. After that 10 is no longer needed. So instead of appending, you can just keep the two values. Here is what I've done:
n=45
fibonacci_numbers = ['0', '1']
for i in range(1,n):
temp = fibonacci_numbers[1]
fibonacci_numbers[1] = fibonacci_numbers[1] + fibonacci_numbers[0]
fibonacci_numbers[0] = temp
Note that it still uses a decent amount of memory, but it didn't give me a memory error (it does take a bit of time to run though).
I also wasn't able to print the full string as I got an OSError [Errno 5] Input/Output error but it can still count and print that output.
For larger numbers, storing as a string is going to quickly cause a memory issue. In that case, I'd suggest doing the fibonacci sequence with plain integers and then converting to bits. See here for tips on binary conversion.
While the regular fibonacci sequence doesn't work in a direct sense, consider that 10 is 2 and 101 is 5. 5+2 doesn't work - you want 10110 or an or operation 10100 | 10 yielding 22; so if you shift one by the length of the other, you can get the result. See for example
x = 5
y = 2
(x << 2) | y
>> 22
Shifting x by the number of bits representing y and then doing a bitwise or with | solves the issue. Python summarizes these bitwise operations well here. All that's left for you to do is determine how many bits to shift and implement this into your for loop!
For really large n you will still have a memory issue shown in the plot:
'
Finally i got the answer but can someone explain it briefly why it is working
def count(p, n):
count = 0
i = n.find(p)
while i != -1:
n = n[i + 1:]
i = n.find(p)
count += 1
return count
def occurence(p, n):
a1 = "1"
a0 = "0"
lp = len(p)
i = 1
if n <= 5:
return count(p, atring(n))
while lp > len(a1):
temp = a1
a1 += a0
a0 = temp
i += 1
if i >= n:
return count(p, a1)
fn = a1[:lp - 1]
if -lp + 1 < 0:
ln = a1[-lp + 1:]
else:
ln = ""
countn = count(p, a1)
a1 = a1 + a0
i += 1
if -lp + 1 < 0:
lnp1 = a1[-lp + 1:]
else:
lnp1 = ""
k = 0
countn1 = count(p, a1)
for j in range(i + 1, n + 1):
temp = countn1
countn1 += countn
countn = temp
if k % 2 == 0:
string = lnp1 + fn
else:
string = ln + fn
k += 1
countn1 += count(p, string)
return countn1
def atring(n):
a0 = "0"
a1 = "1"
if n == 0 or n == 1:
return str(n)
for i in range(2, n + 1):
temp = a1
a1 += a0
a0 = temp
return a1
def fn():
a = 100
p = '10'
print( occurence(p, a))
if __name__ == "__main__":
fn()
Here's the problem: I try to randomize n times a choice between two elements (let's say [0,1] -> 0 or 1), and my final list will have n/2 [0] + n/2 [1]. I tend to have this kind of result: [0 1 0 0 0 1 0 1 1 1 1 1 1 0 0, until n]: the problem is that I don't want to have serially 4 or 5 times the same number so often. I know that I could use a quasi randomisation procedure, but I don't know how to do so (I'm using Python).
To guarantee that there will be the same number of zeros and ones you can generate a list containing n/2 zeros and n/2 ones and shuffle it with random.shuffle.
For small n, if you aren't happy that the result passes your acceptance criteria (e.g. not too many consecutive equal numbers), shuffle again. Be aware that doing this reduces the randomness of the result, not increases it.
For larger n it will take too long to find a result that passes your criteria using this method (because most results will fail). Instead you could generate elements one at a time with these rules:
If you already generated 4 ones in a row the next number must be zero and vice versa.
Otherwise, if you need to generate x more ones and y more zeros, the chance of the next number being one is x/(x+y).
You can use random.shuffle to randomize a list.
import random
n = 100
seq = [0]*(n/2) + [1]*(n-n/2)
random.shuffle(seq)
Now you can run through the list and whenever you see a run that's too long, swap an element to break up the sequence. I don't have any code for that part yet.
Having 6 1's in a row isn't particularly improbable -- are you sure you're not getting what you want?
There's a simple Python interface for a uniformly distributed random number, is that what you're looking for?
Here's my take on it. The first two functions are the actual implementation and the last function is for testing it.
The key is the first function which looks at the last N elements of the list where N+1 is the limit of how many times you want a number to appear in a row. It counts the number of ones that occur and then returns 1 with (1 - N/n) probability where n is the amount of ones already present. Note that this probability is 0 in the case of N consecutive ones and 1 in the case of N consecutive zeros.
Like a true random selection, there is no guarantee that the ratio of ones and zeros will be the 1 but averaged out over thousands of runs, it does produce as many ones as zeros.
For longer lists, this will be better than repeatedly calling shuffle and checking that it satisfies your requirements.
import random
def next_value(selected):
# Mathematically, this isn't necessary but it accounts for
# potential problems with floating point numbers.
if selected.count(0) == 0:
return 0
elif selected.count(1) == 0:
return 1
N = len(selected)
selector = float(selected.count(1)) / N
if random.uniform(0, 1) > selector:
return 1
else:
return 0
def get_sequence(N, max_run):
lim = min(N, max_run - 1)
seq = [random.choice((1, 0)) for _ in xrange(lim)]
for _ in xrange(N - lim):
seq.append(next_value(seq[-max_run+1:]))
return seq
def test(N, max_run, test_count):
ones = 0.0
zeros = 0.0
for _ in xrange(test_count):
seq = get_sequence(N, max_run)
# Keep track of how many ones and zeros we're generating
zeros += seq.count(0)
ones += seq.count(1)
# Make sure that the max_run isn't violated.
counts = [0, 0]
for i in seq:
counts[i] += 1
counts[not i] = 0
if max_run in counts:
print seq
return
# Print the ratio of zeros to ones. This should be around 1.
print zeros/ones
test(200, 5, 10000)
Probably not the smartest way, but it works for "no sequential runs", while not generating the same number of 0s and 1s. See below for version that fits all requirements.
from random import choice
CHOICES = (1, 0)
def quasirandom(n, longest=3):
serial = 0
latest = 0
result = []
rappend = result.append
for i in xrange(n):
val = choice(CHOICES)
if latest == val:
serial += 1
else:
serial = 0
if serial >= longest:
val = CHOICES[val]
rappend(val)
latest = val
return result
print quasirandom(10)
print quasirandom(100)
This one below corrects the filtering shuffle idea and works correctly AFAICT, with the caveat that the very last numbers might form a run. Pass debug=True to check that the requirements are met.
from random import random
from itertools import groupby # For testing the result
try: xrange
except: xrange = range
def generate_quasirandom(values, n, longest=3, debug=False):
# Sanity check
if len(values) < 2 or longest < 1:
raise ValueError
# Create a list with n * [val]
source = []
sourcelen = len(values) * n
for val in values:
source += [val] * n
# For breaking runs
serial = 0
latest = None
for i in xrange(sourcelen):
# Pick something from source[:i]
j = int(random() * (sourcelen - i)) + i
if source[j] == latest:
serial += 1
if serial >= longest:
serial = 0
guard = 0
# We got a serial run, break it
while source[j] == latest:
j = int(random() * (sourcelen - i)) + i
guard += 1
# We just hit an infinit loop: there is no way to avoid a serial run
if guard > 10:
print("Unable to avoid serial run, disabling asserts.")
debug = False
break
else:
serial = 0
latest = source[j]
# Move the picked value to source[i:]
source[i], source[j] = source[j], source[i]
# More sanity checks
check_quasirandom(source, values, n, longest, debug)
return source
def check_quasirandom(shuffled, values, n, longest, debug):
counts = []
# We skip the last entries because breaking runs in them get too hairy
for val, count in groupby(shuffled):
counts.append(len(list(count)))
highest = max(counts)
print('Longest run: %d\nMax run lenght:%d' % (highest, longest))
# Invariants
assert len(shuffled) == len(values) * n
for val in values:
assert shuffled.count(val) == n
if debug:
# Only checked if we were able to avoid a sequential run >= longest
assert highest <= longest
for x in xrange(10, 1000):
generate_quasirandom((0, 1, 2, 3), 1000, x//10, debug=True)