Related
Let a,b,c be the first digits of a number (e.g. 523 has a=5, b=2, c=3). I am trying to check if abc == sqrt(a^b^c) for many values of a,b,c. (Note: abc = 523 stands for the number itself.)
I have tried this with Python, but for a>7 it already took a significant amount of time to check just one digit combination. I have tried rewriting the equality as multiple logs, like log_c[log_b[log_a[ (abc)^2 ]]] == 1, however, I encountered Math Domain Errors.
Is there a fast / better way to check this equality (preferably in Python)?
Note: Three digits are an example for StackOverflow. The goal is to test much higher powers with seven to ten digits (or more).
Here is the very basic piece of code I have used so far:
for a in range(1,10):
for b in range(1,10):
for c in range(1,10):
N = a*10**2 + b*10 + c
X = a**(b**c)
if N == X:
print a,b,c
The problem is that you are uselessly calculating very large integers, which can take much time as Python has unlimited size for them.
You should limit the values of c you test.
If your largest possible number is 1000, you want a**b**c < 1000**2, so b**c < log(1000**2, a) = 2*log(1000, a)), so c < log(2*log(1000, a), b)
Note that you should exclude a = 1, as any power of it is 1, and b = 1, as b^c would then be 1, and the whole expression is just a.
To test if the square root of a^b^c is abc, it's better to test if a^b^c is equal to the square of abc, in order to avoid using floats.
So, the code, that (as expected) doesn't find any solution under 1000, but runs very fast:
from math import log
for a in range(2,10):
for b in range(2,10):
for c in range(1,int(log(2*log(1000, a), b))):
N2 = (a*100 + b*10 + c)**2
X = a**(b**c)
if N2 == X:
print(a,b,c)
You are looking for numbers whose square root is equal to a three-digit integer. That means your X has to have at most 6 digits, or more precisely log10(X) < 6. Once your a gets larger, the potential solutions you're generating are much larger than that, so we can eliminate large swathes of them without needing to check them (or needing to calculate a ** b ** c, which can get very large: 9 ** 9 ** 9 has 369_693_100 DIGITS!).
log10(X) < 6 gives us log10(a ** b ** c) < 6 which is the same as b ** c * log10(a) < 6. Bringing it to the other side: log10(a) < 6 / b ** c, and then a < 10 ** (6 / b ** c). That means I know I don't need to check for any a that exceeds that. Correcting for an off-by-one error gives the solution:
for b in range(1, 10):
for c in range(1, 10):
t = b ** c
for a in range(1, 1 + min(9, int(10 ** (6 / t)))):
N = a * 100 + b * 10 + c
X = a ** t
if N * N == X:
print(a, b, c)
Running this shows that there aren't any valid solutions to your equation, sadly!
a**(b**c) will grow quite fast and most of the time it will far exceed three digit number. Most of the calculations you are doing will be useless. To optimize your solution do the following:
Iterate over all 3 digit numbers
For each of these numbers square it and is a power of the first digit of the number
For those that are, check if this power is in turn a power of the second digit
And last check if this power is the third digit
I need to create function which will take one argument int and output int which represents the number of distinct parts of input integer's partition. Namely,
input:3 -> output: 1 -> {1, 2}
input:6 -> output: 3 -> {1, 2, 3}, {2, 4}, {1, 5}
...
Since I am looking only for distinct parts, something like this is not allowed:
4 -> {1, 1, 1, 1} or {1, 1, 2}
So far I have managed to come up with some algorithms which would find every possible combination, but they are pretty slow and effective only until n=100 or so.
And since I only need number of combinations not the combinations themselves Partition Function Q should solve the problem.
Does anybody know how to implement this efficiently?
More information about the problem: OEIS, Partition Function Q
EDIT:
To avoid any confusion, the DarrylG answer also includes the trivial (single) partition, but this does not affect the quality of it in any way.
EDIT 2:
The jodag (accepted answer) does not include trivial partition.
Tested two algorithms
Simple recurrence relation
WolframMathword algorithm (based upon Georgiadis, Kediaya, Sloane)
Both implemented with Memoization using LRUCache.
Results: WolframeMathword approach orders of magnitude faster.
1. Simple recurrence relation (with Memoization)
Reference
Code
#lru_cache(maxsize=None)
def p(n, d=0):
if n:
return sum(p(n-k, n-2*k+1) for k in range(1, n-d+1))
else:
return 1
Performance
n Time (sec)
10 time elapsed: 0.0020
50 time elapsed: 0.5530
100 time elapsed: 8.7430
200 time elapsed: 168.5830
2. WolframMathword algorithm
(based upon Georgiadis, Kediaya, Sloane)
Reference
Code
# Implementation of q recurrence
# https://mathworld.wolfram.com/PartitionFunctionQ.html
class PartitionQ():
def __init__(self, MAXN):
self.MAXN = MAXN
self.j_seq = self.calc_j_seq(MAXN)
#lru_cache
def q(self, n):
" Q strict partition function "
assert n < self.MAXN
if n == 0:
return 1
sqrt_n = int(sqrt(n)) + 1
temp = sum(((-1)**(k+1))*self.q(n-k*k) for k in range(1, sqrt_n))
return 2*temp + self.s(n)
def s(self, n):
if n in self.j_seq:
return (-1)**self.j_seq[n]
else:
return 0
def calc_j_seq(self, MAX_N):
""" Used to determine if n of form j*(3*j (+/-) 1) / 2
by creating a dictionary of n, j value pairs "
result = {}
j = 0
valn = -1
while valn <= MAX_N:
jj = 3*j*j
valp, valn = (jj - j)//2, (jj+j)//2
result[valp] = j
result[valn] = j
j += 1
return result
Performance
n Time (sec)
10 time elapsed: 0.00087
50 time elapsed: 0.00059
100 time elapsed: 0.00125
200 time elapsed: 0.10933
Conclusion: This algorithm is orders of magnitude faster than the simple recurrence relationship
Algorithm
Reference
I think a straightforward and efficient way to solve this is to explicitly compute the coefficient of the generating function from the Wolfram PartitionsQ link in the original post.
This is a pretty illustrative example of how to construct generating functions and how they can be used to count solutions. To start, we recognize that the problem may be posed as follows:
Let m_1 + m_2 + ... + m_{n-1} = n where m_j = 0 or m_j = j for all j.
Q(n) is the number of solutions of the equation.
We can find Q(n) by constructing the following polynomial (i.e. the generating function)
(1 + x)(1 + x^2)(1 + x^3)...(1 + x^(n-1))
The number of solutions is the number of ways the terms combine to make x^n, i.e. the coefficient of x^n after expanding the polynomial. Therefore, we can solve the problem by simply performing the polynomial multiplication.
def Q(n):
# Represent polynomial as a list of coefficients from x^0 to x^n.
# G_0 = 1
G = [int(g_pow == 0) for g_pow in range(n + 1)]
for k in range(1, n):
# G_k = G_{k-1} * (1 + x^k)
# This is equivalent to adding G shifted to the right by k to G
# Ignore powers greater than n since we don't need them.
G = [G[g_pow] if g_pow - k < 0 else G[g_pow] + G[g_pow - k] for g_pow in range(n + 1)]
return G[n]
Timing (average of 1000 iterations)
import time
print("n Time (sec)")
for n in [10, 50, 100, 200, 300, 500, 1000]:
t0 = time.time()
for i in range(1000):
Q(n)
elapsed = time.time() - t0
print('%-5d%.08f'%(n, elapsed / 1000))
n Time (sec)
10 0.00001000
50 0.00017500
100 0.00062900
200 0.00231200
300 0.00561900
500 0.01681900
1000 0.06701700
You can memoize the recurrences in equations 8, 9, and 10 in the mathematica article you linked for a quadratic in N runtime.
def partQ(n):
result = []
def rec(part, tgt, allowed):
if tgt == 0:
result.append(sorted(part))
elif tgt > 0:
for i in allowed:
rec(part + [i], tgt - i, allowed - set(range(1, i + 1)))
rec([], n, set(range(1, n)))
return result
The work is done by the rec internal function, which takes:
part - a list of parts whose sum is always equal to or less than the target n
tgt - the remaining partial sum that needs to be added to the sum of part to get to n
allowed - a set of number still allowed to be used in the full partitioning
When tgt = 0 is passed, that meant the sum of part if n, and the part is added to the result list. If tgt is still positive, each of the allowed numbers is attempted as an extension of part, in a recursive call.
I found this task and completely stuck with its solution.
A non-empty zero-indexed string S consisting of Q characters is given. The period of this string is the smallest positive integer P such that:
P ≤ Q / 2 and S[K] = S[K+P] for 0 ≤ K < Q − P.
For example, 7 is the period of “abracadabracadabra”. A positive integer M is the binary period of a positive integer N if M is the period of the binary representation of N.
For example, 1651 has the binary representation of "110011100111". Hence, its binary period is 5. On the other hand, 102 does not have a binary period, because its binary representation is “1100110” and it does not have a period.
Consider above scenarios & write a function in Python which will accept an integer N as the parameter. Given a positive integer N, the function returns the binary period of N or −1 if N does not have a binary period.
The attached code is still incorrect on some inputs (9, 11, 13, 17 etc). The goal is to find and fix the bugs in the implementation. You can modify at most 2 line.
def binary_period(n):
d = [0] * 30
l = 0
while n > 0:
d[l] = n % 2
n //= 2
l += 1
for p in range(1, 1 + l):
ok = True
for i in range(l - p):
if d[i] != d[i + p]:
ok = False
break
if ok:
return p
return -1
I was given this piece of code in an interview.
The aim of the exercice is to see where lies the bug.
As an input of the function, you will type the integer to see the binary period of it. As an example solution(4) will give you a binary number of 0011.
However, the question is the following: What is the bug?
The bug in this occasion is not some crash and burn code, rather a behavior that should happen and in the code, do not happen.
It is known as a logical error in the code. Logical error is the error when code do not break but doesn't fullfill the requirements.
Using a brute force on the code will not help as there are a billion possibilities.
However if you run the code, let's say from solutions(1) to solutions(100), you will see that the code runs without any glitch. Yet if you are looking at the code, it should return -1 if there are errors.
The code is not givin any -1 even if you run solutions to a with bigger number like 10000.
The bug here lies in the -1 that is not being triggered.
So let's go step by step on the code.
Could it be the while part?
while n > 0:
d[l] = n % 2
n //= 2
l += 1
If you look at the code, it is doing what it should be doing, changing the number given to a binary number, even if it is doing from a backward position. Instead of having 1011, you have 1101 but it does the job.
The issue lies rather in that part
for p in range(1, 1 + l):
ok = True
for i in range(l - p):
if d[i] != d[i + p]:
ok = False
break
if ok:
return p
return -1
It is not returning -1.
if you put some print on some part of the code like this, this would give you this
for p in range(1, 1 + l):
ok = True
for i in range(l - p):
print('l, which works as an incrementor is substracted to p of the first loop',p,l-p)
if d[i] != d[i + p]:
ok = False
break
if ok:
return p
return -1
If you run the whole script, actually, you can see that it is never ending even if d[i] is not equal anymore to d[i+p].
But why?
The reason is because l, the incrementor was built on an integer division. Because of that, you need to do a 1+l//2.
Which gives you the following
def solution(n):
d = [0] * 30
l = 0
while n > 0:
d[l] = n % 2
n //= 2
l += 1
for p in range(1, 1 + l//2): #here you put l//2
ok = True
print('p est ',p)
for i in range(l - p):
if d[i] != d[i + p]:
ok = False
break
if ok:
return
Now if you run the code with solutions(5) for example, the bug should be fixed and you should have -1.
Addendum:
This test is a difficult one with a not easy algorithm to deal with in very short time, with variables that does not make any sense.
First step would be to ask the following questions:
What is the input of the algorithm? In this case, it is an integer.
What is the expected output? In this case, a -1
Is it a logical error or a crash and burn kind of error? In this case, it is a logical error.
These step-by-step (heuristic) will set you on the right direction to debug a problem.
Following up Andy's solution and checking #hdlopez comment, there is a border case when passing int.MaxVal=2147483647
and if you do not increase the array size to 31 (instead of 30). The function throws an index out of range, so two places need to be modified:
1- int[] d = new int[31]; //changed 30 to 31 (unsigned integer)
2- for (p = 1; p < 1 + l / 2; ++p) //added division to l per statement, P ≤ Q / 2
I'm using the following code for finding primitive roots modulo n in Python:
Code:
def gcd(a,b):
while b != 0:
a, b = b, a % b
return a
def primRoots(modulo):
roots = []
required_set = set(num for num in range (1, modulo) if gcd(num, modulo) == 1)
for g in range(1, modulo):
actual_set = set(pow(g, powers) % modulo for powers in range (1, modulo))
if required_set == actual_set:
roots.append(g)
return roots
if __name__ == "__main__":
p = 17
primitive_roots = primRoots(p)
print(primitive_roots)
Output:
[3, 5, 6, 7, 10, 11, 12, 14]
Code fragment extracted from: Diffie-Hellman (Github)
Can the primRoots method be simplified or optimized in terms of memory usage and performance/efficiency?
One quick change that you can make here (not efficiently optimum yet) is using list and set comprehensions:
def primRoots(modulo):
coprime_set = {num for num in range(1, modulo) if gcd(num, modulo) == 1}
return [g for g in range(1, modulo) if coprime_set == {pow(g, powers, modulo)
for powers in range(1, modulo)}]
Now, one powerful and interesting algorithmic change that you can make here is to optimize your gcd function using memoization. Or even better you can simply use built-in gcd function form math module in Python-3.5+ or fractions module in former versions:
from functools import wraps
def cache_gcd(f):
cache = {}
#wraps(f)
def wrapped(a, b):
key = (a, b)
try:
result = cache[key]
except KeyError:
result = cache[key] = f(a, b)
return result
return wrapped
#cache_gcd
def gcd(a,b):
while b != 0:
a, b = b, a % b
return a
# or just do the following (recommended)
# from math import gcd
Then:
def primRoots(modulo):
coprime_set = {num for num in range(1, modulo) if gcd(num, modulo) == 1}
return [g for g in range(1, modulo) if coprime_set == {pow(g, powers, modulo)
for powers in range(1, modulo)}]
As mentioned in comments, as a more pythoinc optimizer way you can use fractions.gcd (or for Python-3.5+ math.gcd).
Based on the comment of Pete and answer of Kasramvd, I can suggest this:
from math import gcd as bltin_gcd
def primRoots(modulo):
required_set = {num for num in range(1, modulo) if bltin_gcd(num, modulo) }
return [g for g in range(1, modulo) if required_set == {pow(g, powers, modulo)
for powers in range(1, modulo)}]
print(primRoots(17))
Output:
[3, 5, 6, 7, 10, 11, 12, 14]
Changes:
It now uses pow method's 3-rd argument for the modulo.
Switched to gcd built-in function that's defined in math (for Python 3.5) for a speed boost.
Additional info about built-in gcd is here: Co-primes checking
In the special case that p is prime, the following is a good bit faster:
import sys
# translated to Python from http://www.bluetulip.org/2014/programs/primitive.js
# (some rights may remain with the author of the above javascript code)
def isNotPrime(possible):
# We only test this here to protect people who copy and paste
# the code without reading the first sentence of the answer.
# In an application where you know the numbers are prime you
# will remove this function (and the call). If you need to
# test for primality, look for a more efficient algorithm, see
# for example Joseph F's answer on this page.
i = 2
while i*i <= possible:
if (possible % i) == 0:
return True
i = i + 1
return False
def primRoots(theNum):
if isNotPrime(theNum):
raise ValueError("Sorry, the number must be prime.")
o = 1
roots = []
r = 2
while r < theNum:
k = pow(r, o, theNum)
while (k > 1):
o = o + 1
k = (k * r) % theNum
if o == (theNum - 1):
roots.append(r)
o = 1
r = r + 1
return roots
print(primRoots(int(sys.argv[1])))
You can greatly improve your isNotPrime function by using a more efficient algorithm. You could double the speed by doing a special test for even numbers and then only testing odd numbers up to the square root, but this is still very inefficient compared to an algorithm such as the Miller Rabin test. This version in the Rosetta Code site will always give the correct answer for any number with fewer than 25 digits or so. For large primes, this will run in a tiny fraction of the time it takes to use trial division.
Also, you should avoid using the floating point exponentiation operator ** when you are dealing with integers as in this case (even though the Rosetta code that I just linked to does the same thing!). Things might work fine in a particular case, but it can be a subtle source of error when Python has to convert from floating point to integers, or when an integer is too large to represent exactly in floating point. There are efficient integer square root algorithms that you can use instead. Here's a simple one:
def int_sqrt(n):
if n == 0:
return 0
x = n
y = (x + n//x)//2
while (y<x):
x=y
y = (x + n//x)//2
return x
Those codes are all in-efficient, in many ways, first of all you do not need to iterate for all co-prime reminders of n, you need to check only for powers that are dividers of Euler's function from n. In the case n is prime Euler's function is n-1. If n i prime, you need to factorize n-1 and make check with only those dividers, not all. There is a simple mathematics behind this.
Second. You need better function for powering a number imagine the power is too big, I think in python you have the function pow(g, powers, modulo) which at each steps makes division and getting the remainder only ( _ % modulo ).
If you are going to implement the Diffie-Hellman algorithm it is better to use safe primes. They are such primes that p is a prime and 2p+1 is also prime, so that 2p+1 is called safe prime. If you get n = 2*p+1, then the dividers for that n-1 (n is prime, Euler's function from n is n-1) are 1, 2, p and 2p, you need to check only if the number g at power 2 and g at power p if one of them gives 1, then that g is not primitive root, and you can throw that g away and select another g, the next one g+1, If g^2 and g^p are non equal to 1 by modulo n, then that g is a primitive root, that check guarantees, that all powers except 2p would give numbers different from 1 by modulo n.
The example code uses Sophie Germain prime p and the corresponding safe prime 2p+1, and calculates primitive roots of that safe prime 2p+1.
You can easily re-work the code for any prime number or any other number, by adding a function to calculate Euler's function and to find all divisors of that value. But this is only a demo not a complete code. And there might be better ways.
class SGPrime :
'''
This object expects a Sophie Germain prime p, it does not check that it accept that as input.
Euler function from any prime is n-1, and the order (see method get_order) of any co-prime
remainder of n could be only a divider of Euler function value.
'''
def __init__(self, pSophieGermain ):
self.n = 2*pSophieGermain+1
#TODO! check if pSophieGermain is prime
#TODO! check if n is also prime.
#They both have to be primes, elsewhere the code does not work!
# Euler's function is n-1, #TODO for any n, calculate Euler's function from n
self.elrfunc = self.n-1
# All divisors of Euler's function value, #TODO for any n, get all divisors of the Euler's function value.
self.elrfunc_divisors = [1, 2, pSophieGermain, self.elrfunc]
def get_order(self, r):
'''
Calculate the order of a number, the minimal power at which r would be congruent with 1 by modulo p.
'''
r = r % self.n
for d in self.elrfunc_divisors:
if ( pow( r, d, self.n) == 1 ):
return d
return 0 # no such order, not possible if n is prime, - see small Fermat's theorem
def is_primitive_root(self, r):
'''
Check if r is a primitive root by modulo p. Such always exists if p is prime.
'''
return ( self.get_order(r) == self.elrfunc )
def find_all_primitive_roots(self, max_num_of_roots = None):
'''
Find all primitive roots, only for demo if n is large the list is large for DH or any other such algorithm
better to stop at first primitive roots.
'''
primitive_roots = []
for g in range(1, self.n):
if ( self.is_primitive_root(g) ):
primitive_roots.append(g)
if (( max_num_of_roots != None ) and (len(primitive_roots) >= max_num_of_roots)):
break
return primitive_roots
#demo, Sophie Germain's prime
p = 20963
sggen = SGPrime(p)
print (f"Safe prime : {sggen.n}, and primitive roots of {sggen.n} are : " )
print(sggen.find_all_primitive_roots())
Regards
There is a puzzle which I am writing code to solve that goes as follows.
Consider a binary vector of length n that is initially all zeros. You choose a bit of the vector and set it to 1. Now a process starts that sets the bit that is the greatest distance from any 1 bit to $1$ (or an arbitrary choice of furthest bit if there is more than one). This happens repeatedly with the rule that no two 1 bits can be next to each other. It terminates when there is no more space to place a 1 bit. The goal is to place the initial 1 bit so that as many bits as possible are set to 1 on termination.
Say n = 2. Then wherever we set the bit we end up with exactly one bit set.
For n = 3, if we set the first bit we get 101 in the end. But if we set the middle bit we get 010 which is not optimal.
For n = 4, whichever bit we set we end up with two set.
For n = 5, setting the first gives us 10101 with three bits set in the end.
For n = 7, we need to set the third bit to get 1010101 it seems.
I have written code to find the optimal value but it does not scale well to large n. My code starts to get slow around n = 1000 but I would like to solve the problem for n around 1 million.
#!/usr/bin/python
from __future__ import division
from math import *
def findloc(v):
count = 0
maxcount = 0
id = -1
for i in xrange(n):
if (v[i] == 0):
count += 1
if (v[i] == 1):
if (count > maxcount):
maxcount = count
id = i
count = 0
#Deal with vector ending in 0s
if (2*count >= maxcount and count >= v.index(1) and count >1):
return n-1
#Deal with vector starting in 0s
if (2*v.index(1) >= maxcount and v.index(1) > 1):
return 0
if (maxcount <=2):
return -1
return id-int(ceil(maxcount/2))
def addbits(v):
id = findloc(v)
if (id == -1):
return v
v[id] = 1
return addbits(v)
#Set vector length
n=21
max = 0
for i in xrange(n):
v = [0]*n
v[i] = 1
v = addbits(v)
score = sum([1 for j in xrange(n) if v[j] ==1])
# print i, sum([1 for j in xrange(n) if v[j] ==1]), v
if (score > max):
max = score
print max
Latest answer (O(log n) complexity)
If we believe the conjecture by templatetypedef and Aleksi Torhamo (update: proof at the end of this post), there is a closed form solution count(n) calculable in O(log n) (or O(1) if we assume logarithm and bit shifting is O(1)):
Python:
from math import log
def count(n): # The count, using position k conjectured by templatetypedef
k = p(n-1)+1
count_left = k/2
count_right = f(n-k+1)
return count_left + count_right
def f(n): # The f function calculated using Aleksi Torhamo conjecture
return max(p(n-1)/2 + 1, n-p(n-1))
def p(n): # The largest power of 2 not exceeding n
return 1 << int(log(n,2)) if n > 0 else 0
C++:
int log(int n){ // Integer logarithm, by counting the number of leading 0
return 31-__builtin_clz(n);
}
int p(int n){ // The largest power of 2 not exceeding n
if(n==0) return 0;
return 1<<log(n);
}
int f(int n){ // The f function calculated using Aleksi Torhamo conjecture
int val0 = p(n-1);
int val1 = val0/2+1;
int val2 = n-val0;
return val1>val2 ? val1 : val2;
}
int count(int n){ // The count, using position k conjectured by templatetypedef
int k = p(n-1)+1;
int count_left = k/2;
int count_right = f(n-k+1);
return count_left + count_right;
}
This code can calculate the result for n=100,000,000 (and even n=1e24 in Python!) correctly in no time1.
I have tested the codes with various values for n (using my O(n) solution as the standard, see Old Answer section below), and they still seem correct.
This code relies on the two conjectures by templatetypedef and Aleksi Torhamo2. Anyone wants to proof those? =D (Update 2: PROVEN)
1By no time, I meant almost instantly
2The conjecture by Aleksi Torhamo on f function has been empirically proven for n<=100,000,000
Old answer (O(n) complexity)
I can return the count of n=1,000,000 (the result is 475712) in 1.358s (in my iMac) using Python 2.7. Update: It's 0.198s for n=10,000,000 in C++. =)
Here is my idea, which achieves O(n) time complexity.
The Algorithm
Definition of f(n)
Define f(n) as the number of bits that will be set on bitvector of length n, assuming that the first and last bit are set (except for n=2, where only the first or last bit is set). So we know some values of f(n) as follows:
f(1) = 1
f(2) = 1
f(3) = 2
f(4) = 2
f(5) = 3
Note that this is different from the value that we are looking for, since the initial bit might not be at the first or last, as calculated by f(n). For example, we have f(7)=3 instead of 4.
Note that this can be calculated rather efficiently (amortized O(n) to calculate all values of f up to n) using the recurrence relation:
f(2n) = f(n)+f(n+1)-1
f(2n+1) = 2*f(n+1)-1
for n>=5, since the next bit set following the rule will be the middle bit, except for n=1,2,3,4. Then we can split the bitvector into two parts, each independent of each other, and so we can calculate the number of bits set using f( floor(n/2) ) + f( ceil(n/2) ) - 1, as illustrated below:
n=11 n=13
10000100001 1000001000001
<----> <----->
f(6)<----> f(7) <----->
f(6) f(7)
n=12 n=14
100001000001 10000010000001
<----> <----->
f(6)<-----> f(7) <------>
f(7) f(8)
we have the -1 in the formula to exclude the double count of the middle bit.
Now we are ready to count the solution of original problem.
Definition of g(n,i)
Define g(n,i) as the number of bits that will be set on bitvector of length n, following the rules in the problem, where the initial bit is at the i-th bit (1-based). Note that by symmetry the initial bit can be anywhere from the first bit up to the ceil(n/2)-th bit. And for those cases, note that the first bit will be set before any bit in between the first and the initial, and so is the case for the last bit. Therefore the number of bit set in the first partition and the second partition is f(i) and f(n+1-i) respectively.
So the value of g(n,i) can be calculated as:
g(n,i) = f(i) + f(n+1-i) - 1
following the idea when calculating f(n).
Now, to calculate the final result is trivial.
Definition of g(n)
Define g(n) as the count being looked for in the original problem. We can then take the maximum of all possible i, the position of initial bit:
g(n) = maxi=1..ceil(n/2)(f(i) + f(n+1-i) - 1)
Python code:
import time
mem_f = [0,1,1,2,2]
mem_f.extend([-1]*(10**7)) # This will take around 40MB of memory
def f(n):
global mem_f
if mem_f[n]>-1:
return mem_f[n]
if n%2==1:
mem_f[n] = 2*f((n+1)/2)-1
return mem_f[n]
else:
half = n/2
mem_f[n] = f(half)+f(half+1)-1
return mem_f[n]
def g(n):
return max(f(i)+f(n+1-i)-1 for i in range(1,(n+1)/2 + 1))
def main():
while True:
n = input('Enter n (1 <= n <= 10,000,000; 0 to stop): ')
if n==0: break
start_time = time.time()
print 'g(%d) = %d, in %.3fs' % (n, g(n), time.time()-start_time)
if __name__=='__main__':
main()
Complexity Analysis
Now, the interesting thing is, what is the complexity of calculating g(n) with the method described above?
We should first note that we iterate over n/2 values of i, the position of initial bit. And in each iteration we call f(i) and f(n+1-i). Naive analysis will lead to O(n * O(f(n))), but actually we used memoization on f, so it's much faster than that, since each value of f(i) is calculated only once, at most. So the complexity is actually added by the time required to calculate all values of f(n), which would be O(n + f(n)) instead.
So what's the complexity of initializing f(n)?
We can assume that we precompute every value of f(n) first before calculating g(n). Note that due to the recurrence relation and the memoization, generating the whole values of f(n) takes O(n) time. And the next call to f(n) will take O(1) time.
So, the overall complexity is O(n+n) = O(n), as evidenced by this running time in my iMac for n=1,000,000 and n=10,000,000:
> python max_vec_bit.py
Enter n (1 <= n <= 10,000,000; 0 to stop): 1000000
g(1000000) = 475712, in 1.358s
Enter n (1 <= n <= 10,000,000; 0 to stop): 0
>
> <restarted the program to remove the effect of memoization>
>
> python max_vec_bit.py
Enter n (1 <= n <= 10,000,000; 0 to stop): 10000000
g(10000000) = 4757120, in 13.484s
Enter n (1 <= n <= 10,000,000; 0 to stop): 6745231
g(6745231) = 3145729, in 3.072s
Enter n (1 <= n <= 10,000,000; 0 to stop): 0
And as a by-product of memoization, the calculation of lesser value of n will be much faster after the first call to large n, as you can also see in the sample run. And with language better suited for number crunching such as C++, you might get significantly faster running time
I hope this helps. =)
The code using C++, for performance improvement
The result in C++ is about 68x faster (measured by clock()):
> ./a.out
Enter n (1 <= n <= 10,000,000; 0 to stop): 1000000
g(1000000) = 475712, in 0.020s
Enter n (1 <= n <= 10,000,000; 0 to stop): 0
>
> <restarted the program to remove the effect of memoization>
>
> ./a.out
Enter n (1 <= n <= 10,000,000; 0 to stop): 10000000
g(10000000) = 4757120, in 0.198s
Enter n (1 <= n <= 10,000,000; 0 to stop): 6745231
g(6745231) = 3145729, in 0.047s
Enter n (1 <= n <= 10,000,000; 0 to stop): 0
Code in C++:
#include <cstdio>
#include <cstring>
#include <ctime>
int mem_f[10000001];
int f(int n){
if(mem_f[n]>-1)
return mem_f[n];
if(n%2==1){
mem_f[n] = 2*f((n+1)/2)-1;
return mem_f[n];
} else {
int half = n/2;
mem_f[n] = f(half)+f(half+1)-1;
return mem_f[n];
}
}
int g(int n){
int result = 0;
for(int i=1; i<=(n+1)/2; i++){
int cnt = f(i)+f(n+1-i)-1;
result = (cnt > result ? cnt : result);
}
return result;
}
int main(){
memset(mem_f,-1,sizeof(mem_f));
mem_f[0] = 0;
mem_f[1] = mem_f[2] = 1;
mem_f[3] = mem_f[4] = 2;
clock_t start, end;
while(true){
int n;
printf("Enter n (1 <= n <= 10,000,000; 0 to stop): ");
scanf("%d",&n);
if(n==0) break;
start = clock();
int result = g(n);
end = clock();
printf("g(%d) = %d, in %.3fs\n",n,result,((double)(end-start))/CLOCKS_PER_SEC);
}
}
Proof
note that for the sake of keeping this answer (which is already very long) simple, I've skipped some steps in the proof
Conjecture of Aleksi Torhamo on the value of f
For `n>=1`, prove that:
f(2n+k) = 2n-1+1 for k=1,2,…,2n-1 ...(1)
f(2n+k) = k for k=2n-1+1,…,2n ...(2)
given f(0)=f(1)=f(2)=1
The result above can be easily proven using induction on the recurrence relation, by considering the four cases:
Case 1: (1) for even k
Case 2: (1) for odd k
Case 3: (2) for even k
Case 4: (2) for odd k
Suppose we have the four cases proven for n. Now consider n+1.
Case 1:
f(2n+1+2i) = f(2n+i) + f(2n+i+1) - 1, for i=1,…,2n-1
= 2n-1+1 + 2n-1+1 - 1
= 2n+1
Case 2:
f(2n+1+2i+1) = 2*f(2n+i+1) - 1, for i=0,…,2n-1-1
= 2*(2n-1+1) - 1
= 2n+1
Case 3:
f(2n+1+2i) = f(2n+i) + f(2n+i+1) - 1, for i=2n-1+1,…,2n
= i + (i+1) - 1
= 2i
Case 4:
f(2n+1+2i+1) = 2*f(2n+i+1) - 1, for i=2n-1+1,…,2n-1
= 2*(i+1) - 1
= 2i+1
So by induction the conjecture is proven.
Conjecture of templatetypedef on the best position
For n>=1 and k=1,…,2n, prove that g(2n+k) = g(2n+k, 2n+1)
That is, prove that placing the first bit on the 2n+1-th position gives maximum number of bits set.
The proof:
First, we have
g(2n+k,2n+1) = f(2n+1) + f(k-1) - 1
Next, by the formula of f, we have the following equalities:
f(2n+1-i) = f(2n+1), for i=-2n-1,…,-1
f(2n+1-i) = f(2n+1)-i, for i=1,…,2n-2-1
f(2n+1-i) = f(2n+1)-2n-2, for i=2n-2,…,2n-1
and also the following inequality:
f(k-1+i) <= f(k-1), for i=-2n-1,…,-1
f(k-1+i) <= f(k-1)+i , for i=1,…,2n-2-1
f(k-1+i) <= f(k-1)+2n-2, for i=2n-2,…,2n-1
and so we have:
f(2n+1-i)+f(k-1+i) <= f(2n+1)+f(k-1), for i=-2n-1,…,2n-1
Now, note that we have:
g(2n+k) = maxi=1..ceil(2n-1+1-k/2)(f(i) + f(2n+k+1-i) - 1)
<= f(2n+1) + f(k-1) - 1
= g(2n+k,2n+1)
And so the conjecture is proven.
So in a break with my normal tradition of not posting algorithms I don't have a proof for, I think I should mention that there's an algorithm that appears to be correct for numbers up to 50,000+ and runs in O(log n) time. This is due to Sophia Westwood, who I worked on this problem with for about three hours today. All credit for this is due to her. Empirically it seems to work beautifully, and it's much, much faster than the O(n) solutions.
One observation about the structure of this problem is that if n is sufficiently large (n ≥ 5), then if you put a 1 anywhere, the problem splits into two subproblems, one to the left of the 1 and one to the right. Although the 1s might be placed in the different halves at different times, the eventual placement is the same as if you solved each half separately and combined them back together.
The next observation is this: suppose you have an array of size 2k + 1 for some k. In that case, suppose that you put a 1 on either side of the array. Then:
The next 1 is placed on the other side of the array.
The next 1 is placed in the middle.
You now have two smaller subproblems of size 2k-1 + 1.
The important part about this is that the resulting bit pattern is an alternating series of 1s and 0s. For example:
For 5 = 4 + 1, we get 10101
For 9 = 8 + 1, we get 101010101
For 17 = 16 + 1, we get 10101010101010101
The reason this matters is the following: suppose you have n total elements in the array and let k be the largest possible value for which 2k + 1 ≤ n. If you place the 1 at position 2k + 1, then the left part of the array up to that position will end up getting tiled with alternating 1s and 0s, which puts a lot of 1s into the array.
What's not obvious is that placing the 1 bit there, for all numbers up to 50,000, appears to yield an optimal solution! I've written a Python script that checks this (using a recurrence relation similar to the one #justhalf) and it seems to work well. The reason that this fact is so useful is that it's really easy to compute this index. In particular, if 2k + 1 ≤ n, then 2k ≤ n - 1, so k ≤ lg (n - 1). Choosing the value ⌊lg (n - 1) ⌋ as your choice of k then lets you compute the bit index by computing 2k + 1. This value of k can be computed in O(log n) time and the exponentiation can be done in O(log n) time as well, so the total runtime is Θ(log n).
The only issue is that I haven't formally proven that this works. All I know is that it's right for the first 50,000 values we've tried. :-)
Hope this helps!
I'll attach what I have. Same as yours, alas, time is basically O(n**3). But at least it avoids recursion (etc), so won't blow up when you get near a million ;-) Note that this returns the best vector found, not the count; e.g.,
>>> solve(23)
[6, 0, 11, 0, 1, 0, 0, 10, 0, 5, 0, 9, 0, 3, 0, 0, 8, 0, 4, 0, 7, 0, 2]
So it also shows the order in which the 1 bits were chosen. The easiest way to get the count is to pass the result to max().
>>> max(solve(23))
11
Or change the function to return maxsofar instead of best.
If you want to run numbers on the order of a million, you'll need something radically different. You can't even afford quadratic time for that (let alone this approach's cubic time). Unlikely to get such a huge O() improvement from fancier data structures - I expect it would require deeper insight into the mathematics of the problem.
def solve(n):
maxsofar, best = 1, [1] + [0] * (n-1)
# by symmetry, no use trying starting points in last half
# (would be a mirror image).
for i in xrange((n + 1)//2):
v = [0] * n
v[i] = count = 1
# d21[i] = distance to closest 1 from index i
d21 = range(i, 0, -1) + range(n-i)
while 1:
d, j = max((d, j) for j, d in enumerate(d21))
if d >= 2:
count += 1
v[j] = count
d21[j] = 0
k = 1
while j-k >= 0 and d21[j-k] > k:
d21[j-k] = k
k += 1
k = 1
while j+k < n and d21[j+k] > k:
d21[j+k] = k
k += 1
else:
if count > maxsofar:
maxsofar = count
best = v[:]
break
return best