Related
How could I check if a number is a perfect square?
Speed is of no concern, for now, just working.
See also: Integer square root in python.
The problem with relying on any floating point computation (math.sqrt(x), or x**0.5) is that you can't really be sure it's exact (for sufficiently large integers x, it won't be, and might even overflow). Fortunately (if one's in no hurry;-) there are many pure integer approaches, such as the following...:
def is_square(apositiveint):
x = apositiveint // 2
seen = set([x])
while x * x != apositiveint:
x = (x + (apositiveint // x)) // 2
if x in seen: return False
seen.add(x)
return True
for i in range(110, 130):
print i, is_square(i)
Hint: it's based on the "Babylonian algorithm" for square root, see wikipedia. It does work for any positive number for which you have enough memory for the computation to proceed to completion;-).
Edit: let's see an example...
x = 12345678987654321234567 ** 2
for i in range(x, x+2):
print i, is_square(i)
this prints, as desired (and in a reasonable amount of time, too;-):
152415789666209426002111556165263283035677489 True
152415789666209426002111556165263283035677490 False
Please, before you propose solutions based on floating point intermediate results, make sure they work correctly on this simple example -- it's not that hard (you just need a few extra checks in case the sqrt computed is a little off), just takes a bit of care.
And then try with x**7 and find clever way to work around the problem you'll get,
OverflowError: long int too large to convert to float
you'll have to get more and more clever as the numbers keep growing, of course.
If I was in a hurry, of course, I'd use gmpy -- but then, I'm clearly biased;-).
>>> import gmpy
>>> gmpy.is_square(x**7)
1
>>> gmpy.is_square(x**7 + 1)
0
Yeah, I know, that's just so easy it feels like cheating (a bit the way I feel towards Python in general;-) -- no cleverness at all, just perfect directness and simplicity (and, in the case of gmpy, sheer speed;-)...
Use Newton's method to quickly zero in on the nearest integer square root, then square it and see if it's your number. See isqrt.
Python ≥ 3.8 has math.isqrt. If using an older version of Python, look for the "def isqrt(n)" implementation here.
import math
def is_square(i: int) -> bool:
return i == math.isqrt(i) ** 2
Since you can never depend on exact comparisons when dealing with floating point computations (such as these ways of calculating the square root), a less error-prone implementation would be
import math
def is_square(integer):
root = math.sqrt(integer)
return integer == int(root + 0.5) ** 2
Imagine integer is 9. math.sqrt(9) could be 3.0, but it could also be something like 2.99999 or 3.00001, so squaring the result right off isn't reliable. Knowing that int takes the floor value, increasing the float value by 0.5 first means we'll get the value we're looking for if we're in a range where float still has a fine enough resolution to represent numbers near the one for which we are looking.
If youre interested, I have a pure-math response to a similar question at math stackexchange, "Detecting perfect squares faster than by extracting square root".
My own implementation of isSquare(n) may not be the best, but I like it. Took me several months of study in math theory, digital computation and python programming, comparing myself to other contributors, etc., to really click with this method. I like its simplicity and efficiency though. I havent seen better. Tell me what you think.
def isSquare(n):
## Trivial checks
if type(n) != int: ## integer
return False
if n < 0: ## positivity
return False
if n == 0: ## 0 pass
return True
## Reduction by powers of 4 with bit-logic
while n&3 == 0:
n=n>>2
## Simple bit-logic test. All perfect squares, in binary,
## end in 001, when powers of 4 are factored out.
if n&7 != 1:
return False
if n==1:
return True ## is power of 4, or even power of 2
## Simple modulo equivalency test
c = n%10
if c in {3, 7}:
return False ## Not 1,4,5,6,9 in mod 10
if n % 7 in {3, 5, 6}:
return False ## Not 1,2,4 mod 7
if n % 9 in {2,3,5,6,8}:
return False
if n % 13 in {2,5,6,7,8,11}:
return False
## Other patterns
if c == 5: ## if it ends in a 5
if (n//10)%10 != 2:
return False ## then it must end in 25
if (n//100)%10 not in {0,2,6}:
return False ## and in 025, 225, or 625
if (n//100)%10 == 6:
if (n//1000)%10 not in {0,5}:
return False ## that is, 0625 or 5625
else:
if (n//10)%4 != 0:
return False ## (4k)*10 + (1,9)
## Babylonian Algorithm. Finding the integer square root.
## Root extraction.
s = (len(str(n))-1) // 2
x = (10**s) * 4
A = {x, n}
while x * x != n:
x = (x + (n // x)) >> 1
if x in A:
return False
A.add(x)
return True
Pretty straight forward. First it checks that we have an integer, and a positive one at that. Otherwise there is no point. It lets 0 slip through as True (necessary or else next block is infinite loop).
The next block of code systematically removes powers of 4 in a very fast sub-algorithm using bit shift and bit logic operations. We ultimately are not finding the isSquare of our original n but of a k<n that has been scaled down by powers of 4, if possible. This reduces the size of the number we are working with and really speeds up the Babylonian method, but also makes other checks faster too.
The third block of code performs a simple Boolean bit-logic test. The least significant three digits, in binary, of any perfect square are 001. Always. Save for leading zeros resulting from powers of 4, anyway, which has already been accounted for. If it fails the test, you immediately know it isnt a square. If it passes, you cant be sure.
Also, if we end up with a 1 for a test value then the test number was originally a power of 4, including perhaps 1 itself.
Like the third block, the fourth tests the ones-place value in decimal using simple modulus operator, and tends to catch values that slip through the previous test. Also a mod 7, mod 8, mod 9, and mod 13 test.
The fifth block of code checks for some of the well-known perfect square patterns. Numbers ending in 1 or 9 are preceded by a multiple of four. And numbers ending in 5 must end in 5625, 0625, 225, or 025. I had included others but realized they were redundant or never actually used.
Lastly, the sixth block of code resembles very much what the top answerer - Alex Martelli - answer is. Basically finds the square root using the ancient Babylonian algorithm, but restricting it to integer values while ignoring floating point. Done both for speed and extending the magnitudes of values that are testable. I used sets instead of lists because it takes far less time, I used bit shifts instead of division by two, and I smartly chose an initial start value much more efficiently.
By the way, I did test Alex Martelli's recommended test number, as well as a few numbers many orders magnitude larger, such as:
x=1000199838770766116385386300483414671297203029840113913153824086810909168246772838680374612768821282446322068401699727842499994541063844393713189701844134801239504543830737724442006577672181059194558045164589783791764790043104263404683317158624270845302200548606715007310112016456397357027095564872551184907513312382763025454118825703090010401842892088063527451562032322039937924274426211671442740679624285180817682659081248396873230975882215128049713559849427311798959652681930663843994067353808298002406164092996533923220683447265882968239141724624870704231013642255563984374257471112743917655991279898690480703935007493906644744151022265929975993911186879561257100479593516979735117799410600147341193819147290056586421994333004992422258618475766549646258761885662783430625 ** 2
for i in range(x, x+2):
print(i, isSquare(i))
printed the following results:
1000399717477066534083185452789672211951514938424998708930175541558932213310056978758103599452364409903384901149641614494249195605016959576235097480592396214296565598519295693079257885246632306201885850365687426564365813280963724310434494316592041592681626416195491751015907716210235352495422858432792668507052756279908951163972960239286719854867504108121432187033786444937064356645218196398775923710931242852937602515835035177768967470757847368349565128635934683294155947532322786360581473152034468071184081729335560769488880138928479829695277968766082973795720937033019047838250608170693879209655321034310764422462828792636246742456408134706264621790736361118589122797268261542115823201538743148116654378511916000714911467547209475246784887830649309238110794938892491396597873160778553131774466638923135932135417900066903068192088883207721545109720968467560224268563643820599665232314256575428214983451466488658896488012211237139254674708538347237589290497713613898546363590044902791724541048198769085430459186735166233549186115282574626012296888817453914112423361525305960060329430234696000121420787598967383958525670258016851764034555105019265380321048686563527396844220047826436035333266263375049097675787975100014823583097518824871586828195368306649956481108708929669583308777347960115138098217676704862934389659753628861667169905594181756523762369645897154232744410732552956489694024357481100742138381514396851789639339362228442689184910464071202445106084939268067445115601375050153663645294106475257440167535462278022649865332161044187890625 True
1000399717477066534083185452789672211951514938424998708930175541558932213310056978758103599452364409903384901149641614494249195605016959576235097480592396214296565598519295693079257885246632306201885850365687426564365813280963724310434494316592041592681626416195491751015907716210235352495422858432792668507052756279908951163972960239286719854867504108121432187033786444937064356645218196398775923710931242852937602515835035177768967470757847368349565128635934683294155947532322786360581473152034468071184081729335560769488880138928479829695277968766082973795720937033019047838250608170693879209655321034310764422462828792636246742456408134706264621790736361118589122797268261542115823201538743148116654378511916000714911467547209475246784887830649309238110794938892491396597873160778553131774466638923135932135417900066903068192088883207721545109720968467560224268563643820599665232314256575428214983451466488658896488012211237139254674708538347237589290497713613898546363590044902791724541048198769085430459186735166233549186115282574626012296888817453914112423361525305960060329430234696000121420787598967383958525670258016851764034555105019265380321048686563527396844220047826436035333266263375049097675787975100014823583097518824871586828195368306649956481108708929669583308777347960115138098217676704862934389659753628861667169905594181756523762369645897154232744410732552956489694024357481100742138381514396851789639339362228442689184910464071202445106084939268067445115601375050153663645294106475257440167535462278022649865332161044187890626 False
And it did this in 0.33 seconds.
In my opinion, my algorithm works the same as Alex Martelli's, with all the benefits thereof, but has the added benefit highly efficient simple-test rejections that save a lot of time, not to mention the reduction in size of test numbers by powers of 4, which improves speed, efficiency, accuracy and the size of numbers that are testable. Probably especially true in non-Python implementations.
Roughly 99% of all integers are rejected as non-Square before Babylonian root extraction is even implemented, and in 2/3 the time it would take the Babylonian to reject the integer. And though these tests dont speed up the process that significantly, the reduction in all test numbers to an odd by dividing out all powers of 4 really accelerates the Babylonian test.
I did a time comparison test. I tested all integers from 1 to 10 Million in succession. Using just the Babylonian method by itself (with my specially tailored initial guess) it took my Surface 3 an average of 165 seconds (with 100% accuracy). Using just the logical tests in my algorithm (excluding the Babylonian), it took 127 seconds, it rejected 99% of all integers as non-Square without mistakenly rejecting any perfect squares. Of those integers that passed, only 3% were perfect Squares (a much higher density). Using the full algorithm above that employs both the logical tests and the Babylonian root extraction, we have 100% accuracy, and test completion in only 14 seconds. The first 100 Million integers takes roughly 2 minutes 45 seconds to test.
EDIT: I have been able to bring down the time further. I can now test the integers 0 to 100 Million in 1 minute 40 seconds. A lot of time is wasted checking the data type and the positivity. Eliminate the very first two checks and I cut the experiment down by a minute. One must assume the user is smart enough to know that negatives and floats are not perfect squares.
import math
def is_square(n):
sqrt = math.sqrt(n)
return (sqrt - int(sqrt)) == 0
A perfect square is a number that can be expressed as the product of two equal integers. math.sqrt(number) return a float. int(math.sqrt(number)) casts the outcome to int.
If the square root is an integer, like 3, for example, then math.sqrt(number) - int(math.sqrt(number)) will be 0, and the if statement will be False. If the square root was a real number like 3.2, then it will be True and print "it's not a perfect square".
It fails for a large non-square such as 152415789666209426002111556165263283035677490.
My answer is:
def is_square(x):
return x**.5 % 1 == 0
It basically does a square root, then modulo by 1 to strip the integer part and if the result is 0 return True otherwise return False. In this case x can be any large number, just not as large as the max float number that python can handle: 1.7976931348623157e+308
It is incorrect for a large non-square such as 152415789666209426002111556165263283035677490.
This can be solved using the decimal module to get arbitrary precision square roots and easy checks for "exactness":
import math
from decimal import localcontext, Context, Inexact
def is_perfect_square(x):
# If you want to allow negative squares, then set x = abs(x) instead
if x < 0:
return False
# Create localized, default context so flags and traps unset
with localcontext(Context()) as ctx:
# Set a precision sufficient to represent x exactly; `x or 1` avoids
# math domain error for log10 when x is 0
ctx.prec = math.ceil(math.log10(x or 1)) + 1 # Wrap ceil call in int() on Py2
# Compute integer square root; don't even store result, just setting flags
ctx.sqrt(x).to_integral_exact()
# If previous line couldn't represent square root as exact int, sets Inexact flag
return not ctx.flags[Inexact]
For demonstration with truly huge values:
# I just kept mashing the numpad for awhile :-)
>>> base = 100009991439393999999393939398348438492389402490289028439083249803434098349083490340934903498034098390834980349083490384903843908309390282930823940230932490340983098349032098324908324098339779438974879480379380439748093874970843479280329708324970832497804329783429874329873429870234987234978034297804329782349783249873249870234987034298703249780349783497832497823497823497803429780324
>>> sqr = base ** 2
>>> sqr ** 0.5 # Too large to use floating point math
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: int too large to convert to float
>>> is_perfect_power(sqr)
True
>>> is_perfect_power(sqr-1)
False
>>> is_perfect_power(sqr+1)
False
If you increase the size of the value being tested, this eventually gets rather slow (takes close to a second for a 200,000 bit square), but for more moderate numbers (say, 20,000 bits), it's still faster than a human would notice for individual values (~33 ms on my machine). But since speed wasn't your primary concern, this is a good way to do it with Python's standard libraries.
Of course, it would be much faster to use gmpy2 and just test gmpy2.mpz(x).is_square(), but if third party packages aren't your thing, the above works quite well.
I just posted a slight variation on some of the examples above on another thread (Finding perfect squares) and thought I'd include a slight variation of what I posted there here (using nsqrt as a temporary variable), in case it's of interest / use:
import math
def is_square(n):
if not (isinstance(n, int) and (n >= 0)):
return False
else:
nsqrt = math.sqrt(n)
return nsqrt == math.trunc(nsqrt)
It is incorrect for a large non-square such as 152415789666209426002111556165263283035677490.
A variant of #Alex Martelli's solution without set
When x in seen is True:
In most cases, it is the last one added, e.g. 1022 produces the x's sequence 511, 256, 129, 68, 41, 32, 31, 31;
In some cases (i.e., for the predecessors of perfect squares), it is the second-to-last one added, e.g. 1023 produces 511, 256, 129, 68, 41, 32, 31, 32.
Hence, it suffices to stop as soon as the current x is greater than or equal to the previous one:
def is_square(n):
assert n > 1
previous = n
x = n // 2
while x * x != n:
x = (x + (n // x)) // 2
if x >= previous:
return False
previous = x
return True
x = 12345678987654321234567 ** 2
assert not is_square(x-1)
assert is_square(x)
assert not is_square(x+1)
Equivalence with the original algorithm tested for 1 < n < 10**7. On the same interval, this slightly simpler variant is about 1.4 times faster.
This is my method:
def is_square(n) -> bool:
return int(n**0.5)**2 == int(n)
Take square root of number. Convert to integer. Take the square. If the numbers are equal, then it is a perfect square otherwise not.
It is incorrect for a large square such as 152415789666209426002111556165263283035677489.
If the modulus (remainder) leftover from dividing by the square root is 0, then it is a perfect square.
def is_square(num: int) -> bool:
return num % math.sqrt(num) == 0
I checked this against a list of perfect squares going up to 1000.
It is possible to improve the Babylonian method by observing that the successive terms form a decreasing sequence if one starts above the square root of n.
def is_square(n):
assert n > 1
a = n
b = (a + n // a) // 2
while b < a:
a = b
b = (a + n // a) // 2
return a * a == n
If it's a perfect square, its square root will be an integer, the fractional part will be 0, we can use modulus operator to check fractional part, and check if it's 0, it does fail for some numbers, so, for safety, we will also check if it's square of the square root even if the fractional part is 0.
import math
def isSquare(n):
root = math.sqrt(n)
if root % 1 == 0:
if int(root) * int(root) == n:
return True
return False
isSquare(4761)
You could binary-search for the rounded square root. Square the result to see if it matches the original value.
You're probably better off with FogleBirds answer - though beware, as floating point arithmetic is approximate, which can throw this approach off. You could in principle get a false positive from a large integer which is one more than a perfect square, for instance, due to lost precision.
A simple way to do it (faster than the second one) :
def is_square(n):
return str(n**(1/2)).split(".")[1] == '0'
Another way:
def is_square(n):
if n == 0:
return True
else:
if n % 2 == 0 :
for i in range(2,n,2):
if i*i == n:
return True
else :
for i in range(1,n,2):
if i*i == n:
return True
return False
This response doesn't pertain to your stated question, but to an implicit question I see in the code you posted, ie, "how to check if something is an integer?"
The first answer you'll generally get to that question is "Don't!" And it's true that in Python, typechecking is usually not the right thing to do.
For those rare exceptions, though, instead of looking for a decimal point in the string representation of the number, the thing to do is use the isinstance function:
>>> isinstance(5,int)
True
>>> isinstance(5.0,int)
False
Of course this applies to the variable rather than a value. If I wanted to determine whether the value was an integer, I'd do this:
>>> x=5.0
>>> round(x) == x
True
But as everyone else has covered in detail, there are floating-point issues to be considered in most non-toy examples of this kind of thing.
If you want to loop over a range and do something for every number that is NOT a perfect square, you could do something like this:
def non_squares(upper):
next_square = 0
diff = 1
for i in range(0, upper):
if i == next_square:
next_square += diff
diff += 2
continue
yield i
If you want to do something for every number that IS a perfect square, the generator is even easier:
(n * n for n in range(upper))
I think that this works and is very simple:
import math
def is_square(num):
sqrt = math.sqrt(num)
return sqrt == int(sqrt)
It is incorrect for a large non-square such as 152415789666209426002111556165263283035677490.
a=int(input('enter any number'))
flag=0
for i in range(1,a):
if a==i*i:
print(a,'is perfect square number')
flag=1
break
if flag==1:
pass
else:
print(a,'is not perfect square number')
In kotlin :
It's quite easy and it passed all test cases as well.
really thanks to >> https://www.quora.com/What-is-the-quickest-way-to-determine-if-a-number-is-a-perfect-square
fun isPerfectSquare(num: Int): Boolean {
var result = false
var sum=0L
var oddNumber=1L
while(sum<num){
sum = sum + oddNumber
oddNumber = oddNumber+2
}
result = sum == num.toLong()
return result
}
def isPerfectSquare(self, num: int) -> bool:
left, right = 0, num
while left <= right:
mid = (left + right) // 2
if mid**2 < num:
left = mid + 1
elif mid**2 > num:
right = mid - 1
else:
return True
return False
This is an elegant, simple, fast and arbitrary solution that works for Python version >= 3.8:
from math import isqrt
def is_square(number):
if number >= 0:
return isqrt(number) ** 2 == number
return False
Decide how long the number will be.
take a delta 0.000000000000.......000001
see if the (sqrt(x))^2 - x is greater / equal /smaller than delta and decide based on the delta error.
import math
def is_square(n):
sqrt = math.sqrt(n)
return sqrt == int(sqrt)
It fails for a large non-square such as 152415789666209426002111556165263283035677490.
The idea is to run a loop from i = 1 to floor(sqrt(n)) then check if squaring it makes n.
bool isPerfectSquare(int n)
{
for (int i = 1; i * i <= n; i++) {
// If (i * i = n)
if ((n % i == 0) && (n / i == i)) {
return true;
}
}
return false;
}
I'm trying to continue on my previous question in which I'm trying to calculate Fibonacci numbers using Benet's algorithm. To work with arbitrary precision I found mpmath. However the implementation seems to fail above certain value. For instance the 99th value gives:
218922995834555891712
This should be (ref):
218922995834555169026
Here is my code:
from mpmath import *
def Phi():
return (1 + sqrt(5)) / 2
def phi():
return (1 - sqrt(5)) / 2
def F(n):
return (power(Phi(), n) - power(phi(), n)) / sqrt(5)
start = 99
end = 100
for x in range(start, end):
print(x, int(F(x)))
mpmath does do arbitrary precision math, and it does do it accurately to any precision (as described above) if you are using the arbitrary precision math module and not the default behavior.
mpmath has more than one module which determines the accuracy and speed of the results (to be chosen depending on what you need), and by default it uses Python floats, which is what I believe you saw above.
If you call mpmath's fib( ) having set mp.dps high enough, you will get the correct answer as stated above.
>>> from mpmath import mp
>>> mp.dps = 25
>>> mp.nprint( mp.fib( 99 ), 25 )
218922995834555169026.0
>>> mp.nprint( mpmath.fib( 99 ), 25 )
218922995834555169026.0
Whereas, if you don't use the mp module, you will only get results as accurate as a Python double.
>>> import mpmath
>>> mpmath.dps = 25
>>> mpmath.nprint( mpmath.fib( 99 ), 25
218922995834555170816.0
mpmath provides arbitrary precision (as set in mpmath.mp.dps), but still inaccuate calculation. For example, mpmath.sqrt(5) is not accurate, so any calculation based on that will also be inaccurate.
To get an accurate result for sqrt(5), you have to use a library which supports abstract calculation, e.g. http://sympy.org/ .
To get an accurate result for Fibonacci numbers, probably the simplest way is using an algorithm which does only integer arithmetics. For example:
def fib(n):
if n < 0:
raise ValueError
def fib_rec(n):
if n == 0:
return 0, 1
else:
a, b = fib_rec(n >> 1)
c = a * ((b << 1) - a)
d = b * b + a * a
if n & 1:
return d, c + d
else:
return c, d
return fib_rec(n)[0]
Actually mpmath's default precision is 15 which I think is not enough if you want to get the result of up to 21-digit precision.
One thing you can do is set the precision to be a higher value and use mpmath's defined arithmetic functions for addition, subtraction, etc.
from mpmath import mp
mp.dps = 50
sqrt5 = mp.sqrt(5)
def Phi():
return 0.5*mp.fadd(1, sqrt5)
def phi():
return 0.5*mp.fsub(1, sqrt5)
def F(n):
return mp.fdiv(mp.power(Phi(), n) - mp.power(phi(), n), sqrt5)
print int(F(99))
This will give you
218922995834555169026L
[This is related to Minimum set cover ]
I would like to solve the following puzzle by computer for small size of n. Consider all 2^n binary vectors of length n. For each one you delete exactly n/3 of the bits, leaving a binary vector length 2n/3 (assume n is an integer multiple of 3). The goal is to choose the bits you delete so as to minimize the number of different binary vectors of length 2n/3 that remain at the end.
For example, for n = 3 the optimal answer is 2 different vectors 11 and 00. For n = 6 it is 4, for n = 9 it is 6 and for n = 12 it is 10.
I had previously attempted to solve this problem as a minimum set cover problem of the following sort. All the lists contain only 1s and 0s.
I say that a list A covers a list B if you can make B from A by inserting exactly x symbols.
Consider all 2^n lists of 1s and 0s of length n and set x = n/3. I would like to compute a minimal set of lists of length 2n/3 that covers them all. David Eisenstat provided code that converted this minimal set cover problem into a Mixed Integer Programming Problem that could be fed into CPLEX (or http://scip.zib.de/ which is open source).
from collections import defaultdict
from itertools import product, combinations
def all_fill(source, num):
output_len = (len(source) + num)
for where in combinations(range(output_len), len(source)):
poss = ([[0, 1]] * output_len)
for (w, s) in zip(where, source):
poss[w] = [s]
for tup in product(*poss):
(yield tup)
def variable_name(seq):
return ('x' + ''.join((str(s) for s in seq)))
n = 12
shortn = ((2 * n) // 3)
x = (n // 3)
all_seqs = list(product([0, 1], repeat=shortn))
hit_sets = defaultdict(set)
for seq in all_seqs:
for fill in all_fill(seq, x):
hit_sets[fill].add(seq)
print('Minimize')
print(' + '.join((variable_name(seq) for seq in all_seqs)))
print('Subject To')
for (fill, seqs) in hit_sets.items():
print(' + '.join((variable_name(seq) for seq in seqs)), '>=', 1)
print('Binary')
for seq in all_seqs:
print(variable_name(seq))
print('End')
The problem is that if you set n=15 then the instance it outputs is too large for any solver I can find. Is there a more efficient way of solving this problem so I can solve n=15 or even n = 18?
This doesn't solve your problem (well, not quickly enough), but you're not getting many ideas and someone else may find something useful to build on here.
It's a short pure Python 3 program, using backtracking search with some greedy ordering heuristics. It solves the N = 3, 6, and 9 instances very quickly. It finds a cover of size 10 for N=12 quickly too, but will apparently take a much longer time to exhaust the search space (I'm out of time for this, and it's still running). For N=15, the initialization time is already slow.
Bitstrings are represented by plain N-bit integers here, so consume little storage. That's to ease recoding in a faster language. It does make heavy use of sets of integers, but no other "advanced" data structures.
Hope this helps someone! But it's clear that the combinatorial explosion of possibilities as N increases ensures that nothing will be "fast enough" without digging deeper into the mathematics of the problem.
def dump(cover):
for s in sorted(cover):
print(" {:0{width}b}".format(s, width=I))
def new_best(cover):
global best_cover, best_size
assert len(cover) < best_size
best_size = len(cover)
best_cover = cover.copy()
print("N =", N, "new best cover, size", best_size)
dump(best_cover)
def initialize(N, X, I):
from itertools import combinations
# Map a "wide" (length N) bitstring to the set of all
# "narrow" (length I) bitstrings that generate it.
w2n = [set() for _ in range(2**N)]
# Map a narrow bitstring to all the wide bitstrings
# it generates.
n2w = [set() for _ in range(2**I)]
for wide, wset in enumerate(w2n):
for t in combinations(range(N), X):
narrow = wide
for i in reversed(t): # largest i to smallest
hi, lo = divmod(narrow, 1 << i)
narrow = ((hi >> 1) << i) | lo
wset.add(narrow)
n2w[narrow].add(wide)
return w2n, n2w
def solve(needed, cover):
if len(cover) >= best_size:
return
if not needed:
new_best(cover)
return
# Find something needed with minimal generating set.
_, winner = min((len(w2n[g]), g) for g in needed)
# And order its generators by how much reduction they make
# to `needed`.
for g in sorted(w2n[winner],
key=lambda g: len(needed & n2w[g]),
reverse=True):
cover.add(g)
solve(needed - n2w[g], cover)
cover.remove(g)
N = 9 # CHANGE THIS TO WHAT YOU WANT
assert N % 3 == 0
X = N // 3 # number of bits to exclude
I = N - X # number of bits to include
print("initializing")
w2n, n2w = initialize(N, X, I)
best_cover = None
best_size = 2**I + 1 # "infinity"
print("solving")
solve(set(range(2**N)), set())
Example output for N=9:
initializing
solving
N = 9 new best cover, size 6
000000
000111
001100
110011
111000
111111
Followup
For N=12 this eventually finished, confirming that the minimal covering set contains 10 elements (which it found very soon at the start). I didn't time it, but it took at least 5 hours.
Why's that? Because it's close to brain-dead ;-) A completely naive search would try all subsets of the 256 8-bit short strings. There are 2**256 such subsets, about 1.2e77 - it wouldn't finish in the expected lifetime of the universe ;-)
The ordering gimmicks here first detect that the "all 0" and "all 1" short strings must be in any covering set, so pick them. That leaves us looking at "only" the 254 remaining short strings. Then the greedy "pick an element that covers the most" strategy very quickly finds a covering set with 11 total elements, and shortly thereafter a covering with 10 elements. That happens to be optimal, but it takes a long time to exhaust all other possibilities.
At this point, any attempt at a covering set that reaches 10 elements is aborted (it can't possibly be smaller than 10 elements then!). If that were done wholly naively too, it would need to try adding (to the "all 0" and "all 1" strings) all 8-element subsets of the 254 remaining, and 254-choose-8 is about 3.8e14. Very much smaller than 1.2e77 - but still way too large to be practical. It's an interesting exercise to understand how the code manages to do so much better than that. Hint: it has a lot to do with the data in this problem.
Industrial-strength solvers are incomparably more sophisticated and complex. I was pleasantly surprised at how well this simple little program did on the smaller problem instances! It got lucky.
But for N=15 this simple approach is hopeless. It quickly finds a cover with 18 elements, but makes no more visible progress for at least hours. Internally, it's still working with needed sets containing hundreds (even thousands) of elements, which makes the body of solve() quite expensive. It still has 2**10 - 2 = 1022 short strings to consider, and 1022-choose-16 is about 6e34. I don't expect it would visibly help even if this code were sped by a factor of a million.
It was fun to try, though :-)
And a small rewrite
This version runs at least 6 times faster on a full N=12 run, simply by cutting off futile searches one level earlier. Also speeds initialization, and cuts memory use by changing the 2**N w2n sets into lists (no set operations are used on those). It's still hopeless for N=15, though :-(
def dump(cover):
for s in sorted(cover):
print(" {:0{width}b}".format(s, width=I))
def new_best(cover):
global best_cover, best_size
assert len(cover) < best_size
best_size = len(cover)
best_cover = cover.copy()
print("N =", N, "new best cover, size", best_size)
dump(best_cover)
def initialize(N, X, I):
from itertools import combinations
# Map a "wide" (length N) bitstring to the set of all
# "narrow" (length I) bitstrings that generate it.
w2n = [set() for _ in range(2**N)]
# Map a narrow bitstring to all the wide bitstrings
# it generates.
n2w = [set() for _ in range(2**I)]
# mask[i] is a string of i 1-bits
mask = [2**i - 1 for i in range(N)]
for t in combinations(range(N), X):
t = t[::-1] # largest i to smallest
for wide, wset in enumerate(w2n):
narrow = wide
for i in t: # delete bit 2**i
narrow = ((narrow >> (i+1)) << i) | (narrow & mask[i])
wset.add(narrow)
n2w[narrow].add(wide)
# release some space
for i, s in enumerate(w2n):
w2n[i] = list(s)
return w2n, n2w
def solve(needed, cover):
if not needed:
if len(cover) < best_size:
new_best(cover)
return
if len(cover) >= best_size - 1:
# can't possibly be extended to a cover < best_size
return
# Find something needed with minimal generating set.
_, winner = min((len(w2n[g]), g) for g in needed)
# And order its generators by how much reduction they make
# to `needed`.
for g in sorted(w2n[winner],
key=lambda g: len(needed & n2w[g]),
reverse=True):
cover.add(g)
solve(needed - n2w[g], cover)
cover.remove(g)
N = 9 # CHANGE THIS TO WHAT YOU WANT
assert N % 3 == 0
X = N // 3 # number of bits to exclude
I = N - X # number of bits to include
print("initializing")
w2n, n2w = initialize(N, X, I)
best_cover = None
best_size = 2**I + 1 # "infinity"
print("solving")
solve(set(range(2**N)), set())
print("best for N =", N, "has size", best_size)
dump(best_cover)
First consider if you have 6 bits. You can throw away 2 bits. Therefore, any pattern balance of 6-0, 5-1 or 4-2 can be converted to 0000 or 1111. In the case a 3-3 zero-one balance any pattern can be converted to one of four cases: 1000, 0001, 0111, or 1110. Therefore, one possible minimum set for 6 bits is:
0000
0001
0111
1110
1000
1111
Now consider 9 bits with 3 thrown away. You have the following set of 14 master patterns:
000000
100000
000001
010000
000010
110000
000011
001111
111100
101111
111101
011111
111110
111111
In other words, each pattern set has ones/zeros in the center, with every permutation of n/3-1 bits on each end. For example, if you have 24 bits then you will have 17 bits in the center and 7 bits on the ends. Since 2^7 = 128 you will have 4 x 128 - 2 = 510 possible patterns.
To find correct deletions there are various algorithms. One method is to find the edit distance between the current bit set and each master pattern. The pattern with the minimum edit distance is the one to convert to. This method uses dynamic programming. Another method would be to do a tree search through the patterns using a set of rules to find the matching pattern.
I want to implement Karatsuba's 2-split multiplication in Python. However, writing numbers in the form
A=c*x+d
where x is a power of the base (let x=b^m) close to sqrt(A).
How am I supposed to find x, if I can't even use division and multiplication? Should I count the number of digits and shift A to the left by half the number of digits?
Thanks.
Almost. You don't shift A by half the number of digits; you shift 1. Of course, this is only efficient if the base is a power of 2, since "shifting" in base 10 (for example) has to be done with multiplications. (Edit: well, ok, you can multiply with shifts and additions. But it's ever so much simpler with a power of 2.)
If you're using Python 3.1 or greater, counting the bits is easy, because 3.1 introduced the int.bit_length() method. For other versions of Python, you can count the bits by copying A and shifting it right until it's 0. This can be done in O(log N) time (N = # of digits) with a sort of binary search method - shift by many bits, if it's 0 then that was too many, etc.
You already accepted an answer since I started writing this, but:
What Tom said: in Python 3.x you can get n = int.bit_length() directly.
In Python 2.x you get n in O(log2(A)) time by binary-search, like below.
Here is (2.x) code that calculates both. Let the base-2 exponent of x be n, i.e. x = 2**n.
First we get n by binary-search by shifting. (Really we only needed n/2, so that's one unnecessary last iteration).
Then when we know n, getting x,c,d is easy (still no using division)
def karatsuba_form(A,n=32):
"""Binary-search for Karatsuba form using binary shifts"""
# First search for n ~ log2(A)
step = n >> 1
while step>0:
c = A >> n
print 'n=%2d step=%2d -> c=%d' % (n,step,c)
if c:
n += step
else:
n -= step
# More concisely, could say: n = (n+step) if c else (n-step)
step >>= 1
# Then take x = 2^(n/2) ˜ sqrt(A)
ndiv2 = n/2
# Find Karatsuba form
c = (A >> ndiv2)
x = (1 << ndiv2)
d = A - (c << ndiv2)
return (x,c,d)
Your question is already answered in the article to which you referred: "Karatsuba's basic step works for any base B and any m, but the recursive algorithm is most efficient when m is equal to n/2, rounded up" ... n being the number of digits, and 0 <= value_of_digit < B.
Some perspective that might help:
You are allowed (and required!) to use elementary operations like number_of_digits // 2 and divmod(digit_x * digit_x, B) ... in school arithmetic, where B is 10, you are required (for example) to know that divmod(9 * 8, 10) produces (7, 2).
When implementing large number arithmetic on a computer, it is usual to make B the largest power of 2 that will support the elementary multiplication operation conveniently. For example in the CPython implementation on a 32-bit machine, B is chosen to to be 2 ** 15 (i.e. 32768), because then product = digit_x * digit_y; hi = product >> 15; lo = product & 0x7FFF; works without overflow and without concern about a sign bit.
I'm not sure what you are trying to achieve with an implementation in Python that uses B == 2, with numbers represented by Python ints, whose implementation in C already uses the Karatsuba algorithm for multiplying numbers that are large enough to make it worthwhile. It can't be speed.
As a learning exercise, you might like to try representing a number as a list of digits, with the base B being an input parameter.
What does the % in a calculation? I can't seem to work out what it does.
Does it work out a percent of the calculation for example: 4 % 2 is apparently equal to 0. How?
The % (modulo) operator yields the remainder from the division of the first argument by the second. The numeric arguments are first converted to a common type. A zero right argument raises the ZeroDivisionError exception. The arguments may be floating point numbers, e.g., 3.14%0.7 equals 0.34 (since 3.14 equals 4*0.7 + 0.34.) The modulo operator always yields a result with the same sign as its second operand (or zero); the absolute value of the result is strictly smaller than the absolute value of the second operand [2].
Taken from http://docs.python.org/reference/expressions.html
Example 1:
6%2 evaluates to 0 because there's no remainder if 6 is divided by 2 ( 3 times ).
Example 2: 7%2 evaluates to 1 because there's a remainder of 1 when 7 is divided by 2 ( 3 times ).
So to summarise that, it returns the remainder of a division operation, or 0 if there is no remainder. So 6%2 means find the remainder of 6 divided by 2.
Somewhat off topic, the % is also used in string formatting operations like %= to substitute values into a string:
>>> x = 'abc_%(key)s_'
>>> x %= {'key':'value'}
>>> x
'abc_value_'
Again, off topic, but it seems to be a little documented feature which took me awhile to track down, and I thought it was related to Pythons modulo calculation for which this SO page ranks highly.
An expression like x % y evaluates to the remainder of x ÷ y - well, technically it is "modulus" instead of "reminder" so results may be different if you are comparing with other languages where % is the remainder operator. There are some subtle differences (if you are interested in the practical consequences see also "Why Python's Integer Division Floors" bellow).
Precedence is the same as operators / (division) and * (multiplication).
>>> 9 / 2
4
>>> 9 % 2
1
9 divided by 2 is equal to 4.
4 times 2 is 8
9 minus 8 is 1 - the remainder.
Python gotcha: depending on the Python version you are using, % is also the (deprecated) string interpolation operator, so watch out if you are coming from a language with automatic type casting (like PHP or JS) where an expression like '12' % 2 + 3 is legal: in Python it will result in TypeError: not all arguments converted during string formatting which probably will be pretty confusing for you.
[update for Python 3]
User n00p comments:
9/2 is 4.5 in python. You have to do integer division like so: 9//2 if you want python to tell you how many whole objects is left after division(4).
To be precise, integer division used to be the default in Python 2 (mind you, this answer is older than my boy who is already in school and at the time 2.x were mainstream):
$ python2.7
Python 2.7.10 (default, Oct 6 2017, 22:29:07)
[GCC 4.2.1 Compatible Apple LLVM 9.0.0 (clang-900.0.31)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 9 / 2
4
>>> 9 // 2
4
>>> 9 % 2
1
In modern Python 9 / 2 results 4.5 indeed:
$ python3.6
Python 3.6.1 (default, Apr 27 2017, 00:15:59)
[GCC 4.2.1 Compatible Apple LLVM 8.1.0 (clang-802.0.42)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 9 / 2
4.5
>>> 9 // 2
4
>>> 9 % 2
1
[update]
User dahiya_boy asked in the comment session:
Q. Can you please explain why -11 % 5 = 4 - dahiya_boy
This is weird, right? If you try this in JavaScript:
> -11 % 5
-1
This is because in JavaScript % is the "remainder" operator while in Python it is the "modulus" (clock math) operator.
You can get the explanation directly from GvR:
Edit - dahiya_boy
In Java and iOS -11 % 5 = -1 whereas in python and ruby -11 % 5 = 4.
Well half of the reason is explained by the Paulo Scardine, and rest of the explanation is below here
In Java and iOS, % gives the remainder that means if you divide 11 % 5 gives Quotient = 2 and remainder = 1 and -11 % 5 gives Quotient = -2 and remainder = -1.
Sample code in swift iOS.
But when we talk about in python its gives clock modulus. And its work with below formula
mod(a,n) = a - {n * Floor(a/n)}
Thats means,
mod(11,5) = 11 - {5 * Floor(11/5)} => 11 - {5 * 2}
So, mod(11,5) = 1
And
mod(-11,5) = -11 - 5 * Floor(-11/5) => -11 - {5 * (-3)}
So, mod(-11,5) = 4
Sample code in python 3.0.
Why Python's Integer Division Floors
I was asked (again) today to explain why integer division in Python returns the floor of the result instead of truncating towards zero like C.
For positive numbers, there's no surprise:
>>> 5//2
2
But if one of the operands is negative, the result is floored, i.e., rounded away from zero (towards negative infinity):
>>> -5//2
-3
>>> 5//-2
-3
This disturbs some people, but there is a good mathematical reason. The integer division operation (//) and its sibling, the modulo operation (%), go together and satisfy a nice mathematical relationship (all variables are integers):
a/b = q with remainder r
such that
b*q + r = a and 0 <= r < b
(assuming a and b are >= 0).
If you want the relationship to extend for negative a (keeping b positive), you have two choices: if you truncate q towards zero, r will become negative, so that the invariant changes to 0 <= abs(r) < otherwise, you can floor q towards negative infinity, and the invariant remains 0 <= r < b. [update: fixed this para]
In mathematical number theory, mathematicians always prefer the latter choice (see e.g. Wikipedia). For Python, I made the same choice because there are some interesting applications of the modulo operation where the sign of a is uninteresting. Consider taking a POSIX timestamp (seconds since the start of 1970) and turning it into the time of day. Since there are 24*3600 = 86400 seconds in a day, this calculation is simply t % 86400. But if we were to express times before 1970 using negative numbers, the "truncate towards zero" rule would give a meaningless result! Using the floor rule it all works out fine.
Other applications I've thought of are computations of pixel positions in computer graphics. I'm sure there are more.
For negative b, by the way, everything just flips, and the invariant becomes:
0 >= r > b.
So why doesn't C do it this way? Probably the hardware didn't do this at the time C was designed. And the hardware probably didn't do it this way because in the oldest hardware, negative numbers were represented as "sign + magnitude" rather than the two's complement representation used these days (at least for integers). My first computer was a Control Data mainframe and it used one's complement for integers as well as floats. A pattern of 60 ones meant negative zero!
Tim Peters, who knows where all Python's floating point skeletons are buried, has expressed some worry about my desire to extend these rules to floating point modulo. He's probably right; the truncate-towards-negative-infinity rule can cause precision loss for x%1.0 when x is a very small negative number. But that's not enough for me to break integer modulo, and // is tightly coupled to that.
PS. Note that I am using // instead of / -- this is Python 3 syntax, and also allowed in Python 2 to emphasize that you know you are invoking integer division. The / operator in Python 2 is ambiguous, since it returns a different result for two integer operands than for an int and a float or two floats. But that's a totally separate story; see PEP 238.
Posted by Guido van Rossum at 9:49 AM
The modulus is a mathematical operation, sometimes described as "clock arithmetic." I find that describing it as simply a remainder is misleading and confusing because it masks the real reason it is used so much in computer science. It really is used to wrap around cycles.
Think of a clock: Suppose you look at a clock in "military" time, where the range of times goes from 0:00 - 23.59. Now if you wanted something to happen every day at midnight, you would want the current time mod 24 to be zero:
if (hour % 24 == 0):
You can think of all hours in history wrapping around a circle of 24 hours over and over and the current hour of the day is that infinitely long number mod 24. It is a much more profound concept than just a remainder, it is a mathematical way to deal with cycles and it is very important in computer science. It is also used to wrap around arrays, allowing you to increase the index and use the modulus to wrap back to the beginning after you reach the end of the array.
Python - Basic Operators
http://www.tutorialspoint.com/python/python_basic_operators.htm
Modulus - Divides left hand operand by right hand operand and returns remainder
a = 10 and b = 20
b % a = 0
In most languages % is used for modulus. Python is no exception.
% Modulo operator can be also used for printing strings (Just like in C) as defined on Google https://developers.google.com/edu/python/strings.
# % operator
text = "%d little pigs come out or I'll %s and %s and %s" % (3, 'huff', 'puff', 'blow down')
This seems to bit off topic but It will certainly help someone.
Also, there is a useful built-in function called divmod:
divmod(a, b)
Take two (non complex) numbers as arguments and return a pair of numbers
consisting of their quotient and
remainder when using long division.
x % y calculates the remainder of the division x divided by y where the quotient is an integer. The remainder has the sign of y.
On Python 3 the calculation yields 6.75; this is because the / does a true division, not integer division like (by default) on Python 2. On Python 2 1 / 4 gives 0, as the result is rounded down.
The integer division can be done on Python 3 too, with // operator, thus to get the 7 as a result, you can execute:
3 + 2 + 1 - 5 + 4 % 2 - 1 // 4 + 6
Also, you can get the Python style division on Python 2, by just adding the line
from __future__ import division
as the first source code line in each source file.
Modulus operator, it is used for remainder division on integers, typically, but in Python can be used for floating point numbers.
http://docs.python.org/reference/expressions.html
The % (modulo) operator yields the remainder from the division of the first argument by the second. The numeric arguments are first converted to a common type. A zero right argument raises the ZeroDivisionError exception. The arguments may be floating point numbers, e.g., 3.14%0.7 equals 0.34 (since 3.14 equals 4*0.7 + 0.34.) The modulo operator always yields a result with the same sign as its second operand (or zero); the absolute value of the result is strictly smaller than the absolute value of the second operand [2].
It's a modulo operation, except when it's an old-fashioned C-style string formatting operator, not a modulo operation. See here for details. You'll see a lot of this in existing code.
It was hard for me to readily find specific use cases for the use of % online ,e.g. why does doing fractional modulus division or negative modulus division result in the answer that it does. Hope this helps clarify questions like this:
Modulus Division In General:
Modulus division returns the remainder of a mathematical division operation. It is does it as follows:
Say we have a dividend of 5 and divisor of 2, the following division operation would be (equated to x):
dividend = 5
divisor = 2
x = 5/2
The first step in the modulus calculation is to conduct integer division:
x_int = 5 // 2 ( integer division in python uses double slash)
x_int = 2
Next, the output of x_int is multiplied by the divisor:
x_mult = x_int * divisor
x_mult = 4
Lastly, the dividend is subtracted from the x_mult
dividend - x_mult = 1
The modulus operation ,therefore, returns 1:
5 % 2 = 1
Application to apply the modulus to a fraction
Example: 2 % 5
The calculation of the modulus when applied to a fraction is the same as above; however, it is important to note that the integer division will result in a value of zero when the divisor is larger than the dividend:
dividend = 2
divisor = 5
The integer division results in 0 whereas the; therefore, when step 3 above is performed, the value of the dividend is carried through (subtracted from zero):
dividend - 0 = 2 —> 2 % 5 = 2
Application to apply the modulus to a negative
Floor division occurs in which the value of the integer division is rounded down to the lowest integer value:
import math
x = -1.1
math.floor(-1.1) = -2
y = 1.1
math.floor = 1
Therefore, when you do integer division you may get a different outcome than you expect!
Applying the steps above on the following dividend and divisor illustrates the modulus concept:
dividend: -5
divisor: 2
Step 1: Apply integer division
x_int = -5 // 2 = -3
Step 2: Multiply the result of the integer division by the divisor
x_mult = x_int * 2 = -6
Step 3: Subtract the dividend from the multiplied variable, notice the double negative.
dividend - x_mult = -5 -(-6) = 1
Therefore:
-5 % 2 = 1
Be aware that
(3 +2 + 1 - 5) + (4 % 2) - (1/4) + 6
even with the brackets results in 6.75 instead of 7 if calculated in Python 3.4.
And the '/' operator is not that easy to understand, too (python2.7): try...
- 1/4
1 - 1/4
This is a bit off-topic here, but should be considered when evaluating the above expression :)
The % (modulo) operator yields the remainder from the division of the first argument by the second. The numeric arguments are first converted to a common type.
3 + 2 + 1 - 5 + 4 % 2 - 1 / 4 + 6 = 7
This is based on operator precedence.
% is modulo. 3 % 2 = 1, 4 % 2 = 0
/ is (an integer in this case) division, so:
3 + 2 + 1 - 5 + 4 % 2 - 1 / 4 + 6
1 + 4%2 - 1/4 + 6
1 + 0 - 0 + 6
7
It's a modulo operation
http://en.wikipedia.org/wiki/Modulo_operation
http://docs.python.org/reference/expressions.html
So with order of operations, that works out to
(3+2+1-5) + (4%2) - (1/4) + 6
(1) + (0) - (0) + 6
7
The 1/4=0 because we're doing integer math here.
It is, as in many C-like languages, the remainder or modulo operation. See the documentation for numeric types — int, float, long, complex.
Modulus - Divides left hand operand by right hand operand and returns remainder.
If it helps:
1:0> 2%6
=> 2
2:0> 8%6
=> 2
3:0> 2%6 == 8%6
=> true
... and so on.
I have found that the easiest way to grasp the modulus operator (%) is through long division. It is the remainder and can be useful in determining a number to be even or odd:
4%2 = 0
2
2|4
-4
0
11%3 = 2
3
3|11
-9
2
def absolute(c):
if c>=0:
return c
else:
return c*-1
x=int(input("Enter the value:"))
a=absolute(x)
print(a)