Project Euler - Problem 160 - python

For any N, let f(N) be the last five
digits before the trailing zeroes in
N!. For example,
9! = 362880 so f(9)=36288
10! = 3628800 so f(10)=36288
20! = 2432902008176640000 so f(20)=17664
Find f(1,000,000,000,000)
I've successfully tackled this question for the given examples, my function can correctly find f(9), f(10), etc. However it struggles with larger numbers, especially the number the problem asks for - f(10^12).
My current optimizations are as follows: I remove trailing zeros from the multiplier and the sum, and shorten the sum to 5 digits after each multiplication. The code in python is as follows:
def SFTR (n):
sum, a = 1, 2
while a < n+1:
mul = int(re.sub("0+$","",str(a)))
sum *= mul
sum = int(re.sub("0+$","",str(sum))[-5:])
a += 1
return sum
Can anyone tell me why this function is scaling so largely, and why its taking so long. Also, if anyone could hint me in the correct direction to optimize my algorithm. (a name of the general topic will suffice) Thank you.
Update:
I have made some changes for optimization and it is significantly faster, but it is still not fast enough for f(10^12). Can anyone tell me whats making my code slow or how to make it faster?
def SFTR (n):
sum, a = 1, 2
while a < n+1:
mul = a
while(mul % 10 == 0): mul = mul/10
mul = mul % 100000
sum *= mul
while(sum % 10 == 0): sum = sum/10
sum = sum % 100000
a += 1
return sum

mul can get very big. Is that necessary? If I asked you to compute the last 5 non-zero digits of 1278348572934847283948561278387487189900038 * 38758
by hand, exactly how many digits of the first number do you actually need to know?

Building strings frequently is expensive. I'd rather use the modulo operator when truncating to the last five digits.
python -m timeit 'x = str(111111111111111111111111111111111)[-5:]'
1000000 loops, best of 3: 1.09 usec per loop
python -m timeit 'x = 111111111111111111111111111111111 % 100000'
1000000 loops, best of 3: 0.277 usec per loop
The same applies to stripping the trailing zeros. There should be a more efficient way to do this, and you probably don't have to do it in every single step.
I didn't check your algorithm for correctness, though, it's just a hint for optimization.

In fact, you might even note that there are only a restricted set of possible trailing non-zero digits. If I recall correctly, there are only a few thousand possible trailing non-zero digit combinations, when you look only at the last 5 digits. For example, is it possible for the final non-zero digit ever to be odd? (Ignore the special cases of 0! and 1! here.)

Related

How can I compute faster (x**2-2)%n - Python

x = 2**1000000
n = 2**100000000
(x**2-2)%n is too slow. I found pow() but I can't use it because I can't subtract 2. (pow(x, 2)-2)%n and (x*x-2)%n are also slow. When I tested (x*x-2) it was fast but when I added the modulo operator it was slow. Is there a way to compute (x**2-2)%n faster?
Are you running this in the interpreter? I did some testing and the main slowdown seemed to come from the interpreter trying to display the result.
If you assign the expression to a variable, the interpreter won't try to display the result, and it will be very quick:
x = 2**1000000
n = 2**100000000
result = (x**2-2)%n
Addendum:
I was also originally thinking along the same lines as MikeW's answer, and if you wanted every part of the code to be fast, you could take advantage of Python's internal base 2 representation of integers and use bitwise left shifts:
x = 1 << 1000000
n = 1 << 100000000
This comes with the caveat that this only works because x and n are powers of 2, and you have to be more careful to avoid making an off-by-one error. This answer is a good explanation of how bitshifts basically work, but Python is bit different than other languages like C, C++, or Java because Python integers are unlimited precision, so you can never left shift a bit completely away like you could in other languages.
Some module rules :
1) (a+b)mod(n) = amod(n)+bmod(N)
2) (a.b)mod(n) = amod(n).bmod(n)
So you can transform your equation into :
(x**2-2)%n ==> (x.x - 2)%n ==> (x%n).(x%n) - (2%n)
If n is always greater than 2, (2%n) is 2 itself.
solving (x%n) :
If x and n are always in 2**value ; if x > n then (x%n)= 0 is the answer and if x < n (x%n)=x
So the answer is either 0-(2%n) or x**2-(2%n)
If x is always a power of 2, and n is always a power of 2, then you can you can compute it easily and quickly using bit operations on a byte array, which you can then reconstitute into a "number".
If 2^N is (binary) 1 followed by N zeroes, then (2^N)^2 is (binary) 1 followed by 2N zeros.
2^3 squared is b'1000000'
If you have a number 2^K (binary 1 followed by K zeroes), then 2^K - 2 will be K-1 1s (ones) followed by a zero.
eg 2^4 is 16 = b'10000', 2^4 - 2 is b'1110'
If you require "% 2^M" then in binary, you just select the last (lower) M bits, and disregard the rest .
9999 is b'10011100001111'
9999 % 2^8 is b'00001111'
'
Hence combining the parts, if x=2^A and n=2^B, then
(x^2 - 2 ) % n
will be: (last B bits of) (binary) (2*A - 1 '1's followed by a '0')
If you want to compute (x ** y - z) % n
it will be equivalent to ((x ** y) % n - z) % n
Python pow function includes as optional parameter a modulo, as it is very often used and can be computed in an optimized way. So you should use:
(pow(x, y, n) - z) % n
OP says in comment : it's slow because I assign x to the answer and I repeat the process.
I try this :
x = 2**(1000*1000)
n = 2**(100*1000*1000)
import time
t0=time.time()
for i in range(6):
x = (x*x-2)%n
t1=time.time()
print(i,t1-t0)
t0=t1
print(x<n)
"""
0 0.0
1 0.4962291717529297
2 0.5937404632568359
3 1.9043104648590088
4 5.708504915237427
5 16.74528479576111
True
"""
It shows that in this problem, it's just slow because x grows, doubling the number of digit at each loop :
In [5]: %timeit u=x%n
149 ns ± 6.42 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
the %n takes absolutely no time if x<n.

When does python start using a different algorithm for big multiplication?

I'm currently in an algorithms class and was interested to see which of two methods of multiplying a list of large numbers gives the faster runtime. What I found was that the recursive multiply performs about 10x faster. For the code below, I got t_sim=53.05s and t_rec=4.73s. I did some other tests and they all seemed to be around the 10x range.
Additionally, you could put the values from the recursive multiply into a tree and reuse them to even more quickly compute multiplications of subsets of the list.
I did a theoretical runtime analysis, and both are n^2 using standard multiplication, but when the karatsuba algorithm is used, that factor goes down to n^log_2(3).
Every multiply in simple_multiply should have runtime i * 1. Summing over i=1...n, we get an arithmetic series and can use gauss's formula to get n*(n+1)/2 = O(n^2).
For the second one, we can see that the time to multiply for a given level of recursion is (2^d)^2, where d is the depth, but only needs to multiply n*2^-d values. The levels turn out to form a geometric series where the runtime at each level is n*2^d with a final depth of log_2(n). The solution to the geometric series is n * (1-2^log_2(n))/(1-2) = n*(n-1) = O(n^2). If using the karatsuba algorithm, you can get O(n^log_2(3)) by doing the same method
If the code were using the karatsuba algorithm, then the speedup would make sense, but what doesn't seem to make sense is the linear relationship between the two runtimes, making it seem like python is using standard multiplication, which according to wikipedia is faster when using under 500ish bits. (I'm using 2^23 bits in the code below. Each number is literally a megabyte long)
import random
import time
def simple_multiply(values):
a = 1
for val in values:
a *= val
return a
def recursive_multiply(values):
if len(values) == 1:
return values[0]
temp = []
i = 0
while i + 1 < len(values):
temp.append(values[i] * values[i+1])
i += 2
if len(values) % 2 == 1:
temp.append(values[-1])
return recursive_multiply(temp)
def test(func, values):
t1 = time.time()
func(values)
print( time.time() - t1)
def main():
n = 2**11
scale = 2**12
values = [random.getrandbits(scale) for i in range(n)]
test(simple_multiply, values)
test(recursive_multiply, values)
pass
if __name__ == '__main__':
main()
Both versions of the code have the same number of multiplications, but in the simple version each multiplication is ~2000 bits long on average.
In the second version n/2 multiplications are 24 bits long, n/4 are 48 bits long, n/8 are 96 bits long, etc... The average length is only 48 bits.
There is something wrong in your assumption. Your assumption is that each multiplication of the between the different ranks should take same times, for instance len(24)*len(72) approx len(48)*len(48). But that's not true, as evident by the following snippets:
%%timeit
random.getrandbits(2**14)*random.getrandbits(2**14)*random.getrandbits(2**14)*random.getrandbits(2**14)
>>>1000 loops, best of 3: 1.48 ms per loop
%%timeit
(random.getrandbits(2**14)*random.getrandbits(2**14))*(random.getrandbits(2**14)*random.getrandbits(2**14))
>>>1000 loops, best of 3: 1.23 ms per loop
The difference is consistent even on such a small scale

Unlucky number 13

I came across this problem Unlucky number 13! recently but could not think of efficient solution this.
Problem statement :
N is taken as input.
N can be very large 0<= N <= 1000000009
Find total number of such strings that are made of exactly N characters which don't include "13". The strings may contain any integer from 0-9, repeated any number of times.
# Example:
# N = 2 :
# output : 99 (0-99 without 13 number)
# N =1 :
# output : 10 (0-9 without 13 number)
My solution:
N = int(raw_input())
if N < 2:
print 10
else:
without_13 = 10
for i in range(10, int('9' * N)+1):
string = str(i)
if string.count("13") >= 1:
continue
without_13 += 1
print without_13
Output
The output file should contain answer to each query in a new line modulo 1000000009.
Any other efficient way to solve this ? My solution gives time limit exceeded on coding site.
I think this can be solved via recursion:
ans(n) = { ans([n/2])^2 - ans([n/2]-1)^2 }, if n is even
ans(n) = { ans([n/2]+1)*ans([n/2]) - ans([n/2])*ans([n/2]-1) }, if n is odd
Base Cases:
ans(0) = 1
ans(1) = 10
It's implementation is running quite fast even for larger inputs like 10^9 ( which is expected as its complexity is O(log[n]) instead of O(n) like the other answers ):
cache = {}
mod = 1000000009
def ans(n):
if cache.has_key(n):
return cache[n]
if n == 0:
cache[n] = 1
return cache[n]
if n == 1:
cache[n] = 10
return cache[n]
temp1 = ans(n/2)
temp2 = ans(n/2-1)
if (n & 1) == 0:
cache[n] = (temp1*temp1 - temp2*temp2) % mod
else:
temp3 = ans(n/2 + 1)
cache[n] = (temp1 * (temp3 - temp2)) % mod
return cache[n]
print ans(1000000000)
Online Demo
Explanation:
Let a string s have even number of digits 'n'.
Let ans(n) be the answer for the input n, i.e. the number of strings without the substring 13 in them.
Therefore, the answer for string s having length n can be written as the multiplication of the answer for the first half of the string (ans([n/2])) and the answer for the second half of the string (ans([n/2])), minus the number of cases where the string 13 appears in the middle of the number n, i.e. when the last digit of the first half is 1 and the first digit of the second half is 3.
This can expressed mathematically as:
ans(n) = ans([n/2])^2 - ans([n/2]-1)*2
Similarly for the cases where the input number n is odd, we can derive the following equation:
ans(n) = ans([n/2]+1)*ans([n/2]) - ans([n/2])*ans([n/2]-1)
I get the feeling that this question is designed with the expectation that you would initially instinctively do it the way you have. However, I believe there's a slightly different approach that would be faster.
You can produce all the numbers that contain the number 13 yourself, without having to loop through all the numbers in between. For example:
2 digits:
13
3 digits position 1:
113
213
313 etc.
3 digits position 2: 131
132
133 etc.
Therefore, you don't have to check all the number from 0 to n*9. You simply count all the numbers with 13 in them until the length is larger than N.
This may not be the fastest solution (in fact I'd be surprised if this couldn't be solved efficiently by using some mathematics trickery) but I believe it will be more efficient than the approach you have currently taken.
This a P&C problem. I'm going to assume 0 is valid string and so is 00, 000 and so on, each being treated distinct from the other.
The total number of strings not containing 13, of length N, is unsurprisingly given by:
(Total Number of strings of length N) - (Total number of strings of length N that have 13 in them)
Now, the Total number of strings of length N is easy, you have 10 digits and N slots to put them in: 10^N.
The number of strings of length N with 13 in them is a little trickier.
You'd think you can do something like this:
=> (N-1)C1 * 10^(N-2)
=> (N-1) * 10^(N-2)
But you'd be wrong, or more accurately, you'd be over counting certain strings. For example, you'd be over counting the set of string that have two or more 13s in them.
What you really need to do is apply the inclusion-exclusion principle to count the number of strings with 13 in them, so that they're all included once.
If you look at this problem as a set counting problem, you have quite a few sets:
S(0,N): Set of all strings of Length N.
S(1,N): Set of all strings of Length N, with at least one '13' in it.
S(2,N): Set of all strings of Length N, with at least two '13's in it.
...
S(N/2,N): Set of all strings of Length N, with at least floor(N/2) '13's in it.
You want the set of all strings with 13 in them, but counted at most once. You can use the inclusion-exclusion principle for computing that set.
Let f(n) be the number of sequences of length n that have no "13" in them, and g(n) be the number of sequences of length n that have "13" in them.
Then f(n) = 10^n - g(n) (in mathematical notation), because it's the number of possible sequences (10^n) minus the ones that contain "13".
Base cases:
f(0) = 1
g(0) = 0
f(1) = 10
g(1) = 0
When looking for the sequences with "13", a sequence can have a "13" at the beginning. That will account for 10^(n-2) possible sequences with "13" in them. It could also have a "13" in the second position, again accounting for 10^(n-2) possible sequences. But if it has a "13" in the third position, and we'd assume there would also be 10^(n-2) possible sequences, we could those twice that already had a "13" in the first position. So we have to substract them. Instead, we count 10^(n-4) times f(2) (because those are exactly the combinations in the first two positions that don't have "13" in them).
E.g. for g(5):
g(5) = 10^(n-2) + 10^(n-2) + f(2)*10^(n-4) + f(3)*10^(n-5)
We can rewrite that to look the same everywhere:
g(5) = f(0)*10^(n-2) + f(1)*10^(n-3) + f(2)*10^(n-4) + f(3)*10^(n-5)
Or simply the sum of f(i)*10^(n-(i+2)) with i ranging from 0 to n-2.
In Python:
from functools import lru_cache
#lru_cache(maxsize=1024)
def f(n):
return 10**n - g(n)
#lru_cache(maxsize=1024)
def g(n):
return sum(f(i)*10**(n-(i+2)) for i in range(n-1)) # range is exclusive
The lru_cache is optional, but often a good idea when working with recursion.
>>> [f(n) for n in range(10)]
[1, 10, 99, 980, 9701, 96030, 950599, 9409960, 93149001, 922080050]
The results are instant and it works for very large numbers.
In fact this question is more about math than about python.
For N figures there is 10^N possible unique strings. To get the answer to the problem we need to subtract the number of string containing "13".
If string starts from "13" we have 10^(N-2) possible unique strings. If we have 13 at the second possition (e.i. a string like x13...), we again have 10^(N-2) possibilities. But we can't continue this logic further as this will lead us to double calculation of string which have 13 at different possitions. For example for N=4 there will be a string "1313" which we will calculate twice. To avoid this we should calculate only those strings which we haven't calculated before. So for "13" on possition p (counting from 0) we should find the number of unique string which don't have "13" on the left side from p, that is for each p
number_of_strings_for_13_at_p = total_number_of_strings_without_13(N=p-1) * 10^(N-p-2)
So we recursevily define the total_number_of_strings_without_13 function.
Here is the idea in the code:
def number_of_strings_without_13(N):
sum_numbers_with_13 = 0
for p in range(N-1):
if p < 2:
sum_numbers_with_13 += 10**(N-2)
else:
sum_numbers_with_13 += number_of_strings_without_13(p) * 10**(N-p-2)
return 10**N - sum_numbers_with_13
I should say that 10**N means 10 in the power of N. All the other is described above. The functions also has a surprisingly pleasent ability to give correct answers for N=1 and N=2.
To test this works correct I've rewritten your code into function and refactored a little bit:
def number_of_strings_without_13_bruteforce(N):
without_13 = 0
for i in range(10**N):
if str(i).count("13"):
continue
without_13 += 1
return without_13
for N in range(1, 7):
print(number_of_strings_without_13(N),
number_of_strings_without_13_bruteforce(N))
They gave the same answers. With bigger N bruteforce is very slow. But for very large N recursive function also gets mush slower. There is a well known solution for that: as we use the value of number_of_strings_without_13 with parameters smaller than N multiple times, we should remember the answers and not recalculate them each time. It's quite simple to do like this:
def number_of_strings_without_13(N, answers=dict()):
if N in answers:
return answers[N]
sum_numbers_with_13 = 0
for p in range(N-1):
if p < 2:
sum_numbers_with_13 += 10**(N-2)
else:
sum_numbers_with_13 += number_of_strings_without_13(p) * 10**(N-p-2)
result = 10**N - sum_numbers_with_13
answers[N] = result
return result
Thanks to L3viathan's comment now it is clear. The logic is beautiful.
Let's assume a(n) is a number of strings of n digits without "13" in it. If we know all the good strings for n-1, we can add one more digit to the left of each string and calculate a(n). As we can combine previous digits with any of 10 new, we will get 10*a(n-1) different strings. But we must subtract the number of strings, which now starts with "13" which we wrongly summed like OK at the previous step. There is a(n-2) of such wrongly added strings. So a(n) = 10*a(n-1) - a(n-2). That is it. Such simple.
What is even more interesting is that this sequence can be calculated without iterations with a formula https://oeis.org/A004189 But practically that doesn't helps much, as the formula requires floating point calculations which will lead to rounding and would not work for big n (will give answer with some mistake).
Nevertheless the original sequence is quite easy to calculate and it doesn't need to store all the previous values, just the last two. So here is the code
def number_of_strings(n):
result = 0
result1 = 99
result2 = 10
if n == 1:
return result2
if n == 2:
return result1
for i in range(3, n+1):
result = 10*result1 - result2
result2 = result1
result1 = result
return result
This one is several orders faster than my previous suggestion. And memory consumption is now just O(n)
P.S. If you run this with Python2, you'd better change range to xrange
This python3 solution meets time and memory requirement of HackerEarth
from functools import lru_cache
mod = 1000000009
#lru_cache(1024)
def ans(n):
if n == 0:
return 1
if n == 1:
return 10
temp1 = ans(n//2)
temp2 = ans(n//2-1)
if (n & 1) == 0:
return (temp1*temp1 - temp2*temp2) % mod
else:
temp3 = ans(n//2 + 1)
return (temp1 * (temp3 - temp2)) % mod
for t in range(int(input())):
n = int(input())
print(ans(n))
I came across this problem on
https://www.hackerearth.com/problem/algorithm/the-unlucky-13-d7aea1ff/
I haven't been able to get the judge to accept my solution(s) in Python but (2) in ANSI C worked just fine.
Straightforward recursive counting of a(n) = 10*a(n-1) - a(n-2) is pretty slow when getting to large numbers but there are several options (one which is not mentioned here yet):
1.) using generating functions:
https://www.wolframalpha.com/input/?i=g%28n%2B1%29%3D10g%28n%29+-g%28n-1%29%2C+g%280%29%3D1%2C+g%281%29%3D10
the powers should be counted using squaring and modulo needs to be inserted cleverly into that and the numbers must be rounded but Python solution was slow for the judge anyway (it took 7s on my laptop and judge needs this to be counted under 1.5s)
2.) using matrices:
the idea is that we can get vector [a(n), a(n-1)] by multiplying vector [a(n-1), a(n-2)] by specific matrix constructed from equation a(n) = 10*a(n-1) - a(n-2)
| a(n) | = | 10 -1 | * | a(n-1) |
| a(n-1) | | 1 0 | | a(n-2) |
and by induction:
| a(n) | = | 10 -1 |^(n-1) * | a(1) |
| a(n-1) | | 1 0 | | a(0) |
the matrix multiplication in 2D should be done via squaring using modulo. It should be hardcoded rather counted via for cycles as it is much faster.
Again this was slow for Python (8s on my laptop) but fast for ANSI C (0.3s)
3.) the solution proposed by Anmol Singh Jaggi above which is the fastest in Python (3s) but the memory consumption for cache is big enough to break memory limits of the judge. Removing cache or limiting it makes the computation very slow.
You are given a string S of length N. The string S consists of digits from 1-9, Consider the string indexing to be 1-based.
You need to divide the string into blocks such that the i block contains the elements from the index((i 1) • X +1) to min(N, (i + X)) (both inclusive). A number is valid if it is formed by choosing exactly one digit from each block and placing the digits in the order of their block
number

Prime numbers which can be written as sum of the squares of two numbers x and y

The problem is:
Given a range of numbers (x,y) , Find all the prime numbers(Count only) which are sum of the squares of two numbers, with the restriction that 0<=x<y<=2*(10^8)
According to Fermat's theorem :
Fermat's theorem on sums of two squares asserts that an odd prime number p can be
expressed as p = x^2 + y^2 with integer x and y if and only if p is congruent to
1 (mod4).
I have done something like this:
import math
def is_prime(n):
if n % 2 == 0 and n > 2:
return False
return all(n % i for i in range(3, int(math.sqrt(n)) + 1, 2))
a,b=map(int,raw_input().split())
count=0
for i in range(a,b+1):
if(is_prime(i) and (i-1)%4==0):
count+=1
print(count)
But this increases the time complexity and memory limit in some cases.
Here is my submission result:
Can anyone help me reduce the Time Complexity and Memory limit with better algorithm?
Problem Link(Not an ongoing contest FYI)
Do not check whether each number is prime. Precompute all the prime numbers in the range, using Sieve of Eratosthenes. This will greatly reduce the complexity.
Since you have maximum of 200M numbers and 256Mb memory limit and need at least 4 bytes per number, you need a little hack. Do not initialize the sieve with all numbers up to y, but only with numbers that are not divisible by 2, 3 and 5. That will reduce the initial size of the sieve enough to fit into the memory limit.
UPD As correctly pointed out by Will Ness in comments, sieve contains only flags, not numbers, thus it requires not more than 1 byte per element and you don't even need this precomputing hack.
You can reduce your memory usage by changing for i in range(a,b+1): to for i in xrange(a,b+1):, so that you are not generating an entire list in memory.
You can do the same thing inside the statement below, but you are right that it does not help with time.
return all(n % i for i in xrange(3, int(math.sqrt(n)) + 1, 2))
One time optimization that might not cost as much in terms of memory as the other answer is to use Fermat's Little Theorem. It may help you reject many candidates early.
More specifically, you could pick maybe 3 or 4 random values to test and if one of them rejects, then you can reject. Otherwise you can do the test you are currently doing.
First of all, although it will not change the order of your time-complexity, you can still narrow down the list of numbers that you are checking by a factor of 6, since you only need to check numbers that are either equal to 1 mod 12 or equal to 5 mod 12 (such as [1,5], [13,17], [25,29], [37,41], etc).
Since you only need to count the primes which are sum of squares of two numbers, the order doesn't matter. Therefore, you can change range(a,b+1) to range(1,b+1,12)+range(5,b+1,12).
Obviously, you can then remove the if n % 2 == 0 and n > 2 condition in function is_prime, and in addition, change the if is_prime(i) and (i-1)%4 == 0 condition to if is_prime(i).
And finally, you can check the primality of each number by dividing it only with numbers that are adjacent to multiples of 6 (such as [5,7], [11,13], [17,19], [23,25], etc).
So you can change this:
range(3,int(math.sqrt(n))+1,2)
To this:
range(5,math.sqrt(n))+1,6)+range(7,math.sqrt(n))+1,6)
And you might as well calculate math.sqrt(n))+1 beforehand.
To summarize all this, here is how you can improve the overall performance of your program:
import math
def is_prime(n):
max = int(math.sqrt(n))+1
return all(n % i for i in range(5,max,6)+range(7,max,6))
count = 0
b = int(raw_input())
for i in range(1,b+1,12)+range(5,b+1,12):
if is_prime(i):
count += 1
print count
Please note that 1 is typically not regarded as prime, so you might want to print count-1 instead. On the other hand, 2 is not equal to 1 mod 4, yet it is the sum of two squares, so you may leave it as is...

Python: Summing Digits in a Long Number

I have come up with the following method for summing the digits in number. It seems to work for numbers below about 10^23, but doesn't work for higher numbers. Why does it not work for higher numbers?
x=0
for i in range(0, 5000):
x+=((number/10**i)//1)%10
print(x)
Leaving aside that this is a very inefficient way to sum the digits, and assuming you're using Python 3, add a conditional print to the loop:
for i in range(0, 5000):
piece = ((number/10**i)//1)%10
if piece:
print(i, piece)
x+=((number/10**i)//1)%10
Then you'll be able to see where it's going wrong. Starting with number = 10**24, I get output:
0 4.0
1 2.0
24 1.0
7.0
You're not expecting those intermediate results, right? Now you only have to figure out why you're getting them ;-)
The short course is that you're doing floating-point computations when you should be doing integer computations. You get off track immediately (when i is 0):
>>> 10**24/10**0
1e+24
>>> _ // 1
1e+24
>>> _ % 10
4.0
Why is that? Because 10**24 isn't exactly representable as a binary floating-point number:
>>> from decimal import Decimal
>>> Decimal(1e24)
Decimal('999999999999999983222784')
So the exact value of the approximation stored for 1e24 is 999999999999999983222784, and that indeed leaves a remainder of 4 when divided by 10.
To fix this, just stick to integer operations:
number = 10**24
x=0
for i in range(0, 5000):
x += number//10**i % 10
print(x)
That prints 1. Much more efficient is to do, e.g.,
print(sum(int(ch) for ch in str(number)))
This doesn't particularly answer the question but whenever you're looking at the decimal digits in a number you should use the decimal module:
from decimal import Decimal
sum(Decimal(number).as_tuple().digits)
Some advantages over the naïve sum(int(c) for c in str(int(abs(number)))):
sum(int(c) for c in str(int(abs(1.32))))
#>>> 1
sum(Decimal(1.5).as_tuple().digits)
#>>> 6
Note that
Decimal(1.3)
#>>> Decimal('1.3000000000000000444089209850062616169452667236328125')
because 1.3 is a floating-point number. You'd want
Decimal('1.3')
#>>> Decimal('1.3')
for an exact decimal. This might confuse your results for floating points.
A way to avoid this is to use Decimal(str(number)) instead of Decimal(number), which will give you a more obvious but less technically correct answer.
Also:
%~> python -m timeit -s "from decimal import Decimal" -s "number = 123456789**10" "sum(int(c) for c in str(int(abs(number))))"
10000 loops, best of 3: 70.4 usec per loop
%~> python -m timeit -s "from decimal import Decimal" -s "number = 123456789**10" "sum(Decimal(number).as_tuple().digits)"
100000 loops, best of 3: 11.8 usec per loop
But the real reason is that it's a just conceptually simpler not to think about the string but about the decimal representation.
This looks like Python 3. The "//1" is simply truncation to an integer value, and you don't need that at all if you use // for the first division:
x=0
for i in range(0, 5000):
x += (number // 10**i)%10
print(x)
As for 10^24, that is equal to 18, so the sum is 1+8=9. (^ is C-style bitwise exclusive or for integer arguments.)
Try 10**24 if you wanted 1 with 24 zeroes following it.
About your loop boundary. To get the number of digits in the integer part of a number, add "math" to your imports and:
ndigits = 1 + int(math.log10(max(1, abs(n))))
If you know you don't have negative numbers, you can leave out the abs().
If you know you don't have numbers less than 1, you can leave out the max() and the abs().

Categories

Resources