Number of occurrences of digit in numbers from 0 to n - python

Given a number n, count number of occurrences of digits 0, 2 and 4 including n.
Example1:
n = 10
output: 4
Example2:
n = 22
output: 11
My Code:
n = 22
def count_digit(n):
count = 0
for i in range(n+1):
if '2' in str(i):
count += 1
if '0' in str(i):
count += 1
if '4' in str(i):
count += 1
return count
count_digit(n)
Code Output: 10
Desired Output: 11
Constraints: 1 <= N <= 10^5
Note: The solution should not cause outOfMemoryException or Time Limit Exceeded for large numbers.

You can increment your count like this:
def count_digit(n):
count = 0
for i in range(n + 1):
if '2' in str(i):
count += str(i).count('2')
if '0' in str(i):
count += str(i).count('0')
if '4' in str(i):
count += str(i).count('4')
return count
In that way, edge cases like 22, 44, and so on are covered!

There are numbers in which the desired number is repeated, such as 20 or 22, so instead of adding 1 you must add 2
>>>
>>> string = ','.join(map(str,range(23)))
>>>
>>> string
'0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22'
>>>
>>> string.count('0') + string.count('2') + string.count('4')
11
>>>
n = 22
def count_digit(n):
count = 0
for i in map(str,range(n+1)):
count+=i.count('0')
count+=i.count('2')
count+=i.count('3')
return count
print(count_digit(n))
that solotion is fast:
It can be developed to be faster:
def count_digit(n):
i=0
count=0
s='024'
while i<n-1:
j = 0
for v in str(i):
if v in s:
j+=1
count+=3*j + (7*(j-1))
i+=10
for i in range(i,n+1,1):
for v in str(i):
if v in s:
count+=1
return count

TL;DR: If you do it right, you can compute the count about a thousand times faster for n close to 10**5, and since the better algorithm uses time proportional to the number of digits in n, it can easily handle even values of n too large for a 64-bit integer.
As is often the case with puzzles like this ("in the numbers from x to y, how many...?"), the key is to find a way to compute an aggregate count, ideally in O(1), for a large range. For combinatorics over the string representation of numbers, a convenient range is often something like the set of all numbers whose string representation is a given size, possibly with a specific prefix. In other words, ranges of the form [prefix*10⁴, prefix*10⁴+9999], where 0s in the lower limit is the same as the number of 9s in the upper limit and the exponent of 10 in the multiplier. (It's often actually more convenient to use half-open ranges, where the lower limit is inclusive and the upper limit is exclusive, so the above example would be [prefix*10⁴, (prefix+1)*10⁴).)
Also note that if the problem is to compute a count for [x, y), and you only know how to compute [0, y), then you just do two computations, because
count [x, y) == count [0, y) - count [0, x)
That identity is one of the simplifications which half-open intervals allow.
That would work nicely with this problem, because it's clear how many times a digit d occurs in the set of all k-digit suffixes for a given prefix. (In the 10k suffixes, every digit has the same frequency as every other digit; there are a total of k×10k digits in those 10k, and since all digits have the same count, that count must be k×10k−1.) Then you just have to add the digit count of the prefixes, but the prefix appears exactly 10k times, and each one contributes the same count.
So you could take a number like 72483, and decompose it into the following ranges, which roughly correspond to the sum of the digits in 72483, plus a few ranges containing fewer digits.
[0, 9]
[10, 99]
[100, 999]
[1000, 9999]
[10000, 19999]
[20000, 29999]
[30000, 39999]
[40000, 49999]
[50000, 59999]
[60000, 69999]
[70000, 70999]
[71000, 71999]
[72000, 72099]
[72100, 72199]
[72200, 72299]
[72300, 72399]
[72400, 72409]
[72410, 72419]
[72420, 72429]
[72430, 72439]
[72440, 72449]
[72450, 72459]
[72460, 72469]
[72470, 72479]
[72480, 72480]
[72481, 72481]
[72482, 72482]
[72483, 72483]
However, in the following code, I used a slightly different algorithm, which turned out to be a bit shorter. It considers the rectangle in which all the mumbers from 0 to n are written out, including leading zeros, and then computes counts for each column. A column of digits in a rectangle of sequential integers follows a simple recurring pattern; the frequency can easily be computed by starting with the completely repetitive part of the column. After the complete repetitions, the remaining digits are in order, with each one except the last one appearing the same number of times. It's probably easiest to understand that by drawing out a small example on a pad of paper, but the following code should also be reasonably clear (I hope).
The one problem with that is that it counts leading zeros which don't actually exist, so it needs to be corrected by subtracting the leading zero count. Fortunately, that count is extremely easy to compute. If you consider a range ending with a five-digit number (which itself cannot start with a zero, since it wouldn't really be a five-digit number if it started with zero), then you can see that the range includes:
10000 numbers start with a zero
1000 more numbers which have a second leading zero
100 more numbers which have a third leading zero
10 more numbers which have a fourth leading zero
No numbers have five leading zeros, because we write 0 as such, not as an empty string.
That adds up to 11110, and it's easy to see how that generalises. That value can be computed without a loop, as (10⁵ − 1) / 9 − 1. That correction is done at the end of the following function:
def countd(m, s=(0,2,4)):
if m < 0: return 0
m += 1
rv = 0
rest = 0
pos = 1
while True:
digit = m % 10
m //= 10
rv += m * pos * len(s)
for d in s:
if digit > d:
rv += pos
elif digit == d:
rv += rest
if m == 0:
break
rest += digit * pos
pos *= 10
if 0 in s:
rv -= (10 * pos - 1) // 9 - 1
return rv
That code could almost certainly be tightened up; I was just trying to get the algorithm down. But, as it is, it's execution time is measured in microseconds, not milliseconds, even for much larger values of n.
Here's an update of Kelly's benchmark; I removed the other solutions because they were taking too long for the last value of n:
Try it online!

Another brute force, seems faster:
def count_digit(n):
s = str(list(range(n+1)))
return sum(map(s.count, '024'))
Benchmark with n = 10**5:
result time solution
115474 244 ms original
138895 51 ms Kelly
138895 225 ms islam_abdelmoumen
138895 356 ms CodingDaveS
Code (Try it online!):
from timeit import default_timer as time
def original(n):
count = 0
for i in range(n+1):
if '2' in str(i):
count += 1
if '0' in str(i):
count += 1
if '4' in str(i):
count += 1
return count
def Kelly(n):
s = str(list(range(n+1)))
return sum(map(s.count, '024'))
def islam_abdelmoumen(n):
count = 0
for i in map(str,range(n+1)):
count+=i.count('0')
count+=i.count('2')
count+=i.count('3')
return count
def CodingDaveS(n):
count = 0
for i in range(n + 1):
if '2' in str(i):
count += str(i).count('2')
if '0' in str(i):
count += str(i).count('0')
if '4' in str(i):
count += str(i).count('4')
return count
funcs = original, Kelly, islam_abdelmoumen, CodingDaveS
print('result time solution')
print()
for _ in range(3):
for f in funcs:
t = time()
print(f(10**5), ' %3d ms ' % ((time()-t)*1e3), f.__name__)
print()

I ended up with a similar answer to rici's, except maybe from a slightly different phrasing for the numeric formulation. How many instances of each digit in each position ("counts for each column," as rici described) we can formulate in two parts as first p * floor(n / (10 * p)), where p is 10 raised to the power of position. For example, in position 0 (the rightmost), there is one 1 for each ten numbers. Counting the 0's, however, requires an additional check regarding the population of the current and next position.
To the first part we still need to add the counts attributed to the remainder of the division. For example, for n = 6, floor(6 / 10) = 0 but we do have one count of 2 and one of 4. We add p if the digit in that position in n is greater than the digit we're counting; or, if the digit is the same, we add the value on the right of the digit plus 1 (for example, for n = 45, we want to count the 6 instances where 4 appears in position 1: 40, 41, 42, 43, 44, 45).
JavaScript code, comparing with rici's instantly for all numbers from 1 to 600,000. (If I'm not mistaken, rici's code wrongly returns 0 for n = 0, when the answer should be 1 count.
function countd(m, s = [0,2,4]) {
if (m <= 0)
return 0
m += 1
rv = 0
rest = 0
pos = 1
while (true) {
digit = m % 10
m = Math.floor(m / 10)
rv += m * pos * s.length
for (d of s) {
if (digit > d)
rv += pos
else if (digit == d)
rv += rest
}
if (m == 0) {
break
}
rest += digit * pos
pos *= 10
}
if (s.includes(0)) {
rv -= Math.floor((10 * pos - 1) / 9) - 1
}
return rv
}
function f(n, ds = [0, 2, 4]) {
// Value on the right of position
let curr = 0;
let m = n;
// 10 to the power of position
let p = 1;
let result = 1;
while (m) {
const digit = m % 10;
m = Math.floor(m / 10);
for (const d of ds) {
if (d != 0 || n >= 11 * p) {
result += p * Math.floor((n - (d ? 0 : 10 * p)) / (10 * p));
}
if (digit > d && (d != 0 || m > 0)) {
result += p;
} else if (digit == d) {
result += curr + 1;
}
}
curr += p * digit;
p *= 10;
}
return result;
}
for (let n = 1; n <= 600000; n += 1) {
const _f = f(n);
const _countd = countd(n);
if (_f != _countd) {
console.log(`n: ${ n }`);
console.log(_f, _countd);
break;
}
}
console.log("Done.");

Using single branch conditional
def count_digit(n):
s = '024'
out = 0
for integer in map(str, range(n+1)): # integer as string
for digit in integer:
if digit in s:
out += 1
return out
or more compactly
def count_digit(n):
s = '024'
return sum(1 for i in map(str, range(n+1)) for d in i if d in s)

Related

Search in sorted array

There is quite simple task for finding values in sorted array which may contain duplicities and return indices to standard output on a single line.
First line of the input contains the numbers N and k, separated by a space.
N is the count of numbers and k is the number of queries to perform.
The next line or lines contain N numbers in non-decreasing order (data) and k numbers (queries) to search for in the input sequence.
Numbers are separated by spaces and ends of lines.
Read the data into memory and for each request find its first position i in the sequence (i.e., the smallest value i for which data[i]=x). Positions are indexed from 1 to N.
Write all these indices to standard output on a single line, separated by spaces. If the requested number is not present in the sequence, output 0 instead of its position. If the number is present more than once, output the index of its first occurence. The size of the sequence (N) and number of the requests (k) are at most 1 000 000.
def custom_search(arr, target) -> int:
n = len(arr) + 1
for i in range(1, n):
if (arr[i-1] == target):
return(i)
return(0)
def give_numbers():
inputs = list(map(int, input().split()))
if len(inputs) != 2:
return([], None, None)
n, m = inputs
if ((n < 1 or n > 1000000) or (m < 1 or m > 1000000)):
return([], None, None)
i = 2
stuff = []
while i >= 1:
stuff.append(list(map(int, input().split())))
i -= 1
return(stuff, n, m)
inpt, n, m = give_numbers()
if len(inpt) != 0:
N, k = inpt
if n == len(N) and m == len(k):
for i in k:
print(custom_search(N, i), end=" ")
Inputs:
10 4
4 8 9 9 9 9 18 28 32 100
4 9 28 32
Outputs:
1 3 8 9
Is there any better way to avoid O(n) in searching in ordered array and speed this up?
The algorithm you are looking for is called binary search, and its time complexity is O(log2(N)). Here is a python function that has 2 parameters:
The value you are looking for
The sorted array
and it returns the first position i where array[i] = value
def find_first_appearence(value, array):
position = 0
left = 0;
right = len(array) - 1
while left <= right:
middle = int(left + (right - left) / 2)
if array[middle] >= value:
right = middle - 1
position = middle
else:
left = middle + 1
if array[position] != value:
return 0
return position
Have you considered implementing some sort of binary search?
Divide the array in half, if the value searched is greater than the the middle value take the second part and keep going. In pseudocode:
found = false
while(!found && array.length > 1){
i = array.length / 2;
if (array[i]==searchedValue) return true
if (array[i]>searchedValue) array = array.slice(0, i)
if (array[i]<searchedValie) array = array.slice(i+1, array.length)
}
if (array[0] == searchedValue) found = true
return found
This will decrease the complexity to O(log(n))
You can use modified binary search that can find left most occurenct of the given target in the given array:
int binsearchLeftmost(int l, int r, int target, const std::vector<int>& array) {
int res = 0;
while (l <= r) {
int m = l + (r - l) / 2;
if (array[m] > target) {
r = m - 1;
}
else if (array[m] < target) {
l = m + 1;
}
else {
res = m + 1;
r = m - 1;
}
}
return res;
}

Fibonacci series in bit string

I am working on Fibonacci series but in bit string which can be represented as:
f(0)=0;
f(1)=1;
f(2)=10;
f(3)=101;
f(4)=10110;
f(5)=10110101;
Secondly, I have a pattern for example '10' and want to count how many times this occurs in particular series, for example, the Fibonacci series for 5 is '101101101' so '10' occur 3 times.
my code is running correctly without error but the problem is that it cannot run for more than the value of n=45 I want to run n=100
can anyone help? I only want to calculate the count of occurrence
n=5
fibonacci_numbers = ['0', '1']
for i in range(1,n):
fibonacci_numbers.append(fibonacci_numbers[i]+fibonacci_numbers[i-1])
#print(fibonacci_numbers[-1])
print(fibonacci_numbers[-1])
nStr = str (fibonacci_numbers[-1])
pattern = '10'
count = 0
flag = True
start = 0
while flag:
a = nStr.find(pattern, start)
if a == -1:
flag = False
else:
count += 1
start = a + 1
print(count)
This is a fun one! The trick is that you don't actually need that giant bit string, just the number of 10s it contains and the edges. This solution runs in O(n) time and O(1) space.
from typing import NamedTuple
class FibString(NamedTuple):
"""First digit, last digit, and the number of 10s in between."""
first: int
tens: int
last: int
def count_fib_string_tens(n: int) -> int:
"""Count the number of 10s in a n-'Fibonacci bitstring'."""
def combine(b: FibString, a: FibString) -> FibString:
"""Combine two FibStrings."""
tens = b.tens + a.tens
# mind the edges!
if b.last == 1 and a.first == 0:
tens += 1
return FibString(b.first, tens, a.last)
# First two values are 0 and 1 (tens=0 for both)
a, b = FibString(0, 0, 0), FibString(1, 0, 1)
for _ in range(1, n):
a, b = b, combine(b, a)
return b.tens # tada!
I tested this against your original implementation and sure enough it produces the same answers for all values that the original function is able to calculate (but it's about eight orders of magnitude faster by the time you get up to n=40). The answer for n=100 is 218922995834555169026 and it took 0.1ms to calculate using this method.
The nice thing about the Fibonacci sequence that will solve your issue is that you only need the last two values of the sequence. 10110 is made by combining 101 and 10. After that 10 is no longer needed. So instead of appending, you can just keep the two values. Here is what I've done:
n=45
fibonacci_numbers = ['0', '1']
for i in range(1,n):
temp = fibonacci_numbers[1]
fibonacci_numbers[1] = fibonacci_numbers[1] + fibonacci_numbers[0]
fibonacci_numbers[0] = temp
Note that it still uses a decent amount of memory, but it didn't give me a memory error (it does take a bit of time to run though).
I also wasn't able to print the full string as I got an OSError [Errno 5] Input/Output error but it can still count and print that output.
For larger numbers, storing as a string is going to quickly cause a memory issue. In that case, I'd suggest doing the fibonacci sequence with plain integers and then converting to bits. See here for tips on binary conversion.
While the regular fibonacci sequence doesn't work in a direct sense, consider that 10 is 2 and 101 is 5. 5+2 doesn't work - you want 10110 or an or operation 10100 | 10 yielding 22; so if you shift one by the length of the other, you can get the result. See for example
x = 5
y = 2
(x << 2) | y
>> 22
Shifting x by the number of bits representing y and then doing a bitwise or with | solves the issue. Python summarizes these bitwise operations well here. All that's left for you to do is determine how many bits to shift and implement this into your for loop!
For really large n you will still have a memory issue shown in the plot:
'
Finally i got the answer but can someone explain it briefly why it is working
def count(p, n):
count = 0
i = n.find(p)
while i != -1:
n = n[i + 1:]
i = n.find(p)
count += 1
return count
def occurence(p, n):
a1 = "1"
a0 = "0"
lp = len(p)
i = 1
if n <= 5:
return count(p, atring(n))
while lp > len(a1):
temp = a1
a1 += a0
a0 = temp
i += 1
if i >= n:
return count(p, a1)
fn = a1[:lp - 1]
if -lp + 1 < 0:
ln = a1[-lp + 1:]
else:
ln = ""
countn = count(p, a1)
a1 = a1 + a0
i += 1
if -lp + 1 < 0:
lnp1 = a1[-lp + 1:]
else:
lnp1 = ""
k = 0
countn1 = count(p, a1)
for j in range(i + 1, n + 1):
temp = countn1
countn1 += countn
countn = temp
if k % 2 == 0:
string = lnp1 + fn
else:
string = ln + fn
k += 1
countn1 += count(p, string)
return countn1
def atring(n):
a0 = "0"
a1 = "1"
if n == 0 or n == 1:
return str(n)
for i in range(2, n + 1):
temp = a1
a1 += a0
a0 = temp
return a1
def fn():
a = 100
p = '10'
print( occurence(p, a))
if __name__ == "__main__":
fn()

Calculating the sum of the 4th power of each digit, why do I get a wrong result?

I am trying to complete Project Euler question #30, I decided to verify my code against a known answer. Basically the question is this:
Find the sum of all the numbers that can be written as the sum of fifth powers of their digits.
Here is the known answer I am trying to prove with python:
1634 = 1^4 + 6^4 + 3^4 + 4^4
8208 = 8^4 + 2^4 + 0^4 + 8^4
9474 = 9^4 + 4^4 + 7^4 + 4^4
As 1 = 1^4 is not a sum it is not included.
The sum of these numbers is 1634 + 8208 + 9474 = 19316.
When I run my code I get all three of the values which add up to 19316, great! However among these values there is an incorrect one: 6688
Here is my code:
i=1
answer = []
while True:
list = []
i=i+1
digits = [int(x) for x in str(i)]
for x in digits:
a = x**4
list.append(a)
if sum(list) == i:
print(sum(list))
answer.append(sum(list))
The sum of list returns the three correct values, and the value 6688. Can anybody spot something I have missed?
You are checking the sum too early. You check for a matching sum for each individual digit in the number, and 6 ^ 4 + 6 ^ 4 + 8 ^ 4 is 6688. That's three of the digits, not all four.
Move your sum() test out of your for loop:
for x in digits:
a = x**4
list.append(a)
if sum(list) == i:
print(sum(list))
answer.append(sum(list))
At best you could discard a number early when the sum already exceeds the target:
digitsum = 0
for d in digits:
digitsum += d ** 4
if digitsum > i:
break
else:
if digitsum == i:
answer.append(i)
but I'd not bother with that here, and just use a generator expression to combine determining the digits, raising them to the 4th power, and summing:
if sum(int(d) ** 4 for d in str(i)) == i:
answer.append(i)
You haven't defined an upper bound, the point where numbers will always be bigger than the sum of their digits and you need to stop incrementing i. For the sum of nth powers, you can find such a point by taking 9 ^ n, counting its digits, then taking the number of digits in the nth power of 9 times the nth power of 9. If this creates a number with more digits, continue on until the number of digits no longer changes.
In the same vein, you can start i at max(10, 1 + 2 ** n), because the smallest sum you'll be able to make from digits will be using a single 2 digit plus the minimum number of 1 and 0 digits you can get away with, and at any power greater than 1, the power of digits other than 1 and 0 is always greater than the digit value itself, and you can't use i = 1:
def determine_bounds(n):
"""Given a power n > 1, return the lower and upper bounds in which to search"""
nine_power, digit_count = 9 ** n, 1
while True:
upper = digit_count * nine_power
new_count = len(str(upper))
if new_count == digit_count:
return max(10, 2 ** n), upper
digit_count = new_count
If you combine the above function with range(*<expression>) variable-length parameter passing to range(), you can use a for loop:
for i in range(*determine_bounds(4)):
# ...
You can put determining if a number is equal to the sum of its digits raised to a given power n in a function:
def is_digit_power_sum(i, n):
return sum(int(d) ** n for d in str(i)) == i
then you can put everything into a list comprehension:
>>> n = 4
>>> [i for i in range(*determine_bounds(n)) if is_digit_power_sum(i, n)]
[1634, 8208, 9474]
>>> n = 5
>>> [i for i in range(*determine_bounds(n)) if is_digit_power_sum(i, n)]
[4150, 4151, 54748, 92727, 93084, 194979]
The is_digit_power_sum() could benefit from a cache of powers; adding a cache makes the function more than twice as fast for 4-digit inputs:
def is_digit_power_sum(i, n, _cache={}):
try:
powers = _cache[n]
except KeyError:
powers = _cache[n] = {str(d): d ** n for d in range(10)}
return sum(powers[d] for d in str(i)) == i
and of course, the solution to the question is the sum of the numbers:
n = 5
answer = sum(i for i in range(*determine_bounds(n)) if is_digit_power_sum(i, n))
print(answer)
which produces the required output in under half a second on my 2.9 GHz Intel Core i7 MacBook Pro, using Python 3.8.0a3.
Here Fixed:
i=1
answer = []
while True:
list = []
i=i+1
digits = [int(x) for x in str(i)]
for x in digits:
a = x**4
list.append(a)
if sum(list) == i and len(list) == 4:
print(sum(list))
answer.append(sum(list))
The bug I found:
6^4+6^4+8^4 = 6688
So I just put a check for len of list.

Need better logic in finding the count of palindrome numbers in the range

I have two numbers say A = 10 and B =20.
Now I need to count the palindrome numbers in Range (A,B)
I tried this:
s = list(map(int,raw_input().split()))
a = s[0]
b = s[1]
l = range(s[0],s[1]+1)
# print "list : ",l
def isNumberPalindrome(n):
return str(n) == str(n)[::-1]
x = filter(isNumberPalindrome, l)
# print " All Palindorme numbers : ",x
count = len(x)
print count
I have problem of memory exceeding if A and B are in range of 10^18.
Can somebody suggest me how to solve this.
Thanks in Advance
Use a generator instead of calling range().
from __future__ import print_function
def isNumberPalindrome(n):
return str(n) == str(n)[::-1]
a = pow(10, 18)
b = pow(10, 19) + 1
def gen_range(start, end):
i = long(start)
while i < end:
yield i
i = i + 1
count = 0
for l in gen_range(a, b):
count += isNumberPalindrome(l)
print(count)
It is not the whole answer. But consider this:
For range 10^n to 10^n+1 you can find the number of palindromes in constant time. It is 10^ceil(n/2) - 10^ceil(n/2)-1. Because (for example) for n = 6 and range from 10^6 to 10^7 (1 000 000 - 10 000 000) number of palindromes is the number of possible numbers from 1000 to 10000 (that represent the first half of original numbers. 4315 is the first half of 4315134 and so on).
So you don't filter numbers but find how many palindromes you can generate in such range.

How to calculate no. of palindroms in a large number interval?

I want to calculate how many numbers are palindrome in large interval data say 10^15
My simple code (python) snippet is:
def count_palindromes(start, end):
count = 0
for i in range(start, end + 1):
if str(i) == str(i)[::-1]:
count += 1
return count
start = 1000 #some initial number
end = 10000000000000 #some other large number
if __name__ == "__main__":
print count_palindromes(start, end)
Its a simple program which checks each number one by one. Its vary time consuming and takes a lot of computer resources.
Is there any other method/technique by which we can count Palindrome numbers? Any Algorithm to use for this?
I want to minimize time taken in producing the output.
When you want to count the numbers having some given property between two limits, it is often useful to solve the somewhat simpler problem
How many numbers with the given property are there between 0 and n?
Keeping one limit fixed can make the problem significantly simpler to tackle. When the simpler problem is solved, you can get the solution to the original problem with a simple subtraction:
countBetween(a,b) = countTo(b) - countTo(a)
or countTo(b ± 1) - countTo(a ± 1), depending on whether the limit is included in countTo and which limits shall be included in countBetween.
If negative limits can occur (not for palindromes, I presume), countTo(n) should be <= 0 for negative n (one can regard the function as an integral with respect to the counting measure).
So let us determine
palindromes_below(n) = #{ k : 0 <= k < n, k is a palindrome }
We get more uniform formulae for the first part if we pretend that 0 is not a palindrome, so for the first part, we do that.
Part 1: How many palindromes with a given number d of digits are there?
The first digit cannot be 0, otherwise it's unrestricted, hence there are 9 possible choices (b-1 for palindromes in an arbitrary base b).
The last digit is equal to the first by the fact that it shall be a palindrome.
The second digit - if d >= 3 - can be chosen arbitrarily and independently from the first. That also determines the penultimate digit.
If d >= 5, one can also freely choose the third digit, and so on.
A moment's thought shows that for d = 2*k + 1 or d = 2*k + 2, there are k digits that can be chosen without restriction, and one digit (the first) that is subject to the restriction that it be non-zero. So there are
9 * 10**k
d-digit palindromes then ((b-1) * b**k for base b).
That's a nice and simple formula. From that, using the formula for a geometric sum, we can easily obtain the number of palindromes smaller than 10n (that is, with at most n digits):
if n is even, the number is
n/2-1 n/2-1
2 * ∑ 9*10**k = 18 * ∑ 10**k = 18 * (10**(n/2) - 1) / (10 - 1) = 2 * (10**(n/2) - 1)
k=0 k=0
if n is odd, the number is
2 * (10**((n-1)/2) - 1) + 9 * 10**((n-1)/2) = 11 * (10**((n-1)/2) - 2
(for general base b, the numbers are 2 * (b**(n/2) - 1) resp. (b+1) * b**((n-1)/2) - 2).
That's not quite as uniform anymore, but still simple enough:
def palindromes_up_to_n_digits(n):
if n < 1:
return 0
if n % 2 == 0:
return 2*10**(n//2) - 2
else:
return 11*10**(n//2) - 2
(remember, we don't count 0 yet).
Now for the remaining part. Given n > 0 with k digits, the palindromes < n are either
palindromes with fewer than k digits, there are palindromes_up_to_n_digits(k-1) of them, or
palindromes with exactly k digits that are smaller than n.
So it remains to count the latter.
Part 2:
Letm = (k-1)//2 and
d[1] d[2] ... d[m] d[m+1] ... d[k]
the decimal representation of n (the whole thing works with the same principle for other bases, but I don't explicitly mention that in the following), so
k
n = ∑ d[j]*10**(k-j)
j=1
For each 1 <= c[1] < d[1], we can choose the m digits c[2], ..., c[m+1] freely to obtain a palindrome
p = c[1] c[2] ... c[m+1] {c[m+1]} c[m] ... c[2] c[1]
(the digit c[m+1] appears once for odd k and twice for even k). Now,
c[1]*(10**(k-1) + 1) <= p < (c[1] + 1)*10**(k-1) <= d[1]*10**(k-1) <= n,
so all these 10**m palindromes (for a given choice of c[1]!) are smaller than n.
Thus there are (d[1] - 1) * 10**m k-digit palindromes whose first digit is smaller than the first digit of n.
Now let us consider the k-digit palindromes with first digit d[1] that are smaller than n.
If k == 2, there is one if d[1] < d[2] and none otherwise. If k >= 3, for each 0 <= c[2] < d[2], we can freely choose the m-1 digits c[3] ... c[m+1] to obtain a palindrome
p = d[1] c[2] c[3] ... c[m] c[m+1] {c[m+1]} c[m] ... c[3] c[2] d[1]
We see p < n:
d[1]*(10**(k-1) + 1) + c[2]*(10**(k-2) + 10)
<= p < d[1]*(10**(k-1) + 1) + (c[2] + 1)*(10**(k-2) + 10)
<= d[1]*(10**(k-1) + 1) + d[2]*(10**(k-2) + 10) <= n
(assuming k > 3, for k == 3 replace 10**(k-2) + 10 with 10).
So that makes d[2]*10**(m-1) k-digit palindromes with first digit d[1] and second digit smaller than d[2].
Continuing, for 1 <= r <= m, there are
d[m+1]*10**(m-r)
k-digit palindromes whose first r digits are d[1] ... d[r] and whose r+1st digit is smaller than d[r+1].
Summing up, there are
(d[1]-1])*10**m + d[2]*10**(m-1) + ... + d[m]*10 + d[m+1]
k-digit palindromes that have one of the first m+1 digits smaller than the corresponding digit of n and all preceding digits equal to the corresponding digit of n. Obviously, these are all smaller than n.
There is one k-digit palindrome p whose first m+1 digits are d[1] .. d[m+1], we must count that too if p < n.
So, wrapping up, and now incorporating 0 too, we get
def palindromes_below(n):
if n < 1:
return 0
if n < 10:
return n # 0, 1, ..., n-1
# General case
dec = str(n)
digits = len(dec)
count = palindromes_up_to_n_digits(digits-1) + 1 # + 1 for 0
half_length = (digits-1) // 2
front_part = dec[0:half_length + 1]
count += int(front_part) - 10**half_length
i, j = half_length, half_length+1
if digits % 2 == 1:
i -= 1
while i >= 0 and dec[i] == dec[j]:
i -= 1
j += 1
if i >= 0 and dec[i] < dec[j]:
count += 1
return count
Since the limits are both to be included in the count for the given problem (unless the OP misunderstood), we then have
def count_palindromes(start, end):
return palindromes_below(end+1) - palindromes_below(start)
for a fast solution:
>>> bench(10**100,10**101-1)
900000000000000000000000000000000000000000000000000 palindromes between
10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
and
99999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999
in 0.000186920166016 seconds
Actually, it's a problem for Google Codejam (which I'm pretty sure you're not supposed to get outside help on) but alas, I'll throw in my 2 cents.
The idea I came up with (but failed to implement) for the large problem was to precompile (generated at runtime, not hardcoded into the source) a list of all palindromic numbers less than 10^15 (there's not very many, it takes like ~60 seconds) then find out how many of those numbers lie between the bounds of each input.
EDIT: This won't work on the 10^100 problem, like you said, that would be a mathematical solution (although there is a pattern if you look, so you'd just need an algorithm to generate all numbers with that pattern)
I presume this is for something like Project Euler... my rough idea would be to generate all numbers up to half the length of your limit (like, if you're going to 99999, go up to 99). Then reverse them, append them to the unreversed one, and potentially add a digit in the middle (for the numbers with odd lengths). You'll might have to do some filtering for duplicates, or weird ones (like if you had a zero at the beginning of the number or sommat) but that should be a lot faster than what you were doing.

Categories

Resources