How to split big numbers? - python

I have a big number, which I need to split into smaller numbers in Python. I wrote the following code to swap between the two:
def split_number (num, part_size):
string = str(num)
string_size = len(string)
arr = []
pointer = 0
while pointer < string_size:
e = pointer + part_size
arr.append(int(string[pointer:e]))
pointer += part_size
return arr
def join_number(arr):
num = ""
for x in arr:
num += str(x)
return int(num)
But the number comes back different. It's hard to debug because the number is so large so before I go into that I thought I would post it here to see if there is a better way to do it or whether I'm missing something obvious.
Thanks a lot.

Clearly, any leading 0s in the "parts" can't be preserved by this operation. Can't join_number also receive the part_size argument, so that it can reconstruct the string formats with all the leading zeros?
Without some information such as part_size that's known to both the sender and receiver, or the equivalent (such as the base number to use for a similar split and join based on arithmetic, roughly equivalent to 10**part_size given the way you're using part_size), the task becomes quite a bit harder. If the receiver is initially clueless about this, why not just place the part_size (or base, etc) as the very first int in the arr list that's being sent and received? That way, the encoding trivially becomes "self-sufficient", i.e., doesn't need any supplementary parameter known to both sender and receiver.

There is no need to convert to and from strings, which can be very time consuming for really large numbers
>>> def split_number(n, part_size):
... base = 10**part_size
... L = []
... while n:
... n,part = divmod(n,base)
... L.append(part)
... return L[::-1]
...
>>> def join_number(L, part_size):
... base = 10**part_size
... n = 0
... L = L[::-1]
... while L:
... n = n*base+L.pop()
... return n
...
>>> print split_number(1000005,3)
[1, 0, 5]
>>> print join_number([1,0,5],3)
1000005
>>>
Here you can see that just converting the number to a str takes longer than my entire function!
>>> from time import time
>>> t=time();b = split_number(2**100000,3000);print time()-t
0.204252004623
>>> t=time();b = split_number(2**100000,30);print time()-t
0.486856222153
>>> t=time();b = str(2**100000);print time()-t
0.730905056

You should think of the following number split into 3-sized chunks:
1000005 -> 100 000 5
You have two problems. The first is that if you put those integers back together, you'll get:
100 0 5 -> 100005
(i.e., the middle one is 0, not 000) which is not what you started with. Second problem is that you're not sure what size the last part should be.
I would ensure that you're first using a string whose length is an exact multiple of the part size so you know exactly how big each part should be:
def split_number (num, part_size):
string = str(num)
string_size = len(string)
while string_size % part_size != 0:
string = "0%s"%(string)
string_size = string_size + 1
arr = []
pointer = 0
while pointer < string_size:
e = pointer + part_size
arr.append(int(string[pointer:e]))
pointer += part_size
return arr
Secondly, make sure that you put the parts back together with the right length for each part (ensuring you don't put leading zeros on the first part of course):
def join_number(arr, part_size):
fmt_str = "%%s%%0%dd"%(part_size)
num = arr[0]
for x in arr[1:]:
num = fmt_str%(num,int(x))
return int(num)
Tying it all together, the following complete program:
#!/usr/bin/python
def split_number (num, part_size):
string = str(num)
string_size = len(string)
while string_size % part_size != 0:
string = "0%s"%(string)
string_size = string_size + 1
arr = []
pointer = 0
while pointer < string_size:
e = pointer + part_size
arr.append(int(string[pointer:e]))
pointer += part_size
return arr
def join_number(arr, part_size):
fmt_str = "%%s%%0%dd"%(part_size)
num = arr[0]
for x in arr[1:]:
num = fmt_str%(num,int(x))
return int(num)
x = 1000005
print x
y = split_number(x,3)
print y
z = join_number(y,3)
print z
produces the output:
1000005
[1, 0, 5]
1000005
which shows that it goes back together.
Just keep in mind I haven't done Python for a few years. There's almost certainly a more "Pythonic" way to do it with those new-fangled lambdas and things (or whatever Python calls them) but, since your code was of the basic form, I just answered with the minimal changes required to get it working. Oh yeah, and be wary of negative numbers :-)

Here's some code for Alex Martelli's answer.
def digits(n, base):
while n:
yield n % base
n //= base
def split_number(n, part_size):
base = 10 ** part_size
return list(digits(n, base))
def join_number(digits, part_size):
base = 10 ** part_size
return sum(d * (base ** i) for i, d in enumerate(digits))

Related

How to reverse an int in python?

I'm creating a python script which prints out the whole song of '99 bottles of beer', but reversed. The only thing I cannot reverse is the numbers, being integers, not strings.
This is my full script,
def reverse(str):
return str[::-1]
def plural(word, b):
if b != 1:
return word + 's'
else:
return word
def line(b, ending):
print b or reverse('No more'), plural(reverse('bottle'), b), reverse(ending)
for i in range(99, 0, -1):
line(i, "of beer on the wall")
line(i, "of beer"
print reverse("Take one down, pass it around")
line(i-1, "of beer on the wall \n")
I understand my reverse function takes a string as an argument, however I do not know how to take in an integer, or , how to reverse the integer later on in the script.
Without converting the number to a string:
def reverse_number(n):
r = 0
while n > 0:
r *= 10
r += n % 10
n /= 10
return r
print(reverse_number(123))
You are approaching this in quite an odd way. You already have a reversing function, so why not make line just build the line the normal way around?
def line(bottles, ending):
return "{0} {1} {2}".format(bottles,
plural("bottle", bottles),
ending)
Which runs like:
>>> line(49, "of beer on the wall")
'49 bottles of beer on the wall'
Then pass the result to reverse:
>>> reverse(line(49, "of beer on the wall"))
'llaw eht no reeb fo selttob 94'
This makes it much easier to test each part of the code separately and see what's going on when you put it all together.
Something like this?
>>> x = 123
>>> str(x)
'123'
>>> str(x)[::-1]
'321'
best way is
x=12345
a=str(x)[::-1]\\ In this process i have create string of inverse of integer (a="54321")
a=int(a) \\ Here i have converted string a in integer
or
one line code is
a=int(str(x)[::-1]))
def reverse(x):
re = 0
negative = x < 0
MAX_BIG = 2 ** 31 -1
MIN_BIG = -2 ** 31
x = abs(x)
while x != 0:
a = int(x % 10)
re = re * 10 + a
x = int(x // 10)
reverse = -1 * re if negative else re
return 0 if reverse < MIN_BIG or reverse > MAX_BIG else reverse
this is for 32 - bit integer ( -2^31 ; 2^31-1 )
def reverse_number(n):
r = 0
while n > 0:
r = (r*10) + (n % 10)
print(r)
r *=10
n //= 10
return r
print(reverse_number(123))
You can cast an integer to string with str(i) and then use your reverse function.
The following line should do what you are looking for:
def line(b, ending):
print reverse(str(b)) or reverse('No more'), plural(reverse('bottle'),reverse(str(b))), reverse(ending)
Original number is taken in a
a = 123
We convert the int to string ,then reverse it and again convert in int and store reversed number in b
b = int("".join(reversed(str(a))))
Print the values of a and b
print(a,b)
def reverse_number(n):
r = 0
while n > 0:
r *= 10
r += n % 10
n /= 10
return r
print(reverse_number(123))
This code will not work if the number ends with zeros, example 100 and 1000 return 1
def reverse(num):
rev = 0
while(num != 0):
reminder = num % 10
rev = (rev * 10 ) + reminder
num = num // 10
print ("Reverse number is : " , rev )
num=input("enter number : ")
reverse(int(num))
#/ always results into float
#// division that results into whole number adjusted to the left in the number line
I think the following code should be good to reverse your positive integer.
You can use it as a function in your code.
n = input() # input is always taken as a string
rev = int(str(n)[::-1])
If you are having n as integer then you need to specify it as str here as shown. This is the quickest way to reverse a positive integer
import math
def Function(inputt):
a = 1
input2 = inputt
while(input2 > 9):
input2 = input2/10
a = a + 1
print("There are ", a, " numbers ")
N = 10
m = 1
print(" THe reverse numbers are: ")
for i in range(a):
l = (inputt%N)/m
print(math.floor(l), end = '')
N = N*10
m = m*10
print(" \n")
return 0
enter = int(input("Enter the number: "))
print(Function(enter))
More robust solution to handle negative numbers:
def reverse_integer(num):
sign = [1,-1][num < 0]
output = sign * int(str(abs(num))[::-1])
An easy and fast way to do it is as follows:
def reverse(x: int|str) -> int:
reverse_x = int(''.join([dgt for dgt in reversed(num:=str(x)) if dgt != '-']))
if '-' in num:
reverse_x = -reverse_x'
return reverse_x
First we create a list (using list comprehension) of the digits in reverse order. However, we must exclude the sign (otherwise the number would turn out like [3, 2, 1, -]). We now turn the list into a string using the ''.join() method.
Next we check if the original number had a negative sign in it. If it did, we would add a negative sign to reverse_x.
Easily you can write this class:
class reverse_number:
def __init__(self,rvs_num):
self.rvs_num = rvs_num
rvs_ed = int(str(rvs_num)[::-1])
print(rvs_ed)
You can use it by writing:
reverse_number(your number)
I have written it in a different way, but it works
def isPalindrome(x: int) -> bool:
if x<0:
return False
elif x<10:
return True
else:
rev=0
rem = x%10
quot = x//10
rev = rev*10+rem
while (quot>=10):
rem = quot%10
quot = quot//10
rev = rev*10+rem
rev = rev*10+quot
if rev==x:
return True
else:
return False
res=isPalindrome(1221)

Converting a string to binary

I need some help converting a string to binary. I have to do it using my own code, not built in functions (except I can use 'ord' to get the characters into decimal).
The problem I have is that it only seems to convert the first character into binary, not all of the characters of the string. For instance, if you type "hello" it will convert the h to binary but not the whole thing.
Here's what I have so far
def convertFile():
myList = []
myList2 = []
flag = True
string = input("input a string: ")
for x in string:
x = ord(x)
myList.append(x)
print(myList)
for i in range(len(myList)):
for x in myList:
print(x)
quotient = x / 2
quotient = int(quotient)
print(quotient)
remainder = x % 2
remainder = int(remainder)
print(remainder)
myList2.append(remainder)
print(myList2)
if int(quotient) < 1:
pass
else:
x = quotient
myList2.reverse()
print ("" .join(map(str, myList2)))
convertFile()
If you're just wanting "hex strings", you can use the following snippet:
''.join( '%x' % ord(i) for i in input_string )
Eg. 'hello' => '68656c6c6f', where 'h' => '68' in the ascii table.
def dec2bin(decimal_value):
return magic_that_converts_a_decimal_to_binary(decimal_value)
ordinal_generator = (ord(letter) for letter in my_word) #generators are lazily evaluated
bins = [dec2bin(ordinal_value) for ordinal_value in ordinal_generator]
print bins
as an aside this is bad
for x in myList:
...
x = whatever
since once it goes to x again at the top whatever you set x equal to gets tossed out and x gets assigned the next value in the list

A cleaner way to generate a list of running IDs

I just made a function to generate a list of running ids between a given range. IDs begin with an alphabet and follow with 5 numbers (e.g. A00002). The function below works, but I was wondering if there was a cleaner way to do this. Thanks!
def running_ids(start, end):
list = []
start = int(start[1:])
end = int(end[1:])
steps = end - start
def zeros(n):
zeros = 5 - len(str(n))
return zeros
while start <= end:
string = "A" + "0"*zeros(start) + str(start)
list.append(string)
start += 1
return list
print running_ids('A00001', 'A00005')
['A00001', 'A00002', 'A00003', 'A00004', 'A00005']
Use a generator. This way you can generate the numbers as needed and not store them all at once. It also maintains the state of your counter, useful if you start building large projects and you forget to add one to your index. It's a very powerful way of programming in Python:
def running_id():
n = 1
while True:
yield 'A{0:05d}'.format(n)
n += 1
C = running_id()
for n in xrange(5):
print next(C)
Giving:
A00001
A00002
A00003
A00004
A00005
You could just use simple builtin string formatting:
>>> 'A%05d'%1
'A00001'
>>> 'A{0:05d}'.format(1)
'A00001'
You can use the builtin format method
print "A" + format(1, "05d") # A00001
print "A" + format(100, "05d") # A00100
Or you can use str.zfill method like this
print "A" + str(1).zfill(5) # A00001
print "A" + str(100).zfill(5) # A00100
def running_ids(start, end):
t = start[0]
low = int(start[1:])
high = int(end[1:]) + 1
res = []
for x in range(low, high):
res.append(t + '{0:05d}'.format(x))
return res
print(running_ids('A00001', 'A00005'))

Is there a faster way of converting a number to a name?

The following code defines a sequence of names that are mapped to numbers. It is designed to take a number and retrieve a specific name. The class operates by ensuring the name exists in its cache, and then returns the name by indexing into its cache. The question in this: how can the name be calculated based on the number without storing a cache?
The name can be thought of as a base 63 number, except for the first digit which is always in base 53.
class NumberToName:
def __generate_name():
def generate_tail(length):
if length > 0:
for char in NumberToName.CHARS:
for extension in generate_tail(length - 1):
yield char + extension
else:
yield ''
for length in itertools.count():
for char in NumberToName.FIRST:
for extension in generate_tail(length):
yield char + extension
FIRST = ''.join(sorted(string.ascii_letters + '_'))
CHARS = ''.join(sorted(string.digits + FIRST))
CACHE = []
NAMES = __generate_name()
#classmethod
def convert(cls, number):
for _ in range(number - len(cls.CACHE) + 1):
cls.CACHE.append(next(cls.NAMES))
return cls.CACHE[number]
def __init__(self, *args, **kwargs):
raise NotImplementedError()
The following interactive sessions show some of the values that are expected to be returned in order.
>>> NumberToName.convert(0)
'A'
>>> NumberToName.convert(26)
'_'
>>> NumberToName.convert(52)
'z'
>>> NumberToName.convert(53)
'A0'
>>> NumberToName.convert(1692)
'_1'
>>> NumberToName.convert(23893)
'FAQ'
Unfortunately, these numbers need to be mapped to these exact names (to allow a reverse conversion).
Please note: A variable number of bits are received and converted unambiguously into a number. This number should be converted unambiguously to a name in the Python identifier namespace. Eventually, valid Python names will be converted to numbers, and these numbers will be converted to a variable number of bits.
Final solution:
import string
HEAD_CHAR = ''.join(sorted(string.ascii_letters + '_'))
TAIL_CHAR = ''.join(sorted(string.digits + HEAD_CHAR))
HEAD_BASE, TAIL_BASE = len(HEAD_CHAR), len(TAIL_CHAR)
def convert_number_to_name(number):
if number < HEAD_BASE: return HEAD_CHAR[number]
q, r = divmod(number - HEAD_BASE, TAIL_BASE)
return convert_number_to_name(q) + TAIL_CHAR[r]
This is a fun little problem full of off by 1 errors.
Without loops:
import string
first_digits = sorted(string.ascii_letters + '_')
rest_digits = sorted(string.digits + string.ascii_letters + '_')
def convert(number):
if number < len(first_digits):
return first_digits[number]
current_base = len(rest_digits)
remain = number - len(first_digits)
return convert(remain / current_base) + rest_digits[remain % current_base]
And the tests:
print convert(0)
print convert(26)
print convert(52)
print convert(53)
print convert(1692)
print convert(23893)
Output:
A
_
z
A0
_1
FAQ
What you've got is a corrupted form of bijective numeration (the usual example being spreadsheet column names, which are bijective base-26).
One way to generate bijective numeration:
def bijective(n, digits=string.ascii_uppercase):
result = []
while n > 0:
n, mod = divmod(n - 1, len(digits))
result += digits[mod]
return ''.join(reversed(result))
All you need to do is supply a different set of digits for the case where 53 >= n > 0. You will also need to increment n by 1, as properly the bijective 0 is the empty string, not "A":
def name(n, first=sorted(string.ascii_letters + '_'), digits=sorted(string.ascii_letters + '_' + string.digits)):
result = []
while n >= len(first):
n, mod = divmod(n - len(first), len(digits))
result += digits[mod]
result += first[n]
return ''.join(reversed(result))
Tested for the first 10,000 names:
first_chars = sorted(string.ascii_letters + '_')
later_chars = sorted(list(string.digits) + first_chars)
def f(n):
# first, determine length by subtracting the number of items of length l
# also determines the index into the list of names of length l
ix = n
l = 1
while ix >= 53 * (63 ** (l-1)):
ix -= 53 * (63 ** (l-1))
l += 1
# determine first character
first = first_chars[ix // (63 ** (l-1))]
# rest of string is just a base 63 number
s = ''
rem = ix % (63 ** (l-1))
for i in range(l-1):
s = later_chars[rem % 63] + s
rem //= 63
return first+s
You can use the code in this answer to the question "Base 62 conversion in Python" (or perhaps one of the other answers).
Using the referenced code, I think the answer your real question which was "how can the name be calculated based on the number without storing a cache?" would be to make the name the simple base 62 conversion of the number possibly with a leading underscore if the first character of the name is a digit (which is simply ignored when converting the name back into a number).
Here's sample code illustrating what I propose:
from base62 import base62_encode, base62_decode
def NumberToName(num):
ret = base62_encode(num)
return ('_' + ret) if ret[0] in '0123456789' else ret
def NameToNumber(name):
return base62_decode(name if name[0] is not '_' else name[1:])
if __name__ == '__main__':
def test(num):
name = NumberToName(num)
num2 = NameToNumber(name)
print 'NumberToName({0:5d}) -> {1!r:>6s}, NameToNumber({2!r:>6s}) -> {3:5d}' \
.format(num, name, name, num2)
test(26)
test(52)
test(53)
test(1692)
test(23893)
Output:
NumberToName( 26) -> 'q', NameToNumber( 'q') -> 26
NumberToName( 52) -> 'Q', NameToNumber( 'Q') -> 52
NumberToName( 53) -> 'R', NameToNumber( 'R') -> 53
NumberToName( 1692) -> 'ri', NameToNumber( 'ri') -> 1692
NumberToName(23893) -> '_6dn', NameToNumber('_6dn') -> 23893
If the numbers could be negative, you might have to modify the code from the referenced answer (and there is some discussion there on how to do it).

Encoding a 128-bit integer in Python?

Inspired by the "encoding scheme" of the answer to this question, I implemented my own encoding algorithm in Python.
Here is what it looks like:
import random
from math import pow
from string import ascii_letters, digits
# RFC 2396 unreserved URI characters
unreserved = '-_.!~*\'()'
characters = ascii_letters + digits + unreserved
size = len(characters)
seq = range(0,size)
# Seed random generator with same randomly generated number
random.seed(914576904)
random.shuffle(seq)
dictionary = dict(zip(seq, characters))
reverse_dictionary = dict((v,k) for k,v in dictionary.iteritems())
def encode(n):
d = []
n = n
while n > 0:
qr = divmod(n, size)
n = qr[0]
d.append(qr[1])
chars = ''
for i in d:
chars += dictionary[i]
return chars
def decode(str):
d = []
for c in str:
d.append(reverse_dictionary[c])
value = 0
for i in range(0, len(d)):
value += d[i] * pow(size, i)
return value
The issue I'm running into is encoding and decoding very large integers. For example, this is how a large number is currently encoded and decoded:
s = encode(88291326719355847026813766449910520462)
# print s -> "3_r(AUqqMvPRkf~JXaWj8"
i = decode(s)
# print i -> "8.82913267194e+37"
# print long(i) -> "88291326719355843047833376688611262464"
The highest 16 places match up perfectly, but after those the number deviates from its original.
I assume this is a problem with the precision of extremely large integers when dividing in Python. Is there any way to circumvent this problem? Or is there another issue that I'm not aware of?
The problem lies within this line:
value += d[i] * pow(size, i)
It seems like you're using math.pow here instead of the built-in pow method. It returns a floating point number, so you lose accuracy for your large numbers. You should use the built-in pow or the ** operator or, even better, keep the current power of the base in an integer variable:
def decode(s):
d = [reverse_dictionary[c] for c in s]
result, power = 0, 1
for x in d:
result += x * power
power *= size
return result
It gives me the following result now:
print decode(encode(88291326719355847026813766449910520462))
# => 88291326719355847026813766449910520462

Categories

Resources