Python binary manipulation

Python binary manipulation - python

I am manipulating binary values with python. Here is my statement:
I have a float value, for example: 127.136.
First, I split this value in order to differenciate the integer and the fractional values, then I transform them into bytes:
velocity = 127.136
split_num = str(velocity).split('.')
int_part = int(split_num[0]) # = 127
decimal_part = int(split_num[2]) # = 136
int_part_b = "{0:b}".format(int_part)
decimal_part_b = "{0:b}".format(decimal_part)
I have to send the final value as a 2 bits word.
The total space is 32 bits. 127 is the integer part and 136 is the fractional part. The integer part takes 16 bits and the decimal part takes 16 bits.
So I want to initialize 3 binary word of 32 bits like that:
final_value = 00000000000000000000000000000000
integer_value = 00000000000000000000000000000000
fractional_value = 00000000000000000000000000000000
But then how can I add the value of int_part_b from the 16th bits of integer_value so for example if int_part_b = 10011 I would like to have: integer_value = 00000000000100110000000000000000 and for the fractional one if decimal_part_b = 11110 I would like to have: fractional_value = 00000000000000000000000000011110
And at the end, I sum this two bits values to obtain:
final_value = 00000000000100110000000000011110
Does anyone has any idea about how to do that?

For this encoding to make any sense, the value in the fractional part should be a faction of 2^16 (or any denominator <= 65536 you pick as a convention).
You cannot obtain the fractional part of the float value by string manipulation because that makes you lose the base10 denominator. (e.g. 0.36 and 0.036 are two different fractions but your code will produce 36 in both cases).
Note that there are a lot of different ways to encode numbers in binary (BCD for example) so you probably should be specific on that in your question
Binary encoding:
Here is one way to do it with a denominator of 2^16:
velocity = 127.136
denom = 2**16
intPart,fracPart = divmod(round(velocity*denom),denom)
encoded = (intPart<<16) | fracPart
print(f"{encoded:032b}")
# 00000000011111110010001011010001
Or, more straightforwardly (but only if denom is 2**16):
encoded = round(velocity*2**16)
print(f"{encoded:032b}")
# 00000000011111110010001011010001
Here, the value in the right part (fractional) is 8913, representing a fraction of 8913/65536 (0.1360015...) which is as close to 0.136 as a binary fraction over 16 bits can get.
Decoding the value:
int(encoded[:16],2) + int(encoded[16:],2)/denom
# 127.13600158691406
Decimal encoding:
If you want to represent the fractional part in base 10, you'll have to decide on a base10 denominator that fits within 16 bits (e.g. 10000) and compute your fractional part accordingly.
velocity = 127.136
denom = 10000
intPart,fracPart = divmod(round(velocity*denom),denom)
encoded = (intPart<<16) | fracPart
print(f"{encoded:032b}")
# 00000000011111110000010101010000
Now the fractional part is 1360 representing a fraction of 1360/10000 (0.136) which doesn't lose precision compared to the base 10 original (as long as there are no more than 4 decimal digits)
Decoding the value:
int(encoded[:16],2) + int(encoded[16:],2)/denom
# 127.136

You don't really need 3 differend words, just use the binary shift operator and sum the integer and decimal parts like this:
final_value = (int_part << 16) + decimal_part
print("{:032b}".format(final_value))
So here you shift your integer part 16 bits to the left and then you add your decimal part to it so they will occupy there respective 16-bit words of the 32-bit word.

Related

In Python - how to retrieve the binary code of negative integer from the memory? [duplicate]

Integers in Python are stored in two's complement, correct?
Although:
>>> x = 5
>>> bin(x)
0b101
And:
>>> x = -5
>>> bin(x)
-0b101
That's pretty lame. How do I get python to give me the numbers in REAL binary bits, and without the 0b infront of it? So:
>>> x = 5
>>> bin(x)
0101
>>> y = -5
>>> bin(y)
1011

It works best if you provide a mask. That way you specify how far to sign extend.
>>> bin(-27 & 0b1111111111111111)
'0b1111111111100101'
Or perhaps more generally:
def bindigits(n, bits):
s = bin(n & int("1"*bits, 2))[2:]
return ("{0:0>%s}" % (bits)).format(s)
>>> print bindigits(-31337, 24)
111111111000010110010111
In basic theory, the actual width of the number is a function of the size of the storage. If it's a 32-bit number, then a negative number has a 1 in the MSB of a set of 32. If it's a 64-bit value, then there are 64 bits to display.
But in Python, integer precision is limited only to the constraints of your hardware. On my computer, this actually works, but it consumes 9GB of RAM just to store the value of x. Anything higher and I get a MemoryError. If I had more RAM, I could store larger numbers.
>>> x = 1 << (1 << 36)
So with that in mind, what binary number represents -1? Python is well-capable of interpreting literally millions (and even billions) of bits of precision, as the previous example shows. In 2's complement, the sign bit extends all the way to the left, but in Python there is no pre-defined number of bits; there are as many as you need.
But then you run into ambiguity: does binary 1 represent 1, or -1? Well, it could be either. Does 111 represent 7 or -1? Again, it could be either. So does 111111111 represent 511, or -1... well, both, depending on your precision.
Python needs a way to represent these numbers in binary so that there's no ambiguity of their meaning. The 0b prefix just says "this number is in binary". Just like 0x means "this number is in hex". So if I say 0b1111, how do I know if the user wants -1 or 15? There are two options:
Option A: The sign bit
You could declare that all numbers are signed, and the left-most bit is the sign bit. That means 0b1 is -1, while 0b01 is 1. That also means that 0b111 is also -1, while 0b0111 is 7. In the end, this is probably more confusing than helpful particularly because most binary arithmetic is going to be unsigned anyway, and people are more likely to run into mistakes by accidentally marking a number as negative because they didn't include an explicit sign bit.
Option B: The sign indication
With this option, binary numbers are represented unsigned, and negative numbers have a "-" prefix, just like they do in decimal. This is (a) more consistent with decimal, (b) more compatible with the way binary values are most likely going to be used. You lose the ability to specify a negative number using its two's complement representation, but remember that two's complement is a storage implementation detail, not a proper indication of the underlying value itself. It shouldn't have to be something that the user has to understand.
In the end, Option B makes the most sense. There's less confusion and the user isn't required to understand the storage details.

To properly interpret a binary sequence as two's complement, there needs to a length associated with the sequence. When you are working low-level types that correspond directly to CPU registers, there is an implicit length. Since Python integers can have an arbitrary length, there really isn't an internal two's complement format. Since there isn't a length associated with a number, there is no way to distinguish between positive and negative numbers. To remove the ambiguity, bin() includes a minus sign when formatting a negative number.
Python's arbitrary length integer type actually uses a sign-magnitude internal format. The logical operations (bit shifting, and, or, etc.) are designed to mimic two's complement format. This is typical of multiple precision libraries.

Here is a little bit more readable version of Tylerl answer, for example let's say you want -2 in its 8-bits negative representation of "two's complement" :
bin(-2 & (2**8-1))
2**8 stands for the ninth bit (256), substract 1 to it and you have all the preceding bits set to one (255)
for 8 and 16 bits masks, you can replace (2**8-1) by 0xff, or 0xffff. The hexadecimal version becomes less readalbe after that point.
If this is unclear, here is a regular function of it:
def twosComplement (value, bitLength) :
return bin(value & (2**bitLength - 1))

The compliment of one minus number's meaning is mod value minus the positive value.
So I think，the brief way for the compliment of -27 is
bin((1<<32) - 27) // 32 bit length '0b11111111111111111111111111100101'
bin((1<<16) - 27)
bin((1<<8) - 27) // 8 bit length '0b11100101'

Not sure how to get what you want using the standard lib. There are a handful of scripts and packages out there that will do the conversion for you.
I just wanted to note the "why" , and why it's not lame.
bin() doesn't return binary bits. it converts the number to a binary string. the leading '0b' tells the interpreter that you're dealing with a binary number , as per the python language definition. this way you can directly work with binary numbers, like this
>>> 0b01
1
>>> 0b10
2
>>> 0b11
3
>>> 0b01 + 0b10
3
that's not lame. that's great.
http://docs.python.org/library/functions.html#bin
bin(x)
Convert an integer number to a binary string.
http://docs.python.org/reference/lexical_analysis.html#integers
Integer and long integer literals are described by the following lexical definitions:
bininteger ::= "0" ("b" | "B") bindigit+
bindigit ::= "0" | "1"

Use slices to get rid of unwanted '0b'.
bin(5)[2:]
'101'
or if you want digits,
tuple ( bin(5)[2:] )
('1', '0', '1')
or even
map( int, tuple( bin(5)[2:] ) )
[1, 0, 1]

tobin = lambda x, count=8: "".join(map(lambda y:str((x>>y)&1), range(count-1, -1, -1)))
e.g.
tobin(5) # => '00000101'
tobin(5, 4) # => '0101'
tobin(-5, 4) # => '1011'
Or as clear functions:
# Returns bit y of x (10 base). i.e.
# bit 2 of 5 is 1
# bit 1 of 5 is 0
# bit 0 of 5 is 1
def getBit(y, x):
return str((x>>y)&1)
# Returns the first `count` bits of base 10 integer `x`
def tobin(x, count=8):
shift = range(count-1, -1, -1)
bits = map(lambda y: getBit(y, x), shift)
return "".join(bits)
(Adapted from W.J. Van de Laan's comment)

I'm not entirely certain what you ultimately want to do, but you might want to look at the bitarray package.

def tobin(data, width):
data_str = bin(data & (2**width-1))[2:].zfill(width)
return data_str

You can use the Binary fractions package. This package implements TwosComplement with binary integers and binary fractions. You can convert binary-fraction strings into their twos complement and vice-versa
Example:
>>> from binary_fractions import TwosComplement
>>> TwosComplement.to_float("11111111111") # TwosComplement --> float
-1.0
>>> TwosComplement.to_float("11111111100") # TwosComplement --> float
-4.0
>>> TwosComplement(-1.5) # float --> TwosComplement
'10.1'
>>> TwosComplement(1.5) # float --> TwosComplement
'01.1'
>>> TwosComplement(5) # int --> TwosComplement
'0101'
To use this with Binary's instead of float's you can use the Binary class inside the same package.
PS: Shameless plug, I'm the author of this package.

For positive numbers, just use:
bin(x)[2:].zfill(4)
For negative numbers, it's a little different:
bin((eval("0b"+str(int(bin(x)[3:].zfill(4).replace("0","2").replace("1","0").replace("2","1"))))+eval("0b1")))[2:].zfill(4)
As a whole script, this is how it should look:
def binary(number):
if number < 0:
return bin((eval("0b"+str(int(bin(number)[3:].zfill(4).replace("0","2").replace("1","0").replace("2","1"))))+eval("0b1")))[2:].zfill(4)
return bin(number)[2:].zfill(4)
x=input()
print binary(x)

A modification on tylerl's very helpful answer that provides sign extension for positive numbers as well as negative (no error checking).
def to2sCompStr(num, bitWidth):
num &= (2 << bitWidth-1)-1 # mask
formatStr = '{:0'+str(bitWidth)+'b}'
ret = formatStr.format(int(num))
return ret
Example:
In [11]: to2sCompStr(-24, 18)
Out[11]: '111111111111101000'
In [12]: to2sCompStr(24, 18)
Out[12]: '000000000000011000'

No need, it already is. It is just python choosing to represent it differently. If you start printing each nibble separately, it will show its true colours.
checkNIB = '{0:04b}'.format
checkBYT = lambda x: '-'.join( map( checkNIB, [ (x>>4)&0xf, x&0xf] ) )
checkBTS = lambda x: '-'.join( [ checkBYT( ( x>>(shift*8) )&0xff ) for shift in reversed( range(4) ) if ( x>>(shift*8) )&0xff ] )
print( checkBTS(-0x0002) )
Output is simple:
>>>1111-1111-1111-1111-1111-1111-1111-1110
Now it reverts to original representation when you want to display a twos complement of an nibble but it is still possible if you divide it into halves of nibble and so. Just have in mind that the best result is with negative hex and binary integer interpretations simple numbers not so much, also with hex you can set up the byte size.

We can leverage the property of bit-wise XOR. Use bit-wise XOR to flip the bits and then add 1. Then you can use the python inbuilt bin() function to get the binary representation of the 2's complement. Here's an example function:
def twos_complement(input_number):
print(bin(input_number)) # prints binary value of input
mask = 2**(1 + len(bin(input_number)[2:])) - 1 # Calculate mask to do bitwise XOR operation
twos_comp = (input_number ^ mask) + 1 # calculate 2's complement, for negative of input_number (-1 * input_number)
print(bin(twos_comp)) # print 2's complement representation of negative of input_number.

I hope this solves your problem`
num = input("Enter number : ")
bin_num=bin(num)
binary = '0' + binary_num[2:]
print binary

How does python handle very small float numbers?

This is more a curiosity than a technical problem.
I'm trying to better understand how floating point numbers are handled in Python. In particular, I'm curious about the number returned by sys.float_info.epsilon = 2.220446049250313e-16.
I can see, looking up on the documentation on Double-precision floating-point, that this number can also be written as 1/pow(2, 52). So far, so good.
I decided to write a small python script (see below. Disclaimer: this code is ugly and can burn your eyes) which start from eps = 0.1 and makes the comparison 1.0 == 1.0 + eps. If False, it means eps is big enough to make a difference. Then I try to find a smaller number by subtracting 1 from the last digit and adding the digit 1 to the right of the last and looking for False again by incrementing the last digit.
I am pretty confident that the code is ok because at certain point (32 decimal places) I get eps = 0.00000000000000011102230246251567 = 1.1102230246251567e-16 which is very close to 1/pow(2, 53) = 1.1102230246251565e-16 (last digit differs by 2).
I thought the code would no produce sensible numbers after that. However, the script kept working, always zeroing in a more accurate decimal number until it reached 107 decimal places. Beyond that, the code did not find a False to the test. I got very intrigued with that result and could not wrap my head around it.
Does this 107 decimal places float number have any meaning? If positive, what is it particular about it?
If not, what is python doing past the 32 decimal places eps? Surely there is some algorithm python is cranking to get to the 107 long float.
The script.
total = 520 # hard-coded after try-and-error max number of iterations.
dig = [1]
n = 1
for t in range(total):
eps = '0.'+''.join(str(x) for x in dig)
if(1.0 == 1.0 + float(eps)):
if dig[-1] == 9:
print(eps, n)
n += 1
dig.append(1)
else:
dig[-1] += 1
else:
print(eps, n)
n += 1
dig[-1] -= 1
dig.append(1)
The output (part of it). Values are the eps and the number of decimal places
0.1 1
0.01 2
(...)
0.000000000000001 15
0.0000000000000002 16
0.00000000000000012 17
0.000000000000000112 18
0.0000000000000001111 19
0.00000000000000011103 20
(...)
0.0000000000000001110223024625157 31
0.00000000000000011102230246251567 32
0.000000000000000111022302462515667 33
(...)
0.000000000000000111022302462515666368314810887391490808258832543534838643850548578484449535608291625976563 105
0.0000000000000001110223024625156663683148108873914908082588325435348386438505485784844495356082916259765626 106
0.00000000000000011102230246251566636831481088739149080825883254353483864385054857848444953560829162597656251 107
I ran this code in Python 3.8.3 (tags/v3.8.3:6f8c832, May 13 2020, 22:20:19) [MSC v.1925 32 bit (Intel)] on win32.

Your test involves a double rounding and is finding the number 2−53+2−105.
Many Python implementations use the IEEE-754 binary64 format. (This is not required by the Python documentation.) In this format, the significand (fraction portion) of a floating-point number has 53 bits. (52 are encoded in a primary significand field. 1 is encoded via the exponent field.) For numbers in the interval [1, 2), the significand is scaled (by the exponent portion of the floating-point representation) so that its leading bit corresponds to a value of 1 (20). This means is trailing bit corresponds to a value of 2−52.
Thus, the difference between 1 and the next number representable in this format is 2−52—that is the smallest change that can be made in the number, by increasing the low bit.
Now, suppose x contains 1. If we add 2−52 to it, we will of course get 1+2−52, since that result is representable. What happens if we add something slightly smaller, say ¾•2−52? In this case, the real-number result, 1+¾•2−52, is not representable. It must be rounded to a representable number. The common default rounding method is to round to the nearest representable number. In this case, that is 1+2−52.
Thus, adding to 1 some numbers smaller than 2−52 still produces 1+2−52. What is the smallest number we can add to 1 and get this result?
In case of ties, where the real-number result is exactly halfway between two representable numbers, the common default rounding method uses the one with the even low bit. So, with a choice between 1 (trailing bit 0) and 1+2−52 (trailing bit 1), it chooses 1. That means if we add ½•2−52 to 1, it will produce 1.
If we add any number greater than ½•2−52 to 1, there will be no tie; the real-number result will be nearer to 1+2−52, and that will be the result.
The next question is what is the smallest number greater than ½•2−52 (2−53) that we can add to 1? If the number has to be in the IEEE-754 binary64 format, it is limited by its significand. With the leading bit scaled to represent 2−53, the trailing bit represents 2−53−52 = 2−105.
Therefore, 2−53+2−105 is the smallest binary64 value we can add to 1 to get 1+2−52.
As your program tests values, it works with a decimal numeral. That decimal numeral is converted to the floating-point format and then added to 1. So it is finding the smallest number in the floating-point format that produces a sum greater than 1, and that is the number described above, 2−53+2−105. Its value in decimal is 1.110223024625156663683148108873914908082588325435348386438505485784844495356082916259765625•10−16.

Performing right shift and bit masking on binary fraction in python

I am looking for a way in python to perform right shift and bit masking on a binary number which has a fraction part as well. For e.g., if there are 1 integer and 2 fraction bits in the number, then number 0b101 corresponds to 1.25 in decimal. First, I want to know the pythonic way to represent this number in python.
Second, I want to perform 1 right shift on this number (0b101>>1) so that the resultant number will be 0b010 which will be 0.5 in decimal. Is there an intrinsic and pythonic way in python to perform this operation. Similarly, how to mask and get a specific bit from the binary number?
Presently, for shift I am multiplying the number by 2**-x, x is the number of right shifts. I cannot think a similar operation I can perform for bit mask.

If you really must get directly at the internal representation of a float you can use struct, like this:
>>> import struct
>>> a = 1.25
>>> b = struct.pack('>d',a)
>>> b
b'?\xf4\x00\x00\x00\x00\x00\x00' # the ? means \x3f, leftmost 7 bits of exponent
>>> a.hex()
'0x1.4000000000000p+0'
You can mask the bit you want out of the bytestring that struct.pack() returns.
[edit] The question mark representing \x3f is because the default output representation of a bytestring is a string and Python will where possible show an ascii character, not two hex digits.
[edit] This representation is in principle platform-dependent, but in practice it isn't, because virtually every computer (even IBM mainframes nowadays) has a floating-point processor that uses this format.
Finding out which bit you want may be something of a challenge.
>>> c = struct.pack('>d',a/2)
>>> c
b'?\xe4\x00\x00\x00\x00\x00\x00'
>>> (a/2).hex()
'0x1.4000000000000p-1'
As you can see, division by 2 is not quite the simple one-bit shift to the right that your question seems to suggest you are expecting. In this case, the division by 2 has decremented the exponent by 1 (from 0x3ff to 0x3fe; 1023 to 1022) and left the bit pattern of the fraction (0x4000) unchanged. The exponent appears large because it is biased by 1023.
The main difficulties are
Sign, exponent and fraction don't align to byte boundaries, but to nybble boundaries (sign plus exponent: 12 bits; fraction: 52 bits)
The number is normalized so that it has no leading zeroes (much as scientific notation in decimal is normalized so that it has no leading zeroes) and, since everyone knows it's there, the leading 1 is not stored.
I can recommend the Wikpedia article on this subject: it has lots of useful examples.
But I suspect that you don't really want to get at the internal representation of a float. Instead, you want a fixed-point binary class, without pesky binary exponents, that works much the same as you would do it on paper, and where division by a power of 2 really does reflect as a shift of so many bits to the right.
Depending on how much work you want to put into it, you could do this by defining a FixedBinary class as a subclass of numbers.Real, with the integer portion internally represented by one int and the fractional component by another int, and the sign by a third int, so that 1.25 would be represented as (1, int(0.25 * 65536), +1) (or some other power of 2).
This also shows you the simplest way to get a bit representation of your fraction.
[edit] I recommend storing the sign separately. You could store it in the integer portion, or the fraction, or both, but all have disadvantages.
If you store it in the sign of the fraction, the twos-complement
representation of negative integers will give you difficulty when
you want to mask your bits.
If you don't store it in the sign of the fraction there will be no way to
represent -0.5.
If you don't store it in the sign of the integer portion there will be no way to represent -1.0.
A multiplicand of 65536 will give you 4 decimal digits of accuracy. You can increase it if you want more. I also recommend that you store your fraction in the rightmost bits and simply ignore the leftmost bits. In other words, be content with the binary point being in the middle of the int, don't insist on it being on the left. That is because you will need headroom to the left of the binary point when you do multiplication.
Implementing your own numeric class is a considerable amount of work, though.

You can work using fxpmath.
Info about this package is at:
https://github.com/francof2a/fxpmath
For your example:
from fxpmath import Fxp
x = Fxp('0b0101', signed=True, n_word=4, n_frac=2)
print(x)
y = x >> 1
print(y)
# example of AND mask
z = x & Fxp('0b0110', signed=True, n_word=4, n_frac=2)
print(z.bin())
outputs:
1.25
0.5
0100

using struct pack and unpack on floating point number 0.01 output 0.009999999776482582.

using 'struct pack and unpack' on floating point number 0.01 outputs 0.009999999776482582.
For my project I would have to configure values which are float
and the same needs to be stored in binary file and I would need this data to analyze later and I need the exact values that was configured.
Can some one help me if there is any way that I could store the exact value 0.01 and retrieve the same.
Based on some of the post I have already tried using double instead of float and it works but the problem with using double is my file size increases. I have a lot of parameters which is float and can take values any where between 0 to 5.
Hence any other suggestion would be appreciated.
Here is the sample code
import struct
a = 0.01
b = struct.pack('<f', a)
c = struct.unpack('<f', b)
print c
(0.009999999776482582,)

There is no floating point number 0.01. IEEE floating point numbers do not represent the entire real number line; they represent an approximation using a particular binary scheme. The closest approximation has the decimal representation 0.009999999776482582.
You could try serializing a textual representation (e.g. json) or pickle a BigDecimal which can represent arbitrary precisions.

You say that numbers go from 0 to 5. If you only need two decimal places (0.00, 0.01, ..., 5.00) then you can store the numbers as 16 bit integers, first by multiplying the number by 100, and then when reading dividing the number by 100:
>>> import struct
>>> digits = 2
>>> a = 0.01
>>> b = struct.pack('<H', int(round(a, digits)) * (10 ** digits))
>>> c = struct.unpack('<H', b)
>>> print (c[0] / (10.0 ** digits))
0.01
Storing numbers with H format you can actually save 50% space compared to floats. If you need more precision, more than two decimal places, you need yo multiply the number and then dividing by 1000, 10000, etc, changing the format from H to I.

How to describe the maths behind this function

This is a Decimal to Binary converter. I need help explaining the math behind is as I don't have a clue how to explain all the Shifts etc.
number = int(raw_input("Enter the Number:"))
binary = ''
while number > 0:
binary = str(number % 2) + binary
number >>=1
print(binary)

The loop builds up a string which represents a binary value.
str(number % 2) finds the lowest bit of the number (either 0 or 1).
binary = str(number % 2) + binary adds the bit to the left end of string binary
number >>=1 removes the low bit now that we're done with it
while number > 0 continues until the number is 0

Suppose you are converting 56 to binary.
When right shift by 1 bit, 56 (111000) becomes 28(011100)
Note: The right shift operator causes the bit pattern in the first operand to be shifted right the number of bits specified by the second operand. Bits vacated by the shift operation are zero-filled for unsigned quantities
Like this, the variable, number is right shifted by 1 till it is greater than 1 and every time, the remainder of number divided by 2 (this will be 0 or 1 always) is appended to the result variable, binary
Finally, the variable binary will have the binary equvalent

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.