I was trying to understand bitwise NOT in python.
I tried following:
print('{:b}'.format(~ 0b0101))
print(~ 0b0101)
The output is
-110
-6
I tried to understand the output as follows:
Bitwise negating 0101 gives 1010. With 1 in most significant bit, python interprets it as a negative number in 2's complement form and to get back corresponding decimal it further takes 2's complement of 1010 as follows:
1010
0101 (negating)
0110 (adding 1 to get final value)
So it prints it as -110 which is equivalent to -6.
Am I right with this interpretation?
You're half right..
The value is indeed represented by ~x == -(x+1) (add one and invert), but the explanation of why is a little misleading.
Two's compliment numbers require setting the MSB of the integer, which is a little difficult if the number can be an arbitrary number of bits long (as is the case with python). Internally python keeps a separate number (there are optimizations for short numbers however) that tracks how long the digit is. When you print a negative int using the binary format: f'{-6:b}, it just slaps a negative sign in front of the binary representation of the positive value (one's compliment). Otherwise, how would python determine how many leading one's there should be? Should positive values always have leading zeros to indicate they're positive? Internally it does indeed use two's compliment for the math though.
If we consider signed 8 bit numbers (and display all the digits) in 2's compliment your example becomes:
~ 0000 0101: 5
= 1111 1010: -6
So in short, python is performing correct bitwise negation, however the display of negative binary formatted numbers is misleading.
Python integers are arbitrarily long, so if you invert 0b0101, it would be 1111...11111010. How many ones do you write? Well, a 4-bit twos complement -6 is 1010, and a 32-bit twos complement -6 is 11111111111111111111111111111010. So an arbitrarily long -6 could ideally just be written as -6.
Check what happens when ~5 is masked to look at the bits it represents:
>>> ~5
-6
>>> format(~5 & 0xF,'b')
'1010'
>>> format(~5 & 0xFFFF,'b')
'1111111111111010'
>>> format(~5 & 0xFFFFFFFF,'b')
'11111111111111111111111111111010'
>>> format(~5 & 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,'b')
'11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111010'
A negative decimal representation makes sense and you must mask to limit a representation to a specific number of bits.
Related
it is a dummy question, but I need to understand it more deeply
Python integers use two's complement to store signed values. That means that positive numbers are stored simply as their bit sequence (so 14 is 00001110 since it's equal to 8 + 4 + 2). On the other hand, negative numbers are stored by taking their positive quantity, inverting it, and adding one. So -14 is 11110010. We took the bitwise representation of 14 (00001110), inverted it (11110001), and added one (11110010).
But there's an added wrinkle. Python integer values are bignums; they can be arbitrarily large. So our usual notion of "this number is stored in N bits" breaks down. Instead, we may end up with two numbers of differing lengths. If we end up in that situation, we may have to sign extend the shorter one. This is just a fancy way of saying "take the most significant bit and repeat it until the number is long enough for our liking".
So in the case of 14 and -14, we have
0010
1110
We & them together. Only the second bit (counting from the right, or least significant bit) is true in both, so we get 0010, or 2. On the other hand, with 16 and -16, we get
010000
110000
For -16, we took positive sixteen (010000), flipped all of the bits (101111), and then added one, which got carried all the way over to the second most significant bit (110000). When we & these, we get 16.
010000
See also BitwiseOperators - Python Wiki
A bitwise operation is a binary operation.
In some representations of integers, one bit is used to represent the sign of the number. Depending upon which bit that is, will change the result of the bitwise &. In 2's complement, negative numbers are represented by inverting all the bits and then adding 1. There are many ways of representing real numbers in binary. Regardless, generally, a bitwise operation of a positive and negative number will always result in undefined behaviour. Probably why most calculators will only allow positive integers in bitwise operations. That is, calculators that are advanced enough to have such a feature.
EDIT: The specific numbers you chose are significant. The number 14 is represented in binary using less bits than 16. It just so happens that -16 and 16 are both exactly the same binary (looking only at the first five bits, since the rest are not significant when you come to & them together)
Now a bitwise & only sets a bit if that bit is set in both the numbers you are and-ing together.
1110 = 14
0010 = -14
&0010 = 2!
10000 = 16
10000 = -16
&10000 = 16!
There's your answer.
Here is two results I get when I xor 2 integers. The sames bits, but a different sign for the second parameter of the xor.
>>> bin(0b0001 ^ -0b0010)
'-0b1'
>>> bin(0b0001 ^ 0b0010)
'0b11'
I don't really understand the logic. Isn't XOR just supposed so XOR every bit one by one ? Even with signed bits ? I would expect to get the same results (with a different sign).
If python's integers were fixed-width (eg: 32-bit, or 64-bit), a negative number would be represented in 2's complement form. That is, if you want -a, then take the bits of a, invert them all, and then add 1. Then a ^ b is just the number that's represented by the bitwise xor of the bits of a and b in two's complement. The result is re-interpreted in two's complement (ie: negative if the top bit is set).
Python's int type isn't fixed-width, but the result of a ^ b follows the same pattern: imagine that the values are represented as a wide-enough fixed-with int type, and then take the xor of the two values.
Although this now seems a bit arbitrary, it makes sense historically: Python adopted many operations from C, so xor was defined to work like in C. Python had a fixed-width integer type like C, and having a ^ b give the same result for the fixed-width and arbitary-width integer types essentially forces the current definition.
Back to a worked example: 1 ^ -2. 8 bits is more than enough to represent these two values. In 2's complement:
1 = 00000001
-2 = 11111110
Then the bitwise xor is:
= 11111111
This is the 8-bit 2's complement representation of -1. Although we've used 8 bits here, the result is the same no matter the width chosen as long as it's enough to represent the two values.
I need to compute the hamming distance between two integers by counting the number of differing bits between their binary representations.
This is the function that I am using for that purpose:
def hamming(a, b):
# compute and return the Hamming distance between the integers
return bin(int(a) ^ int(b)).count("1")
I started to conduct some simple tests on this function to make sure it works properly but almost immediately I see that it does not and I am trying to understand as to why.
I tested the function with these two numbers:
a = -1704441252336819740
b = -1704441252336819741
The binary representations of these numbers given by python are:
bin(a): -0b10111 10100111 01100100 01001001 11011010 00001110 11011110 00011100
bin(b): -0b10111 10100111 01100100 01001001 11011010 00001110 11011110 00011101
As you can see their binary representations are the same aside for the first digit thus the hamming distance should be 1.
However, the returned hamming distance from the function is 3 and I can't seem to understand why.
The issue arises when I compute the XOR between these two digits as a ^ b returns 7 (thus counts 3 '1' bits) when I would expect it to return 1 (and count 1 '1' bit).
I believe this has to do with the fact that the XOR value seems to be getting stored as an unsigned integer with the minimal number of possible bits whereas I need it to be stored as a
How am I misunderstanding the XOR operator and how can I change my function to work the way I want it to?
Actually, it is the bin function that is misleading:
Instead of displaying the actual binary value stored, it displays |x| (absolute value) and prints minus sign in front of it for negative numbers.
But, that is not how the values are actually stored.
XOR operates on the actual binary values which are stored in two's compliment, and that is why you are getting bigger bit difference then you expected.
As a simple example lets take two 4 bit numbers:
-10 = 0b0110
-11 = 0b0101
^ = 0b0011
As you can see, in this representation there are two bits of difference between these two numbers, while if they were positive, there would be only one bit difference.
I am trying to solve a challenge on this site. I have everything correct except I can't properly convert a bitstring to its 32-bit signed integer representation.
For example I have this bitstring:
block = '10101010001000101110101000101110'
My own way of converting this bitstring to 32-bit signed integer: I partially remember from school that first bit is the sign bit. If it is 1 we have negative number and vice versa.
when I do this, it gives me the number in base 10. It just converts it to base 10:
int(block, 2) #yields 2854414894
I have tried excluding the first bit and convert remaining 31 length bitstring, after that checked the first bit to decide whether this is negative number or not:
int(block[1:32], 2) #yields 706931246
But the correct answer is -1440552402. What operation should I do to this bitstring to get this integer? Is it relevant if the byte order of the system is little endian or big endian? My system is little endian.
In python there's no size for integers, so you'll never get a negative value with a high order 1 bit.
To "emulate" 32-bit behaviour just do this, since your 2854414894 value is > 2**31-1 aka 0x7FFFFFFF:
print(int(block[1:32], 2)-2**31)
you'll get
-1440552402
You're right that the upper bit determines sign, but it's not a simple flag. Instead, the whole character of negative numbers is inverted. This is a positive number 1 (in 8 bits):
00000001
This is a negative 1:
11111111
The upshot is that addition and subtraction "wrap around". So 4 - 1 would be:
0100 - 0001 = 0011
And so 0 - 1 is the same as 1_0000_0000 - 1. The "borrow" just goes off the top of the integer.
The general way to "negate" a number is "invert the bits, add 1". This works both ways, so you can go from positive to negative and back.
In your case, use the leading '1' to detect whether negation is needed, then convert to int, then maybe perform the negation steps. Note, however, that because python's int is not a fixed-width value, there's a separate internal flag (a Python int is not a "32-bit" number, it's an arbitrary-precision integer, with a dynamically allocated representation stored in some fashion other than simple 2's complement).
block = '10101010001000101110101000101110'
asnum = int(block, 2)
if block[0] == '1':
asnum ^= 0xFFFFFFFF
asnum += 1
asnum = -asnum
print(asnum)
You should check for when the input value is out of the positive range for 32 bit signed integers:
res = int(block, 2)
if res >= 2**31:
res -= 2**32
So first you interpret the number as an unsigned number, but when you notice the sign bit was set ( >= 2^31 ), you subtract 2^32 so to get the negative number.
I have 32 bit numbers A=0x0000000A and B=0X00000005.
I get A xor B by A^B and it gives 0b1111.
I rotated this and got D=0b111100000 but I want this to be 32 bit number not just for printing but I need MSB bits even though there are 0 in this case for further manipulation.
Most high-level languages don't have ROR/ROL operators. There are two ways to deal with this: one is to add an external library like ctypes or https://github.com/scott-griffiths/bitstring, that have native support for rotate or bitslice support for integers (which is pretty easy to add).
One thing to keep in mind is that Python is 'infinite' precision - those MSBs are always 0 for positive numbers, 1 for negative numbers; python stores as many digits as it needs to hold up to the highest magnitude difference from the default. This is one reason you see weird notation in python like ~(0x3) is shown as -0x4, which is equivalent in two's complement notation, rather than the equivalent positive value, but -0x4 is always true, even if you AND it against a 5000 bit number, it will just mask off the bottom two bits.
Or, you can just do yourself, the way we all used to, and how the hardware actually does it:
def rotate_left(number, rotatebits, numbits=32):
newnumber = (number << rotatebits) & ~((1<<numbits)-1)
newnumber |= (number & ~((1<<rotatebits)-1)) << rotatebits
return newnumber
To get the binary of an integer you could use bin().
Just an short example:
>>> i = 333333
>>> print (i)
333333
>>> print (bin(i))
0b1010001011000010101
>>>
bin(i)[2:].zfill(32)
I guess does what you want.
I think your bigger problem here is that you are misunderstanding the difference between a number and its representation
12 ^ 18 #would xor the values
56 & 11 # and the values
if you need actual 32bit signed integers you can use numpy
a =numpy.array(range(100),dtype=np.int32)