Below a and b (hex), representing two's complement signed binary numbers.
For example:
a = 0x17c7cc6e
b = 0xc158a854
Now I want to know the signed representation of a & b in base 10. Sorry I'm a low level programmer and new to python; feel very stupid for asking this. I don't care about additional library's but the answer should be simple and straight forward. Background: a & b are extracted data from a UDP packet. I have no control over the format. So please don't give me an answer that would assume I can change the format of those varibles before hand.
I have converted a & b into the following with this:
aBinary = bin(int(a, 16))[2:].zfill(32) => 00010111110001111100110001101110 => 398969966
bBinary = bin(int(b, 16))[2:].zfill(32) => 11000001010110001010100001010100 => -1051154348
I was trying to do something like this (doesn't work):
if aBinary[1:2] == 1:
aBinary = ~aBinary + int(1, 2)
What is the proper way to do this in python?
why not using ctypes ?
>>> import ctypes
>>> a = 0x17c7cc6e
>>> ctypes.c_int32(a).value
398969966
>>> b = 0xc158a854
>>> ctypes.c_int32(b).value
-1051154348
A nice way to do this in Python is using bitwise operations. For example, for 32-bit values:
def s32(value):
return -(value & 0x80000000) | (value & 0x7fffffff)
Applying this to your values:
>>> s32(a)
398969966
>>> s32(b)
-1051154348
What this function does is sign-extend the value so it's correctly interpreted with the right sign and value.
Python is a bit tricky in that it uses arbitrary precision integers, so negative numbers are treated as if there were an infinite series of leading 1 bits. For example:
>>> bin(-42 & 0xff)
'0b11010110'
>>> bin(-42 & 0xffff)
'0b1111111111010110'
>>> bin(-42 & 0xffffffff)
'0b11111111111111111111111111010110'
>>> import numpy
>>> numpy.int32(0xc158a854)
-1051154348
You'll have to know at least the width of your data. For instance, 0xc158a854 has 8 hexadecimal digits so it must be at least 32 bits wide; it appears to be an unsigned 32 bit value. We can process it using some bitwise operations:
In [232]: b = 0xc158a854
In [233]: if b >= 1<<31: b -= 1<<32
In [234]: b
Out[234]: -1051154348L
The L here marks that Python 2 has switched to processing the value as a long; it's usually not important, but in this case indicates that I've been working with values outside the common int range for this installation. The tool to extract data from binary structures such as UDP packets is struct.unpack; if you just tell it that your value is signed in the first place, it will produce the correct value:
In [240]: s = '\xc1\x58\xa8\x54'
In [241]: import struct
In [242]: struct.unpack('>i', s)
Out[242]: (-1051154348,)
That assumes two's complement representation; one's complement (such as the checksum used in UDP), sign and magnitude, or IEEE 754 floating point are some less common encodings for numbers.
Another modern solution:
>>> b = 0xc158a854
>>> int.from_bytes(bytes.fromhex(hex(b)[2:]), byteorder='big', signed=True)
-1051154348
2^31 = 0x80000000 (sign bit, which in two's compliment represents -2^31)
2^31-1 = 0x7fffffff (all the positive bits)
and hence (n & 0x7fffffff) - (n & 0x80000000) will apply the sign correctly
you could even do, n - ((n & 0x80000000)<<1) to subtract the msb value twice
or finally there's, (n & 0x7fffffff) | -(n & 0x80000000) to just merge the negative bit, not subtract
def signedHex(n): return (n & 0x7fffffff) | -(n & 0x80000000)
signedHex = lambda n: (n & 0x7fffffff) | -(n & 0x80000000)
value=input("enter hexa decimal value for getting compliment values:=")
highest_value="F"*len(value)
resulting_decimal=int(highest_value,16)-int(value,16)
ones_compliment=hex(resulting_decimal)
twos_compliment=hex(r+1)
print(f'ones compliment={ones_compliment}\n twos complimet={twos_compliment}')
Related
Based on what I've read about the binary representation of integers, the first bit is for sign (positive or negative).
Let's say we have an integer x = 5 and sys.getsizeof(x) returns 28 (that is binary representation in 28 bit).
For now I am trying to flip the first bit to 1 by using x|=(1<<27)but it returns 134217733.
I was just wondering whether it needs to be some negative number? (not -5)
Is there anything wrong with what I am doing?
You can't switch a Python int from positive to negative the way you're trying to, by just flipping a bit in its representation. You're assuming it's stored in a fixed-length two's complement representation. But integers in Python 3 are not fixed-length bit strings, and they are not stored in a two's complement representation. Instead, they are stored as variable-length strings of 30- or 15-bit "digits", with the sign stored separately (like a signed-magnitude representation). So the "lowest-level" way to negate a Python int is not with bit operations, but with the unary - operator, which will switch its sign. (See the end of this answer for details from the Python 3 source.)
(I should also mention that sys.getsizeof() does not tell you the number of bits in your int. It gives you the number of bytes of memory that the integer object is using. This is also not the number of bytes of the actual stored number; most of those bytes are for other things.)
You can still play around with two's complement representations in Python, by emulating a fixed-length bit string using a positive int. First, choose the length you want, for example 6 bits. (You could just as easily choose larger numbers like 28 or 594.) We can define some helpful constants and functions:
BIT_LEN = 6
NUM_INTS = 1 << BIT_LEN # 0b1000000
BIT_MASK = NUM_INTS - 1 # 0b111111
HIGH_BIT = 1 << (BIT_LEN - 1) # 0b100000
def to2c(num):
"""Returns the two's complement representation for a signed integer."""
return num & BIT_MASK
def from2c(bits):
"""Returns the signed integer for a two's complement representation."""
bits &= BIT_MASK
if bits & HIGH_BIT:
return bits - NUM_INTS
Now we can do something like you were trying to:
>>> x = to2c(2)
>>> x |= 1 << 5
>>> bin(x)
'0b100010'
>>> from2c(x)
-30
Which shows that turning on the high bit for the number 2 in a 6-bit two's complement representation turns the number into -30. This makes sense, because 26-1 = 32, so the lowest integer in this representation is -32. And -32 + 2 = -30.
If you're interested in the details of how Python 3 stores integers, you can look through Objects/longobject.c in the source. In particular, looking at the function _PyLong_Negate():
/* If a freshly-allocated int is already shared, it must
be a small integer, so negating it must go to PyLong_FromLong */
Py_LOCAL_INLINE(void)
_PyLong_Negate(PyLongObject **x_p)
{
PyLongObject *x;
x = (PyLongObject *)*x_p;
if (Py_REFCNT(x) == 1) {
Py_SIZE(x) = -Py_SIZE(x);
return;
}
*x_p = (PyLongObject *)PyLong_FromLong(-MEDIUM_VALUE(x));
Py_DECREF(x);
}
you can see that all it does in the normal case is negate the Py_SIZE() value of the integer object. Py_SIZE() is simply a reference to the ob_size field of the integer object. When this value is 0, the integer is 0. Otherwise, its sign is the sign of the integer, and its absolute value is the number of 30- or 15-bit digits in the array that holds the integer's absolute value.
Negative number representations in python :
Depending on how many binary digit you want, subtract from a number (2n):
>>> bin((1 << 8) - 1)
'0b11111111'
>>> bin((1 << 16) - 1)
'0b1111111111111111'
>>> bin((1 << 32) - 1)
'0b11111111111111111111111111111111'
Function to generate two's compliment(negative number):
def to_twoscomplement(bits, value):
if value < 0:
value = ( 1<<bits ) + value
formatstring = '{:0%ib}' % bits
return formatstring.format(value)
Output:
>>> to_twoscomplement(16, 3)
'0000000000000011'
>>> to_twoscomplement(16, -3)
'1111111111111101'
Refer : two's complement of numbers in python
In Python 2.5, I have a float and I'd like to obtain and manipulate its bit pattern as an integer.
For example, suppose I have
x = 173.3125
In IEEE 754 format, x's bit pattern in hexadecimal is 432D5000.
How can I obtain & manipulate (e.g., perform bitwise operations) on that bit pattern?
You can get the string you want (apparently implying a big-endian, 32-bit representation; Python internally uses the native endianity and 64-bits for floats) with the struct module:
>>> import struct
>>> x = 173.125
>>> s = struct.pack('>f', x)
>>> ''.join('%2.2x' % ord(c) for c in s)
'432d2000'
this doesn't yet let you perform bitwise operations, but you can then use struct again to map the string into an int:
>>> i = struct.unpack('>l', s)[0]
>>> print hex(i)
0x432d2000
and now you have an int which you can use in any sort of bitwise operations (follow the same two steps in reverse if after said operations you need to get a float again).
The problem is that a Python float object might not be a IEEE 754, because it is an object (in fact they are, but internally they could hold whichever representation is more convenient)...
As leo said, you can do a type cast with ctypes, so you are enforcing a particular representation (in this case, single precision):
from ctypes import *
x = 173.3125
bits = cast(pointer(c_float(x)), POINTER(c_int32)).contents.value
print hex(bits)
#swap the least significant bit
bits ^= 1
And then back:
y = cast(pointer(c_int32(bits)), POINTER(c_float)).contents.value
For reference, it is also possible to use numpy and view.
import numpy
def fextract( f ):
bits = numpy.asarray( f, dtype=numpy.float64 ).view( numpy.int64 )
if not bits & 0x7fffffffffffffff: # f == +/-0
return 0, 0
sign = numpy.sign(bits)
exponent = ( (bits>>52) & 0x7ff ) - 1075
mantissa = 0x10000000000000 | ( bits & 0xfffffffffffff )
# from here on f == sign * mantissa * 2**exponent
for shift in 32, 16, 8, 4, 2, 1:
if not mantissa & ((1<<shift)-1):
mantissa >>= shift
exponent += shift
return sign * mantissa, exponent
fextract( 1.5 ) # --> 3, -1
Use struct or xdrlib module:
>>> import struct
>>> x = 173.3125
>>> rep = struct.pack('>f', x)
>>> numeric = struct.unpack('>I', rep)[0]
>>> '%x' %numeric
'432d5000'
Now you can work with numeric, and then go in the reverse direction to get your floating point number back. You have to use >I (unsigned int) to avoid getting a negative number. xdrlib is similar.
References: struct, xdrlib.
I am not too well versed on this topic, but have you tried the ctypes module?
Let's say I have this number i = -6884376.
How do I refer to it as to an unsigned variable?
Something like (unsigned long)i in C.
Assuming:
You have 2's-complement representations in mind; and,
By (unsigned long) you mean unsigned 32-bit integer,
then you just need to add 2**32 (or 1 << 32) to the negative value.
For example, apply this to -1:
>>> -1
-1
>>> _ + 2**32
4294967295L
>>> bin(_)
'0b11111111111111111111111111111111'
Assumption #1 means you want -1 to be viewed as a solid string of 1 bits, and assumption #2 means you want 32 of them.
Nobody but you can say what your hidden assumptions are, though. If, for example, you have 1's-complement representations in mind, then you need to apply the ~ prefix operator instead. Python integers work hard to give the illusion of using an infinitely wide 2's complement representation (like regular 2's complement, but with an infinite number of "sign bits").
And to duplicate what the platform C compiler does, you can use the ctypes module:
>>> import ctypes
>>> ctypes.c_ulong(-1) # stuff Python's -1 into a C unsigned long
c_ulong(4294967295L)
>>> _.value
4294967295L
C's unsigned long happens to be 4 bytes on the box that ran this sample.
To get the value equivalent to your C cast, just bitwise and with the appropriate mask. e.g. if unsigned long is 32 bit:
>>> i = -6884376
>>> i & 0xffffffff
4288082920
or if it is 64 bit:
>>> i & 0xffffffffffffffff
18446744073702667240
Do be aware though that although that gives you the value you would have in C, it is still a signed value, so any subsequent calculations may give a negative result and you'll have to continue to apply the mask to simulate a 32 or 64 bit calculation.
This works because although Python looks like it stores all numbers as sign and magnitude, the bitwise operations are defined as working on two's complement values. C stores integers in twos complement but with a fixed number of bits. Python bitwise operators act on twos complement values but as though they had an infinite number of bits: for positive numbers they extend leftwards to infinity with zeros, but negative numbers extend left with ones. The & operator will change that leftward string of ones into zeros and leave you with just the bits that would have fit into the C value.
Displaying the values in hex may make this clearer (and I rewrote to string of f's as an expression to show we are interested in either 32 or 64 bits):
>>> hex(i)
'-0x690c18'
>>> hex (i & ((1 << 32) - 1))
'0xff96f3e8'
>>> hex (i & ((1 << 64) - 1)
'0xffffffffff96f3e8L'
For a 32 bit value in C, positive numbers go up to 2147483647 (0x7fffffff), and negative numbers have the top bit set going from -1 (0xffffffff) down to -2147483648 (0x80000000). For values that fit entirely in the mask, we can reverse the process in Python by using a smaller mask to remove the sign bit and then subtracting the sign bit:
>>> u = i & ((1 << 32) - 1)
>>> (u & ((1 << 31) - 1)) - (u & (1 << 31))
-6884376
Or for the 64 bit version:
>>> u = 18446744073702667240
>>> (u & ((1 << 63) - 1)) - (u & (1 << 63))
-6884376
This inverse process will leave the value unchanged if the sign bit is 0, but obviously it isn't a true inverse because if you started with a value that wouldn't fit within the mask size then those bits are gone.
Python doesn't have builtin unsigned types. You can use mathematical operations to compute a new int representing the value you would get in C, but there is no "unsigned value" of a Python int. The Python int is an abstraction of an integer value, not a direct access to a fixed-byte-size integer.
Since version 3.2 :
def unsignedToSigned(n, byte_count):
return int.from_bytes(n.to_bytes(byte_count, 'little', signed=False), 'little', signed=True)
def signedToUnsigned(n, byte_count):
return int.from_bytes(n.to_bytes(byte_count, 'little', signed=True), 'little', signed=False)
output :
In [3]: unsignedToSigned(5, 1)
Out[3]: 5
In [4]: signedToUnsigned(5, 1)
Out[4]: 5
In [5]: unsignedToSigned(0xFF, 1)
Out[5]: -1
In [6]: signedToUnsigned(0xFF, 1)
---------------------------------------------------------------------------
OverflowError Traceback (most recent call last)
Input In [6], in <cell line: 1>()
----> 1 signedToUnsigned(0xFF, 1)
Input In [1], in signedToUnsigned(n, byte_count)
4 def signedToUnsigned(n, byte_count):
----> 5 return int.from_bytes(n.to_bytes(byte_count, 'little', signed=True), 'little', signed=False)
OverflowError: int too big to convert
In [7]: signedToUnsigned(-1, 1)
Out[7]: 255
Explanations : to/from_bytes convert to/from bytes, in 2's complement considering the number as one of size byte_count * 8 bits. In C/C++, chances are you should pass 4 or 8 as byte_count for respectively a 32 or 64 bit number (the int type).
I first pack the input number in the format it is supposed to be from (using the signed argument to control signed/unsigned), then unpack to the format we would like it to have been from. And you get the result.
Note the Exception when trying to use fewer bytes than required to represent the number (In [6]). 0xFF is 255 which can't be represented using a C's char type (-128 ≤ n ≤ 127). This is preferable to any other behavior.
You could use the struct Python built-in library:
Encode:
import struct
i = -6884376
print('{0:b}'.format(i))
packed = struct.pack('>l', i) # Packing a long number.
unpacked = struct.unpack('>L', packed)[0] # Unpacking a packed long number to unsigned long
print(unpacked)
print('{0:b}'.format(unpacked))
Out:
-11010010000110000011000
4288082920
11111111100101101111001111101000
Decode:
dec_pack = struct.pack('>L', unpacked) # Packing an unsigned long number.
dec_unpack = struct.unpack('>l', dec_pack)[0] # Unpacking a packed unsigned long number to long (revert action).
print(dec_unpack)
Out:
-6884376
[NOTE]:
> is BigEndian operation.
l is long.
L is unsigned long.
In amd64 architecture int and long are 32bit, So you could use i and I instead of l and L respectively.
[UPDATE]
According to the #hl037_ comment, this approach works on int32 not int64 or int128 as I used long operation into struct.pack(). Nevertheless, in the case of int64, the written code would be changed simply using long long operand (q) in struct as follows:
Encode:
i = 9223372036854775807 # the largest int64 number
packed = struct.pack('>q', i) # Packing an int64 number
unpacked = struct.unpack('>Q', packed)[0] # Unpacking signed to unsigned
print(unpacked)
print('{0:b}'.format(unpacked))
Out:
9223372036854775807
111111111111111111111111111111111111111111111111111111111111111
Next, follow the same way for the decoding stage. As well as this, keep in mind q is long long integer — 8byte and Q is unsigned long long
But in the case of int128, the situation is slightly different as there is no 16-byte operand for struct.pack(). Therefore, you should split your number into two int64.
Here's how it should be:
i = 10000000000000000000000000000000000000 # an int128 number
print(len('{0:b}'.format(i)))
max_int64 = 0xFFFFFFFFFFFFFFFF
packed = struct.pack('>qq', (i >> 64) & max_int64, i & max_int64)
a, b = struct.unpack('>QQ', packed)
unpacked = (a << 64) | b
print(unpacked)
print('{0:b}'.format(unpacked))
Out:
123
10000000000000000000000000000000000000
111100001011110111000010000110101011101101001000110110110010000000011110100001101101010000000000000000000000000000000000000
just use abs for converting unsigned to signed in python
a=-12
b=abs(a)
print(b)
Output:
12
I am trying to append some hex values in python and I always seem to get 0x between the number. From what I searched, either this is not possible without converting it into a lit of values ?? I am not sure.
a = 0x7b
b = 0x80000
hex(a) + hex(b) = 0x7b0x80000
I dont want the 0x in the middle - I need, 0x7b80000. is there any other way to do this? If I convert to integer I get the sum of the two and converting it to hex is a different value than 0x7b80000
I don't think you want to "append" them. Doing integer arithmetic by using strings is a bad idea. I think you want to bit-shift a into the right place and OR them together:
>>> a = 0x7B
>>> b = 0x80000
>>>
>>> hex( (a<<20) | b )
'0x7b80000'
Perhaps if you were more specific about what these numbers are and what exactly you're trying to accomplish I could provide a more general answer.
You can use f-string formatting with Python 3:
>>> a = 0x7b
>>> b = 0x80000
>>> f'0x{a:x}{b:x}'
'0x7b80000'
This is a more generic way to append hex / int / bin values.
Only works for positive values of b.
a = 0x7b
b = 0x80000
def append_hex(a, b):
sizeof_b = 0
# get size of b in bits
while((b >> sizeof_b) > 0):
sizeof_b += 1
# align answer to nearest 4 bits (hex digit)
sizeof_b += sizeof_b % 4
return (a << sizeof_b) | b
print(hex(append_hex(a, b)))
Basically you have to find the highest set bit that b has.
Align that number to the highest multiple of 4 since that's what hex chars are.
Append the a to the front of the highest multiple of 4 that was found before.
It's been 7 years but the accepted answer is wrong and this post still appears in the first place in google searches; so here is a correct answer:
import math
def append_hex(a, b):
sizeof_b = 0
# get size of b in bits
while((b >> sizeof_b) > 0):
sizeof_b += 1
# every position in hex in represented by 4 bits
sizeof_b_hex = math.ceil(sizeof_b/4) * 4
return (a << sizeof_b_hex) | b
The accepted answer doesn't make sense (you can check it with values a=10, b=1a). In this solution, we search for the nearest divider of 4 - since every hex value is represented by 4 bits - and then move the first value this time of bits.
I have a bit counting method that I am trying to make as fast as possible. I want to try the algorithm below from Bit Twiddling Hacks, but I don't know C. What is 'type T' and what is the python equivalent of (T)~(T)0/3?
A generalization of the best bit
counting method to integers of
bit-widths upto 128 (parameterized by
type T) is this:
v = v - ((v >> 1) & (T)~(T)0/3); // temp
v = (v & (T)~(T)0/15*3) + ((v >> 2) & (T)~(T)0/15*3); // temp
v = (v + (v >> 4)) & (T)~(T)0/255*15; // temp
c = (T)(v * ((T)~(T)0/255)) >> (sizeof(v) - 1) * CHAR_BIT; // count
T is a integer type, which I'm assuming is unsigned. Since this is C, it'll be fixed width, probably (but not necessarily) one of 8, 16, 32, 64 or 128. The fragment (T)~(T)0 that appears repeatedly in that code sample just gives the value 2**N-1, where N is the width of the type T. I suspect that the code may require that N be a multiple of 8
for correct operation.
Here's a direct translation of the given code into Python, parameterized in terms of N, the width of T in bits.
def count_set_bits(v, N=128):
mask = (1 << N) - 1
v = v - ((v >> 1) & mask//3)
v = (v & mask//15*3) + ((v >> 2) & mask//15*3)
v = (v + (v >> 4)) & mask//255*15
return (mask & v * (mask//255)) >> (N//8 - 1) * 8
Caveats:
(1) the above will only work for numbers up to 2**128. You might be able to generalize it for larger numbers, though.
(2) There are obvious inefficiencies: for example, 'mask//15' is computed twice. This doesn't matter for C, of course, because the compiler will almost certainly do the division at compile time rather than run time, but Python's peephole optimizer may not be so clever.
(3) The fastest C method may well not translate to the fastest Python method. For Python speed, you should probably be looking for an algorithm that minimizes the number of Python bitwise operations. As Alexander Gessler said: profile!
What you copied is a template for generating code. It's not a good idea to transliterate that template into another language and expect it to run fast. Let's expand the template.
(T)~(T)0 means "as many 1-bits as fit in type T". The algorithm needs 4 masks which we will compute for the various T-sizes we might be interested in.
>>> for N in (8, 16, 32, 64, 128):
... all_ones = (1 << N) - 1
... constants = ' '.join([hex(x) for x in [
... all_ones // 3,
... all_ones // 15 * 3,
... all_ones // 255 * 15,
... all_ones // 255,
... ]])
... print N, constants
...
8 0x55 0x33 0xf 0x1
16 0x5555 0x3333 0xf0f 0x101
32 0x55555555L 0x33333333L 0xf0f0f0fL 0x1010101L
64 0x5555555555555555L 0x3333333333333333L 0xf0f0f0f0f0f0f0fL 0x101010101010101L
128 0x55555555555555555555555555555555L 0x33333333333333333333333333333333L 0xf0f0f0f0f0f0f0f0f0f0f0f0f0f0f0fL 0x1010101010101010101010101010101L
>>>
You'll notice that the masks generated for the 32-bit case match those in the hardcoded 32-bit C code. Implementation detail: lose the L suffix from the 32-bit masks (Python 2.x) and lose all L suffixes for Python 3.x.
As you can see the whole template and (T)~(T)0 caper is merely obfuscatory sophistry. Put quite simply, for a k-byte type, you need 4 masks:
k bytes each 0x55
k bytes each 0x33
k bytes each 0x0f
k bytes each 0x01
and the final shift is merely N-8 (i.e. 8*(k-1)) bits. Aside: I doubt if the template code would actually work on a machine whose CHAR_BIT was not 8, but there aren't very many of those around these days.
Update: There is another point that affects the correctness and the speed when transliterating such algorithms from C to Python. The C algorithms often assume unsigned integers. In C, operations on unsigned integers work silently modulo 2**N. In other words, only the least significant N bits are retained. No overflow exceptions. Many bit twiddling algorithms rely on this. However (a) Python's int and long are signed (b) old Python 2.X will raise an exception, recent Python 2.Xs will silently promote int to long and Python 3.x int == Python 2.x long.
The correctness problem usually requires register &= all_ones at least once in the Python code. Careful analysis is often required to determine the minimal correct masking.
Working in long instead of int doesn't do much for efficiency. You'll notice that the algorithm for 32 bits will return a long answer even from input of 0, because the 32-bits all_ones is long.