How to keep leading zeros in binary integer (python)?

How to keep leading zeros in binary integer (python)? - python

I need to calculate a checksum for a hex serial word string using XOR. To my (limited) knowledge this has to be performed using the bitwise operator ^. Also, the data has to be converted to binary integer form. Below is my rudimentary code - but the checksum it calculates is 1000831. It should be 01001110 or 47hex. I think the error may be due to missing the leading zeros. All the formatting I've tried to add the leading zeros turns the binary integers back into strings. I appreciate any suggestions.
word = ('010900004f')
#divide word into 5 separate bytes
wd1 = word[0:2]
wd2 = word[2:4]
wd3 = word[4:6]
wd4 = word[6:8]
wd5 = word[8:10]
#this converts a hex string to a binary string
wd1bs = bin(int(wd1, 16))[2:]
wd2bs = bin(int(wd2, 16))[2:]
wd3bs = bin(int(wd3, 16))[2:]
wd4bs = bin(int(wd4, 16))[2:]
#this converts binary string to binary integer
wd1i = int(wd1bs)
wd2i = int(wd2bs)
wd3i = int(wd3bs)
wd4i = int(wd4bs)
wd5i = int(wd5bs)
#now that I have binary integers, I can use the XOR bitwise operator to cal cksum
checksum = (wd1i ^ wd2i ^ wd3i ^ wd4i ^ wd5i)
#I should get 47 hex as the checksum
print (checksum, type(checksum))

Why use all this conversions and the costly string functions?
(I will answer the X part of your XY-Problem, not the Y part.)
def checksum (s):
v = int (s, 16)
checksum = 0
while v:
checksum ^= v & 0xff
v >>= 8
return checksum
cs = checksum ('010900004f')
print (cs, bin (cs), hex (cs) )
Result is 0x47 as expected. Btw 0x47 is 0b1000111 and not as stated 0b1001110.

s = '010900004f'
b = int(s, 16)
print reduce(lambda x, y: x ^ y, ((b>> 8*i)&0xff for i in range(0, len(s)/2)), 0)

Just modify like this.
before:
wd1i = int(wd1bs)
wd2i = int(wd2bs)
wd3i = int(wd3bs)
wd4i = int(wd4bs)
wd5i = int(wd5bs)
after:
wd1i = int(wd1bs, 2)
wd2i = int(wd2bs, 2)
wd3i = int(wd3bs, 2)
wd4i = int(wd4bs, 2)
wd5i = int(wd5bs, 2)
Why your code doesn't work?
Because you are misunderstanding int(wd1bs) behavior.
See doc here. So Python int function expect wd1bs is 10 base by default.
But you expect int function to treat its argument as 2 base.
So you need to write as int(wd1bs, 2)
Or you can also rewrite your entire code like this. So you don't need to use bin function in this case. And this code is basically same as #Hyperboreus answer. :)
w = int('010900004f', 16)
w1 = (0xff00000000 & w) >> 4*8
w2 = (0x00ff000000 & w) >> 3*8
w3 = (0x0000ff0000 & w) >> 2*8
w4 = (0x000000ff00 & w) >> 1*8
w5 = (0x00000000ff & w)
checksum = w1 ^ w2 ^ w3 ^ w4 ^ w5
print hex(checksum)
#'0x47'
And this is more shorter one.
import binascii
word = '010900004f'
print hex(reduce(lambda a, b: a ^ b, (ord(i) for i in binascii.unhexlify(word))))
#0x47

Related

Reverse int as hex

I have int in python that I want to reverse
x = int(1234567899)
I want to result will be 3674379849
explain : = 1234567899 = 0x499602DB and 3674379849 = 0xDB029649
How to do that in python ?

>>> import struct
>>> struct.unpack('>I', struct.pack('<I', 1234567899))[0]
3674379849
>>>
This converts the integer to a 4-byte array (I), then decodes it in reverse order (> vs <).
Documentation: struct

If you just want the result, use sabiks approach - if you want the intermediate steps for bragging rights, you would need to
create the hex of the number (#1) and maybe add a leading 0 for correctness
reverse it 2-byte-wise (#2)
create an integer again (#3)
f.e. like so
n = 1234567899
# 1
h = hex(n)
if len(h) % 2: # fix for uneven lengthy inputs (f.e. n = int("234",16))
h = '0x0'+h[2:]
# 2 (skips 0x and prepends 0x for looks only)
bh = '0x'+''.join([h[i: i+2] for i in range(2, len(h), 2)][::-1])
# 3
b = int(bh, 16)
print(n, h, bh, b)
to get
1234567899 0x499602db 0xdb029649 3674379849

Using z3 where constraint depends on output of function

I want to use z3 to solve this case. The input is a 10 character string. Each character of the input is a printable character (ASCII). The input should be such that when calc2() function is called with input as a parameter, the result should be: 0x0009E38E1FB7629B.
How can I use z3py in such cases?
Usually I would just add independent equations as a constraint to z3. In this case, I am not sure how to use z3.
def calc2(input):
result = 0
for i in range(len(input)):
r1 = (result << 0x5) & 0xffffffffffffffff
r2 = result >> 0x1b
r3 = (r1 ^ r2)
result = (r3 ^ ord(input[i]))
return result
if __name__ == "__main__":
input = sys.argv[1]
result = calc2(input)
if result == 0x0009E38E1FB7629B:
print "solved"
Update: I tried the following however it does not give me correct answer:
from z3 import *
def calc2(input):
result = 0
for i in range(len(input)):
r1 = (result << 0x5) & 0xffffffffffffffff
r2 = result >> 0x1b
r3 = (r1 ^ r2)
result = r3 ^ Concat(BitVec(0, 56), input[i])
return result
if __name__ == "__main__":
s = Solver()
X = [BitVec('x' + str(i), 8) for i in range(10)]
s.add(calc2(X) == 0x0009E38E1FB7629B)
if s.check() == sat:
print(s.model())

I hope this isn't homework, but here's one way to go about it:
from z3 import *
s = Solver()
# Input is 10 character long; represent with 10 8-bit symbolic variables
input = [BitVec("input%s" % i, 8) for i in range(10)]
# Make sure each character is printable ASCII, i.e., between 0x20 and 0x7E
for i in range(10):
s.add(input[i] >= 0x20)
s.add(input[i] <= 0x7E)
def calc2(input):
# result is a 64-bit value
result = BitVecVal(0, 64)
for i in range(len(input)):
# NB. We don't actually need to mask with 0xffffffffffffffff
# Since we explicitly have a 64-bit value in result.
# But it doesn't hurt to mask it, so we do it here.
r1 = (result << 0x5) & 0xffffffffffffffff
r2 = result >> 0x1b
r3 = r1 ^ r2
# We need to zero-extend to match sizes
result = r3 ^ ZeroExt(56, input[i])
return result
# Assert the required equality
s.add(calc2(input) == 0x0009E38E1FB7629B)
# Check and get model
print s.check()
m = s.model()
# reconstruct the string:
s = ''.join([chr (m[input[i]].as_long()) for i in range(10)])
print s
This prints:
$ python a.py
sat
L`p:LxlBVU
Looks like your secret string is
"L`p:LxlBVU"
I've put in some comments in the program to help you with how things are coded in z3py, but feel free to ask for clarification. Hope this helps!
Getting all solutions
To get other solutions, you simply loop and assert that the solution shouldn't be the previous one. You can use the following while loop after the assertion:
while s.check() == sat:
m = s.model()
print ''.join([chr (m[input[i]].as_long()) for i in range(10)])
s.add(Or([input[i] != m[input[i]] for i in range(10)]))
When I ran it, it kept going! You might want to stop it after a while.

You can encode calc2 in Z3. you'll need to unroll the loop for 1,2,3,4,..,n times (for n = max input size expected), but that's it.
(You don't actually need to unroll the loop, you can use z3py to create the constraints)

Python: XOR hex values in strings

Im new to python and I have (maybe) a dumb question.
I have to XOR two value. currently my values look like this:
v1 =
<class 'str'>
2dbdd2157b5a10ba61838a462fc7754f7cb712d6
v2 =
<class 'str'>
5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8
but the thing is, i need to XOR the actual HEX value instead of the ascii value of the given character in the string.
so for example:
the first byte in the first string is s1 = 2d, in the second string s2 = 5b
def sxor(s1,s2):
return ''.join(chr(ord(a) ^ ord(b)) for a,b in zip(s1,s2))
this will not work, because it gives back the ASCII value of each character (then XOR them), which is obviously differs from the actual hex value.

Your mistake is converting the characters to their ASCII codepoints, not to integer values.
You can convert them using int() and format() instead:
return ''.join(format(int(a, 16) ^ int(b, 16), 'x') for a,b in zip(s1,s2))
int(string, 16) interprets the input string as a hexadecimal value. format(integer, 'x') outputs a hexadecimal string for the given integer.
You can do this without zip() by just taking the whole strings as one big integer number:
return '{1:0{0}x}'.format(len(s1), int(s1, 16) ^ int(s2, 16))
To make sure leading 0 characters are not lost, the above uses str.format() to format the resulting integer to the right length of zero-padded hexadecimal.

Parse the strings into int's then xor the ints:
def sxor(s1,s2):
i1 = int(s1, 16)
i2 = int(s2, 16)
# do you want the result as an int or as another string?
return hex(i1 ^ i2)

Works with python3:
_HAS_NUMPY = False
try:
import numpy
_HAS_NUMPY = True
except:
pass
def my_xor(s1, s2):
if _HAS_NUMPY:
b1 = numpy.fromstring(s1, dtype="uint8")
b2 = numpy.fromstring(s2, dtype="uint8")
return (b1 ^ b2).tostring()
result = bytearray(s1)
for i, b in enumerate(s2):
result[i] ^= b
return result

Write boolean string to binary file?

I have a string of booleans and I want to create a binary file using these booleans as bits. This is what I am doing:
# first append the string with 0s to make its length a multiple of 8
while len(boolString) % 8 != 0:
boolString += '0'
# write the string to the file byte by byte
i = 0
while i < len(boolString) / 8:
byte = int(boolString[i*8 : (i+1)*8], 2)
outputFile.write('%c' % byte)
i += 1
But this generates the output 1 byte at a time and is slow. What would be a more efficient way to do it?

It should be quicker if you calculate all your bytes first and then write them all together. For example
b = bytearray([int(boolString[x:x+8], 2) for x in range(0, len(boolString), 8)])
outputFile.write(b)
I'm also using a bytearray which is a natural container to use, and can also be written directly to your file.
You can of course use libraries if that's appropriate such as bitarray and bitstring. Using the latter you could just say
bitstring.Bits(bin=boolString).tofile(outputFile)

Here's another answer, this time using an industrial-strength utility function from the PyCrypto - The Python Cryptography Toolkit where, in version 2.6 (the current latest stable release), it's defined inpycrypto-2.6/lib/Crypto/Util/number.py.
The comments preceeding it say:
Improved conversion functions contributed by Barry Warsaw, after careful benchmarking
import struct
def long_to_bytes(n, blocksize=0):
"""long_to_bytes(n:long, blocksize:int) : string
Convert a long integer to a byte string.
If optional blocksize is given and greater than zero, pad the front of the
byte string with binary zeros so that the length is a multiple of
blocksize.
"""
# after much testing, this algorithm was deemed to be the fastest
s = b('')
n = long(n)
pack = struct.pack
while n > 0:
s = pack('>I', n & 0xffffffffL) + s
n = n >> 32
# strip off leading zeros
for i in range(len(s)):
if s[i] != b('\000')[0]:
break
else:
# only happens when n == 0
s = b('\000')
i = 0
s = s[i:]
# add back some pad bytes. this could be done more efficiently w.r.t. the
# de-padding being done above, but sigh...
if blocksize > 0 and len(s) % blocksize:
s = (blocksize - len(s) % blocksize) * b('\000') + s
return s

You can convert a boolean string to a long using data = long(boolString,2). Then to write this long to disk you can use:
while data > 0:
data, byte = divmod(data, 0xff)
file.write('%c' % byte)
However, there is no need to make a boolean string. It is much easier to use a long. The long type can contain an infinite number of bits. Using bit manipulation you can set or clear the bits as needed. You can then write the long to disk as a whole in a single write operation.

You can try this code using the array class:
import array
buffer = array.array('B')
i = 0
while i < len(boolString) / 8:
byte = int(boolString[i*8 : (i+1)*8], 2)
buffer.append(byte)
i += 1
f = file(filename, 'wb')
buffer.tofile(f)
f.close()

A helper class (shown below) makes it easy:
class BitWriter:
def __init__(self, f):
self.acc = 0
self.bcount = 0
self.out = f
def __del__(self):
self.flush()
def writebit(self, bit):
if self.bcount == 8 :
self.flush()
if bit > 0:
self.acc |= (1 << (7-self.bcount))
self.bcount += 1
def writebits(self, bits, n):
while n > 0:
self.writebit( bits & (1 << (n-1)) )
n -= 1
def flush(self):
self.out.write(chr(self.acc))
self.acc = 0
self.bcount = 0
with open('outputFile', 'wb') as f:
bw = BitWriter(f)
bw.writebits(int(boolString,2), len(boolString))
bw.flush()

Use the struct package.
This can be used in handling binary data stored in files or from network connections, among other sources.
Edit:
An example using ? as the format character for a bool.
import struct
p = struct.pack('????', True, False, True, False)
assert p == '\x01\x00\x01\x00'
with open("out", "wb") as o:
o.write(p)
Let's take a look at the file:
$ ls -l out
-rw-r--r-- 1 lutz lutz 4 Okt 1 13:26 out
$ od out
0000000 000001 000001
000000
Read it in again:
with open("out", "rb") as i:
q = struct.unpack('????', i.read())
assert q == (True, False, True, False)

Concatenate two 32 bit int to get a 64 bit long in Python

I want to generate 64 bits long int to serve as unique ID's for documents.
One idea is to combine the user's ID, which is a 32 bit int, with the Unix timestamp, which is another 32 bits int, to form an unique 64 bits long integer.
A scaled-down example would be:
Combine two 4-bit numbers 0010 and 0101 to form the 8-bit number 00100101.
Does this scheme make sense?
If it does, how do I do the "concatenation" of numbers in Python?

Left shift the first number by the number of bits in the second number, then add (or bitwise OR - replace + with | in the following examples) the second number.
result = (user_id << 32) + timestamp
With respect to your scaled-down example,
>>> x = 0b0010
>>> y = 0b0101
>>> (x << 4) + y
37
>>> 0b00100101
37
>>>

foo = <some int>
bar = <some int>
foobar = (foo << 32) + bar

This should do it:
(x << 32) + y

For the next guy (which was me in this case was me). Here is one way to do it in general (for the scaled down example):
def combineBytes(*args):
"""
given the bytes of a multi byte number combine into one
pass them in least to most significant
"""
ans = 0
for i, val in enumerate(args):
ans += (val << i*4)
return ans
for other sizes change the 4 to a 32 or whatever.
>>> bin(combineBytes(0b0101, 0b0010))
'0b100101'

None of the answers before this cover both merging and splitting the numbers. Splitting can be as much a necessity as merging.
NUM_BITS_PER_INT = 4 # Replace with 32, 48, 64, etc. as needed.
MAXINT = (1 << NUM_BITS_PER_INT) - 1
def merge(a, b):
c = (a << NUM_BITS_PER_INT) | b
return c
def split(c):
a = (c >> NUM_BITS_PER_INT) & MAXINT
b = c & MAXINT
return a, b
# Test
EXPECTED_MAX_NUM_BITS = NUM_BITS_PER_INT * 2
for a in range(MAXINT + 1):
for b in range(MAXINT + 1):
c = merge(a, b)
assert c.bit_length() <= EXPECTED_MAX_NUM_BITS
assert (a, b) == split(c)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to keep leading zeros in binary integer (python)? - python

s = '010900004f' b = int(s, 16) print reduce(lambda x, y: x ^ y, ((b>> 8*i)&0xff for i in range(0, len(s)/2)), 0)

Related

Reverse int as hex

Using z3 where constraint depends on output of function

Python: XOR hex values in strings

Write boolean string to binary file?

Concatenate two 32 bit int to get a 64 bit long in Python

Categories

Resources