Why should I be failing this simple python algorithm? - python

I have an algorithm that I want to write in python and analyze it. I think I wrote it well, but my output doesn't match what the given output should be.
given algorithm;
Input{inStr: a binary string of bytes}
Output{outHash: 32-bit hashcode for the inStr in a series of hex values}
Mask: 0x3FFFFFFF
outHash: 0
for byte in input
intermediate_value = ((byte XOR 0xCC) Left Shift 24) OR
((byte XOR 0x33) Left Shift 16) OR
((byte XOR 0xAA) Left Shift 8) OR
(byte XOR 0x55)
outHash =(outHash AND Mask) + (intermediate_value AND Mask)
return outHash
My algorithm version in python is;
Input = "Hello world!"
Mask = 0x3FFFFFFF
outHash = 0
for byte in Input:
intermediate_value = ((ord(byte) ^ 0xCC) << 24) or ((ord(byte) ^ 0x33) << 16) or ((ord(byte) ^ 0xAA) << 8) or (ord(byte) ^ 0x55)
outHash =(outHash & Mask) + (intermediate_value & Mask)
print outHash
# use %x to print result in hex
print '%x'%outHash
For input "Hello world!", I should see output as 0x50b027cf, but my output is too different, it looks like;
1291845632
4d000000

OR must be bitwise OR operator (|).

Related

trasform hex to hexcodepoint

I have a hex code like this:
\xf0\x9f\x94\xb4
And I want to encode this like this:
1F534
How can I transform it with a method in python 2.7?
Thanks
Here you are just asking: how can I find the unicode code of the character represented in utf8 with the (byte) string '\xf0\x9f\x94\xb4'?
In Python3 it would be as simple as:
>>> hex(ord(b'\xf0\x9f\x94\xb4'.decode()))
'0x1f534'
In a Python2 version compiled with --enable-unicode=ucs4, it would be more or less the same:
>>> hex(ord('\xf0\x9f\x94\xb4'.decode('utf-8')))
'0x1f534'
But after your comments, you have a Python 2.7 version compiled with --enable-unicode=ucs2. In that case, Unicode strings actually contain a UTF16 representation of the string:
>>> print [hex(ord(i)) for i in '\xf0\x9f\x94\xb4'.decode('utf-8')]
['0xd83d', '0xdd34']
with no direct way to find the true unicode code point of the U+1F534 LARGE RED CIRCLE character.
The last option is then to decode the utf8 sequence by hand. You can find the description of the UTF8 encoding on wikipedia. The following function take an utf-8 representation of an unicode character and return its code point:
def from_utf8(bstr):
b = [ord(i) for i in bstr]
if b[0] & 0x80 == 0: return b
if b[0] & 0xe0 == 0xc0:
return ((b[0] & 0x1F) << 6) | (b[1] & 0x3F)
if b[0] & 0xf0 == 0xe0:
return ((b[0] & 0xF) << 12) | ((b[1] & 0x3F) << 6) | (b[2] & 0x3F)
else:
return ((b[0] & 7) << 18) | ((b[1] & 0x3F) << 12) | \
((b[2] & 0x3F) << 6) | (b[3] & 0x3F)
Beware, no control is done here to make sure that the string is a correct UTF-8 representation of a single character... But at least it gives the expected result:
>>> print hex(from_utf8("\xf0\x9f\x94\xb4"))
0x1f534

Unsupported operand types for bitwise exclusive OR in Python

I am working with a function which generates cyclical redundancy check values. on data packets prior to sending them out over serial and I seem to be having some problems with the Python not being able to determine the difference between a hex representation and an ascii representation of a value. I send the following data:
('+', ' ', 'N', '\x00', '\x08')
To the following function
# Computes CRC checksum using CRC-32 polynomial
def crc_stm32(self,data):
crc = 0xFFFFFFFF
for d in data:
crc ^= d
for i in range(32):
if crc & 0x80000000:
crc = (crc << 1) ^ 0x04C11DB7 #Polynomial used in STM32
else:
crc = (crc << 1)
crc = (crc & 0xFFFFFFFF)
return crc
Now the actual value of the '+' char that is going through this function is (as one might expect) 0x2B, however when Python gets to the line
crc ^= d
I am faced with the following error
unsupported operand type(s) for ^=: 'long' and 'str'
I have tried casting the value to chr(), hex(), int(), long() etc. all to no avail. It seems as though Python is interpreting the '+' value as a char or string.
As per juanpa's comment, the following modification to the code allowed for the proper handling of the data.
# Computes CRC checksum using CRC-32 polynomial
def crc_stm32(self,data):
crc = 0xFFFFFFFF
for d in map(ord,data):
crc ^= d
for i in range(32):
if crc & 0x80000000:
crc = (crc << 1) ^ 0x04C11DB7 #Polynomial used in STM32
else:
crc = (crc << 1)
crc = (crc & 0xFFFFFFFF)
print crc
return crc

Python working with bits

I want to do a bit operation, and need some help:
I have a word of 16 bit and want to split it into two, reverse each and then join them again.
Example if i have 0b11000011
First I divide it into 0b1100 and 0b0011
Then i reverse both getting 0b0011 and 0b1100
And finally rejoin them getting 0b00111100
Thanks!
Here's one way to do it:
def rev(n):
res = 0
mask = 0x01
while mask <= 0x80:
res <<= 1
res |= bool(n & mask)
mask <<= 1
return res
x = 0b1100000110000011
x = (rev(x >> 8) << 8) | rev(x & 0xFF)
print bin(x) # 0b1000001111000001
Note that the method above operates on words, not bytes as example in the question.
here are some basic operations you can try, and you can concatenate results after splitting your string in two and reversing it
a = "0b11000011" #make a string
b = a[:6] #get first 5 chars
c = a[::-1] # invert the string

Get nth byte of integer

I have the following integer:
target = 0xd386d209
print hex(target)
How can I print the nth byte of this integer? For example, expected output for the first byte would be:
0x09
You can do this with the help of bit manipulation. Create a bit mask for an entire byte, then bitshift that mask the number of bytes you'd like. Mask out the byte using binary AND and finally bitshift back the result to the first position:
target = 0xd386d209
byte_index = 0
mask = 0xFF << (8 * byte_index)
print hex((target & mask) >> (8 * byte_index))
You can simplify it a little bit by shifting the input number first. Then you don't need to bitshift the mask value at all:
target = 0xd386d209
byte_index = 0
mask = 0xFF
print hex((target >> (8 * byte_index)) & mask)
def byte(number, i):
return (number & (0xff << (i * 8))) >> (i * 8)
>>> def print_n_byte(target, n):
... return hex((target&(0xFF<<(8*n)))>>(8*n))
...
>>> print_n_byte(0xd386d209, 0)
'0x9L'
>>> print_n_byte(0xd386d209, 1)
'0xd2L'
>>> print_n_byte(0xd386d209, 2)
'0x86L'
This only involves some simple binary operation.
>>> target = 0xd386d209
>>> b = 1
>>> hex((target & (0xff << b * 8)) >> b * 8)
'0x9'
>>> hex((target & (0xff << b * 8)) >> b * 8)
'0xd2'

How to keep leading zeros in binary integer (python)?

I need to calculate a checksum for a hex serial word string using XOR. To my (limited) knowledge this has to be performed using the bitwise operator ^. Also, the data has to be converted to binary integer form. Below is my rudimentary code - but the checksum it calculates is 1000831. It should be 01001110 or 47hex. I think the error may be due to missing the leading zeros. All the formatting I've tried to add the leading zeros turns the binary integers back into strings. I appreciate any suggestions.
word = ('010900004f')
#divide word into 5 separate bytes
wd1 = word[0:2]
wd2 = word[2:4]
wd3 = word[4:6]
wd4 = word[6:8]
wd5 = word[8:10]
#this converts a hex string to a binary string
wd1bs = bin(int(wd1, 16))[2:]
wd2bs = bin(int(wd2, 16))[2:]
wd3bs = bin(int(wd3, 16))[2:]
wd4bs = bin(int(wd4, 16))[2:]
#this converts binary string to binary integer
wd1i = int(wd1bs)
wd2i = int(wd2bs)
wd3i = int(wd3bs)
wd4i = int(wd4bs)
wd5i = int(wd5bs)
#now that I have binary integers, I can use the XOR bitwise operator to cal cksum
checksum = (wd1i ^ wd2i ^ wd3i ^ wd4i ^ wd5i)
#I should get 47 hex as the checksum
print (checksum, type(checksum))
Why use all this conversions and the costly string functions?
(I will answer the X part of your XY-Problem, not the Y part.)
def checksum (s):
v = int (s, 16)
checksum = 0
while v:
checksum ^= v & 0xff
v >>= 8
return checksum
cs = checksum ('010900004f')
print (cs, bin (cs), hex (cs) )
Result is 0x47 as expected. Btw 0x47 is 0b1000111 and not as stated 0b1001110.
s = '010900004f'
b = int(s, 16)
print reduce(lambda x, y: x ^ y, ((b>> 8*i)&0xff for i in range(0, len(s)/2)), 0)
Just modify like this.
before:
wd1i = int(wd1bs)
wd2i = int(wd2bs)
wd3i = int(wd3bs)
wd4i = int(wd4bs)
wd5i = int(wd5bs)
after:
wd1i = int(wd1bs, 2)
wd2i = int(wd2bs, 2)
wd3i = int(wd3bs, 2)
wd4i = int(wd4bs, 2)
wd5i = int(wd5bs, 2)
Why your code doesn't work?
Because you are misunderstanding int(wd1bs) behavior.
See doc here. So Python int function expect wd1bs is 10 base by default.
But you expect int function to treat its argument as 2 base.
So you need to write as int(wd1bs, 2)
Or you can also rewrite your entire code like this. So you don't need to use bin function in this case. And this code is basically same as #Hyperboreus answer. :)
w = int('010900004f', 16)
w1 = (0xff00000000 & w) >> 4*8
w2 = (0x00ff000000 & w) >> 3*8
w3 = (0x0000ff0000 & w) >> 2*8
w4 = (0x000000ff00 & w) >> 1*8
w5 = (0x00000000ff & w)
checksum = w1 ^ w2 ^ w3 ^ w4 ^ w5
print hex(checksum)
#'0x47'
And this is more shorter one.
import binascii
word = '010900004f'
print hex(reduce(lambda a, b: a ^ b, (ord(i) for i in binascii.unhexlify(word))))
#0x47

Categories

Resources