I am trying to convert this C function into Python;
typedef unsigned long var;
/* Bit rotate rightwards */
var ror(var v,unsigned int bits) {
return (v>>bits)|(v<<(8*sizeof(var)-bits));
}
I have tried Googling for some solutions, but I can't seem to get any of them to give the same results as the one here.
This is one solution I have found from another program;
def mask1(n):
"""Return a bitmask of length n (suitable for masking against an
int to coerce the size to a given length)
"""
if n >= 0:
return 2**n - 1
else:
return 0
def ror(n, rotations=1, width=8):
"""Return a given number of bitwise right rotations of an integer n,
for a given bit field width.
"""
rotations %= width
if rotations < 1:
return n
n &= mask1(width)
return (n >> rotations) | ((n << (8 * width - rotations)))
I am trying to btishift key = 0xf0f0f0f0f123456. The C code gives 000000000f0f0f12 when it is called with; ror(key, 8 << 1) and Python gives; 0x0f0f0f0f0f123456 (the original input!)
Your C output doesn't match the function that you provided. That is presumably because you are not printing it correctly. This program:
#include <stdio.h>
#include <stdint.h>
uint64_t ror(uint64_t v, unsigned int bits)
{
return (v>>bits) | (v<<(8*sizeof(uint64_t)-bits));
}
int main(void)
{
printf("%llx\n", ror(0x0123456789abcdef, 4));
printf("%llx\n", ror(0x0123456789abcdef, 8));
printf("%llx\n", ror(0x0123456789abcdef, 12));
printf("%llx\n", ror(0x0123456789abcdef, 16));
return 0;
}
produces the following output:
f0123456789abcde
ef0123456789abcd
def0123456789abc
cdef0123456789ab
To produce an ror function in Python I refer you to this excellent article: http://www.falatic.com/index.php/108/python-and-bitwise-rotation
This Python 2 code produces the same output as the C program above:
ror = lambda val, r_bits, max_bits: \
((val & (2**max_bits-1)) >> r_bits%max_bits) | \
(val << (max_bits-(r_bits%max_bits)) & (2**max_bits-1))
print "%x" % ror(0x0123456789abcdef, 4, 64)
print "%x" % ror(0x0123456789abcdef, 8, 64)
print "%x" % ror(0x0123456789abcdef, 12, 64)
print "%x" % ror(0x0123456789abcdef, 16, 64)
The shortest way I've found in Python:
(note this works only with integers as inputs)
def ror(n,rotations,width):
return (2**width-1)&(n>>rotations|n<<(width-rotations))
There are different problems in your question.
C part :
You use a value of key that is a 64 bits value (0x0f0f0f0f0f123456), but the output shows that for you compiler unsigned long is only 32 bits wide. So what C code does is rotating the 32 bits value 0x0f123456 16 times giving 0x34560f12
If you had used unsigned long long (assuming it is 64 bits on your architecture as it is on mine), you would have got 0x34560f0f0f0f0f12 (rotation 16 times of a 64 bits)
Python part :
The definition of width between mask1 and ror is not consistent. mask1 takes a width in bits, where ror takes a width in bytes and one byte = 8 bits.
The ror function should be :
def ror(n, rotations=1, width=8):
"""Return a given number of bitwise right rotations of an integer n,
for a given bit field width.
"""
rotations %= width * 8 # width bytes give 8*bytes bits
if rotations < 1:
return n
mask = mask1(8 * width) # store the mask
n &= mask
return (n >> rotations) | ((n << (8 * width - rotations)) & mask) # apply the mask to result
That way with key = 0x0f0f0f0f0f123456, you get :
>>> hex(ror(key, 16))
'0x34560f0f0f0f0f12L'
>>> hex(ror(key, 16, 4))
'0x34560f12L'
exactly the same as C output
i know its nearly 6 years old
I always find it easier to use string slices than bitwise operations.
def rotate_left(x, n):
return int(f"{x:032b}"[n:] + f"{x:032b}"[:n], 2)
def rotate_right(x, n):
return int(f"{x:032b}"[-n:] + f"{x:032b}"[:-n], 2)
def rotation_value(value, rotations, widht=32):
""" Return a given number of bitwise left or right rotations of an interger
value,
for a given bit field widht.
if rotations == -rotations:
left
else:
right
"""
if int(rotations) != abs(int(rotations)):
rotations = widht + int(rotations)
return (int(value)<<(widht-(rotations%widht)) | (int(value)>>(rotations%widht))) & ((1<<widht)-1)
Related
I'm wondering if there's a way to do a two's complement sign extension as you would in C/C++ in Python, using standard libraries (preferably on a bitarray).
C/C++:
// Example program
#include <iostream>
#include <string>
int main()
{
int x = 0xFF;
x <<= (32 - 8);
x >>= (32 - 8);
std::cout << x;
return 0;
}
And here's a Python function I've written which (in my testing) accomplishes the same thing. I'm simply wondering if there's a built-in (or just faster) way of doing it:
def sign_extend(value, bits):
highest_bit_mask = 1 << (bits - 1)
remainder = 0
for i in xrange(bits - 1):
remainder = (remainder << 1) + 1
if value & highest_bit_mask == highest_bit_mask:
value = (value & remainder) - highest_bit_mask
else:
value = value & remainder
return value
The following code delivers the same results as your function, but is a bit shorter. Also, obviously, if you are going to apply this to a lot of data, you can pre-calculate both the masks.
def sign_extend(value, bits):
sign_bit = 1 << (bits - 1)
return (value & (sign_bit - 1)) - (value & sign_bit)
I am attempting to implement a CARP hash in Python as described in the following IETF draft:
https://datatracker.ietf.org/doc/html/draft-vinod-carp-v1-03#section-3.1
Specifically:
3.1. Hash Function
The hash function outputs a 32 bit unsigned integers based on a
zero-terminated ASCII input string. The machine name and domain
names of the URL, the protocol, and the machine names of each member
proxy should be evaluated in lower case since that portion of the
URL is case insensitive.
Because irreversibility and strong cryptographic features are
unnecessary for this application, a very simple and fast hash
function based on the bitwise left rotate operator is used.
For (each char in URL):
URL_Hash += _rotl(URL_Hash, 19) + char ;
Member proxy hashes are computed in a similar manner:
For (each char in MemberProxyName):
MemberProxy_Hash += _rotl(MemberProxy_Hash, 19) + char ;
Becaues member names are often similar to each other, their hash
values are further spread across hash space via the following
additional operations:
MemberProxy_Hash += MemberProxy_Hash * 0x62531965 ;
MemberProxy_Hash = _rotl (MemberProxy_Hash, 21) ;
3.2. Hash Combination
Hashes are combined by first exclusive or-ing (XOR) the URL hash by
the machine name and then multiplying by a constant and performing
a bitwise rotation.
All final and intermediate values are 32 bit unsigned integers.
Combined_Hash = (URL_hash ^ MemberProxy_Hash) ;
Combined_Hash += Combined_Hash * 0x62531965 ;
Combined_Hash = _rotl(Combined_Hash, 21) ;
I've tried to use numpy to create 32 bit unsigned integers. The first problem arrises when the left bit shift is implemented. Numpy automatically recasts the result as a 64 bit unsigned integer. Same for any arithmetic that would overflow 32 bits.
For example:
from numpy import uint32
def key_hash(data):
# hash should be a 32-bit unsigned integer
hashed = uint32()
for char in data:
hashed += hashed << 19 + ord(char)
return hashed
x = key_hash("testkey")
print type(x)
Returns:
type 'numpy.int64'
Any tips of how I constrain this all to 32 bit space? Also, I am a bit confused by the spec in how performing some of these operations like "MemberProxy_Hash += MemberProxy_Hash * 0x62531965" will ever fit in 32 bits as it is calculating the hash.
EDIT:
Based upon feedback, it sounds like the right solution would be:
def key_hash(data):
# hash should be a 32-bit unsigned integer
hashed = 0
for char in data:
hashed += ((hashed << 19) + (hashed >> 13) + ord(char)) & 0xFFFFFFFF
return hashed
def server_hash(data):
# hash should be a 32-bit unsigned integer
hashed = 0
for char in data:
hashed += ((hashed << 19) + (hashed >> 13) + ord(char)) & 0xFFFFFFFF
hashed += (hashed * 0x62531965) & 0xFFFFFFFF
hashed = ((hashed << 21) + (hashed >> 11)) & 0xFFFFFFFF
return hashed
def hash_combination(key_hash, server_hash):
# hash should be a 32-bit unsigned integer
combined_hash = (key_hash ^ server_hash) & 0xFFFFFFFF
combined_hash += (combined_hash * 0x62531965) & 0xFFFFFFFF
return combined_hash
EDIT #2:
Another fixed version.
def rotate_left(x, n, maxbit=32):
# assumes 32 bit
x = x & (2 ** maxbit - 1)
return ((x << n) | (x >> (maxbit - n)))
def key_hash(data):
# hash should be a 32-bit unsigned integer
hashed = 0
for char in data:
hashed = (hashed + rotate_left(hashed, 19) + ord(char))
return hashed
def server_hash(data):
# hash should be a 32-bit unsigned integer
hashed = 0
for char in data:
hashed = (hashed + rotate_left(hashed, 19) + ord(char))
hashed = hashed + hashed * 0x62531965
hashed = rotate_left(hashed, 21)
return hashed
def hash_combination(key_hash, server_hash):
# hash should be a 32-bit unsigned integer
combined_hash = key_hash ^ server_hash
combined_hash = combined_hash + combined_hash * 0x62531965
return combined_hash & 0xFFFFFFFF
Don't bother with numpy uint32. Just use standard Python int. Constrain the result of operations as necessary by doing result &= 0xFFFFFFFF to remove unwanted high-order bits.
def key_hash(data):
# hash should be a 32-bit unsigned integer
hashed = 0
for char in data:
# hashed += ((hashed << 19) + ord(char)) & 0xFFFFFFFF
# the above is wrong; it's not masking the final addition.
hashed = (hashed + (hashed << 19) + ord(char)) & 0xFFFFFFFF
return hashed
You could do just one final masking but that would be rather slow on long input as the intermediate hashed would be a rather large number.
By the way, the above would not be a very good hash function. The rot in rotl means rotate, not shift.
You need
# hashed += ((hashed << 19) + (hashed >> 13) + ord(char)) & 0xFFFFFFFF
# the above is wrong; it's not masking the final addition.
hashed = (hashed + (hashed << 19) + (hashed >> 13) + ord(char)) & 0xFFFFFFFF
Edit ... a comparison; this code:
def rotate_left(x, n, maxbit=32):
# assumes 32 bit
x = x & (2 ** maxbit - 1)
return ((x << n) | (x >> (maxbit - n)))
def key_hash(data):
# hash should be a 32-bit unsigned integer
hashed = 0
for char in data:
hashed = (hashed + rotate_left(hashed, 19) + ord(char))
return hashed
def khash(data):
h = 0
for c in data:
assert 0 <= h <= 0xFFFFFFFF
h = (h + (h << 19) + (h >> 13) + ord(c)) & 0xFFFFFFFF
assert 0 <= h <= 0xFFFFFFFF
return h
guff = "twas brillig and the slithy toves did whatever"
print "yours: %08X" % key_hash(guff)
print "mine : %08X" % khash(guff)
produces:
yours: A20352DB4214FD
mine : DB4214FD
The following works for me, though maybe a little unpythonic:
from numpy import uint32
def key_hash(data):
# hash should be a 32-bit unsigned integer
hashed = uint32()
for char in data:
hashed += hashed << uint32(19) + uint32(ord(char))
return hashed
x = key_hash("testkey")
print type(x)
The problem is that numbers are coerced towards more bits rather than less.
I want to generate 64 bits long int to serve as unique ID's for documents.
One idea is to combine the user's ID, which is a 32 bit int, with the Unix timestamp, which is another 32 bits int, to form an unique 64 bits long integer.
A scaled-down example would be:
Combine two 4-bit numbers 0010 and 0101 to form the 8-bit number 00100101.
Does this scheme make sense?
If it does, how do I do the "concatenation" of numbers in Python?
Left shift the first number by the number of bits in the second number, then add (or bitwise OR - replace + with | in the following examples) the second number.
result = (user_id << 32) + timestamp
With respect to your scaled-down example,
>>> x = 0b0010
>>> y = 0b0101
>>> (x << 4) + y
37
>>> 0b00100101
37
>>>
foo = <some int>
bar = <some int>
foobar = (foo << 32) + bar
This should do it:
(x << 32) + y
For the next guy (which was me in this case was me). Here is one way to do it in general (for the scaled down example):
def combineBytes(*args):
"""
given the bytes of a multi byte number combine into one
pass them in least to most significant
"""
ans = 0
for i, val in enumerate(args):
ans += (val << i*4)
return ans
for other sizes change the 4 to a 32 or whatever.
>>> bin(combineBytes(0b0101, 0b0010))
'0b100101'
None of the answers before this cover both merging and splitting the numbers. Splitting can be as much a necessity as merging.
NUM_BITS_PER_INT = 4 # Replace with 32, 48, 64, etc. as needed.
MAXINT = (1 << NUM_BITS_PER_INT) - 1
def merge(a, b):
c = (a << NUM_BITS_PER_INT) | b
return c
def split(c):
a = (c >> NUM_BITS_PER_INT) & MAXINT
b = c & MAXINT
return a, b
# Test
EXPECTED_MAX_NUM_BITS = NUM_BITS_PER_INT * 2
for a in range(MAXINT + 1):
for b in range(MAXINT + 1):
c = merge(a, b)
assert c.bit_length() <= EXPECTED_MAX_NUM_BITS
assert (a, b) == split(c)
i have these 2 functions i got from some other code
def ROR(x, n):
mask = (2L**n) - 1
mask_bits = x & mask
return (x >> n) | (mask_bits << (32 - n))
def ROL(x, n):
return ROR(x, 32 - n)
and i wanted to use them in a program, where 16 bit rotations are required. however, there are also other functions that require 32 bit rotations, so i wanted to leave the 32 in the equation, so i got:
def ROR(x, n, bits = 32):
mask = (2L**n) - 1
mask_bits = x & mask
return (x >> n) | (mask_bits << (bits - n))
def ROL(x, n, bits = 32):
return ROR(x, bits - n)
however, the answers came out wrong when i tested this set out. yet, the values came out correctly when the code is
def ROR(x, n):
mask = (2L**n) - 1
mask_bits = x & mask
return (x >> n) | (mask_bits << (16 - n))
def ROL(x, n,bits):
return ROR(x, 16 - n)
what is going on and how do i fix this?
Well, just look at what happens when you call ROL(x, n, 16). It calls ROR(x,16-n), which is equivalent to ROR(x,16-n,32), but what you really wanted was ROR(x, 16-n, 16).
Basically, the implication of #GregS's correct answers are that you need to fix one detail in your second implementation:
def ROL(x, n, bits=32):
return ROR(x, bits - n, bits)
(I'd make this a comment, but then I couldn't have readably formatted code in it!-).
How do I get the maximum signed short integer in Python (i.e. SHRT_MAX in C's limits.h)?
I want to normalize samples from a single channel of a *.wav file, so instead of a bunch of 16-bit signed integers, I want a bunch of floats between 1 and -1. Here's what I've got (the pertinent code is in the normalized_samples() function):
def samples(clip, chan_no = 0):
# *.wav files generally come in 8-bit unsigned ints or 16-bit signed ints
# python's wave module gives sample width in bytes, so STRUCT_FMT
# basically converts the wave.samplewidth into a struct fmt string
STRUCT_FMT = { 1 : 'B',
2 : 'h' }
for i in range(clip.getnframes()):
yield struct.unpack(STRUCT_FMT[clip.getsampwidth()] * clip.getnchannels(),
clip.readframes(1))[chan_no]
def normalized_samples(clip, chan_no = 0):
for sample in samples(clip, chan_no):
yield float(sample) / float(32767) ### THIS IS WHERE I NEED HELP
GregS is right, this is not the right way to solve the problem. If your samples are known 8 or 16 bit, you don't want to be dividing them by a number that varies by platform.
You may be running into trouble because a signed 16-bit int actually ranges from -32768 to 32767. Dividing by 32767 is going to give you < -1 in the extreme negative case.
Try this:
yield float(sample + 2**15) / 2**15 - 1.0
Here is a way using cython
getlimit.py
import pyximport; pyximport.install()
import limits
print limits.shrt_max
limits.pyx
import cython
cdef extern from "limits.h":
cdef int SHRT_MAX
shrt_max = SHRT_MAX
in module sys, sys.maxint. Though I'm not sure that is the correct way to solve your problem.
I can't imagine circumstances on a modern computer (i.e. one that uses 2's complement integers) where this would fail:
assert -32768 <= signed_16_bit_integer <= 32767
To do exactly what you asked for:
if signed_16_bit_integer >= 0:
afloat = signed_16_bit_integer / 32767.0
else:
afloat = signed_16_bit_integer / -32768.0
Having read your code a bit more closely: you have sample_width_in_bytes so just divide by 255 or 256 if it's B and by 32768 if it's h
#!/usr/bin/env python2
# maximums.py
####################################333#########################
B16_MAX = (1 << 16) - 1
B15_MAX = (1 << 15) - 1
B08_MAX = (1 << 8) - 1
B07_MAX = (1 << 7) - 1
print
print "hex(B16_MAX) =",hex(B16_MAX) # 0xffff
print "hex(B15_MAX) =",hex(B15_MAX) # 0x7fff
print "hex(B08_MAX) =",hex(B08_MAX) # 0xff
print "hex(B07_MAX) =",hex(B07_MAX) # 0x7f
print
####################################333#########################
UBYTE2_MAX = B16_MAX
SBYTE2_MAX = B15_MAX
UBYTE1_MAX = B08_MAX
SBYTE1_MAX = B07_MAX
print
print "UBYTE2_MAX =",UBYTE2_MAX # 65535
print "SBYTE2_MAX =",SBYTE2_MAX # 32767
print "UBYTE1_MAX =",UBYTE1_MAX # 255
print "SBYTE1_MAX =",SBYTE1_MAX # 127
print
####################################333#########################
USHRT_MAX = UBYTE2_MAX
SHRT_MAX = SBYTE2_MAX
CHAR_MAX = UBYTE1_MAX
BYTE_MAX = SBYTE1_MAX
print
print "USHRT_MAX =",USHRT_MAX # 65535
print " SHRT_MAX =", SHRT_MAX # 32767
print " CHAR_MAX =", CHAR_MAX # 255
print " BYTE_MAX =", BYTE_MAX # 127
print
####################################333#########################