Comparing bit representation of objects in Python

Comparing bit representation of objects in Python - python

I am watching a video named The Mighty Dictionary which has the following code:
k1 = bits(hash('Monty'))
k2 = bits(hash('Money'))
diff = ('^' [a==b] for a,b in zip(k1,k2))
print(k1,k2,''.join(diff))
As I understand, bits is not a built-in method in Python, but his own written method which is similar to `format(x, 'b'), or is it something that existed in Python 2? (I've never wrote code in Python 2)
I've tried to accomplish the same, get the bits representation of the strings and check where the bits differ:
k1 = format(hash('Monty'),'b')
k2 = format(hash('Money'),'b')
diff = ('^ ' [a==b] for a,b in zip(k1,k2))
print(k1,'\n',k2,'\n',''.join(diff))
I do get the expected result:
UPDATED
Had to shift the first line by 1 space to match the symbols
110111010100001110100101100000100110111111110001001101111000110
-1000001111101001011101001010101101000111001011011000011110100
^ ^^^ ^ ^^ ^^^ ^^^^^^^ ^ ^^^^^ ^^ ^^ ^^^^^^^ ^ ^ ^^^
Also, the lengths of the bits are not the same, whereas I understand that no matter the string, it will take the same, in my case, 64 bits? But its 63 and 62.
print(len(format(hash('Monty'),'b')))
print(len(format(hash('Money'),'b')))
63
62
So, to sum up my question:
Is bits a built-in method in Python2?
Is the recommended way to compare bit representation of an object is using the following:
def fn():
pass
print(format(hash(fn),'b'))
# -111111111111111111111111111111111101111000110001011100000000101
Shouldn't all objects have the same length of bits that represent the object depending on the processor? If I run the following code several times I get these results:
def fn():
pass
def nf():
pass
print(format(hash(fn),'b'))
print(format(hash(nf),'b'))
# first time
# 10001001010011010111110000100
# -111111111111111111111111111111111101110110101100101000001000001
# second time
# 10001001010011010111111101010
# 10001001010011010111110000100
# third time
# 10001001010011010111101010001
# -111111111111111111111111111111111101110110101100101000001000001

No, bits is not a built-in function in Python 2 or Python 3.
By default format() doesn't show leading zeroes. Use the format string 032b to format the number in a 32-character field with leading zeroes.
>>> format(hash('Monty'), '032b')
'1001000100011010010110101101101011000010101011100110001010001'
Another problem you're running into is that hash() can return negative numbers. Maybe this couldn't happen in Python 2, or his bits() function shows the two's complement bits of the number. You can do this by normalizing the input:
def bits(n):
if n < 0:
n = 2**32 + n
return format(n, '032b')
Every time you run the code, you define new fn and nf functions. Different functions will not necessarily have the same hash code, even if they have the same name.
If you don't redefine the functions, you should get the same hash codes each time.
Hashing strings and numbers just depends on the contents, but hashing more complex objects depends on the specific instance.

Related

Declaring and Looping over a variable in one line

Just for fun, I am trying to compress a programming problem into one line. I know this is typically a bad practice, but it is a fun challenge that I am asking for your help on.
I have a piece of code which declares the variables and in the second line which loops over a list created in the first line, until a number is not found anymore. Finally it returns that value.
The programming question is as follows. Given a sentence, convert each character to it's ascii representation. Then convert that ascii value to binary (filling the remaining spaces with 0 if the binary number is less than 8 digits), and combine the numbers into one string. Starting from the number 0, convert it to binary and check if it is in the string. If it is, add one to the number and check again. Return the last consecutive binary number that is in the string.
Ex)
string = "0000010"
0 in string: add 1
1 in string: add 1
10 in string: add 1
11 not in string: the last consecutive binary number was 102=210. Return 2
You can see my code below
def findLastBinary(s: str):
string, n = ''.join(['0'*(10-len(bin(ord(char))))+bin(ord(char))[2:] for char in s]), 0
while bin(n)[2:] in string: n+=1
return n-1
It would also be nice if I could combine the return statement and loop into one line as well.
EDIT
Fixed the code (it should work now). Also below, you will see a sample test case. Hope this helps with answering this question.
Sample test case
Input:
s="Roses and thorns"
Below you will see the steps my code follows to get the correct answer (obviously made more readable)
Organized into columns in the following order:
Character-Ascii-Binary Representation of ascii value:
R - 82 - 01010010
o - 111 - 01101111
s - 115 - 01110011
etc.
Keep in mind that if the binary number has less than 8 digits, zeros should be added to the beginning of the number until it is 8 digits.
Each binary integer is then concatenated into a single string (I added spaces for readability only):
01010010 01101111 01110011 01100101 01110011 00100000 01100001 01101110 01100100 00100000 01110100 01101000 01101111 01110010 01101110 01110011
Now we start from the binary number 0, and check if it is in the string. It is so we move on to 1. 1 is in the string, so we move on to 10. 10 is in the string. And so we continue until we find the binary string 11111 is not in our string. 111112=3110. Since 31 was the first number whose decimal representation was not in the string, we return the last number whose decimal number was in the string: namely, 31-1=30. 30 is what the function should return.

The problem statement has changed. See the bottom of this answer for the updated solution.
The function can be defined the function this way, thanks to #treuss' observation (this applies to the original problem to find the largest base 10 integer which when converted to binary is in the string):
def largest_binary_number(sentence: str):
return int(''.join([bin(ord(char))[2:].zfill(8) for char in sentence]), 2)
But suppose that the problem was to "find the smallest base 10 integer larger than 1000 whose binary representation is in the string." Then we have something like this:
def find(sentence: str):
return list(iter(lambda: globals().__setitem__('_c', globals().get('_c', 1000-1) + 1) or bin(globals().get('_c'))[2:] in ''.join([bin(ord(c))[2:].zfill(8) for c in sentence]), True)) is type or globals().get('_c')
Let's break this down into four parts:
globals().__setitem__('_c', globals().get('_c', 1000-1) + 1) - initialize and increment a counter
... or bin(globals().get('_c'))[2:] in ''.join([bin(ord(c))[2:].zfill(8) for c in sentence]) - check if the binary representation of the counter is in the binary representation of the sentence
list(iter(lambda: ..., True)) - inline while loop using black magic
... is type or globals().get('_c') - get the final value of the counter, which satisfies our condition
Part 1: globals().__setitem__('_c', globals().get('_c', 1000-1) + 1)
Since we are confined to do everything in one line, we don't have the luxury of defining variables. This is where globals comes in: we can store and use arbitrary variables as dictionary entries using the __setitem__ and get methods. Here we name our counter variable _c, calling get to initialize and fetch the value, then immediately increment it by one and save the value with __setitem__. Now we have a counter variable.
Part 2: ... or bin(globals().get('_c'))[2:] in ''.join([bin(ord(c))[2:].zfill(8) for c in sentence])
bin(globals().get('_c'))[2:] converts the counter to binary and removes the 0b prefix. ''.join([bin(ord(c))[2:].zfill(8) for c in sentence]), as before, converts the input sentence to binary. We use in to check if the binary counter is a substring of the binary sentence. Because the __setitem__ call from part 1 returns None, we use or here to ignore that and execute this part.
Part 3: list(iter(lambda: ..., True))
This is the bread and butter, allowing us to perform inline iteration. iter is usually passed an iterable to create and iterator, but it actually has a second form that takes two arguments: a callable and a sentinel. When iterating over an iterator created using this two-argument form, the callable is successively called until it returns the sentinel value (beware infinite loops!). So we define a lambda function that returns True when the condition is satisfied, and set the sentinel to True. Finally we use the list constructor to begin iterating.
Part 4: ... is type or globals().get('_c')
Once the list constructor finishes iterating, we need to fetch and return the final value of the counter. We follow list(...) with is type to make an expression that always evaluates to False, then chain it with or globals().get('_c') at the end of this one-liner to return the counter. Et voilà!
Part 5:
Of course, what we had before was a two-liner.
find = lambda sentence: list(iter(lambda: globals().__setitem__('_c', globals().get('_c', 1000-1) + 1) or bin(globals().get('_c'))[2:] in ''.join([bin(ord(c))[2:].zfill(8) for c in sentence]), True)) is type or globals().get('_c')
Now we have a one-liner.
Note: In hindsight, maybe the walrus := could be used to make the counter, instead of having to call globals() every time. However, replacing globals with locals doesn't work for some reason.
Note 2: Using these techniques, we can make one-liners that satisfy various conditions.
Update: Here's another version using the walrus
find = lambda sentence: (_c := {'v': 1000-1}) and list(iter(lambda: _c.__setitem__('v', _c['v'] + 1) or bin(_c['v'])[2:] in ''.join([bin(ord(c))[2:].zfill(8) for c in sentence]), True)) is type or _c['v']
We initialize the counter at the top level and simply use _c everywhere else. Note how it is a dict instead of an int because outer variables cannot be assigned within the inner lambda (but mutating outer variables is fine).
Update 2: OP has updated the problem statement, so here's the new solution:
find = lambda s: (_c := {'v': 0-1}) and list(iter(lambda: _c.__setitem__('v', _c['v'] + 1) or bin(_c['v'])[2:] in ''.join([bin(ord(c))[2:].zfill(8) for c in s]), False)) is type or _c['v'] - 1
The techniques are the same, but now we start the counter from -1 (the first iteration increments it to 0 before anything else), the sentinel becomes False (because we stop the loop when the binary counter is not in the binary string), and decrement the return value by 1 to get the last number satisfying the condition.

Converting string to binary then xor binary

So I am trying to convert a string to binary then xor the binary by using the following methods
def string_to_binary(s):
return ' '.join(map(bin,bytearray(s,encoding='utf-8')))
def xor_bin(a,b):
return int(a,2) ^ int(b,2)
When I try and run the xor_bin function I get the following error:
Exception has occurred: exceptions.ValueError
invalid literal for int() with base 2: '0b1100010 0b1111001 0b1100101 0b1100101 0b1100101'
I can't see what's wrong here.

bin is bad here; it doesn't pad out to eight digits (so you'll lose data alignment whenever the high bit is a 0 and misinterpret all bits to the left of that loss as being lower magnitude than they should be), and it adds a 0b prefix that you don't want. str.format can fix both issues, by zero padding and omitting the 0b prefix (I also removed the space in the joiner string, since you don't want spaces in the result):
def string_to_binary(s):
return ''.join(map('{:08b}'.format, bytearray(s, encoding='utf-8')))
With that, string_to_binary('byeee') gets you '0110001001111001011001010110010101100101' which is what you want, as opposed to '0b1100010 0b1111001 0b1100101 0b1100101 0b1100101' which is obviously not a (single) valid base-2 integer.

Your question is unclear because you don't show how the two functions you defined where being used when the error occurred — therefore this answer is a guess.
You can convert a binary string representation of an integer into a Python int, (which are stored internally as binary values) by simply using passing it to the int() function — as you're doing in the xor_bin() function. Once you have two int values, you can xor them "in binary" by simply using the ^ operator — which again, you seem to know.
This means means to xor the binary string representations of two integers and convert the result back into a binary string representation could be done like this you one of your functions just as it is. Here's what I mean:
def xor_bin(a, b):
return int(a, 2) ^ int(b, 2)
s1 = '0b11000101111001110010111001011100101'
s2 = '0b00000000000000000000000000001111111'
# ---------------------------------------
# '0b11000101111001110010111001010011010' expected result of xoring them
result = xor_bin(s1, s2)
print bin(result) # -> 0b11000101111001110010111001010011010

Float converted to 2.dp reverts to original number of decimal places when inserted into a string

I have created the following snippet of code and I am trying to convert my 5 dp DNumber to a 2 dp one and insert this into a string. However which ever method I try to use, always seems to revert the DNumber back to the original number of decimal places (5)
Code snippet below:
if key == (1, 1):
DNumber = '{r[csvnum]}'.format(r=row)
# returns 7.65321
DNumber = """%.2f""" % (float(DNumber))
# returns 7.65
Check2 = False
if DNumber:
if DNumber <= float(8):
Check2 = True
if Check2:
print DNumber
# returns 7.65
string = 'test {r[csvhello]} TESTHERE test'.format(r=row).replace("TESTHERE", str("""%.2f""" % (float(gtpe))))
# returns: test Hello 7.65321 test
string = 'test {r[csvhello]} TESTHERE test'.format(r=row).replace("TESTHERE", str(DNumber))
# returns: test Hello 7.65321 test
What I hoped it would return: test Hello 7.65 test
Any Ideas or suggestion on alternative methods to try?

It seems like you were hoping that converting the float to a 2-decimal-place string and then back to a float would give you a 2-decimal-place float.
The first problem is that your code doesn't actually do that anywhere. If you'd done that, you would get something very close to 7.65, not 7.65321.
But the bigger problem is that what you're trying to do doesn't make any sense. A float always has 53 binary digits, no matter what. If you round it to two decimal digits (no matter how you do it, including by converting to string and back), what you actually get is a float rounded to two decimal digits and then rounded to 53 binary digits. The closest float to 7.65 is not exactly 7.65, but 7.650000000000000355271368.* So, that's what you'd end up with. And there's no way around that; it's inherent to the way float is stored.
However, there is a different type you can use for this: decimal.Decimal. For example:
>>> f = 7.65321
>>> s = '%.2f' % f
>>> d = decimal.Decimal(s)
>>> f, s, d
(7.65321, '7.65', Decimal('7.65'))
Or, of course, you could just pass around a string instead of a float (as you're accidentally doing in your code already), or you could remember to use the .2f format every time you want to output it.
As a side note, since your DNumber ends up as a string, this line is not doing anything useful:
if DNumber <= 8:
In Python 2.x, comparing two values of different types gives you a consistent but arbitrary and meaningless answer. With CPython 2.x, it will always be False.** In a different Python 2.x implementation, it might be different. In Python 3.x, it raises a TypeError.
And changing it to this doesn't help in any way:
if DNumber <= float(8):
Now, instead of comparing a str to an int, you're comparing a str to a float. This is exactly as meaningless, and follows the exact same rules. (Also, float(8) means the same thing as 8.0, but less readable and potentially slower.)
For that matter, this:
if DNumber:
… is always going to be true. For a number, if foo checks whether it's non-zero. That's a bad idea for float values (you should check whether it's within some absolute or relative error range of 0). But again, you don't have a float value; you have a str. And for strings, if foo checks whether the string is non-empty. So, even if you started off with 0, your string "0.00" is going to be true.
* I'm assuming here that you're using CPython, on a platform that uses IEEE-754 double for its C double type, and that all those extra conversions back and forth between string and float aren't introducing any additional errors.
** The rule is, slightly simplified: If you compare two numbers, they're converted to a type that can hold them both; otherwise, if either value is None it's smaller; otherwise, if either value is a number, it's smaller; otherwise, whichever one's type has an alphabetically earlier name is smaller.

I think you're trying to do the following - combine the formatting with the getter:
>>> a = 123.456789
>>> row = {'csvnum': a}
>>> print 'test {r[csvnum]:.2f} hello'.format(r=row)
test 123.46 hello

If your number is a 7 followed by five digits, you might want to try:
print "%r" % float(str(x)[:4])
where x is the float in question.
Example:
>>>x = 1.11111
>>>print "%r" % float(str(x)[:4])
>>>1.11

Length of hexadecimal number

How can we get the length of a hexadecimal number in the Python language?
I tried using this code but even this is showing some error.
i = 0
def hex_len(a):
if a > 0x0:
# i = 0
i = i + 1
a = a/16
return i
b = 0x346
print(hex_len(b))
Here I just used 346 as the hexadecimal number, but my actual numbers are very big to be counted manually.

Use the function hex:
>>> b = 0x346
>>> hex(b)
'0x346'
>>> len(hex(b))-2
3
or using string formatting:
>>> len("{:x}".format(b))
3

While using the string representation as intermediate result has some merits in simplicity it's somewhat wasted time and memory. I'd prefer a mathematical solution (returning the pure number of digits without any 0x-prefix):
from math import ceil, log
def numberLength(n, base=16):
return ceil(log(n+1)/log(base))
The +1 adjustment takes care of the fact, that for an exact power of your number base you need a leading "1".

As Ashwini wrote, the hex function does the hard work for you:
hex(x)
Convert an integer number (of any size) to a hexadecimal string. The result is a valid Python expression.

How to convert a string representing a binary fraction to a number in Python

Let us suppose that we have a string representing a binary fraction such as:
".1"
As a decimal number this is 0.5. Is there a standard way in Python to go from such strings to a number type (whether it is binary or decimal is not strictly important).
For an integer, the solution is straightforward:
int("101", 2)
>>>5
int() takes an optional second argument to provide the base, but float() does not.
I am looking for something functionally equivalent (I think) to this:
def frac_bin_str_to_float(num):
"""Assuming num to be a string representing
the fractional part of a binary number with
no integer part, return num as a float."""
result = 0
ex = 2.0
for c in num:
if c == '1':
result += 1/ex
ex *= 2
return result
I think that does what I want, although I may well have missed some edge cases.
Is there a built-in or standard method of doing this in Python?

The following is a shorter way to express the same algorithm:
def parse_bin(s):
return int(s[1:], 2) / 2.**(len(s) - 1)
It assumes that the string starts with the dot. If you want something more general, the following will handle both the integer and the fractional parts:
def parse_bin(s):
t = s.split('.')
return int(t[0], 2) + int(t[1], 2) / 2.**len(t[1])
For example:
In [56]: parse_bin('10.11')
Out[56]: 2.75

It is reasonable to suppress the point instead of splitting on it, as follows. This bin2float function (unlike parse_bin in previous answer) correctly deals with inputs without points (except for returning an integer instead of a float in that case).
For example, the invocations bin2float('101101'), bin2float('.11101'), andbin2float('101101.11101')` return 45, 0.90625, 45.90625 respectively.
def bin2float (b):
s, f = b.find('.')+1, int(b.replace('.',''), 2)
return f/2.**(len(b)-s) if s else f

You could actually generalize James's code to convert it from any number system if you replace the hard coded '2' to that base.
def str2float(s, base=10):
dot, f = s.find('.') + 1, int(s.replace('.', ''), base)
return f / float(base)**(len(s) - dot) if dot else f

You can use the Binary fractions package. With this package you can convert binary-fraction strings into floats and vice-versa.
Example:
>>> from binary_fractions import Binary
>>> float(Binary("0.1"))
0.5
>>> str(Binary(0.5))
'0b0.1'
It has many more helper functions to manipulate binary strings such as: shift, add, fill, to_exponential, invert...
PS: Shameless plug, I'm the author of this package.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Comparing bit representation of objects in Python - python

Related

Declaring and Looping over a variable in one line

Converting string to binary then xor binary

Float converted to 2.dp reverts to original number of decimal places when inserted into a string

Length of hexadecimal number

How to convert a string representing a binary fraction to a number in Python

Categories

Resources