Python: format negative number with parentheses - python

Is there a way to use either string interpolation or string.format to render negative numbers into text formatted using parentheses instead of "negative signs"?
I.e. -3.14 should be (3.14).
I had hoped to do this using string interpolation or string.format rather than needing an import specifically designed for currencies or accounting.
Edit to clarify: Please assume the variable to be formatted is either an int or a float. I.e. while this can be done with regular expressions (see good answers below), I was thinking this would be a more native operation for Python's formatting functionality.
So to be clear:
import numpy as np
list_of_inputs = [-10, -10.5, -10 * np.sqrt(2), 10, 10.5, 10 * np.sqrt(2)]
for i in list_of_inputs:
# your awesome solution goes here
should return:
(10)
(10.5)
(14.14)
10
10.5
14.14
Clearly there is some flexibility about that last one. I had hoped the "put negative numbers in parentheses" would be a natural argument of string interpolation or string.format so that I could use other formatting language while setting the display style of negative numbers.

If you just need to handle possibly-negative numeric input:
print '{0:.2f}'.format(num) if num>=0 else '({0:.2f})'.format(abs(num))

This is what subclassing the formatter class is for. Try the following:
import string
class NegativeParenFormatter(string.Formatter):
def format_field(self, value, format_spec):
try:
if value<0:
return "(" + string.Formatter.format_field(self, -value, format_spec) + ")"
else:
return string.Formatter.format_field(self, value, format_spec)
except:
return string.Formatter.format_field(self, value, format_spec)
f = NegativeParenFormatter()
print f.format("{0} is positive, {1} is negative, {2} is a string", 3, -2, "-4")
this prints:
'3 is positive, (2) is negative, -4 is a string'

Pandas has a display option for floats and numpy has a display option for any dtype:
In [11]: df = pd.DataFrame([[1., -2], [-3., 4]], columns=['A', 'B'])
Note: A is a float column, B is an int column.
We can just write a simple formatter depending on the sign of the number:
In [12]: formatter = lambda x: '(%s)' % str(x)[1:] if x < 0 else str(x)
In [13]: pd.options.display.float_format = formatter
In [14]: df # doesn't work for the int column :(
Out[14]:
A B
0 1.0 2
1 (3.0) 4
In [15]: df.astype(float)
Out[15]:
A B
0 1.0 (2.0)
1 (3.0) 4.0
You can also configure numpy's print options:
In [21]: df.values # float
Out[21]:
array([[1., 2.],
[3., 4.]])
In [22]: df['B'].values # int
Out[22]: array([2, 4])
In [23]: np.set_printoptions(formatter={'int': formatter, 'float': formatter})
In [24]: df.values # float
Out[24]:
array([[1.0, (2.0)],
[(3.0), 4.0]])
In [25]: df['B'].values # int
Out[25]: array([(2), 4])
Note: this doesn't change the way the data is stored, just the way you view it.

Your easiest approach would be to use a trinary.
num = -3.14
output = "({})".format(math.fabs(num)) if num < 0 else "{}".format(num)
I can't remember if this works with a straight print statement instead of an assignment. I will check this when I get by an interpreter.
Thanks LartS for 3.x confirmation: I further confirmed against(3.x and 2.x)
print("({})".format(math.fabs(num)) if num < 0 else "{}".format(num))
Does work

Maybe you're looking for something like this
float = -3.14
num= "(%(key)s)" %{ 'key': str(abs(float))} if float < 0 else str(float)

You can use conditionals in a Python print statement:
print "%s%d%s" % ( "(" if (i<0) else(""), i, ")" if (i<0) else("") )

Related

Zylabs 7.14 Lab: Reverse binary [duplicate]

Are there any canned Python methods to convert an Integer (or Long) into a binary string in Python?
There are a myriad of dec2bin() functions out on Google... But I was hoping I could use a built-in function / library.
Python's string format method can take a format spec.
>>> "{0:b}".format(37)
'100101'
Format spec docs for Python 2
Format spec docs for Python 3
If you're looking for bin() as an equivalent to hex(), it was added in python 2.6.
Example:
>>> bin(10)
'0b1010'
Python actually does have something already built in for this, the ability to do operations such as '{0:b}'.format(42), which will give you the bit pattern (in a string) for 42, or 101010.
For a more general philosophy, no language or library will give its user base everything that they desire. If you're working in an environment that doesn't provide exactly what you need, you should be collecting snippets of code as you develop to ensure you never have to write the same thing twice. Such as, for example, the pseudo-code:
define intToBinString, receiving intVal:
if intVal is equal to zero:
return "0"
set strVal to ""
while intVal is greater than zero:
if intVal is odd:
prefix "1" to strVal
else:
prefix "0" to strVal
divide intVal by two, rounding down
return strVal
which will construct your binary string based on the decimal value. Just keep in mind that's a generic bit of pseudo-code which may not be the most efficient way of doing it though, with the iterations you seem to be proposing, it won't make much difference. It's really just meant as a guideline on how it could be done.
The general idea is to use code from (in order of preference):
the language or built-in libraries.
third-party libraries with suitable licenses.
your own collection.
something new you need to write (and save in your own collection for later).
If you want a textual representation without the 0b-prefix, you could use this:
get_bin = lambda x: format(x, 'b')
print(get_bin(3))
>>> '11'
print(get_bin(-3))
>>> '-11'
When you want a n-bit representation:
get_bin = lambda x, n: format(x, 'b').zfill(n)
>>> get_bin(12, 32)
'00000000000000000000000000001100'
>>> get_bin(-12, 32)
'-00000000000000000000000000001100'
Alternatively, if you prefer having a function:
def get_bin(x, n=0):
"""
Get the binary representation of x.
Parameters
----------
x : int
n : int
Minimum number of digits. If x needs less digits in binary, the rest
is filled with zeros.
Returns
-------
str
"""
return format(x, 'b').zfill(n)
I am surprised there is no mention of a nice way to accomplish this using formatting strings that are supported in Python 3.6 and higher. TLDR:
>>> number = 1
>>> f'0b{number:08b}'
'0b00000001'
Longer story
This is functionality of formatting strings available from Python 3.6:
>>> x, y, z = 1, 2, 3
>>> f'{x} {y} {2*z}'
'1 2 6'
You can request binary as well:
>>> f'{z:b}'
'11'
Specify the width:
>>> f'{z:8b}'
' 11'
Request zero padding:
f'{z:08b}'
'00000011'
And add common prefix to signify binary number:
>>> f'0b{z:08b}'
'0b00000011'
You can also let Python add the prefix for you but I do not like it so much as the version above because you have to take the prefix into width consideration:
>>> f'{z:#010b}'
'0b00000011'
More info is available in official documentation on Formatted string literals and Format Specification Mini-Language.
As a reference:
def toBinary(n):
return ''.join(str(1 & int(n) >> i) for i in range(64)[::-1])
This function can convert a positive integer as large as 18446744073709551615, represented as string '1111111111111111111111111111111111111111111111111111111111111111'.
It can be modified to serve a much larger integer, though it may not be as handy as "{0:b}".format() or bin().
This is for python 3 and it keeps the leading zeros !
print(format(0, '08b'))
A simple way to do that is to use string format, see this page.
>> "{0:b}".format(10)
'1010'
And if you want to have a fixed length of the binary string, you can use this:
>> "{0:{fill}8b}".format(10, fill='0')
'00001010'
If two's complement is required, then the following line can be used:
'{0:{fill}{width}b}'.format((x + 2**n) % 2**n, fill='0', width=n)
where n is the width of the binary string.
one-liner with lambda:
>>> binary = lambda n: '' if n==0 else binary(n/2) + str(n%2)
test:
>>> binary(5)
'101'
EDIT:
but then :(
t1 = time()
for i in range(1000000):
binary(i)
t2 = time()
print(t2 - t1)
# 6.57236599922
in compare to
t1 = time()
for i in range(1000000):
'{0:b}'.format(i)
t2 = time()
print(t2 - t1)
# 0.68017411232
As the preceding answers mostly used format(),
here is an f-string implementation.
integer = 7
bit_count = 5
print(f'{integer:0{bit_count}b}')
Output:
00111
For convenience here is the python docs link for formatted string literals: https://docs.python.org/3/reference/lexical_analysis.html#f-strings.
Summary of alternatives:
n=42
assert "-101010" == format(-n, 'b')
assert "-101010" == "{0:b}".format(-n)
assert "-101010" == (lambda x: x >= 0 and str(bin(x))[2:] or "-" + str(bin(x))[3:])(-n)
assert "0b101010" == bin(n)
assert "101010" == bin(n)[2:] # But this won't work for negative numbers.
Contributors include John Fouhy, Tung Nguyen, mVChr, Martin Thoma. and Martijn Pieters.
>>> format(123, 'b')
'1111011'
For those of us who need to convert signed integers (range -2**(digits-1) to 2**(digits-1)-1) to 2's complement binary strings, this works:
def int2bin(integer, digits):
if integer >= 0:
return bin(integer)[2:].zfill(digits)
else:
return bin(2**digits + integer)[2:]
This produces:
>>> int2bin(10, 8)
'00001010'
>>> int2bin(-10, 8)
'11110110'
>>> int2bin(-128, 8)
'10000000'
>>> int2bin(127, 8)
'01111111'
you can do like that :
bin(10)[2:]
or :
f = str(bin(10))
c = []
c.append("".join(map(int, f[2:])))
print c
Using numpy pack/unpackbits, they are your best friends.
Examples
--------
>>> a = np.array([[2], [7], [23]], dtype=np.uint8)
>>> a
array([[ 2],
[ 7],
[23]], dtype=uint8)
>>> b = np.unpackbits(a, axis=1)
>>> b
array([[0, 0, 0, 0, 0, 0, 1, 0],
[0, 0, 0, 0, 0, 1, 1, 1],
[0, 0, 0, 1, 0, 1, 1, 1]], dtype=uint8)
Yet another solution with another algorithm, by using bitwise operators.
def int2bin(val):
res=''
while val>0:
res += str(val&1)
val=val>>1 # val=val/2
return res[::-1] # reverse the string
A faster version without reversing the string.
def int2bin(val):
res=''
while val>0:
res = chr((val&1) + 0x30) + res
val=val>>1
return res
numpy.binary_repr(num, width=None)
Examples from the documentation link above:
>>> np.binary_repr(3)
'11'
>>> np.binary_repr(-3)
'-11'
>>> np.binary_repr(3, width=4)
'0011'
The two’s complement is returned when the input number is negative and width is specified:
>>> np.binary_repr(-3, width=3)
'101'
>>> np.binary_repr(-3, width=5)
'11101'
The accepted answer didn't address negative numbers, which I'll cover.
In addition to the answers above, you can also just use the bin and hex functions. And in the opposite direction, use binary notation:
>>> bin(37)
'0b100101'
>>> 0b100101
37
But with negative numbers, things get a bit more complicated. The question doesn't specify how you want to handle negative numbers.
Python just adds a negative sign so the result for -37 would be this:
>>> bin(-37)
'-0b100101'
In computer/hardware binary data, negative signs don't exist. All we have is 1's and 0's. So if you're reading or producing binary streams of data to be processed by other software/hardware, you need to first know the notation being used.
One notation is sign-magnitude notation, where the first bit represents the negative sign, and the rest is the actual value. In that case, -37 would be 0b1100101 and 37 would be 0b0100101. This looks like what python produces, but just add a 0 or 1 in front for positive / negative numbers.
More common is Two's complement notation, which seems more complicated and the result is very different from python's string formatting. You can read the details in the link, but with an 8bit signed integer -37 would be 0b11011011 and 37 would be 0b00100101.
Python has no easy way to produce these binary representations. You can use numpy to turn Two's complement binary values into python integers:
>>> import numpy as np
>>> np.int8(0b11011011)
-37
>>> np.uint8(0b11011011)
219
>>> np.uint8(0b00100101)
37
>>> np.int8(0b00100101)
37
But I don't know an easy way to do the opposite with builtin functions. The bitstring package can help though.
>>> from bitstring import BitArray
>>> arr = BitArray(int=-37, length=8)
>>> arr.uint
219
>>> arr.int
-37
>>> arr.bin
'11011011'
>>> BitArray(bin='11011011').int
-37
>>> BitArray(bin='11011011').uint
219
Python 3.6 added a new string formatting approach called formatted string literals or “f-strings”.
Example:
name = 'Bob'
number = 42
f"Hello, {name}, your number is {number:>08b}"
Output will be 'Hello, Bob, your number is 00001010!'
A discussion of this question can be found here - Here
Unless I'm misunderstanding what you mean by binary string I think the module you are looking for is struct
n=input()
print(bin(n).replace("0b", ""))
def binary(decimal) :
otherBase = ""
while decimal != 0 :
otherBase = str(decimal % 2) + otherBase
decimal //= 2
return otherBase
print binary(10)
output:
1010
Here is the code I've just implemented. This is not a method but you can use it as a ready-to-use function!
def inttobinary(number):
if number == 0:
return str(0)
result =""
while (number != 0):
remainder = number%2
number = number/2
result += str(remainder)
return result[::-1] # to invert the string
Calculator with all neccessary functions for DEC,BIN,HEX:
(made and tested with Python 3.5)
You can change the input test numbers and get the converted ones.
# CONVERTER: DEC / BIN / HEX
def dec2bin(d):
# dec -> bin
b = bin(d)
return b
def dec2hex(d):
# dec -> hex
h = hex(d)
return h
def bin2dec(b):
# bin -> dec
bin_numb="{0:b}".format(b)
d = eval(bin_numb)
return d,bin_numb
def bin2hex(b):
# bin -> hex
h = hex(b)
return h
def hex2dec(h):
# hex -> dec
d = int(h)
return d
def hex2bin(h):
# hex -> bin
b = bin(h)
return b
## TESTING NUMBERS
numb_dec = 99
numb_bin = 0b0111
numb_hex = 0xFF
## CALCULATIONS
res_dec2bin = dec2bin(numb_dec)
res_dec2hex = dec2hex(numb_dec)
res_bin2dec,bin_numb = bin2dec(numb_bin)
res_bin2hex = bin2hex(numb_bin)
res_hex2dec = hex2dec(numb_hex)
res_hex2bin = hex2bin(numb_hex)
## PRINTING
print('------- DECIMAL to BIN / HEX -------\n')
print('decimal:',numb_dec,'\nbin: ',res_dec2bin,'\nhex: ',res_dec2hex,'\n')
print('------- BINARY to DEC / HEX -------\n')
print('binary: ',bin_numb,'\ndec: ',numb_bin,'\nhex: ',res_bin2hex,'\n')
print('----- HEXADECIMAL to BIN / HEX -----\n')
print('hexadec:',hex(numb_hex),'\nbin: ',res_hex2bin,'\ndec: ',res_hex2dec,'\n')
Somewhat similar solution
def to_bin(dec):
flag = True
bin_str = ''
while flag:
remainder = dec % 2
quotient = dec / 2
if quotient == 0:
flag = False
bin_str += str(remainder)
dec = quotient
bin_str = bin_str[::-1] # reverse the string
return bin_str
here is simple solution using the divmod() fucntion which returns the reminder and the result of a division without the fraction.
def dectobin(number):
bin = ''
while (number >= 1):
number, rem = divmod(number, 2)
bin = bin + str(rem)
return bin
Here's yet another way using regular math, no loops, only recursion. (Trivial case 0 returns nothing).
def toBin(num):
if num == 0:
return ""
return toBin(num//2) + str(num%2)
print ([(toBin(i)) for i in range(10)])
['', '1', '10', '11', '100', '101', '110', '111', '1000', '1001']
To calculate binary of numbers:
print("Binary is {0:>08b}".format(16))
To calculate the Hexa decimal of a number:
print("Hexa Decimal is {0:>0x}".format(15))
To Calculate all the binary no till 16::
for i in range(17):
print("{0:>2}: binary is {0:>08b}".format(i))
To calculate Hexa decimal no till 17
for i in range(17):
print("{0:>2}: Hexa Decimal is {0:>0x}".format(i))
##as 2 digit is enogh for hexa decimal representation of a number
try:
while True:
p = ""
a = input()
while a != 0:
l = a % 2
b = a - l
a = b / 2
p = str(l) + p
print(p)
except:
print ("write 1 number")
I found a method using matrix operation to convert decimal to binary.
import numpy as np
E_mat = np.tile(E,[1,M])
M_order = pow(2,(M-1-np.array(range(M)))).T
bindata = np.remainder(np.floor(E_mat /M_order).astype(np.int),2)
Eis input decimal data,M is the binary orders. bindata is output binary data, which is in a format of 1 by M binary matrix.

Logical OR for Bit-string in Python

What i want to do is have the result of logical OR for two bit-strings. For example:
a='010010'
b='000101'
c=LOGIC_OR(a,b)
c
010111
The error i encounter most of the time is when I convert 'b' from string to binary it removes leading zeros. Others methods i have used convert 'a' and 'b' to integers. Generally nothing is working and help would be much appreciated.
Thanks in advance
You can convert them to integers with int specifying the base to be 2. Then, perform a bitwise OR operation and convert the result to a bit string with bin.
>>> c = int(a, 2) | int(b, 2))
>>> c
23
If you want to print the result as a bit string, use str.format. If you're on python-3.6, you can also use f-strings.
>>> '{:b}'.format(c)
'10111'
>>> print(f"{c:b}")
10111
To capture leading zeros with respect to a/b, use str.zfill -
>>> f"{c:b}".zfill(len(a))
'010111'
Here are a couple of alternative methods.
Third-party bitarray library:
from bitarray import bitarray
a='010010'
b='000101'
logical_or_bitarray = bitarray(a) | bitarray(b) # output: bitarray('010111')
logical_or_string = ''.join(map(str, map(int, logical_or_bitarray))) # output: '010111'
Python strings:-
a='010010'
b='000101'
def compare_bits(A, B):
c_1 = str(int(A) | int(B))
c = (len(A) - len(c_1))*'0' + str(c_1)
return c
compare_bits(a, b)
You should convert to int objects and do numerical operations in the numerical data type. Then you use string-formatting when you need to see it. If you have Python 3.6, using f-strings makes this trivial:
>>> a='010010'
>>> b='000101'
>>> a = int(a, base=2) # we should be ints
>>> b = int(b, base=2) # we should be ints
>>> c = a | b # operations natural and built in
>>> print(f"{c:b}") # use formatting when you need it
10111
Read the string formatting spec's. You can make them do whatever you desire. Using a fill value of '0' and a width of '6':
>>> print(f"{c:0>6b}")
010111
And this is cool too:
>>> pad='0'
>>> width = 6
>>> print(f"{c:{pad}>{width}b}")
010111

Matlab's vectorized sprintf like function in python

After using Matlab for some time I grew quite fond of its sprintf function, which is vectorized (vectorization is the crucial part of the question).
Assuming one has a listli=[1,2,3,4,5,6],
sprintf("%d %d %d\n", li)
would apply the format on the elements in li one after another returning
"1 2 3\n4 5 6\n"
as string.
My current solution does not strike as very pythonic:
def my_sprintf(formatstr, args):
#number of arguments for format string:
n=formatstr.count('%')
res=""
#if there are k*n+m elements in the list, leave the last m out
for i in range(n,len(args)+1,n):
res+=formatstr%tuple(args[i-n:i])
return res
What would be the usual/better way of doing it in python?
Would it be possible, without explicitly eliciting the number of expected parameters from the format string (n=formatstr.count('%') feels like a hack)?
PS: For the sake of simplicity one could assume, that the number of elements in the list is a multiple of number of arguments in the format string.
You could use a variation of the grouper recipe if you get the user to pass in the chunk size.
def sprintf(iterable,fmt, n):
args = zip(*[iter(iterable)] * n)
return "".join([fmt % t for t in args])
Output:
In [144]: sprintf(li,"%.2f %.2f %d\n", 3)
Out[144]: '1.00 2.00 3\n4.00 5.00 6\n'
In [145]: sprintf(li,"%d %d %d\n", 3)
Out[145]: '1 2 3\n4 5 6\n'
You could handle when the chunk size was not a multiple of the list size using izip_longest and str.format but it would not let you specify the types without erroring :
from itertools import izip_longest
def sprintf(iterable, fmt, n, fillvalue=""):
args = izip_longest(*[iter(iterable)] * n, fillvalue=fillvalue)
return "".join([fmt.format(*t) for t in args])
If you split the placeholders or get the user to pass an iterable of placeholders you could catch all the potential issues.
def sprintf(iterable, fmt, sep=" "):
obj = object()
args = izip_longest(*[iter(iterable)] * len(fmt), fillvalue=obj)
return "".join(["{sep}".join([f % i for f, i in zip(fmt, t) if i is not obj]).format(sep=sep) + "\n"
for t in args])
Demo:
In [165]: sprintf(li, ["%.2f", "%d", "%.2f", "%2.f"])
Out[165]: '1.00 2 3.00 4\n5.00 6\n'
In [166]: sprintf(li, ["%d", "%d", "%d"])
Out[166]: '1 2 3\n4 5 6\n'
In [167]: sprintf(li, ["%f", "%f", "%.4f"])
Out[167]: '1.000000 2.000000 3.0000\n4.000000 5.000000 6.0000\n'
In [168]: sprintf(li, ["%.2f", "%d", "%.2f", "%2.f"])
Out[168]: '1.00 2 3.00 4\n5.00 6\n'
You may want to remove the += in the for loop. The following version is approximately three times faster than yours. It also works even in cases where you want to print the % symbol in the output. Therefore, the format string contains '%%'.
def my_sprintf(format_str, li):
n = format_str.count('%') - 2*format_str.count('%%')
repeats = len(li)//n
return (format_str*repeats) % tuple(li[:repeats*n])
A less hacky way is possible if you use the newer .format method instead of %. In such a case, you can use the string.Formatter().parse() method to get the list of fields used in the format_str.
The function then looks like this:
import string
li = [1, 2, 3, 4, 5, 6, 7]
format_str = '{:d} {:d} {:d}\n'
def my_sprintf(format_str, li):
formatter = string.Formatter()
n = len(list(filter(lambda a: a[2] is not None,
formatter.parse(format_str))))
repeats = len(li)//n
return (format_str*repeats).format(*li[:repeats*n])

How can I "trim" significant figures of a Decimal in Python to only those that are non-zero?

I've got the user entering values such as 10.5 or 9 which, when displayed, result in 10.5 and 9.0 respectively. Instead, I want to trim them down to the non-zero significant figures so that they display as 10.5 and 9.
Is there any way in the standard library to do this easily or should I write my own function? For what it's worth, this is being done in a Django project.
>>> import decimal
>>> d = decimal.Decimal("9.0")
>>> d
Decimal('9.0')
>>> help(d.normalize)
Help on method normalize in module decimal:
normalize(self, context=None) method of decimal.Decimal instance
Normalize- strip trailing 0s, change anything equal to 0 to 0e0
>>> d.normalize()
Decimal('9')
>>> str(d)
'9.0'
>>> str(d.normalize())
'9'
Although it's possible that by "Decimal" you don't mean decimal.Decimal. In that case, say "float" instead ;-)
EDIT: caution
By "trailing 0s", the docs mean all trailing zeroes, not necessarily just those "after the decimal point". For example,
>>> d = decimal.Decimal("100.000")
>>> d
Decimal('100.000')
>>> d.normalize()
Decimal('1E+2')
If that's not what you want, then I think you'll have to write your own function :-(
EDIT: trying a regexp
The normalize() function here gets a lot closer to what I guess you want:
import re
trailing0 = re.compile(r"""(\. # decimal point
\d*?) # and as few digits as possible
0+$ # before at least 1 trailing 0
""", re.VERBOSE)
def replacer(m):
g = m.group(1)
if len(g) == 1:
assert g == "."
return ""
else:
return g
def normalize(x):
return trailing0.sub(replacer, str(x))
Then, e.g.,
from decimal import Decimal as D
for x in 1.0, 2, 10.010, D("1000.0000"), D("10.5"), D("9.0"):
print str(x), "->", normalize(x)
displays:
1.0 -> 1
2 -> 2
10.01 -> 10.01
1000.0000 -> 1000
10.5 -> 10.5
9.0 -> 9
Both .normalize() and Python's string formatting options mess up by displaying the value in scientific notation when the zeroes are to the left of the decimal point as well as to the right.
I ended up writing this simple function to do the conversion, which does what I need:
def trim_decimal(d):
"""
Trims decimal values if they are all 0; otherwise does nothing.
"""
i = int(d)
if i == d:
return i
return d

Convert bytes to bits in python

I am working with Python3.2. I need to take a hex stream as an input and parse it at bit-level. So I used
bytes.fromhex(input_str)
to convert the string to actual bytes. Now how do I convert these bytes to bits?
Another way to do this is by using the bitstring module:
>>> from bitstring import BitArray
>>> input_str = '0xff'
>>> c = BitArray(hex=input_str)
>>> c.bin
'0b11111111'
And if you need to strip the leading 0b:
>>> c.bin[2:]
'11111111'
The bitstring module isn't a requirement, as jcollado's answer shows, but it has lots of performant methods for turning input into bits and manipulating them. You might find this handy (or not), for example:
>>> c.uint
255
>>> c.invert()
>>> c.bin[2:]
'00000000'
etc.
What about something like this?
>>> bin(int('ff', base=16))
'0b11111111'
This will convert the hexadecimal string you have to an integer and that integer to a string in which each byte is set to 0/1 depending on the bit-value of the integer.
As pointed out by a comment, if you need to get rid of the 0b prefix, you can do it this way:
>>> bin(int('ff', base=16))[2:]
'11111111'
... or, if you are using Python 3.9 or newer:
>>> bin(int('ff', base=16)).removepreffix('0b')
'11111111'
Note: using lstrip("0b") here will lead to 0 integer being converted to an empty string. This is almost always not what you want to do.
Operations are much faster when you work at the integer level. In particular, converting to a string as suggested here is really slow.
If you want bit 7 and 8 only, use e.g.
val = (byte >> 6) & 3
(this is: shift the byte 6 bits to the right - dropping them. Then keep only the last two bits 3 is the number with the first two bits set...)
These can easily be translated into simple CPU operations that are super fast.
using python format string syntax
>>> mybyte = bytes.fromhex("0F") # create my byte using a hex string
>>> binary_string = "{:08b}".format(int(mybyte.hex(),16))
>>> print(binary_string)
00001111
The second line is where the magic happens. All byte objects have a .hex() function, which returns a hex string. Using this hex string, we convert it to an integer, telling the int() function that it's a base 16 string (because hex is base 16). Then we apply formatting to that integer so it displays as a binary string. The {:08b} is where the real magic happens. It is using the Format Specification Mini-Language format_spec. Specifically it's using the width and the type parts of the format_spec syntax. The 8 sets width to 8, which is how we get the nice 0000 padding, and the b sets the type to binary.
I prefer this method over the bin() method because using a format string gives a lot more flexibility.
I think simplest would be use numpy here. For example you can read a file as bytes and then expand it to bits easily like this:
Bytes = numpy.fromfile(filename, dtype = "uint8")
Bits = numpy.unpackbits(Bytes)
input_str = "ABC"
[bin(byte) for byte in bytes(input_str, "utf-8")]
Will give:
['0b1000001', '0b1000010', '0b1000011']
Here how to do it using format()
print "bin_signedDate : ", ''.join(format(x, '08b') for x in bytevector)
It is important the 08b . That means it will be a maximum of 8 leading zeros be appended to complete a byte. If you don't specify this then the format will just have a variable bit length for each converted byte.
To binary:
bin(byte)[2:].zfill(8)
Use ord when reading reading bytes:
byte_binary = bin(ord(f.read(1))) # Add [2:] to remove the "0b" prefix
Or
Using str.format():
'{:08b}'.format(ord(f.read(1)))
The other answers here provide the bits in big-endian order ('\x01' becomes '00000001')
In case you're interested in little-endian order of bits, which is useful in many cases, like common representations of bignums etc -
here's a snippet for that:
def bits_little_endian_from_bytes(s):
return ''.join(bin(ord(x))[2:].rjust(8,'0')[::-1] for x in s)
And for the other direction:
def bytes_from_bits_little_endian(s):
return ''.join(chr(int(s[i:i+8][::-1], 2)) for i in range(0, len(s), 8))
One line function to convert bytes (not string) to bit list. There is no endnians issue when source is from a byte reader/writer to another byte reader/writer, only if source and target are bit reader and bit writers.
def byte2bin(b):
return [int(X) for X in "".join(["{:0>8}".format(bin(X)[2:])for X in b])]
I came across this answer when looking for a way to convert an integer into a list of bit positions where the bitstring is equal to one. This becomes very similar to this question if you first convert your hex string to an integer like int('0x453', 16).
Now, given an integer - a representation already well-encoded in the hardware, I was very surprised to find out that the string variants of the above solutions using things like bin turn out to be faster than numpy based solutions for a single number, and I thought I'd quickly write up the results.
I wrote three variants of the function. First using numpy:
import math
import numpy as np
def bit_positions_numpy(val):
"""
Given an integer value, return the positions of the on bits.
"""
bit_length = val.bit_length() + 1
length = math.ceil(bit_length / 8.0) # bytelength
bytestr = val.to_bytes(length, byteorder='big', signed=True)
arr = np.frombuffer(bytestr, dtype=np.uint8, count=length)
bit_arr = np.unpackbits(arr, bitorder='big')
bit_positions = np.where(bit_arr[::-1])[0].tolist()
return bit_positions
Then using string logic:
def bit_positions_str(val):
is_negative = val < 0
if is_negative:
bit_length = val.bit_length() + 1
length = math.ceil(bit_length / 8.0) # bytelength
neg_position = (length * 8) - 1
# special logic for negatives to get twos compliment repr
max_val = 1 << neg_position
val_ = max_val + val
else:
val_ = val
binary_string = '{:b}'.format(val_)[::-1]
bit_positions = [pos for pos, char in enumerate(binary_string)
if char == '1']
if is_negative:
bit_positions.append(neg_position)
return bit_positions
And finally, I added a third method where I precomputed a lookuptable of the positions for a single byte and expanded that given larger itemsizes.
BYTE_TO_POSITIONS = []
pos_masks = [(s, (1 << s)) for s in range(0, 8)]
for i in range(0, 256):
positions = [pos for pos, mask in pos_masks if (mask & i)]
BYTE_TO_POSITIONS.append(positions)
def bit_positions_lut(val):
bit_length = val.bit_length() + 1
length = math.ceil(bit_length / 8.0) # bytelength
bytestr = val.to_bytes(length, byteorder='big', signed=True)
bit_positions = []
for offset, b in enumerate(bytestr[::-1]):
pos = BYTE_TO_POSITIONS[b]
if offset == 0:
bit_positions.extend(pos)
else:
pos_offset = (8 * offset)
bit_positions.extend([p + pos_offset for p in pos])
return bit_positions
The benchmark code is as follows:
def benchmark_bit_conversions():
# for val in [-0, -1, -3, -4, -9999]:
test_values = [
# -1, -2, -3, -4, -8, -32, -290, -9999,
# 0, 1, 2, 3, 4, 8, 32, 290, 9999,
4324, 1028, 1024, 3000, -100000,
999999999999,
-999999999999,
2 ** 32,
2 ** 64,
2 ** 128,
2 ** 128,
]
for val in test_values:
r1 = bit_positions_str(val)
r2 = bit_positions_numpy(val)
r3 = bit_positions_lut(val)
print(f'val={val}')
print(f'r1={r1}')
print(f'r2={r2}')
print(f'r3={r3}')
print('---')
assert r1 == r2
import xdev
xdev.profile_now(bit_positions_numpy)(val)
xdev.profile_now(bit_positions_str)(val)
xdev.profile_now(bit_positions_lut)(val)
import timerit
ti = timerit.Timerit(10000, bestof=10, verbose=2)
for timer in ti.reset('str'):
for val in test_values:
bit_positions_str(val)
for timer in ti.reset('numpy'):
for val in test_values:
bit_positions_numpy(val)
for timer in ti.reset('lut'):
for val in test_values:
bit_positions_lut(val)
for timer in ti.reset('raw_bin'):
for val in test_values:
bin(val)
for timer in ti.reset('raw_bytes'):
for val in test_values:
val.to_bytes(val.bit_length(), 'big', signed=True)
And it clearly shows the str and lookup table implementations are ahead of numpy. I tested this on CPython 3.10 and 3.11.
Timed str for: 10000 loops, best of 10
time per loop: best=20.488 µs, mean=21.438 ± 0.4 µs
Timed numpy for: 10000 loops, best of 10
time per loop: best=25.754 µs, mean=28.509 ± 5.2 µs
Timed lut for: 10000 loops, best of 10
time per loop: best=19.420 µs, mean=21.305 ± 3.8 µs

Categories

Resources