issue is this: in (pl)python code, we've calculated an integer = 26663.
Can easily convert this to hex using hex(myint) = 0x6827
So far so good!
Now, how to write this value -into a concatenation of strings- into a PostgreSQL (v9) bytea field? The DB is UTF8-encoded, if this matters.
EG, neither of these examples will work:
Here, of course, I cannot concatenate 'str' and 'int' objects:
rv = plpy.execute(plan, [ (string1 + 6827) ])
This one inputs the wrong hex code for 0x6827
rv = plpy.execute(plan, [ (string1 + str('6827')) ])
Help!
I'm not familiar with Postgres, but the hex(n) function returns a string representation of the numeric value of n in hexadecimal. The nicest way in my opinion to concatenate this with a string is to use format strings. For example:
rv = plpy.execute(plan, [ ( 'foo %s bar' % hex(6827) ) ] )
If the string is really in a variable called string1, and you only need to append it with the hex value, then simple concatenation using the + sign will work fine:
rv = plpy.execute(plan, [ ( string1 + hex(6827) ) ])
This works without conversion because the hex() function returns a string.
If you don't actually want to store a printable string representation, but rather a binary string, use the struct module to create an array of bytes.
import struct
bytes = struct.pack('i', 6827) # Ignoring endianness
A lot of people are confused about what storing something as "binary" actually means, and since you are using a field type (bytea) which seems to be intended for binary storage, maybe this is what you actually want?
The returned value from bytes will be a string that you can either concatenate with another string, or continue to pack more binary values into.
See the struct module documentation for more information!
Related
I am trying to convert a number stored as a list of ints to a float type. I got the number via a serial console and want to reassemble it back together into a float.
The way I would do it in C is something like this:
bit_data = ((int16_t)byte_array[0] << 8) | byte_array[1];
result = (float)bit_data;
What I tried to use in python is a much more simple conversion:
result = int_list[0]*256.0 + int_list[1]
However, this does not preserve the sign of the result, as the C code does.
What is the right way to do this in python?
UPDATE:
Python version is 2.7.3.
My byte array has a length of 2.
in the python code byte_array is list of ints. I've renamed it to avoid misunderstanding. I can not just use the float() function because it will not preserve the sign of the number.
I'm a bit confused by what data you have, and how it is represented in Python. As I understand it, you have received two unsigned bytes over a serial connection, which are now represented by a list of two python ints. This data represents a big endian 16-bit signed integer, which you want to extract and turn into a float. eg. [0xFF, 0xFE] -> -2 -> -2.0
import array, struct
two_unsigned_bytes = [255, 254] # represented by ints
byte_array = array.array("B", two_unsigned_bytes)
# change above to "b" if the ints represent signed bytes ie. in range -128 to 127
signed_16_bit_int, = struct.unpack(">h", byte_array)
float_result = float(signed_16_bit_int)
I think what you want is the struct module.
Here's a round trip snippet:
import struct
sampleValue = 42.13
somebytes = struct.pack('=f', sampleValue)
print(somebytes)
result = struct.unpack('=f', somebytes)
print(result)
result may be surprising to you. unpack returns a tuple. So to get to the value you can do
result[0]
or modify the result setting line to be
result = struct.unpack('=f', some bytes)[0]
I personally hate that, so use the following instead
result , = struct.unpack('=f', some bytes) # tuple unpacking on assignment
The second thing you'll notice is that the value has extra digits of noise. That's because python's native floating point representation is double.
(This is python3 btw, adjust for using old versions of python as appropriate)
I am not sure I really understand what you are doing, but I think you got 4 bytes from a stream and know them to represent a float32 value. The way you handling this suggests big-endian byte-order.
Python has the struct package (https://docs.python.org/2/library/struct.html) to handle bytestreams.
import struct
stream = struct.pack(">f", 2/3.)
len(stream) # 4
reconstructed_float = struct.unpack(">f", stream)
Okay, so I think int_list isn't really just a list of ints. The ints are constrained to 0-255 and represent bytes that can be built into a signed integer. You then want to turn that into a float. The trick is to set the sign of the first byte properly and then procede much like you did.
float((-(byte_array[0]-127) if byte_array[0]>127 else byte_array[0])*256 + byte_array[1])
I'm struggling a bit to generate ID of type integer for given string in Python.
I thought the built-it hash function is perfect but it appears that the IDs are too long sometimes. It's a problem since I'm limited to 64bits as maximum length.
My code so far: hash(s) % 10000000000.
The input string(s) which I can expect will be in range of 12-512 chars long.
Requirements are:
integers only
generated from provided string
ideally up to 10-12 chars long (I'll have ~5 million items only)
low probability of collision..?
I would be glad if someone can provide any tips / solutions.
I would do something like this:
>>> import hashlib
>>> m = hashlib.md5()
>>> m.update("some string")
>>> str(int(m.hexdigest(), 16))[0:12]
'120665287271'
The idea:
Calculate the hash of a string with MD5 (or SHA-1 or ...) in hexadecimal form (see module hashlib)
Convert the string into an integer and reconvert it to a String with base 10 (there are just digits in the result)
Use the first 12 characters of the string.
If characters a-f are also okay, I would do m.hexdigest()[0:12].
If you're not allowed to add extra dependency, you can continue using hash function in the following way:
>>> my_string = "whatever"
>>> str(hash(my_string))[1:13]
'460440266319'
NB:
I am ignoring 1st character as it may be the negative sign.
hash may return different values for same string, as PYTHONHASHSEED Value will change everytime you run your program. You may want to set it to some fixed value. Read here
encode utf-8 was needed for mine to work:
def unique_name_from_str(string: str, last_idx: int = 12) -> str:
"""
Generates a unique id name
refs:
- md5: https://stackoverflow.com/questions/22974499/generate-id-from-string-in-python
- sha3: https://stackoverflow.com/questions/47601592/safest-way-to-generate-a-unique-hash
(- guid/uiid: https://stackoverflow.com/questions/534839/how-to-create-a-guid-uuid-in-python?noredirect=1&lq=1)
"""
import hashlib
m = hashlib.md5()
string = string.encode('utf-8')
m.update(string)
unqiue_name: str = str(int(m.hexdigest(), 16))[0:last_idx]
return unqiue_name
see my ultimate-utils python library.
I have a script that calls a function that takes a hexadecimal number for an argument. The argument needs to the 0x prefix. The data source is a database table and is stored as a string, so it is returned '0x77'. I am looking for a way to take the string from the database and use it as an argument in hex form with the 0x prefix.
This works:
addr = 0x77
value = class.function(addr)
The database entry has to be a string, as most of the other records do not have hexadecimal values in this column, but the values could be changed to make it easier, so instead of '0x77', it could be '119'.
Your class.function expects an integer which can be represented either by a decimal or a hexadecimal literal, so that these two calls are completely equivalent:
class.function(0x77)
class.function(119) # 0x77 == 119
Even print(0x77) will show 119 (because decimal is the default representation).
So, we should rather be talking about converting a string representation to integer. The string can be a hexadecimal representation, like '0x77', then parse it with the base parameter:
>>> int('0x77', 16)
119
or a decimal one, then parse it as int('119').
Still, storing integer whenever you deal with integers is better.
EDIT: as #gnibbler suggested, you can parse as int(x, 0), which handles both formats.
>>> hex(119)
'0x77'
#or:
>>> hex(int("119"))
'0x77'
This should work for you.
You can also get the hex representation of characters:
>>> hex(ord("a"))
'0x61'
I think you're saying that you read a string from the database and you want to convert it to an integer, if the string has the 0x prefix you can convert it like so:
>>> print int("0x77", 16)
119
If it doesnt:
>>> print int("119")
119
The shortest ways I have found are:
n = 5
# Python 2.
s = str(n)
i = int(s)
# Python 3.
s = bytes(str(n), "ascii")
i = int(s)
I am particularly concerned with two factors: readability and portability. The second method, for Python 3, is ugly. However, I think it may be backwards compatible.
Is there a shorter, cleaner way that I have missed? I currently make a lambda expression to fix it with a new function, but maybe that's unnecessary.
Answer 1:
To convert a string to a sequence of bytes in either Python 2 or Python 3, you use the string's encode method. If you don't supply an encoding parameter 'ascii' is used, which will always be good enough for numeric digits.
s = str(n).encode()
Python 2: http://ideone.com/Y05zVY
Python 3: http://ideone.com/XqFyOj
In Python 2 str(n) already produces bytes; the encode will do a double conversion as this string is implicitly converted to Unicode and back again to bytes. It's unnecessary work, but it's harmless and is completely compatible with Python 3.
Answer 2:
Above is the answer to the question that was actually asked, which was to produce a string of ASCII bytes in human-readable form. But since people keep coming here trying to get the answer to a different question, I'll answer that question too. If you want to convert 10 to b'10' use the answer above, but if you want to convert 10 to b'\x0a\x00\x00\x00' then keep reading.
The struct module was specifically provided for converting between various types and their binary representation as a sequence of bytes. The conversion from a type to bytes is done with struct.pack. There's a format parameter fmt that determines which conversion it should perform. For a 4-byte integer, that would be i for signed numbers or I for unsigned numbers. For more possibilities see the format character table, and see the byte order, size, and alignment table for options when the output is more than a single byte.
import struct
s = struct.pack('<i', 5) # b'\x05\x00\x00\x00'
You can use the struct's pack:
In [11]: struct.pack(">I", 1)
Out[11]: '\x00\x00\x00\x01'
The ">" is the byte-order (big-endian) and the "I" is the format character. So you can be specific if you want to do something else:
In [12]: struct.pack("<H", 1)
Out[12]: '\x01\x00'
In [13]: struct.pack("B", 1)
Out[13]: '\x01'
This works the same on both python 2 and python 3.
Note: the inverse operation (bytes to int) can be done with unpack.
I have found the only reliable, portable method to be
bytes(bytearray([n]))
Just bytes([n]) does not work in python 2. Taking the scenic route through bytearray seems like the only reasonable solution.
Converting an int to a byte in Python 3:
n = 5
bytes( [n] )
>>> b'\x05'
;) guess that'll be better than messing around with strings
source: http://docs.python.org/3/library/stdtypes.html#binaryseq
In Python 3.x, you can convert an integer value (including large ones, which the other answers don't allow for) into a series of bytes like this:
import math
x = 0x1234
number_of_bytes = int(math.ceil(x.bit_length() / 8))
x_bytes = x.to_bytes(number_of_bytes, byteorder='big')
x_int = int.from_bytes(x_bytes, byteorder='big')
x == x_int
from int to byte:
bytes_string = int_v.to_bytes( lenth, endian )
where the lenth is 1/2/3/4...., and endian could be 'big' or 'little'
form bytes to int:
data_list = list( bytes );
When converting from old code from python 2 you often have "%s" % number this can be converted to b"%d" % number (b"%s" % number does not work) for python 3.
The format b"%d" % number is in addition another clean way to convert int to a binary string.
b"%d" % number
perl hex() analog in python how to?
I have next perl code:
my $Lon = substr($Hexline,16,8);
say $output_f "Lon: " . hex($Lon) . "";
where $Hexline has "6a48f82d8e828ce82b82..." format
I try it on python
Lon = int(Hexline[16:24], 16)
f.write('lon = %s' % str(Lon)+'\n')
is it right?
EDIT: in perl's case hex() gives me a decimal value.
Yes, to convert an hexadecimal string to an integer you use int(hex_str, 16).
Note that in your write method call:
You don't need to concatenate two strings to add the new line character, you can add it to the formatting string directly.
To print integer you should use %d instead of %s.
You don't really need to call str to transform the integer into a string.
Hence, the write call could be written as:
f.write('lon = %d\n' % Lon)
Alternatively, you could also use format this way:
f.write('lon = {0}\n'.format(Lon))