This is.. a long one. So I apologize for any inconsistency regarding code and problems. I'll be sure to try and add as much of the source code as I can to make sure the issue is as clear as possible.
This project at work is an attempt at converting Python 2 to 3, and thus far has been mildly straightforward. My coworker and I have reached a point though where no amount of googling or searching has given a straight answer, so here we are.
Alright, Starting offwith...
Python 2 code:
listBytes[102:104]=struct.pack('H',rotation_deg*100) # rotational position in degrees
listBytes[202:204]=struct.pack('H',rotation_deg*100) # rotational position in degrees
listBytes[302:304]=struct.pack('H',rotation_deg*100) # rotational position in degrees
# this continues on for a while in the same fashion
Where rotation_deg is a float between 0.00 and 359.99 (but for testing is almost always changing between 150-250)
For the purpose of testing, we're going to make rotation_deg be 150.00 all the time.
a = listBytes[102:104]=struct.pack('H',150.00*100)
print a
print type(a)
The print out of the following is:
�:
<type 'str'>
From what I understand, in the Python2 version of struct.pack, it is packing the floats as shorts, which then are "added" to the list as a short. Python 2 sees it as a string, and adds no encoding to the string (will get to that later for python 3). All simple and good, then a few more bits and bobs of dropping more stuff into the list and we get to:
return ''.join(listBytes)
Which, is being sent back to a simple variable:
bytes=self.UpdatePacket(bytes,statusIndex,rotation_deg,StatusIdList,StatusValueList, Stat,offsetUTC)
To then be sent along as a string through
sock.sendto(bytes , (host, port) )
This all comes together to look like this:
A string with a bunch of bytes (I think)
This is the working version, in which we are sending the bytes along the socket, data is being retrieved, and everyone is happy. If I missed anything, please let me know, otherwise, lets move to...
Python 3
This is where the Fun Begins
There are a few changes that are required between Python 2 and 3 right off the bat.
struct.pack('H',rotation_deg*100) requires an INT type to be packed, meaning all instances of packing had to be given int(rotatin_deg*100) as to not error the program.
sock.sendto(bytes, (host, port)) did not work anymore as the socket needed a bytes object to send something. No more strings that look like bytes, they had to be properly encoded to send properly. So now this becomes sock.sendto(bytes.encode(), (host, port)) to properly encode the "bytes" string.
As more of a background, the length of listBytes should always be 1206. Anymore and our socket won't work properly, and the issue is that no matter what we try with this python 3 code, the .join seems to be sending a LOT more than just byte objects, often quintupling the length of listBytes and breaking the socket.sendto .
listBytes[102:104] = struct.pack('H', int(rotation_deg * 100)) # rotational position in degrees
listBytes[202:204] = struct.pack('H', int(rotation_deg * 100)) # rotational position in degrees
listBytes[302:304] = struct.pack('H', int(rotation_deg * 100)) # rotational position in degrees
# continues on in this fashion again
return ''.join(str(listBytes))
returns to:
bytes = self.UpdatePacket(bytes, statusIndex, rotation_deg, StatusIdList, StatusValueList, Stat, offsetUTC)
sock.sendto(bytes.encode(), (host, port))
Here's where things start getting weird
a = struct.pack('H', int(150.00 * 100))
returns:
b'\x98:', with it's type being <class 'bytes'>, which is fine and the value we want, except we specifically need to store this variable into the list as maybe a string... to encode it later to send as a byte object for the socket.
You're starting to see the problem, yes?
The thing is, we've tried just about every technique to convert the two bytes that struct.pack returns into a string of some kind, and we've been able to convert it over, but then we run into the issue of the .join being evil.
Remember when I was talking about listBytes had to remain a size of 1206 or else it would break? For some reason, if we .join literally anything other than the two bytes as a string, we think python is trying to add a bunch of other stuff that we don't need.
So for now, we're focusing on trying to match the python 2 equivalent to python 3.
Here's what we've tried
binascii.hexlify(struct.pack('H', int(150.00 * 100))).decode() returns '983a'
str(struct.pack('H', int(150.00 * 100.00)).decode()) returns an error, 'utf-8' codec can't decode byte 0x98 in position 0: invalid start byte
str(struct.pack('H', int(150.00 * 100.00)).decode("utf-16")) returns '㪘'. Can't even begin to understand that.
return b''.join(listBytes) returns an error because there are int's at the start of the list.
return ''.join(str(listBytes)).encode('utf-8') still is adding a bunch of nonsense.
Now we get to the .join, and the first loop around it seems.. fine? It has 1206 as listBytes length before .joining, but on the second loop around, it creates a massive influx of junk, making the list 5503 in length. Third go around it becomes 27487, and finally on the last go around, it becomes to large for the socket to handle and I get slapped with [WinError 10040] A message sent on a datagram socket was larger than the internal message buffer or some other network limit, or the buffer used to receive a datagram into was smaller than the datagram itself
Phew, if you made it this far, thank you. Any help at all would be extremely appreciated. If you have questions or I'm missing something, let me know.
Thanks!
You’re just trying too hard. It may be helpful to think of bytes and unicode rather than the ambiguous str type (which is the former in Python 2 and the latter in Python 3). struct always produces bytes, and socket needs to send bytes, regardless of the language version. Put differently, everything outside your process is bytes, although some represent characters in some encoding chosen by the rest of the system. Characters are a data type used to manipulate text within your program, just like floats are a data type used to manipulate “real numbers”. You have no characters here, so you definitely don’t need encode.
So just accumulate your bytes objects and then join them with b''.join (if you can’t just directly use the buffer into which your slice assignments seem to be writing) to keep them as bytes.
I am getting error on zeromq python while sending strings through ROUTER SOCKET. String type messages are receveid successfully but some times, a unicode message throws exception "Type Error: unicode not allowed. use send_unicode". Although I have been trying to use msg.encode('utf-8'). BUt I cant figure out a way to get over with it.
I am on python 2.7.3. I am not using pyzmq (import zmq only). Looking forward to your suggesitons :) Thanks
if backend in sockets:
request=backend.recv_multipart()
#print ("Backend Thread is ready")
worker_id,client_id = request[:2]
if client_id != b"READY" and len(request) > 3:
#print (len(request))
empty2,reply = request[2:]
router_socket.send_multipart([client_id, reply.encode('utf-8')])
The problem was resolved only thing was that I needed to convert the unicode strings back to ascii by using string.encode('ascii')
I got the same error. My erroneous code was:
socket.send("Server message to client3")
You must convert the message to bytes to solve it. To do so, just add b like this:
socket.send(b"Server message to client3")
Is it better to convert strings to byte, then bytes to strings when data sent through network, and why?
So, because PyZMQ is actually a good library they have some docs.
https://pyzmq.readthedocs.io/en/latest/unicode.html
What it says is that the str object changed it's nature over the course of history of Python evolution.
In Python 3 str is a collection of characters and in Python 2 it is a simple wrapper (with some sugar) for char* that we know from C :).
Docs explain why the people behind pyZMQ chose to make the differences explicit - performance is the answer.
To send strings in Python3 you should use the right method, which is send_string, probably the other way around for Python2 (to send unicode you should use send_unicode).
It is however recommended to stick to bytes and explicitly provide correct encoding and decoding where needed.
Also you are using pyzmq... the module name "zmq" comes from pyzmq library/package.
To confront this statement use: pip list | grep zmq (or pip list | select-string zmq for Windows).
I'm reading some strings from a memory buffer, written by a C program. I need to fetch them using python and print them. however when I encounter a string containing %llx python does not know how to parse this:
"unsupported format character 'l' (0x6c) at index 14"
I could use replace('%llx','%x') but than it would not be a long long.. would python handle this correctly in this case?
than it would not be a long long
Python (essentially) doesn't have any concept of a long long. If you're pulling long longs from C code, just use %x and be done with it -- you're not ever going to get values from the C code that are out of the long long range, the only issue that could arise is if you were trying to send them from Python code into C. Just use (with a new-style format string):
print('{0:x}'.format(your_int))
Tested on both Python v3.3.3 and v2.7.6 :
>>> print('%x' % 523433939134152323423597861958781271347434)
6023bedba8c47434c84785469b1724910ea
I'm pretty new to Python so please bear with me here!
I've taken some code from ActiveState (and then butchered it around a bit) to open a DBF file and then output to CSV.
This worked perfectly well on Python 2.5 but I've now moved it to Python 3.3 and ran into a number of issues, most of which I've resolved.
The final issue I have is that in order to run the code, I've had to prefix some items with b (because I was getting TypeError: expected bytes, bytearray or buffer compatible object errors)
The code now works, and outputs correctly, except that every field is displayed as b'DATAHERE' (where DATAHERE is the actual data of course!)
So... does anyone know how I can stop it from outputting the b character? I can post code if required but it's fairly lengthy so I was hoping someone would be able to spot what I expect to be something simple that I've done wrong!
Thanks!
You are seeing the code output byte values; if you expected unicode strings instead, simply decode:
yourdata.decode('ascii')
where ascii should be replaced by the encoding your data uses.
This seems like it should be simple, but I haven't been able to figure it out...
I'm trying to use PySerial to communicate with a microcontroller. I want to send an index location, but when I send it, it PySerial sends the ASCII of the number (so when I send a 0, it sends 48).
I know for Python26 and up, I would just enclose the number with the built-in bytes function like so:
self.index = bytes([index])
However, Python25 doesn't have that function. I can't find any documentation suggesting an equivalent. Does anybody know what I should do?
Thanks in advance!
EDIT: Sorry, here's a simplified version of my code...
class SecondaryImage():
def __init__(self, index):
self.index = index
def sendIndex(self):
serial.write(self.index)
for i in range(64):
img = SecondaryImage(i)
imgs.append(img)
And then I'd call sendIndex() seperately--
imgs[2].sendIndex()
chr is built-in which will you the character for the ordinal you send.
Serial communicates in ascii, so you want to use chr to convert numbers to their ascii character equivalents.
Have you tried the binascii module?
http://docs.python.org/release/2.5.4/lib/module-binascii.html