How do I 'declare' an empty bytes variable? - python

How do I initialize ('declare') an empty bytes variable in Python 3?
I am trying to receive chunks of bytes, and later change that to a
utf-8 string.
However, I'm not sure how to initialize the initial variable that will
hold the entire series of bytes. This variable is called msg.
I can't initialize it as None, because you can't add a bytes and a
NoneType. I can't initialize it as a unicode string, because then
I will be trying to add bytes to a string.
Also, as the receiving program evolves it might get me in to a mess
with series of bytes that contain only parts of characters.
I can't do without a msg initialization, because then msg would be
referenced before assignment.
The following is the code in question
def handleClient(conn, addr):
print('Connection from:', addr)
msg = ?
while 1:
chunk = conn.recv(1024)
if not chunk:
break
msg = msg + chunk
msg = str(msg, 'UTF-8')
conn.close()
print('Received:', unpack(msg))

Just use an empty byte string, b''.
However, concatenating to a string repeatedly involves copying the string many times. A bytearray, which is mutable, will likely be faster:
msg = bytearray() # New empty byte array
# Append data to the array
msg.extend(b"blah")
msg.extend(b"foo")
To decode the byte array to a string, use msg.decode(encoding='utf-8').

bytes() works for me;
>>> bytes() == b''
True

Use msg = bytes('', encoding = 'your encoding here').
Encase you want to go with the default encoding, simply use msg = b'', but this will garbage the whole buffer if its not in the same encoding

As per documentation:
Blockquote
socket.recv(bufsize[, flags])
Receive data from the socket. The return value is a string representing the data received.
Blockquote
So, I think msg="" should work just fine:
>>> msg = ""
>>> msg
''
>>> len(msg)
0
>>>

To allocate bytes of some arbitrary length do
bytes(bytearray(100))

Related

Trying to send string variable via Python socket

I'm in a CTF competition and I'm stuck on a challenge where I have to retrieve a string from a socket, reverse it and get it back. The string changes too fast to do it manually. I'm able to get the string and reverse it but am failing at sending it back. I'm pretty sure I'm either trying to do something that's not possible or am just too inexperienced at Python/sockets/etc. to kung fu my way through.
Here's my code:
import socket
aliensocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
aliensocket.connect(('localhost', 10000))
aliensocket.send('GET_KEY'.encode())
key = aliensocket.recv(1024)
truncKey = str(key)[2:16]
revKey = truncKey[::-1]
print(truncKey)
print(revKey)
aliensocket.send(bytes(revKey.encode('UTF-8')))
print(aliensocket.recv(1024))
aliensocket.close()
And here is the output:
F9SIJINIK4DF7M
M7FD4KINIJIS9F
b'Server expects key to unlock or GET_KEY to retrieve the reversed key'
key is received as a byte string. The b'' wrapped around it when printed just indicates it is a byte string. It is not part of the string. .encode() turns a Unicode string into a byte string, but you can just mark a string as a byte string by prefixing with b.
Just do:
aliensocket.send(b'GET_KEY')
key = aliensocket.recv(1024)
revKey = truncKey[::-1]
print(truncKey) # or do truncKey.decode() if you don't want to see b''
print(revKey)
aliensocket.send(revKey)
data = ''
while True:
chunk = aliensocket.recv(1)
data +=chunk
if not chunk:
rev = data[::-1]
aliensocket.sendall(rev)
break

Python String Prefix by 4 Byte Length

I'm trying to write a server in Python to communicate with a pre-existing client whose message packets are ASCII strings, but prepended by four-byte unsigned integer values representative of the length of the remaining string.
I've done a receiver, but I'm sure there's a a more pythonic way. Until I find it, I haven't done the sender. I can easily calculate the message length, convert it to bytes and transmit the message.The bit I'm struggling with is creating an integer which is an array of four bytes.
Let me clarify: If my string is 260 characters in length, I wish to prepend a big-endian four byte integer representation of 260. So, I don't want the ASCII string "0260" in front of the string, rather, I want four (non-ASCII) bytes representative of 0x00000104.
My code to receive the length prepended string from the client looks like this:
sizeBytes = 4 # size of the integer representing the string length
# receive big-endian 4 byte integer from client
data = conn.recv(sizeBytes)
if not data:
break
dLen = 0
for i in range(sizeBytes):
dLen = dLen + pow(2,i) * data[sizeBytes-i-1]
data = str(conn.recv(dLen),'UTF-8')
I could simply do the reverse. I'm new to Python and feel that what I've done is probably longhand!!
1) Is there a better way of receiving and decoding the length?
2) What's the "sister" method to encode the length for transmission?
Thanks.
The struct module is helpful here
for writing:
import struct
msg = 'some message containing 260 ascii characters'
length = len(msg)
encoded_length = struct.pack('>I', length)
encoded_length will be a string of 4 bytes with value '\x00\x00\x01\x04'
for reading:
length = struct.unpack('>I', received_msg[:4])[0]
An example using asyncio:
import asyncio
import struct
def send_message(writer, message):
data = message.encode()
size = struct.pack('>L', len(data))
writer.write(size + data)
async def receive_message(reader):
data = await reader.readexactly(4)
size = struct.unpack('>L', data)[0]
data = await reader.readexactly(size)
return data.decode()
The complete code is here

Python pySerial read data from arduino breaks when sending "(char)0"

I send some data from an arduino using pySerial.
My Data looks like
bytearray(DST, SRC, STATUS, TYPE, CHANNEL, DATA..., SIZEOFDATA)
where sizeofData is a test that all bytes are received.
The problem is, every time when a byte is zero, my python program just stops reading there:
serial_port = serial.Serial("/dev/ttyUSB0")
while serial_port.isOpen():
response_header_str = serial_port.readline()
format = '>';
format += ('B'*len(response_header_str));
response_header = struct.unpack(format, response_header_str)
pprint(response_header)
serial_port.close()
For example, when I send bytearray(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15) everything is fine. But when I send something like bytearray(1,2,3,4,0,1,2,3,4) I don't see everything beginning with the zero.
The problem is that I cannot avoid sending zeros as I am just sending the "memory dump" e.g. when I send a float value, there might be zero bytes.
how can I tell pyserial not to ignore zero bytes.
I've looked through the source of PySerial and the problem is in PySerial's implementation of FileLike.readline (in http://svn.code.sf.net/p/pyserial/code/trunk/pyserial/serial/serialutil.py). The offending function is:
def readline(self, size=None, eol=LF):
"""\
Read a line which is terminated with end-of-line (eol) character
('\n' by default) or until timeout.
"""
leneol = len(eol)
line = bytearray()
while True:
c = self.read(1)
if c:
line += c
if line[-leneol:] == eol:
break
if size is not None and len(line) >= size:
break
else:
break
return bytes(line)
With the obvious problem being the if c: line. When c == b'\x00' this evaluates to false, and the routine breaks out of the read loop. The easiest thing to do would be to reimplement this yourself as something like:
def readline(port, size=None, eol="\n"):
"""\
Read a line which is terminated with end-of-line (eol) character
('\n' by default) or until timeout.
"""
leneol = len(eol)
line = bytearray()
while True:
line += port.read(1)
if line[-leneol:] == eol:
break
if size is not None and len(line) >= size:
break
return bytes(line)
To clarify from your comments, this is a replacement for the Serial.readline method that will consume null-bytes and add them to the returned string until it hits the eol character, which we define here as "\n".
An example of using the new method, with a file-object substituted for the socket:
>>> # Create some example data terminated by a newline containing nulls.
>>> handle = open("test.dat", "wb")
>>> handle.write(b"hell\x00o, w\x00rld\n")
>>> handle.close()
>>>
>>> # Use our readline method to read it back in.
>>> handle = open("test.dat", "rb")
>>> readline(handle)
'hell\x00o, w\x00rld\n'
Hopefully this makes a little more sense.

Converting string to int in serial connection

I am trying to read a line from serial connection and convert it to int:
print arduino.readline()
length = int(arduino.readline())
but getting this error:
ValueError: invalid literal for int() with base 10: ''
I looked up this error and means that it is not possible to convert an empty string to int, but the thing is, my readline is not empty, because it prints it out.
The print statement prints it out and the next call reads the next line. You should probably do.
num = arduino.readline()
length = int(num)
Since you mentioned that the Arduino is returning C style strings, you should strip the NULL character.
num = arduino.readline()
length = int(num.strip('\0'))
Every call to readline() reads a new line, so your first statement has read a line already, next time you call readline() data is not available anymore.
Try this:
s = arduino.readline()
if len(s) != 0:
print s
length = int(s)
When you say
print arduino.readline()
you have already read the currently available line. So, the next readline might not be getting any data. You might want to store this in a variable like this
data = arduino.readline()
print data
length = int(data)
As the data seems to have null character (\0) in it, you might want to strip that like this
data = arduino.readline().rstrip('\0')
The problem is when the arduino starts to send serial data it starts by sending empty strings initially, so the pyserial picks up an empty string '' which cannot be converted to an integer. You can add a delay above serial.readline(), like this:
while True:
time.sleep(1.5)
pos = arduino.readline().rstrip().decode()
print(pos)

Error concatenate string+byte

I've the variable buffer(string) and eip(byte) and I want concatenate to buffer.
My code:
junk = "\x41" * 50 # A
eip = pack("<L", 0x0015FCC4) # false jmp register
buffer = junk + eip # Problem HERE
print(buffer)
Error:
TypeError: Can't convert 'bytes' object to str implicitly
Well, I can't convert eip to string, because if I convert eip to string with str(eip), the output is: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAb'\xc4\xfc\x15\x00'
I just want that buffer contain the hexadecimal string to use it, and for this reason I put the print (for debug).
Thank you.
The following returns 'c4fc1500'
import binascii
binascii.hexlify(eip)
Is that what you need?

Categories

Resources