I am telneting to my server, which answers to me with messages and at the end of each message is appended hex00 (null character) which cannot be read. I tried searching through and through, but can't seem to make it work, a simple example:
from telnetlib import Telnet
connection = Telnet('localhost', 5001)
connection.write('aa\n')
connection.read_eager()
This returns an output:
'Fail - Command aa not found.\n\r'
whereas there should be sth like:
'Fail - Command aa not found.\n\r\0'
Is there any way to get this end of string character? Can I get bytes as an output if the character is missed on purpose?
The 00 character is there:
I stumbled in this same problem when trying to get data from an RS232-TCP/IP Converter using telnet - the telnetlib would suppress every 0x00 from the message. As Fredrik Johansson well answered, it is the way telnetlib was implemented.
One solution would be to override the process_rawq() function from telnetlib's Telnet class that doesn't eat all the null characters:
import telnetlib
from telnetlib import IAC, DO, DONT, WILL, WONT, SE, NOOPT
def _process_rawq(self):
"""Alteração da implementação desta função necessária pois telnetlib suprime 0x00 e \021 dos dados lidos
"""
buf = ['', '']
try:
while self.rawq:
c = self.rawq_getchar()
if not self.iacseq:
# if c == theNULL:
# continue
# if c == "\021":
# continue
if c != IAC:
buf[self.sb] = buf[self.sb] + c
continue
else:
self.iacseq += c
elif len(self.iacseq) == 1:
# 'IAC: IAC CMD [OPTION only for WILL/WONT/DO/DONT]'
if c in (DO, DONT, WILL, WONT):
self.iacseq += c
continue
self.iacseq = ''
if c == IAC:
buf[self.sb] = buf[self.sb] + c
else:
if c == SB: # SB ... SE start.
self.sb = 1
self.sbdataq = ''
elif c == SE:
self.sb = 0
self.sbdataq = self.sbdataq + buf[1]
buf[1] = ''
if self.option_callback:
# Callback is supposed to look into
# the sbdataq
self.option_callback(self.sock, c, NOOPT)
else:
# We can't offer automatic processing of
# suboptions. Alas, we should not get any
# unless we did a WILL/DO before.
self.msg('IAC %d not recognized' % ord(c))
elif len(self.iacseq) == 2:
cmd = self.iacseq[1]
self.iacseq = ''
opt = c
if cmd in (DO, DONT):
self.msg('IAC %s %d',
cmd == DO and 'DO' or 'DONT', ord(opt))
if self.option_callback:
self.option_callback(self.sock, cmd, opt)
else:
self.sock.sendall(IAC + WONT + opt)
elif cmd in (WILL, WONT):
self.msg('IAC %s %d',
cmd == WILL and 'WILL' or 'WONT', ord(opt))
if self.option_callback:
self.option_callback(self.sock, cmd, opt)
else:
self.sock.sendall(IAC + DONT + opt)
except EOFError: # raised by self.rawq_getchar()
self.iacseq = '' # Reset on EOF
self.sb = 0
pass
self.cookedq = self.cookedq + buf[0]
self.sbdataq = self.sbdataq + buf[1]
telnetlib.Telnet.process_rawq = _process_rawq
then override the Telnet class' method:
telnetlib.Telnet.process_rawq = _process_rawq
This solved the problem for me.
This code (http://www.opensource.apple.com/source/python/python-3/python/Lib/telnetlib.py) seems to just ignore null characters. Is that really correct behavior?
def process_rawq(self):
"""Transfer from raw queue to cooked queue.
Set self.eof when connection is closed. Don't block unless in
the midst of an IAC sequence.
"""
buf = ''
try:
while self.rawq:
c = self.rawq_getchar()
if c == theNULL:
continue
:
:
process_rawq is then in turn called by e.g. read_until
def read_until(self, match, timeout=None):
"""Read until a given string is encountered or until timeout.
When no match is found, return whatever is available instead,
possibly the empty string. Raise EOFError if the connection
is closed and no cooked data is available.
"""
n = len(match)
self.process_rawq()
:
:
I also want to receive the null character. In my particular case it marks the end of a multiline message.
So the answer seems to be that this is expected behavior as the library code is written.
FWIW https://support.microsoft.com/en-us/kb/231866 states:
Communication is established using TCP/IP and is based on a Network
Virtual Terminal (NVT). On the client, the Telnet program is
responsible for translating incoming NVT codes to codes understood by
the client's display device as well as for translating
client-generated keyboard codes into outgoing NVT codes.
The NVT uses 7-bit codes for characters. The display device, referred
to as a printer in the RFC, is only required to display the standard
printing ASCII characters represented by 7-bit codes and to recognize
and process certain control codes. The 7-bit characters are
transmitted as 8-bit bytes with the most significant bit set to zero.
An end-of-line is transmitted as a carriage return (CR) followed by a
line feed (LF). If you want to transmit an actual carriage return,
this is transmitted as a carriage return followed by a NUL (all bits
zero) character.
and
Name Code Decimal Value
Function NULL NUL 0 No operation
Related
I'm reading a string from a microcontroller to Raspberry Pi using Python. The string looks like this:
5050313 9
I then split this up into MQTT topic and payload. The value left of the " " is the topic, and the one right of " " is the payload. My code adds extra new lines to the MQTT topic. How can I avoid these new lines? I've even try rstrip() on the payload. Here's the code:
import serial
import time
import paho.mqtt.publish as publish
def readlineCR(port):
rv = ""
while True:
ch = port.read()
rv += ch
if ch=='\r\n' or ch=='':
return rv
port = serial.Serial("/dev/ttyAMA0", baudrate=115200, timeout=3.0)
while True:
rcv = port.readline()
print(rcv)
if len(rcv) > 4:
mytopic, mypayload = rcv.split(" ")
mypayload.rstrip()
publish.single(mytopic, mypayload, hostname="localhost")
If I subscribe to that topic, I get this exactly:
pi#raspberrypi:/media/pycode $ mosquitto_sub -h localhost -t
50C51C570B00
97
98
99
There shouldn't be any extra lines between the numbers. It should just be
97
98
99
Any ideas where these new lines are coming from?
Basically, your readlineCR shouldn't be returning rv when it gets nothing from read - it needs to never return until the complete string rv string ends with \r\n, and then it can return the rstripped string:
def readlineCR(port):
rv = ""
while True:
ch = port.read()
rv += ch
if rv.endswith("\r\n"):
return rv.rstrip()
In addition I don't see why you are checking the length of rcv - but it won't matter once rcv is a complete message.
you did'nt save result of mypayload.rstrip() in a variable then send this variable i.e (mpayload not affected) look at this example:
>>> s='\r\n97\r\n'
>>> s.strip()
'97'
>>> s
'\r\n97\r\n'
the your code should be:
if len(rcv) > 4:
mytopic, mypayload = rcv.split(" ")
v=mypayload.strip()
publish.single(mytopic, v, hostname="localhost")
print(rcv) adds a newline. To change this to a space (for example), try this:
print(rcv, end=' ')
I am not sure what I am doing wrong here but I am trying to open a file, trace1.flow, read the header information then throw the source IP and destination IP into dictionaries. This is done in Python running on a Fedora VM. I am getting the following error:
(secs, nsecs, booted, exporter, mySourceIP, myDestinationIP) = struct.unpack('IIIIII',myBuf)
struct.error: unpack requires a string argument of length 24
Here is my code:
import struct
import socket
#Dictionaries
uniqSource = {}
uniqDestination = {}
def int2quad(i):
z = struct.pack('!I', i)
return socket.inet_ntoa(z)
myFile = open('trace1.flow')
myBuf = myFile.read(8)
(magic, endian, version, headerLen) = struct.unpack('HBBI', myBuf)
print "Magic: ", hex(magic), "Endian: ", endian, "Version: ", version, "Header Length: ", headerLen
myFile.read(headerLen - 8)
try:
while(True):
myBuf = myFile.read(24)
(secs, nsecs, booted, exporter, mySourceIP, myDestinationIP) = struct.unpack('IIIIII',myBuf)
mySourceIP = int2quad(mySourceIP)
myDestinationIP = int2quad(myDestinationIP)
if mySourceIP not in uniqSource:
uniqSource[mySourceIP] = 1
else:
uniqSource[mySourceIP] += 1
if myDestinationIP not in uniqDestination:
uniqDestination[myDestinationIP] = 1
else:
uniqDestination[myDestinationIP] += 1
myFile.read(40)
except EOFError:
print "END OF FILE"
You seem to assume that file.read will raise EOFError on end of file, but this error is only raised by input() and raw_input(). file.read will simply return a string that's shorter than requested (possibly empty).
So you need to check the length after reading:
myBuf = myFile.read(24)
if len(myBuf) < 24:
break
Perhaps your have reached end-of-file. Check the length of myBuf:
len(myBuf)
It's probably less than 24 chars long. Also you don't need those extra parenthesis, and try to specify duplicated types using 'nI' like this:
secs, nsecs, booted, exporter, mySourceIP, myDestinationIP = struct.unpack('6I',myBuf)
I'm trying to capture a string from the output of a subprocess and when the subprocess asks for user input, include the user input in the string, but I can't get stdout to work.
I got the string output from stdout using a while loop, but I don't know how to terminate it after reading the string.
I tried using subprocess.check_output, but then I can't see the prompts for user input.
import subprocess
import sys
child = subprocess.Popen(["java","findTheAverage"], stdout = subprocess.PIPE, stdin = subprocess.PIPE )
string = u""
while True:
line = str(child.stdout.read(1))
if line != '':
string += line[2]
print(string)
else:
break
print(string)
for line in sys.stdin:
print(line)
child.stdin.write(bytes(line, 'utf-8'))
EDIT:
With help and code from Alfe post I now have a string getting created from the subprocess programs output, and the users input to that program, but its jumbled about.
The string appears to first get The first letter of the output, then the user input, then the rest of the output.
Example of string muddling:
U2
3ser! please enter a double:U
4ser! please enter another double: U
5ser! please enter one final double: Your numbers were:
a = 2.0
b = 3.0
c = 4.0
average = 3.0
Is meant to be:
User! please enter a double:2
User! please enter another double: 3
User! please enter one final double: 4
Your numbers were:
a = 2.0
b = 3.0
c = 4.0
average = 3.0
Using the code:
import subprocess
import sys
import signal
import select
def signal_handler(signum, frame):
raise Exception("Timed out!")
child = subprocess.Popen(["java","findTheAverage"], universal_newlines = True, stdout = subprocess.PIPE, stdin = subprocess.PIPE )
string = u""
stringbuf = ""
while True:
print(child.poll())
if child.poll() != None and not stringbuf:
break
signal.signal(signal.SIGALRM, signal_handler)
signal.alarm(1)
try:
r, w, e = select.select([ child.stdout, sys.stdin ], [], [])
if child.stdout in r:
stringbuf = child.stdout.read(1)
string += stringbuf
print(stringbuf)
except:
print(string)
print(stringbuf)
if sys.stdin in r:
typed = sys.stdin.read(1)
child.stdin.write(typed)
string += typed
FINAL EDIT:
Alright, I played around with it and got it working with this code:
import subprocess
import sys
import select
import fcntl
import os
# the string that we will return filled with tasty program output and user input #
string = ""
# the subprocess running the program #
child = subprocess.Popen(["java","findTheAverage"],bufsize = 0, universal_newlines = True, stdout = subprocess.PIPE, stdin = subprocess.PIPE )
# stuff to stop IO blocks in child.stdout and sys.stdin ## (I stole if from http://stackoverflow.com/a/8980466/2674170)
fcntl.fcntl(child.stdout.fileno(), fcntl.F_SETFL, os.O_NONBLOCK)
fcntl.fcntl(sys.stdin.fileno(), fcntl.F_SETFL, os.O_NONBLOCK)
# this here in the unlikely event that the program has #
# finished by the time the main loop is first running #
# because if that happened the loop would end without #
# having added the programs output to the string! #
progout = ""
typedbuf = "#"
### here we have the main loop, this friendly fellah is
### going to read from the program and user, and tell
### each other what needs to be known
while True:
## stop when the program finishes and there is no more output
if child.poll() != None and not progout:
break
# read from
typed = ""
while typedbuf:
try:
typedbuf = sys.stdin.read(1)
except:
break
typed += typedbuf
stringbuf = "#"
string += typed
child.stdin.write(typed)
progout = ""
progoutbuf = "#"
while progoutbuf:
try:
progoutbuf = child.stdout.read(1)
except:
typedbuf = "#"
break
progout += progoutbuf
if progout:
print(progout)
string += progout
# the final output string #
print( string)
You need select to read from more than one source at the same time (in your case stdin and the output of the child process).
import select
string = ''
while True:
r, w, e = select.select([ child.stdout, sys.stdin ], [], [])
if child.stdout in r:
string += child.stdout.read()
if sys.stdin in r:
typed = sys.stdin.read()
child.stdin.write(typed)
string += typed
You will still need to find a proper breaking condition to leave that loop. But you probably get the idea already.
I want to give a warning at this point: Processes writing into pipes typically buffer until the latest possible moment; you might not expect this because when testing the same program from the command line (in a terminal) typically only lines get buffered. This is due to performance considerations. When writing to a terminal, typically a user expects to see the output as soon as possible. When writing to a pipe, typically a reading process is happy to be given larger chunks in order to sleep longer before they arrive.
I'm receiving the following message trough TCP:
{"message": "Start", "client": "134.106.74.21", "type": 1009}<EOM>
but when I'm trying to partition that
msg.partition( "<EOM>" )
I'm getting the following array:
('{\x00\x00\x00"\x00\x00\x00m\x00\x00\x00e\x00\x00\x00s\x00\x00\x00s\x00\x00\x00a\x00
\x00\x00g\x00\x00\x00e\x00\x00\x00"\x00\x00\x00:\x00\x00\x00 \x00\x00\x00"\x00\x00\x00#
\x00\x00\x00B\x00\x00\x00E\x00\x00\x00G\x00\x00\x00I\x00\x00\x00N\x00\x00\x00;\x00\x00
\x00A\x00\x00\x00l\x00\x00\x00l\x00\x00\x00;\x00\x00\x000\x00\x00\x00;\x00\x00\x001\x00\x00
\x00;\x00\x00\x000\x00\x00\x00;\x00\x00\x001\x00\x00\x003\x00\x00\x004\x00\x00\x00.\x00\x00
\x001\x00\x00\x000\x00\x00\x006\x00\x00\x00.\x00\x00\x007\x00\x00\x004\x00\x00\x00.\x00\x00
\x001\x00\x00\x002\x00\x00\x005\x00\x00\x00:\x00\x00\x003\x00\x00\x000\x00\x00\x000\x00\x00
\x000\x00\x00\x000\x00\x00\x00;\x00\x00\x00#\x00\x00\x00E\x00\x00\x00N\x00\x00\x00D\x00\x00
\x00"\x00\x00\x00,\x00\x00\x00 \x00\x00\x00"\x00\x00\x00c\x00\x00\x00l\x00\x00\x00i\x00\x00
\x00e\x00\x00\x00n\x00\x00\x00t\x00\x00\x00"\x00\x00\x00:\x00\x00\x00 \x00\x00\x00"\x00
\x00\x001\x00\x00\x003\x00\x00\x004\x00\x00\x00.\x00\x00\x001\x00\x00\x000\x00\x00\x006
\x00\x00\x00.\x00\x00\x007\x00\x00\x004\x00\x00\x00.\x00\x00\x001\x00\x00\x002\x00\x00
\x005\x00\x00\x00"\x00\x00\x00,\x00\x00\x00 \x00\x00\x00"\x00\x00\x00t\x00\x00\x00y\x00
\x00\x00p\x00\x00\x00e\x00\x00\x00"\x00\x00\x00:\x00\x00\x00 \x00\x00\x002\x00\x00\x000
\x00\x00\x000\x00\x00\x005\x00\x00\x00}\x00\x00\x00<\x00\x00\x00E\x00\x00\x00O\x00\x00\x00M
\x00\x00\x00>\x00\x00\x00{"message": "Start", "client": "134.106.74.21", "type": 1009}',
'', '')
Updated
try:
#Check if there are messages, if don't than throwing an exception otherwise continue
ans = self.request.recv( 20480 )
if( ans ):
recv = self.getMessage( recv + ans )
else:
#Master client disconnected
break
except:
...
def getMessage( self, msg ):
print( "masg:" + msg );
aSplit = msg.partition( "<EOM>" )
while( aSplit[ 1 ] == "<EOM>" ):
self.recvMessageHandler( json.loads( aSplit[ 0 ] ) )
#Get the new message id any
msg = aSplit[ 3 ]
aSplit = msg.partition( "<EOM>" )
return msg;
The problem has occurred when I'm trying to add two strings.
recv + ans
If you print msg.encode("hex") then you will likely see that this is exactly what is in the string.
In any case, you may have noticed that every 4th byte of the result is one of the characters that you expected. This suggests that you have a UCS4 Unicode string that you are not handling properly.
Did you receive UCS4 encoded bytes? If so then you should be stuffing them into a unicode string u"".append(stuff). But if you are receiving UCS4-encoded bytes and you have any influence over the sender, you really should get things changed to transmit and receive UTF-8 encoded strings since that is more normal over network connections.
Are you sure that the 5 literal bytes < E O M > are indeed the delimiter that you need to use for partitioning. Or is it supposed to be the single byte ASCII code named EOM? Or is it a UCS4 encoded u"<EOM>" ?
I really need python regexp which would give me this information:
Data:
Received from 1.1.1.1 18:41:51:330
(123 bytes):
INVITE: sip:dsafsdf#fsdafas.com To:
sdfasdfasdfas From: "test"
Via:
sdafsdfasdfasd
Sent from 1.1.1.1 18:42:51:330
(123 bytes):
INVITE: sip:dsafsdf#fsdafas.com
From: "test"
To:
sdfasdfasdfas Via:
sdafsdfasdfasd
Received from 1.1.1.1 18:50:51:330
(123 bytes):
INVITE: sip:dsafsdf#fsdafas.com
Via: sdafsdfasdfasd
From: "test"
To:
sdfasdfasdfas
What I need to achieve, is to find the newest INVITE that was "Received" in order to get From: header value. So searching the data backwards.
Is it possible with unique regexp ? :)
Thanks.
One-line answer, assuming you suck the entire header into a string with embedded newlines (or cr/nl's):
sorted(re.findall("Received [^\r\n]+ (\d{2}:\d{2}:\d{2}:\d{3})[^\"]+From: \"([^\r\n]+)\"", data))[-1][1]
The trick to doing it with one RE is using [^\r\n] instead of . when you want to scan over stuff. This works assuming from string always has the double quotes. The double quotes are used to keep the scanner from swallowing the entire string at the first Received... ;)
I do not think a single regular expression is the answer. I think a stateful line-by-line matcher is what you're looking for here.
import re
import collections
_msg_start_re = re.compile('^(Received|Sent)\s+from\s+(\S.*):\s*$')
_msg_field_re = re.compile('^([A-Za-z](?:(?:\w|-)+)):\s+(\S(?:.*\S)?)\s*$')
def message_parser():
hdr = None
fields = collections.defaultdict(list)
msg = None
while True:
if msg is not None:
line = (yield msg)
msg = None
hdr = None
fields = collections.defaultdict(list)
else:
line = (yield None)
if hdr is None:
hdr_match = _msg_start_re.match(line)
hdr = None if hdr_match is None else hdr_match.groups()
elif len(fields) <= 0:
field_match = _msg_field_re.match(line)
if field_match is not None:
fields[field_match.group(1)].append(field_match.group(2))
else: # Waiting for the end of the message
if line.strip() == '':
msg = (hdr, dict(fields))
else:
field_match = _msg_field_re.match(line)
fields[field_match.group(1)].append(field_match.group(2))
Example of use:
parser = msg_parser()
parser.next()
recvd_invites = [msg for msg in (parser.send(line) for line in linelst) \
if (msg is not None) and \
(msg[0][0] == 'Received') and \
('INVITE' in msg[1])]
You might be able to do this with a multiple line regex, but if you do it this way you get the message nicely parsed into its various fields. Presumably you want to do something interesting with the messages, and this will let you do a whole bunch more with them without having to use more regexps.
This also allows you to parse something other than an already existing file or a giant string with all the messages in it. For example, if you want to parse the output of a pipe that's printing out these requests as they happen you can simply do msg = parser.send(line) every time you receive a line and get a new message out as soon as its all been printed (if the line isn't the end of a message then msg will be None).