Python, read uart and post to MQTT, has extra spaces - python

I'm reading a string from a microcontroller to Raspberry Pi using Python. The string looks like this:
5050313 9
I then split this up into MQTT topic and payload. The value left of the " " is the topic, and the one right of " " is the payload. My code adds extra new lines to the MQTT topic. How can I avoid these new lines? I've even try rstrip() on the payload. Here's the code:
import serial
import time
import paho.mqtt.publish as publish
def readlineCR(port):
rv = ""
while True:
ch = port.read()
rv += ch
if ch=='\r\n' or ch=='':
return rv
port = serial.Serial("/dev/ttyAMA0", baudrate=115200, timeout=3.0)
while True:
rcv = port.readline()
print(rcv)
if len(rcv) > 4:
mytopic, mypayload = rcv.split(" ")
mypayload.rstrip()
publish.single(mytopic, mypayload, hostname="localhost")
If I subscribe to that topic, I get this exactly:
pi#raspberrypi:/media/pycode $ mosquitto_sub -h localhost -t
50C51C570B00
97
98
99
There shouldn't be any extra lines between the numbers. It should just be
97
98
99
Any ideas where these new lines are coming from?

Basically, your readlineCR shouldn't be returning rv when it gets nothing from read - it needs to never return until the complete string rv string ends with \r\n, and then it can return the rstripped string:
def readlineCR(port):
rv = ""
while True:
ch = port.read()
rv += ch
if rv.endswith("\r\n"):
return rv.rstrip()
In addition I don't see why you are checking the length of rcv - but it won't matter once rcv is a complete message.

you did'nt save result of mypayload.rstrip() in a variable then send this variable i.e (mpayload not affected) look at this example:
>>> s='\r\n97\r\n'
>>> s.strip()
'97'
>>> s
'\r\n97\r\n'
the your code should be:
if len(rcv) > 4:
mytopic, mypayload = rcv.split(" ")
v=mypayload.strip()
publish.single(mytopic, v, hostname="localhost")

print(rcv) adds a newline. To change this to a space (for example), try this:
print(rcv, end=' ')

Related

Index Error: list index out of range in python

I have project in internet security class. My partner started the project and wrote some python code and i have to continue from where he stopped. But i don't know python and i was planning to learn by running his code and checking how it works. however when i am executing his code i get an error which is "IndexError: list index out of range".
import os
# Deauthenticate devices
os.system("python2 ~/Downloads/de_auth.py -s 00:22:b0:07:58:d4 -d & sleep 30; kill $!")
# renew DHCP on linux "sudo dhclient -v -r & sudo dhclient -v"
# Capture DHCP Packet
os.system("tcpdump -lenx -s 1500 port bootps or port bootpc -v > dhcp.txt & sleep 20; kill $!")
# read packet txt file
DHCP_Packet = open("dhcp.txt", "r")
# Get info from txt file of saved packet
line1 = DHCP_Packet.readline()
line1 = line1.split()
sourceMAC = line1[1]
destMAC = line1[3]
TTL = line1[12]
length = line1[8]
#Parse packet
line = DHCP_Packet.readline()
while "0x0100" not in line:
line = DHCP_Packet.readline()
packet = line + DHCP_Packet.read()
packet = packet.replace("0x0100:", "")
packet = packet.replace("0x0110:", "")
packet = packet.replace("0x0120:", "")
packet = packet.replace("0x0130:", "")
packet = packet.replace("0x0140:", "")
packet = packet.replace("0x0150:", "")
packet = packet.replace("\n", "")
packet = packet.replace(" ", "")
packet = packet.replace(" ", "")
packet = packet.replace("000000000000000063825363", "")
# Locate option (55) = 0x0037
option = "0"
i=0
length = 0
while option != "37":
option = packet[i:i+2]
hex_length = packet[i+2:i+4]
length = int(packet[i+2:i+4], 16)
i = i+ length*2 + 4
i = i - int(hex_length, 16)*2
print "Option (55): " + packet[i:i+length*2 ] + "\nLength: " + str(length) + " Bytes"
print "Source MAC: " + sourceMAC
Thank you a lot
The index error probably means you have an empty or undefined section (index) in your lists. It's most likely in the loop condition at the bottom:
while option != "37":
option = packet[i:i+2]
hex_length = packet[i+2:i+4]
length = int(packet[i+2:i+4], 16)
i = i+ length*2 + 4
Alternatively, it could be earlier in reading your text file:
# Get info from txt file of saved packet
line1 = DHCP_Packet.readline()
line1 = line1.split()
sourceMAC = line1[1]
destMAC = line1[3]
TTL = line1[12]
length = line1[8]
Try actually opening the text file and make sure all the lines are referred to correctly.
If you're new to coding and not used to understanding error messages or using a debugger yet, one way to find the problem area is including print ('okay') between lines in the code, moving it down progressively until the line no longer prints.
I'm pretty new to python as well, but I find it easier to learn by writing your own code and googling what you want to achieve (especially when a partner leaves you code like that...). This website provides documentation on in-built commands (choose your version at the top): https://docs.python.org/3.4/contents.html,
and this website contains more in-depth tutorials for common functions: http://www.tutorialspoint.com/python/index.htm
I think the variable line1 that being split does not have as much as 13 numbers,so you will get error when executing statement TTL = line1[12].
Maybe you do not have the same environment as your partner worked with ,so the result you get(file dhcp.txt) by executing os.system("") maybe null(or with a bad format).
You should check the content of the file dhcp.txt or add statement print line1 after line1 = DHCP_Packet.readline() to check if it has a correct format.

unpack requires a string argument of length 24

I am not sure what I am doing wrong here but I am trying to open a file, trace1.flow, read the header information then throw the source IP and destination IP into dictionaries. This is done in Python running on a Fedora VM. I am getting the following error:
(secs, nsecs, booted, exporter, mySourceIP, myDestinationIP) = struct.unpack('IIIIII',myBuf)
struct.error: unpack requires a string argument of length 24
Here is my code:
import struct
import socket
#Dictionaries
uniqSource = {}
uniqDestination = {}
def int2quad(i):
z = struct.pack('!I', i)
return socket.inet_ntoa(z)
myFile = open('trace1.flow')
myBuf = myFile.read(8)
(magic, endian, version, headerLen) = struct.unpack('HBBI', myBuf)
print "Magic: ", hex(magic), "Endian: ", endian, "Version: ", version, "Header Length: ", headerLen
myFile.read(headerLen - 8)
try:
while(True):
myBuf = myFile.read(24)
(secs, nsecs, booted, exporter, mySourceIP, myDestinationIP) = struct.unpack('IIIIII',myBuf)
mySourceIP = int2quad(mySourceIP)
myDestinationIP = int2quad(myDestinationIP)
if mySourceIP not in uniqSource:
uniqSource[mySourceIP] = 1
else:
uniqSource[mySourceIP] += 1
if myDestinationIP not in uniqDestination:
uniqDestination[myDestinationIP] = 1
else:
uniqDestination[myDestinationIP] += 1
myFile.read(40)
except EOFError:
print "END OF FILE"
You seem to assume that file.read will raise EOFError on end of file, but this error is only raised by input() and raw_input(). file.read will simply return a string that's shorter than requested (possibly empty).
So you need to check the length after reading:
myBuf = myFile.read(24)
if len(myBuf) < 24:
break
Perhaps your have reached end-of-file. Check the length of myBuf:
len(myBuf)
It's probably less than 24 chars long. Also you don't need those extra parenthesis, and try to specify duplicated types using 'nI' like this:
secs, nsecs, booted, exporter, mySourceIP, myDestinationIP = struct.unpack('6I',myBuf)

Python reading until null character from Telnet

I am telneting to my server, which answers to me with messages and at the end of each message is appended hex00 (null character) which cannot be read. I tried searching through and through, but can't seem to make it work, a simple example:
from telnetlib import Telnet
connection = Telnet('localhost', 5001)
connection.write('aa\n')
connection.read_eager()
This returns an output:
'Fail - Command aa not found.\n\r'
whereas there should be sth like:
'Fail - Command aa not found.\n\r\0'
Is there any way to get this end of string character? Can I get bytes as an output if the character is missed on purpose?
The 00 character is there:
I stumbled in this same problem when trying to get data from an RS232-TCP/IP Converter using telnet - the telnetlib would suppress every 0x00 from the message. As Fredrik Johansson well answered, it is the way telnetlib was implemented.
One solution would be to override the process_rawq() function from telnetlib's Telnet class that doesn't eat all the null characters:
import telnetlib
from telnetlib import IAC, DO, DONT, WILL, WONT, SE, NOOPT
def _process_rawq(self):
"""Alteração da implementação desta função necessária pois telnetlib suprime 0x00 e \021 dos dados lidos
"""
buf = ['', '']
try:
while self.rawq:
c = self.rawq_getchar()
if not self.iacseq:
# if c == theNULL:
# continue
# if c == "\021":
# continue
if c != IAC:
buf[self.sb] = buf[self.sb] + c
continue
else:
self.iacseq += c
elif len(self.iacseq) == 1:
# 'IAC: IAC CMD [OPTION only for WILL/WONT/DO/DONT]'
if c in (DO, DONT, WILL, WONT):
self.iacseq += c
continue
self.iacseq = ''
if c == IAC:
buf[self.sb] = buf[self.sb] + c
else:
if c == SB: # SB ... SE start.
self.sb = 1
self.sbdataq = ''
elif c == SE:
self.sb = 0
self.sbdataq = self.sbdataq + buf[1]
buf[1] = ''
if self.option_callback:
# Callback is supposed to look into
# the sbdataq
self.option_callback(self.sock, c, NOOPT)
else:
# We can't offer automatic processing of
# suboptions. Alas, we should not get any
# unless we did a WILL/DO before.
self.msg('IAC %d not recognized' % ord(c))
elif len(self.iacseq) == 2:
cmd = self.iacseq[1]
self.iacseq = ''
opt = c
if cmd in (DO, DONT):
self.msg('IAC %s %d',
cmd == DO and 'DO' or 'DONT', ord(opt))
if self.option_callback:
self.option_callback(self.sock, cmd, opt)
else:
self.sock.sendall(IAC + WONT + opt)
elif cmd in (WILL, WONT):
self.msg('IAC %s %d',
cmd == WILL and 'WILL' or 'WONT', ord(opt))
if self.option_callback:
self.option_callback(self.sock, cmd, opt)
else:
self.sock.sendall(IAC + DONT + opt)
except EOFError: # raised by self.rawq_getchar()
self.iacseq = '' # Reset on EOF
self.sb = 0
pass
self.cookedq = self.cookedq + buf[0]
self.sbdataq = self.sbdataq + buf[1]
telnetlib.Telnet.process_rawq = _process_rawq
then override the Telnet class' method:
telnetlib.Telnet.process_rawq = _process_rawq
This solved the problem for me.
This code (http://www.opensource.apple.com/source/python/python-3/python/Lib/telnetlib.py) seems to just ignore null characters. Is that really correct behavior?
def process_rawq(self):
"""Transfer from raw queue to cooked queue.
Set self.eof when connection is closed. Don't block unless in
the midst of an IAC sequence.
"""
buf = ''
try:
while self.rawq:
c = self.rawq_getchar()
if c == theNULL:
continue
:
:
process_rawq is then in turn called by e.g. read_until
def read_until(self, match, timeout=None):
"""Read until a given string is encountered or until timeout.
When no match is found, return whatever is available instead,
possibly the empty string. Raise EOFError if the connection
is closed and no cooked data is available.
"""
n = len(match)
self.process_rawq()
:
:
I also want to receive the null character. In my particular case it marks the end of a multiline message.
So the answer seems to be that this is expected behavior as the library code is written.
FWIW https://support.microsoft.com/en-us/kb/231866 states:
Communication is established using TCP/IP and is based on a Network
Virtual Terminal (NVT). On the client, the Telnet program is
responsible for translating incoming NVT codes to codes understood by
the client's display device as well as for translating
client-generated keyboard codes into outgoing NVT codes.
The NVT uses 7-bit codes for characters. The display device, referred
to as a printer in the RFC, is only required to display the standard
printing ASCII characters represented by 7-bit codes and to recognize
and process certain control codes. The 7-bit characters are
transmitted as 8-bit bytes with the most significant bit set to zero.
An end-of-line is transmitted as a carriage return (CR) followed by a
line feed (LF). If you want to transmit an actual carriage return,
this is transmitted as a carriage return followed by a NUL (all bits
zero) character.
and
Name Code Decimal Value
Function NULL NUL 0 No operation

Parsing chat messages as config

I'm trying write a function that would be able to parse out a file with defined messages for a set of replies but am at loss on how to do so.
For example the config file would look:
[Message 1]
1: Hey
How are you?
2: Good, today is a good day.
3: What do you have planned?
Anything special?
4: I am busy working, so nothing in particular.
My calendar is full.
Each new line without a number preceding it is considered part of the reply, just another message in the conversation without waiting for a response.
Thanks
Edit: The config file will contain multiple messages and I would like to have the ability to randomly select from them all. Maybe store each reply from a conversation as a list, then the replies with extra messages can carry the newline then just split them by the newline. I'm not really sure what would be the best operation.
Update:
I've got for the most part this coded up so far:
def parseMessages(filename):
messages = {}
begin_message = lambda x: re.match(r'^(\d)\: (.+)', x)
with open(filename) as f:
for line in f:
m = re.match(r'^\[(.+)\]$', line)
if m:
index = m.group(1)
elif begin_message(line):
begin = begin_message(line).group(2)
else:
cont = line.strip()
else:
# ??
return messages
But now I am stuck on being able to store them into the dict the way I'd like..
How would I get this to store a dict like:
{'Message 1':
{'1': 'How are you?\nHow are you?',
'2': 'Good, today is a good day.',
'3': 'What do you have planned?\nAnything special?',
'4': 'I am busy working, so nothing in particular.\nMy calendar is full'
}
}
Or if anyone has a better idea, I'm open for suggestions.
Once again, thanks.
Update Two
Here is my final code:
import re
def parseMessages(filename):
all_messages = {}
num = None
begin_message = lambda x: re.match(r'^(\d)\: (.+)', x)
with open(filename) as f:
messages = {}
message = []
for line in f:
m = re.match(r'^\[(.+)\]$', line)
if m:
index = m.group(1)
elif begin_message(line):
if num:
messages.update({num: '\n'.join(message)})
all_messages.update({index: messages})
del message[:]
num = int(begin_message(line).group(1))
begin = begin_message(line).group(2)
message.append(begin)
else:
cont = line.strip()
if cont:
message.append(cont)
return all_messages
Doesn't sound too difficult. Almost-Python pseudocode:
for line in configFile:
strip comments from line
if line looks like a section separator:
section = matched section
elsif line looks like the beginning of a reply:
append line to replies[section]
else:
append line to last reply in replies[section][-1]
You may want to use the re module for the "looks like" operation. :)
If you have a relatively small number of strings, why not just supply them as string literals in a dict?
{'How are you?' : 'Good, today is a good day.'}

Is there a good regular expression for multiline matching of received SIP invites?

I really need python regexp which would give me this information:
Data:
Received from 1.1.1.1 18:41:51:330
(123 bytes):
INVITE: sip:dsafsdf#fsdafas.com To:
sdfasdfasdfas From: "test"
Via:
sdafsdfasdfasd
Sent from 1.1.1.1 18:42:51:330
(123 bytes):
INVITE: sip:dsafsdf#fsdafas.com
From: "test"
To:
sdfasdfasdfas Via:
sdafsdfasdfasd
Received from 1.1.1.1 18:50:51:330
(123 bytes):
INVITE: sip:dsafsdf#fsdafas.com
Via: sdafsdfasdfasd
From: "test"
To:
sdfasdfasdfas
What I need to achieve, is to find the newest INVITE that was "Received" in order to get From: header value. So searching the data backwards.
Is it possible with unique regexp ? :)
Thanks.
One-line answer, assuming you suck the entire header into a string with embedded newlines (or cr/nl's):
sorted(re.findall("Received [^\r\n]+ (\d{2}:\d{2}:\d{2}:\d{3})[^\"]+From: \"([^\r\n]+)\"", data))[-1][1]
The trick to doing it with one RE is using [^\r\n] instead of . when you want to scan over stuff. This works assuming from string always has the double quotes. The double quotes are used to keep the scanner from swallowing the entire string at the first Received... ;)
I do not think a single regular expression is the answer. I think a stateful line-by-line matcher is what you're looking for here.
import re
import collections
_msg_start_re = re.compile('^(Received|Sent)\s+from\s+(\S.*):\s*$')
_msg_field_re = re.compile('^([A-Za-z](?:(?:\w|-)+)):\s+(\S(?:.*\S)?)\s*$')
def message_parser():
hdr = None
fields = collections.defaultdict(list)
msg = None
while True:
if msg is not None:
line = (yield msg)
msg = None
hdr = None
fields = collections.defaultdict(list)
else:
line = (yield None)
if hdr is None:
hdr_match = _msg_start_re.match(line)
hdr = None if hdr_match is None else hdr_match.groups()
elif len(fields) <= 0:
field_match = _msg_field_re.match(line)
if field_match is not None:
fields[field_match.group(1)].append(field_match.group(2))
else: # Waiting for the end of the message
if line.strip() == '':
msg = (hdr, dict(fields))
else:
field_match = _msg_field_re.match(line)
fields[field_match.group(1)].append(field_match.group(2))
Example of use:
parser = msg_parser()
parser.next()
recvd_invites = [msg for msg in (parser.send(line) for line in linelst) \
if (msg is not None) and \
(msg[0][0] == 'Received') and \
('INVITE' in msg[1])]
You might be able to do this with a multiple line regex, but if you do it this way you get the message nicely parsed into its various fields. Presumably you want to do something interesting with the messages, and this will let you do a whole bunch more with them without having to use more regexps.
This also allows you to parse something other than an already existing file or a giant string with all the messages in it. For example, if you want to parse the output of a pipe that's printing out these requests as they happen you can simply do msg = parser.send(line) every time you receive a line and get a new message out as soon as its all been printed (if the line isn't the end of a message then msg will be None).

Categories

Resources