Implementing "Simplest Protocol" pseudo-code algorithm in python

Implementing "Simplest Protocol" pseudo-code algorithm in python - python

Using two different computers, I have to implement sender and receiver algorithms to send and receive frames. I'm a strong programmer, but relatively new to network programming and python. The algorithms are below.
Sender-site algorithm:
while(true)
{
WaitForEvent();
if(Event(RequestToSend))
{
GetData();
MakeFrame();
SendFrame();
}
Receiver-site algorithm:
while(true)
{
WaitForEvent();
if(Event(ArrivalNotification))
{
ReceiveFrame();
ExtractData();
DeliverData();
}
I have to implement these algorithms on two separate computers, one as a sender and the other as a receiver. I have no idea where to start or look for examples. I've done some research with little luck. If someone can supply example code or a good article on implementing this, that would be lots of help.

I find myself using Python's Socket Server examples. This will get you going on the SendFrame(), ReceiveFrame(), and DeliverData() routines.
The MakeFrame() and ExtractData(), will vary widely based on how much data you're needing to send back and forth. I will try to dig up some good examples I have used in the past.
If you're looking for a 1 stop solution, I would suggest looking into Twisted. There is a definite learning curve to it, but it might be worth it to you. Note that if you're wanting to package the Python code into an exe using pyInstaller or py2exe, Twisted may give you trouble based on a number of threads I have read.
So after looking back through my notes, the framing aspect was a sore subject for me as I could not find any good examples to help. I instead wrote one from scratch and have (and still am) tweaking it.
As you read up on socket programming you will surely see that just because you send all the data (socket.sendall()) does not mean that you will receive it all after the first socket.recv(). This adds some complexity to the message framing question. Due to lack of examples on the web, below I have a stripped down version of what I am using right now in several processes.
Update
So after further testing the under heavy / bursting I moved away from the regex and process the stream character by character which has greatly improved its performance.
SendFrame(), ReceiveFrame(), ExtractData(), DeliverData() Examples:
MESSAGE_FRAME_START = '#'
MESSAGE_FRAME_END = '#'
def process_raw_socket_message_stream(raw_message_stream):
message_list = []
cmd = ''
last_footer_idx = message_string.rfind(MESSAGE_FRAME_END)
cmd_str_len = len(message_string)
byte_cnt = 0
while (byte_cnt <= last_footer_idx):
cmd_chr = message_string[byte_cnt]
cmd += cmd_chr
if cmd_chr == MESSAGE_FRAME_START:
cmd = MESSAGE_FRAME_START
elif cmd_chr == MESSAGE_FRAME_END:
message_list.append(cmd)
byte_cnt += 1
# Remove the parsed data
if last_footer_idx > 0:
message_string = message_string[last_footer_idx+1:]
return message_list, message_string
def add_message_frames(unframed_message):
return MESSAGE_FRAME_START + unframed_message + MESSAGE_FRAME_END
def remove_message_frames(framed_message):
clean_message = framed_message.lstrip(MESSAGE_FRAME_START)
clean_message = clean_message.rstrip(MESSAGE_FRAME_END)
return clean_message
def process_messsage(clean_message):
# Do what needs to be done
pass
def send_data(mysocket, payload):
framed_payload = add_message_frames(payload)
mysocket.sendall(framed_payload)
def receive_data(mysocket, byte_size=1024):
data = ''
while(1):
try: # Wait for data
data += mysocket.recv(byte_size)
if(data != '') and (data != None):
# Decode all messsages
message_list, remaining_data = process_raw_socket_message_stream(data)
# Process all of the messages
for messsage in message_list:
process_messsage(remove_message_frames(message))
# Store the remaining data
data = remaining_data
except:
print "Unexpected Error"

Related

How to navigate function documentation for a parameter that can be any type (python)

I have a process that does many things in python (scrapes data, reads csvs, preprocesses data, loads a model and scores data, pushes data to a database, et. al.). I want to time specific parts of the process separately for monitoring reasons. Before I knew it, my script looked something like this:
import time
import pandas
print('Doing foo...')
foo_start = time.time
foo = pd.read_csv('data.csv')
foo_end = time.time()
foo_delta = round(foo_end - foo_start)
print(f'Complete! ({foo_delta} seconds)')
print('Doing bar...')
bar_start = time.time
bar = 1+1 # (or some other operation)
bar_end = time.time()
bar_delta = round(bar_end - bar_start)
print(f'Complete! ({bar_delta} seconds)')
...
and so on. In the spirit of DRY (Don't Repeat Yourself), I figured I could reduce number of lines by making this a function
timedProcess(operation, msg):
print(msg)
operation_start = time.time()
var = operation
operation_end = time.time()
operation_delta = round(operation_end - operation_star)
print(f'Complete! ({operation_delta} sec)')
return operation_delta, var
And this works EDIT: This does not work. The operation is performed before the function is called. The operation runs, but the time is always 0.
load_time, data = timedProcess(
operation = pd.read_csv('data.csv'),
msg = 'Reading data...'
)
And the time of the process is returned along with the pandas dataframe of the data.
My question regards documentation. How do I document this function?
"""
This function times any given operation.
Parameters:
operation (*what goes here?*): Operation to be timed
msg (str): Message describing operation
Returns:
operation_delta (int): Time of operation in seconds
var (*what goes here?*): Outcome of operation
"""
Since operation can really be any type, I am not sure how to go about documenting this. The codebase won't really ever change and I will be the only one to ever run it since this is just a personal project of mine, but I really want to get into the good habit of well documented code.
So my first question is to how to navigate the documentation, and my second question would be the next level of that - how to go about this using type hinting should I so choose eventually.
My third question is, is this even good practice? Should I ignore DRY in this scenario, and go back to what I have in my original example? Thanks!
NOTE: I know this question may be opinion based, but I am looking for the most agreed upon or accepted solution to this type of problem so I can go forward doing it the most pythonic way.
EDIT(S): Grammar

How to make multiple python programs communicate in this situation?

I'm a bit new at Python and I am working on a robotics project. The short form of my question is that I am trying to find the best way (for my situation) to run multiple python programs at once.
A little bit of context, my robot is a platform for a service robot that is capable of following markers and paths using image algorithms and also receive commands from a remote computer. I want to have separate programs for the image processing, the driving, and so on, and then manage all of them through a main program. I know I can't use anything basic like functions or classes, because each of these processes must be looping continuously, and I don't want to combine all the code to run in a single while loop, because it runs very slowly and it is significantly harder to manage.
So, in short, how do I make two separate, looping programs "talk"? Like I want the imaging program to send information about what it sees to the driving and steering program, etc.
I did some research and I found some information on multithreading and API's and stuff like that, though I can't really tell which one is actually the thing I'm looking for.
To clarify, I just need to be pointed in the right direction. This doesn't seem like a very high-level thing, and I know there are definitely tutorials out there, I'm just really confused as to where to start as I am teaching myself this as I go.

After some sniffing around, I found that using IPC was a good solution. The process I used wasn't too difficult, I just made some very simple server and client classes and had them communicate over the Localhost IP. There's undoubtedly a better way to do this, but for a beginner like myself, it was a simple way to make two programs talk without modifying code too much. For those who are trying to do a similar thing as I did, here's the classes I made for myself. Fair warning, they're not exactly pristine or even very complex, but they got the job done.
Here's the class I made for the server:
import socket
from random import random
from time import sleep
class ServerObject:
def __init__(self,host_address,port):
self._host_address = host_address
self._s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
self._s.bind((self._host_address,port))
def handshake(self):
print "Server Started. Awaiting Connection"
while True:
_data, _addr = self._s.recvfrom(1024)
if str(self._s.recvfrom(1024)[0]) == 'marco':
break
print 'marco recieved. sending polo...'
while True:
self._s.sendto('polo',_addr)
if str(self._s.recvfrom(1024)[0]) == 'confirm':
break
sleep(.5)
print 'connection verified'
self._addr = _addr
return True
def send(self,data):
self._s.sendto(str(data),self._addr)
def recieve(self,mode = 0):
_data, _addr = self._s.recvfrom(1024)
if mode == 0:
return str(_data)
if mode == 1:
return int(_data)
if mode == 2:
return float(_data)
if mode == 3:
return tuple(_data)
def change_port(self,port):
self._s.bind((self._host_address,port))
def close(self):
self._s.close()
print '_socket closed_'
if __name__ == '__main__':
host = '127.0.0.1'
talk = ServerObject(host,6003)
talk.handshake()
And here's the class I made for the client:
import socket
from time import sleep
class ClientObject:
def __init__(self,host_address,server_port,port = 0):
self._server = (host_address,server_port)
self._s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
self._s.bind((host_address,port))
def handshake(self):
print ' sending marco'
self._s.sendto('marco',self._server)
sleep(.1)
self._s.sendto('marco',self._server)
while True:
if str(self._s.recvfrom(1024)[0]) == 'polo':
break
#self._s.sendto('marco',self._server)
#self._s.sendto('marco',self._server)
print ' connection verified'
self._s.sendto('confirm',self._server)
self._s.setblocking(0)
return True
def recieve(self,mode = 0):
_data, _addr = self._s.recvfrom(1024)
if mode == 0:
return str(_data)
if mode == 1:
return int(_data)
if mode == 2:
return float(_data)
if mode == 3:
return tuple(_data)
def send(self,data):
self._s.sendto(str(data),self._server)
def close(self):
self._s.close()
print '_socket closed_'
if __name__ == '__main__':
host = '127.0.0.1'
port = 0
talk = ClientObject(host,24603,port)
talk.handshake()
#while True:
#print talk.recieve()
Use the ServerObject class on the program that will primarily send data and the ClientObject class on the program that will primarily recieve data. These can be flipped around in many situations, but I found it's best to do it this way to take advantage of UDP. The client class has an optional port variable that is set to 0 by default. This is because for UDP the client needs another port to establish itself on. 0 means it will pick an available port, but if you specify one, it's possible to re-establish a connection if the client goes offline without needing to restart both programs.
Use the handshake first on both programs being sure to use the same IP and port (not referring to the last variable on the client) and then use the send and receive functions to pass data back and forth.
again, these aren't that good, in fact there's many problems that cab arise with using this method, but for a simple task, they got the job done. I set up the handshake to print verifications of what is happening, but if those get annoying, you can just remove those lines.
Hope this helps!

I think multiproccessing library could be a solution.
You will be able to run several processes in parallel when each process could perform it specific work, while sending data to each other.
You can check this example
This is generic directory walker, which have process that scans directory tree and passes the data to other process, which scans files in already discovered folders. All this done in parallel.

This is probably a little bit outside the scope of your project, but have you considered using ROS? It lets you run a bunch of different nodes (can be Python scripts) at the same time that communicate by publishing and subscribing to topics. They can be on the same system (i.e. one or more nodes on the robot) or different systems (i.e. one node on the robot, multiple nodes on the PC). ROS also has a lot of awesome built in tools and libraries that are specifically made for robotic systems such as visualization, mapping, odometry, etc. Here's a bit of starting info:
https://en.wikipedia.org/wiki/Robot_Operating_System
http://wiki.ros.org/ROS/StartGuide
It's usually used for much larger frameworks than you seem to be describing, and beware that it takes quite a bit of time (in my experience) to implement, but it is very easy to expand once its up and running. Like I said, it all depends on the scope of your project!
Good luck!

How to Covert to dictionary in Python

I am working on a large scale embedded system built using Python and we are using ZeroMQ to make everything modular. I have sensor data being sent across a ZeroMQ serial port in the form of the python Dictionary as shown here:
accel_com.publish_message({"ACL_X": ACL_1_X_val})
Where accel_com is a Communicator class we built that wraps the ZeroMQ logic that publishes messages across a port. Here you can see we are sending Dictionaries across.
However, on the other side of the communication port, I have another module that grabs this data using this code:
accel_msg = san.get_last_message("sensor/accelerometer")
accel.ax = accel_msg.get('ACL_X')
accel.ay = accel_msg.get('ACL_Y')
accel.az = accel_msg.get('ACL_Z')
The problem is when I try to treat accel_msg as a Python Dictionary, I get an Error:
'NoneType' object does not have a method 'get()'.
So my guess is the dictionary is not going across the wire correctly. I am not very familiar with Python so I am not sure how to solve this problem.

Expanding on #JoranBeasley's comment:
accel_msg is sometimes None, such as while it's waiting for a message. The solution is to skip over None messages
while True: # waiting indefinitely for messages
accel_msg = san.get_last_message("sensor/accelerometer")
if accel_msg: # or more explicitly, if accel_msg is not None:
accel.ax = accel_msg.get('ACL_X')
accel.ay = accel_msg.get('ACL_Y')
accel.az = accel_msg.get('ACL_Z')
break # if you only want one message. otherwise remove this
else:
print accel_msg # which is almost certainly None

Is it possible to loop over an httplib.HTTPResponse's data?

I'm trying to develop a very simple proof-of-concept to retrieve and process data in a streaming manner. The server I'm requesting from will send data in chunks, which is good, but I'm having issues using httplib to iterate through the chunks.
Here's what I'm trying:
import httplib
def getData(src):
d = src.read(1024)
while d and len(d) > 0:
yield d
d = src.read(1024)
if __name__ == "__main__":
con = httplib.HTTPSConnection('example.com', port='8443', cert_file='...', key_file='...')
con.putrequest('GET', '/path/to/resource')
response = con.getresponse()
for s in getData(response):
print s
raw_input() # Just to give me a moment to examine each packet
Pretty simple. Just open an HTTPS connection to server, request a resource, and grab the result, 1024 bytes at a time. I'm definitely making the HTTPS connection successfully, so that's not a problem at all.
However, what I'm finding is that the call to src.read(1024) returns the same thing every time. It only ever returns the first 1024 bytes of the response, apparently never keeping track of a cursor within the file.
So how am I supposed to receive 1024 bytes at a time? The documentation on read() is pretty sparse. I've thought about using urllib or urllib2, but neither seems to be able to make an HTTPS connection.
HTTPS is required, and I am working in a rather restricted corporate environment where packages like Requests are a bit tough to get my hands on. If possible, I'd like to find a solution within Python's standard lib.
// Big Old Fat Edit
Turns out in my original code I had simply forgot to update the d variable. I initialized it with a read outside the yield loop and never changed it in the loop. Once I added it back in there it worked perfectly.
So, in short, I'm just a big idiot.

Is your con.putrequest() actually working? Doing a request with that method requires you to also call a bunch of other methods as you can see in the official httplib documentation:
http://docs.python.org/2/library/httplib.html
As an alternative to using the request() method described above, you
can also send your request step by step, by using the four functions
below.
putrequest()
putheader()
endheaders()
send()
Is there any reason why you're not using the default HTTPConnection.request() function?
Here's a working version for me, using request() instead:
import httlplib
def getData(src, chunk_size=1024):
d = src.read(chunk_size)
while d:
yield d
d = src.read(chunk_size)
if __name__ == "__main__":
con = httplib.HTTPSConnection('google.com')
con.request('GET', '/')
response = con.getresponse()
for s in getData(response, 8):
print s
raw_input() # Just to give me a moment to examine each packet

You can use the seek command to move the cursor along with your read.
This is my attempt at the problem. I apologize if I made it less pythonic in process.
if __name__ == "__main__":
con = httplib.HTTPSConnection('example.com', port='8443', cert_file='...', key_file='...')
con.putrequest('GET', '/path/to/resource')
response = con.getresponse()
c=0
while True:
response.seek(c*1024,0)
data =d.read(1024)
c+=1
if len(data)==0:
break
print data
raw_input()
I hope it is at least helpful.

My python program is running really slow

I'm making a program that (at least right now) retrives stream information from TwitchTV (streaming platform). This program is to self educate myself but when i run it, it's taking 2 minutes to print just the name of the streamer.
I'm using Python 2.7.3 64bit on Windows7 if that is important in anyway.
classes.py:
#imports:
import urllib
import re
#classes:
class Streamer:
#constructor:
def __init__(self, name, mode, link):
self.name = name
self.mode = mode
self.link = link
class Information:
#constructor:
def __init__(self, TWITCH_STREAMS, GAME, STREAMER_INFO):
self.TWITCH_STREAMS = TWITCH_STREAMS
self.GAME = GAME
self.STREAMER_INFO = STREAMER_INFO
def get_game_streamer_names(self):
"Connects to Twitch.TV API, extracts and returns all streams for a spesific game."
#start connection
self.con = urllib2.urlopen(self.TWITCH_STREAMS + self.GAME)
self.info = self.con.read()
self.con.close()
#regular expressions to get all the stream names
self.info = re.sub(r'"teams":\[\{.+?"\}\]', '', self.info) #remove all team names (they have the same name: parameter as streamer names)
self.streamers_names = re.findall('"name":"(.+?)"', self.info) #looks for the name of each streamer in the pile of info
#run in a for to reduce all "live_user_NAME" values
for name in self.streamers_names:
if name.startswith("live_user_"):
self.streamers_names.remove(name)
#end method
return self.streamers_names
def get_streamer_mode(self, name):
"Returns a streamers mode (on/off)"
#start connection
self.con = urllib2.urlopen(self.STREAMER_INFO + name)
self.info = self.con.read()
self.con.close()
#check if stream is online or offline ("stream":null indicates offline stream)
if self.info.count('"stream":null') > 0:
return "offline"
else:
return "online"
main.py:
#imports:
from classes import *
#consts:
TWITCH_STREAMS = "https://api.twitch.tv/kraken/streams/?game=" #add the game name at the end of the link (space = "+", eg: Game+Name)
STREAMER_INFO = "https://api.twitch.tv/kraken/streams/" #add streamer name at the end of the link
GAME = "League+of+Legends"
def main():
#create an information object
info = Information(TWITCH_STREAMS, GAME, STREAMER_INFO)
streamer_list = [] #create a streamer list
for name in info.get_game_streamer_names():
#run for every streamer name, create a streamer object and place it in the list
mode = info.get_streamer_mode(name)
streamer_name = Streamer(name, mode, 'http://twitch.tv/' + name)
streamer_list.append(streamer_name)
#this line is just to try and print something
print streamer_list[0].name, streamer_list[0].mode
if __name__ == '__main__':
main()
the program itself works perfectly, just really slow
any ideas?

Program efficiency typically falls under the 80/20 rule (or what some people call the 90/10 rule, or even the 95/5 rule). That is, 80% of the time the program is actually running in 20% of the code. In other words, there is a good shot that your code has a "bottleneck": a small area of the code that is running slow, while the rest runs very fast. Your goal is to identify that bottleneck (or bottlenecks), then fix it (them) to run faster.
The best way to do this is to profile your code. This means you are logging the time of when a specific action occurs with the logging module, use timeit like a commenter suggested, use some of the built-in profilers, or simply print out the current time at very points of the program. Eventually, you will find one part of the code that seems to be taking the most amount of time.
Experience will tell you that I/O (stuff like reading from a disk, or accessing resources over the internet) will take longer than in-memory calculations. My guess as to the problem is that you're using 1 HTTP connection to get a list of streamers, and then one HTTP connection to get the status of that streamer. Let's say that there are 10000 streamers: your program will need to make 10001 HTTP connections before it finishes.
There would be a few ways to fix this if this is indeed the case:
See if Twitch.TV has some alternatives in their API that allows you to retrieve a list of users WITH their streaming mode so that you don't need to call an API for each streamer.
Cache results. This won't actually help your program run faster the first time it runs, but you might be able to make it so that if it runs a second time within a minute, it can reuse results.
Limit your application to only dealing with a few streamers at a time. If there are 10000 streamers, what exactly does your application do that it really needs to look at the mode of all 10000 of them? Perhaps it's better to just grab the top 20, at which point the user can press a key to get the next 20, or close the application. Often times, programming is not just about writing code, but managing expectations of what your users want. This seems to be a pet project, so there might not be "users", meaning you have free reign to change what the app does.
Use multiple connections. Right now, your app makes one connection to the server, waits for the results to come back, parses the results, saves it, then starts on the next connection. This process might take an entire half a second. If there were 250 streamers, running this process for each of them would take a little over two minutes total. However, if you could run four of them at a time, you could potentially reduce your time to just under 30 seconds total. Check out the multiprocessing module. Keep in mind that some APIs might have limits to how many connections you can make at a certain time, so hitting them with 50 connections at a time might irk them and cause them to forbid you from accessing their API. Use caution here.

You are using the wrong tool here to parse the json data returned by your URL. You need to use json library provided by default rather than parsing the data using regex.
This will give you a boost in your program's performance
Change the regex parser
#regular expressions to get all the stream names
self.info = re.sub(r'"teams":\[\{.+?"\}\]', '', self.info) #remove all team names (they have the same name: parameter as streamer names)
self.streamers_names = re.findall('"name":"(.+?)"', self.info) #looks for the name of each streamer in the pile of info
To json parser
self.info = json.loads(self.info) #This will parse the json data as a Python Object
#Parse the name and return a generator
return (stream['name'] for stream in data[u'streams'])

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.