Python: How to trigger multiple process at same instant - python

I am trying to run a process that does a http POST which in turn will send an alert(time taken to send an alert is in nano second) to a server. I am trying to test the capacity of the server in handling alerts in milliseconds. As per the given standard, the server is said to handle 6000 alerts/second.
I created a piece of code using multiprocessing module, which sends 6000 alerts, but I am using a for loop and hence the time taken to execute the for loop exceeds more than a second. And hence all the 6000 process are not triggered at SAME INSTANT.
Is there a way to trigger multiple(N number) process at same instant?
This is my code: flowtesting.py which is a library. And this is followed by my script after '####'
import json
import httplib2
class flowTesting():
def init(self, companyId, deviceIp):
self.companyId = companyId
self.deviceIp = deviceIp
def generate_savedSearchName(self, randNum):
self.randMsgId = randNum
self.savedSearchName = "TEST %s risk31 more than 3" % self.randMsgId
def def_request_body_dict(self):
self.reqBody_dict = \
{ "Header" : {"agid" : "Agent1",
"mid": self.randMsgId,
"ts" : 1253125001
},
"mp":
{
"host" : self.deviceIp,
"index" : self.companyId,
"savedSearchName" : self.savedSearchName,
}
}
self.req_body = json.dumps(self.reqBody_dict)
def get_default_hdrs(self):
self.hdrs = {'Content-type': 'application/json',
'Accept-Language': 'en-US,en;q=0.8'}
def send_request(self, sIp, method="POST"):
self.sIp = sIp
self.url = "http://%s:8080/agent/splunk/messages" % self.sIp
http_cli = httplib2.Http(timeout=180, disable_ssl_certificate_validation=True)
rsp, rsp_body = http_cli.request(uri=self.url, method=method, headers=self.hdrs, body=self.req_body)
print "rsp: %s and rsp_body: %s" % (rsp, rsp_body)
# My testScript
from flowTesting import flowTesting
import random
import multiprocessing
deviceIp = "10.31.421.35"
companyId = "CPY0000909"
noMsgToBeSent = 1000
sIp = "10.31.44.235"
uniq_msg_id_list = random.sample(xrange(1,10000), noMsgToBeSent)
def runner(companyId, deviceIp, uniq_msg_id):
proc = flowTesting(companyId, deviceIp)
proc.generate_savedSearchName(uniq_msg_id)
proc.def_request_body_dict()
proc.get_default_hdrs()
proc.send_request(sIp)
process_list = []
for uniq_msg_id in uniq_msg_id_list:
savedSearchName = "TEST-1000 %s risk31 more than 3" % uniq_msg_id
process = multiprocessing.Process(target=runner, args=(companyId,deviceIp,uniq_msg_id,))
process.start()
process.join()
process_list.append(process)
print "Process list: %s" % process_list
print "Unique Message Id: %s" % uniq_msg_id_list

Making them all happen in the same instant is obviously impossible—unless you have a 6000-core machine and an OS kernel whose scheduler is able to handle them all perfectly (which you don't), you can't get 6000 pieces of code running at once.
And, even if you did, what they're all trying to do is to send a message on a socket. Even if your kernel was that insanely parallel, unless you have 6000 separate NICs, they're going to end up serialized in the NIC buffer. That's the way IP works: one packet after another. And of course there are all the routers on the path, the server's NIC, the server's OS, etc. And even if IP doesn't get in the way, bytes take time to transfer over a cable. So the only way to do this at the same instant, even in theory, would be to have 6000 NICs on each side and wire them up directly to each other with identical fiber.
However, you don't really need them in the same instant, just closer to each other than they are. You didn't show us your code, but presumably you're just starting 6000 Processes that all immediately try to send a message. That means you're including the process startup time—which can be pretty slow (especially on Windows)—in the skew time.
You can reduce that by using threads instead of processes. That may seem counterintuitive, but Python is pretty good at handling I/O-bound threads, and every modern OS is very good at starting new threads.
But really, what you need is a Barrier on your threads or processes, to let all of them complete all the setup work (including process startup) before any of them try to do any work.
It still probably won't be tight enough, but it will be a lot tighter than you probably have right now.
The next limit you're going to face is context-switching time. Modern OSs are pretty good at scheduling, but not 6000-simultaneous-tasks good. So really, you want to reduce this to N processes, each one just spamming 6000/N connections sequentially as fast as possible. That will get them into the kernel/NIC much faster than trying to do 6000 at once and making the OS do the serialization for you. (In fact, on some platforms, depending on your hardware, you might actually be better off with one process doing 6000 in a row than N doing 6000/N. Test it both ways.)
There's still some overhead for the socket library itself. To get around that, you want to pre-craft all of the IP packets, then create a single raw socket and spam those packets. Send the first packet from each connection, then the second packet from each connection, etc.

You need to use an inter-process synchronization primitive. On Linux you would use a Sys-V semaphore, on Windows you would use a Win32 event.
Your 6000 processes would wait on this semaphore/event, and from a different process you would trigger it, thus releasing all your 6000 processes from their waiting state to a ready state, and then the OS would start executing them as quickly as possible.

Related

Python Twisted multithreaded TCP proxy

I am trying to write a TCP proxy using Python's twisted framework. I started with the Twisted's port forward example and it seems to do the job in a standard secnario. The problem is that I have a rather peculiar scenario. What we need to so is to process each TCP data packet and look for a certain pattern.
In case the pattern matches we need to do a certain process. This process takes anywhere between 30-40 seconds (I know its not a good design but currently thats how things stand). The trouble is that if this process starts all other packets get held up/stuck till the process completes. So if there are 100 live connections and even if 1 of them calls the process all the remaining 99 processes are stuck.
Is there a standard 'twisted' way wherein each connection/session is handled in a separate thread so that the 'blocking process' does not intervene with the other live connections?
Example Code:
from twisted.internet import reactor
from twisted.protocols import portforward
from twisted.internet import threads
def processingOperation(data)
# doing the processing operation here
sleep(30)
return data
def server_dataReceived(self, data):
if data.find("pattern we need to test")<> -1:
data = processingOperation(data)
portforward.Proxy.dataReceived(self, data)
portforward.ProxyServer.dataReceived = server_dataReceived
def client_dataReceived(self, data):
portforward.Proxy.dataReceived(self, data)
portforward.ProxyClient.dataReceived = client_dataReceived
reactor.listenTCP(8383, portforward.ProxyFactory('xxx.yyy.uuu.iii', 80))
reactor.run()
Of cause there is. You defer the processing to a thread. For example:
def render_POST(self, request):
# some code you may have to run before processing
d = threads.deferToThread(method_that_does_the_processing, request)
return ''
There is a trick: This will return before the processing is done. And the client will get the answer back. So you might want to return 202/Accepted instead of 200/Ok (or my dummy '').
If you need to return after the processing is complete, you can use an inline call-back (http://twistedmatrix.com/documents/10.2.0/api/twisted.internet.defer.inlineCallbacks.html).

gevent / requests hangs while making lots of head requests

I need to make 100k head requests, and I'm using gevent on top of requests. My code runs for a while, but then eventually hangs. I'm not sure why it's hanging, or whether it's hanging inside requests or gevent. I'm using the timeout argument inside both requests and gevent.
Please take a look at my code snippet below, and let me know what I should change.
import gevent
from gevent import monkey, pool
monkey.patch_all()
import requests
def get_head(url, timeout=3):
try:
return requests.head(url, allow_redirects=True, timeout=timeout)
except:
return None
def expand_short_urls(short_urls, chunk_size=100, timeout=60*5):
chunk_list = lambda l, n: ( l[i:i+n] for i in range(0, len(l), n) )
p = pool.Pool(chunk_size)
print 'Expanding %d short_urls' % len(short_urls)
results = {}
for i, _short_urls_chunked in enumerate(chunk_list(short_urls, chunk_size)):
print '\t%d. processing %d urls # %s' % (i, chunk_size, str(datetime.datetime.now()))
jobs = [p.spawn(get_head, _short_url) for _short_url in _short_urls_chunked]
gevent.joinall(jobs, timeout=timeout)
results.update({_short_url:job.get().url for _short_url, job in zip(_short_urls_chunked, jobs) if job.get() is not None and job.get().status_code==200})
return results
I've tried grequests, but it's been abandoned, and I've gone through the github pull requests, but they all have issues too.
The RAM usage you are observing mainly stems from all the data that piles up while storing 100.000 response objects, and all the underlying overhead. I have reproduced your application case, and fired off HEAD requests against 15000 URLS from the top Alexa ranking. It did not really matter
whether I used a gevent Pool (i.e. one greenlet per connection) or a fixed set of greenlets, all requesting multiple URLs
how large I set the pool size
In the end, the RAM usage grew over time, to considerable amounts. However, I noticed that changing from requests to urllib2 already lead to a reduction in RAM usage, by about factor two. That is, I replaced
result = requests.head(url)
with
request = urllib2.Request(url)
request.get_method = lambda : 'HEAD'
result = urllib2.urlopen(request)
Some other advice: do not use two timeout mechanisms. Gevent's timeout approach is very solid, and you can easily use it like this:
def gethead(url):
result = None
try:
with Timeout(5, False):
result = requests.head(url)
except Exception as e:
result = e
return result
Might look tricky, but either returns None (after quite precisely 5 seconds, and indicates timeout), any exception object representing a communication error, or the response. Works great!
Although this likely is not part of the issue, in such cases I recommend to keep workers alive and let them work on multiple items each! The overhead of spawning greenlets is small, indeed. Still, this would be a very simple solution with a set of long-lived greenlets:
def qworker(qin, qout):
while True:
try:
qout.put(gethead(qin.get(block=False)))
except Empty:
break
qin = Queue()
qout = Queue()
for url in urls:
qin.put(url)
workers = [spawn(qworker, qin, qout) for i in xrange(POOLSIZE)]
joinall(workers)
returnvalues = [qout.get() for _ in xrange(len(urls))]
Also, you really need to appreciate that this is a large-scale problem you are tackling there, yielding non-standard issues. When I reproduced your scenario with a timeout of 20 s and 100 workers and 15000 URLs to be requested, I easily got a large number of sockets:
# netstat -tpn | wc -l
10074
That is, the OS had more than 10000 sockets to manage, most of them in TIME_WAIT state. I also observed "Too many open files" errors, and tuned the limits up, via sysctl. When you request 100.000 URLs you will probably hit such limits, too, and you need to come up with measures to prevent system starving.
Also note the way you are using requests, it automatically follows redirects from HTTP to HTTPS, and automatically verifies the certificate, all of which surely costs RAM.
In my measurements, when I divided the number of requested URLs by the runtime of the program, I almost never passed 100 responses/s, which is the result of the high-latency connections to foreign servers all over the world. I guess you also are affected by such a limit. Adjust the rest of the architecture to this limit, and you will probably be able to generate a data stream from the Internet to disk (or database) with not so large RAM usage inbetween.
I should address your two main questions, specifically:
I think gevent/the way you are using it is not your problem. I think you are just underestimating the complexity of your task. It comes along with nasty problems, and drives your system to its limits.
your RAM usage issue: Start off by using urllib2, if you can. Then, if things accumulate still too high, you need to work against accumulation. Try to produce a steady state: you might want to start writing off data to disk and generally work towards the situation where objects can become garbage collected.
your code "eventually hangs": probably this is as of your RAM issue. If it is not, then do not spawn so many greenlets, but reuse them as indicated. Also, further reduce concurrency, monitor the number of open sockets, increase system limits if necessary, and try to find out exactly where your software hangs.
I'm not sure if this will resolve your issue, but you are not using pool.Pool() correctly.
Try this:
def expand_short_urls(short_urls, chunk_size=100):
# Pool() automatically limits your process to chunk_size greenlets running concurrently
# thus you don't need to do all that chunking business you were doing in your for loop
p = pool.Pool(chunk_size)
print 'Expanding %d short_urls' % len(short_urls)
# spawn() (both gevent.spawn() and Pool.spawn()) returns a gevent.Greenlet object
# NOT the value your function, get_head, will return
threads = [p.spawn(get_head, short_url) for short_url in short_urls]
p.join()
# to access the returned value of your function, access the Greenlet.value property
results = {short_url: thread.value.url for short_url, thread in zip(short_urls, threads)
if thread.value is not None and thread.value.status_code == 200}
return results

ZMQ PUB Send file

I'm trying (PY)ZMQ for the first time, and wonder if it's possible to send a complete FILE (binary) using PUB/SUB? I need to send database updates to many subscribers. I see examples of short messages but not files. Is it possible?
publisher:
import zmq
import time
import os
import sys
while True:
print 'loop'
msg = 'C:\TEMP\personnel.db'
# Prepare context & publisher
context = zmq.Context()
publisher = context.socket(zmq.PUB)
publisher.bind("tcp://*:2002")
time.sleep(1)
curFile = 'C:/TEMP/personnel.db'
size = os.stat(curFile).st_size
print 'File size:',size
target = open(curFile, 'rb')
file = target.read(size)
if file:
publisher.send(file)
publisher.close()
context.term()
target.close()
time.sleep(10)
subscriber:
'''always listening'''
import zmq
import os
import time
import sys
while True:
path = 'C:/TEST'
filename = 'personnel.db'
destfile = path + '/' + filename
if os.path.isfile(destfile):
os.remove(destfile)
time.sleep(2)
context = zmq.Context()
subscriber = context.socket(zmq.SUB)
subscriber.connect("tcp://127.0.0.1:2002")
subscriber.setsockopt(zmq.SUBSCRIBE,'')
msg = subscriber.recv(313344)
if msg:
f = open(destfile, 'wb')
print 'open'
f.write(msg)
print 'close\n'
f.close()
time.sleep(5)
You shall be able to accomplish to distribute files to many subscribers using zmq and PUB/SUB pattern.
Your code is almost there, or in other words, it might work in most situations, could be improved a bit.
Things to be aware of
Messages are living in memory
The message must fit into memory when getting published (living in PUB socket) and stays there until last currently subscribed consumer does not read it out or disconnects.
The message must also fit into memory when being received. But with reasonable large files (like your 313 kB) it shall work unless you are really short with RAM.
Slow consumer issue
In case you have multiple consumers, and one of them is reading much slower then the others, it will start slowing down all of them. Zmq is explaining this problem and also proposes some methods how to avoid it (e.g. suicide of slow subscriber).
However, in most situations, you will not encounter this problem.
Start your consumer first not to miss a message
zmq messaging is extremely fast. There is no problem, if you start your consumer sooner, then the publisher, zmq makes this scenario easy and consumer will connect automatically.
However, your publisher shall allow consumers to connect before it start publishing, your code does 1 second sleep before sending the message, this shall be sufficient.
Comments to your code
do you really have to sleep after os.remove? Probably not
subscriber.recv - there is no need to know message size in advance, zmq packet is aware of file size, so if you call it without number of bytes to receive, you will get it properly.
Send large files in chunks
zmq provides a feature called multipart messages, but according to doc, it has to fit completely (all message parts) in memory, before being sent out, so this is not the trick to use.
On the other hand, you can create "application level multipart protocol" in such a way, that you decide sending messages with structure like (hasNextPart, chunkData). This way you would be sending in well controlled sized messages and only the last one would tell "hasNextPart" == False.
Consumer would then read and write to disk all the parts until last message, claiming that there is no further part arrives.

Threadsafe printing across multiple processes python 2.x

I have experienced a very weird issue that I just can't explain when dealing with printing to a file from multiple processes (started with the subprocess module). The behavior I am seeing is that some of my output is slightly truncated and some of it is just completely missing. I am using a slightly modified version of Alex Martelli's solution for thread safe printing found here How do I get a thread safe print in Python 2.6?. The main difference is in the write method. To guarantee that output is not interleaved between the multiple processes writing to the same file I buffer the output and only write when I see a newline.
import sys
import threading
tls = threading.local()
class ThreadSafeFile(object):
"""
#author: Alex Martelli
#see: https://stackoverflow.com/questions/3029816/how-do-i-get-a-thread-safe-print-in-python-2-6
#summary: Allows for safe printing of output of multi-threaded programs to stdout.
"""
def __init__(self, f):
self.f = f
self.lock = threading.RLock()
self.nesting = 0
self.dataBuffer = ""
def _getlock(self):
self.lock.acquire()
self.nesting += 1
def _droplock(self):
nesting = self.nesting
self.nesting = 0
for i in range(nesting):
self.lock.release()
def __getattr__(self, name):
if name == 'softspace':
return tls.softspace
else:
raise AttributeError(name)
def __setattr__(self, name, value):
if name == 'softspace':
tls.softspace = value
else:
return object.__setattr__(self, name, value)
def write(self, data):
self._getlock()
self.dataBuffer += data
if data == '\n':
self.f.write(self.dataBuffer)
self.f.flush()
self.dataBuffer = ""
self._droplock()
def flush(self):
self.f.flush()
It should also be noted that to get this to behave abnormally it is going to require either a lot of time or a machine with multiple processors or cores. I ran the offending program in my test suite ~7000 times on a single processor machine before it reported a failure. This program that I've created to demonstrate the issue I've been experiencing in my test suite also seems to work on a single processor machine, but when you execute it on a multicore or multiprocessor machine it will certainly fail.
The following program shows the issue and it is somewhat more involved than I wanted it to be, but I wanted to preserve enough of the behavior of my programs as possible.
The code for process 1 main.py
import subprocess, sys, socket, time, random
from threadSafeFile import ThreadSafeFile
sys.stdout = ThreadSafeFile(sys.__stdout__)
usage = "python main.py nprocs niters"
workerFilename = "/path/to/worker.py"
def startMaster(n, iters):
host = socket.gethostname()
for i in xrange(n):
#set up ~synchronization between master and worker
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.bind((host,0))
sock.listen(1)
socketPort = sock.getsockname()[1]
cmd = 'ssh %s python %s %s %d %d %d' % \
(host, workerFilename, host, socketPort, i, iters)
proc = subprocess.Popen(cmd.split(), shell=False, stdout=None, stderr=None)
conn, addr = sock.accept()
#wait for worker process to start
conn.recv(1024)
for j in xrange(iters):
#do very bursty i/o
for k in xrange(iters):
print "master: %d iter: %d message: %d" % (n,i, j)
#sleep for some amount of time between .02s and .5s
time.sleep(1 * (random.randint(1,50) / float(100)))
#wait for worker to finish
conn.recv(1024)
sock.close()
proc.kill()
def main(nprocs, niters):
startMaster(nprocs, niters)
if __name__ == "__main__":
if len(sys.argv) != 3:
print usage
sys.exit(1)
nprocs = int(sys.argv[1])
niters = int(sys.argv[2])
main(nprocs, niters)
code for process 2 worker.py
import sys, socket,time, random, time
from threadSafeFile import ThreadSafeFile
usage = "python host port id iters"
sys.stdout = ThreadSafeFile(sys.__stdout__)
def main(host, port, n, iters):
#tell master to start
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.connect((host, port))
sock.send("begin")
for i in xrange(iters):
#do bursty i/o
for j in xrange(iters):
print "worker: %d iter: %d message: %d" % (n,i, j)
#sleep for some amount of time between .02s and .5s
time.sleep(1 * (random.randint(1,50) / float(100)))
#tell master we are done
sock.send("done")
sock.close()
if __name__ == "__main__":
if len(sys.argv) != 5:
print usage
sys.exit(1)
host = sys.argv[1]
port = int(sys.argv[2])
n = int(sys.argv[3])
iters = int(sys.argv[4])
main(host,port,n,iters)
When testing I ran main.py as follows:
python main.py 1 75 > main.out
The resulting file should be of length 75*75*2 = 11250 lines of the format:
(master|worker): %d iter: %d message: %d
Most of the time it is short 20-30 lines, but I have seen on occasion the program having the appropriate number of lines. After further investigation of the rare successes some of the lines are being truncated with something like:
ter: %d message: %d
Another interesting aspect to this is that when starting the ssh process using multiprocessing instead of subprocess this program behaves as intended. Some may just say why bother using subprocess when multiprocessing works fine. Unfortunately, it is the academic in me that really wants to know why this is behaving abnormally. Any thoughts and/or insights would be very appreciated. Thanks.
***edit
Ben I understand that threadSafeFile uses different locks per process, but I need it in my larger project for 2 reasons.
1) Each process may have multiple threads that will be writing to stdout even though this example does not. So I need to guarantee both safety at the thread level and at the process level.
2) If I don't make sure that when stdout gets flushed that there is a '\n' at the end of the buffer then there is going to be some potential execution trace where process 1 writes its buffer to a file without a trailing '\n' and then process 2 comes in and writes its buffer. Now we have lines interleaving and that's not what I want.
I also understand that this mechanism makes things a bit restrictive for what can be printed. Right now, in my stage of development of this project, restrictiveness is ok. When I can guarantee correctness I can start to relax the restrictions.
Your comment about locking inside of the conditional check if data == '\n' is incorrect. If the lock goes inside the conditional check then threadSafeFile is no longer thread safe in the general case. If any thread can add to the data buffer then there will be a race condition at dataBuffer += data as this is not an atomic operation. Perhaps your comment is simply related to this example in which we only have 1 thread per process, but if that's the case then we don't even need a lock at all.
In regards to OS level locks, my understanding was that multiple programs were able to safely write to the same file on a unix platform iff the number of bytes being written was smaller than the size of the internal buffer. Shouldn't the OS take care of all of the necessary locking for me in this case?
In each process you create a ThreadSafeFile for sys.stdout, each of which has a lock, but they're different locks; there's nothing connecting the locks used in all the different processes. So you're getting the same effect as if you used no locks at all; no process is ever going to be blocked by a lock held in another process, since they all have their own.
The only reason this works when run on a single processor machine is the buffering you do to queue up writes until a newline is encountered. This means that each line of output is written all in one go. On a uniprocessor, it's not unlikely that the OS will decide to switch processes in the middle of a bunch of successive calls to write, which would trash your data. But if the output is all written in chunks of a single line and you don't care about the order in which lines end up in the file, then it's very very unlikely for a context switch to happen in the middle of an operation you care about. Not theoretically impossible though, so I wouldn't call this code correct even for a uniprocessor.
ThreadSafeFile is very specifically only thread safe. It relies on the fact that the program only has a single ThreadSafeFile object for each file it's writing to. So any writes to that file are going to be going through that single object, synchronizing upon the lock.
When you have multiple processes, you don't have the shared global memory that threads in a single process do. So each process necessarily has its own separate ThreadSafeFile(sys.stdout) object. This is exactly the same mistake as if you had used threads and spawned N threads, each of which created its own ThreadSafeFile(sys.stdout).
I have no idea how this works when you use multiprocessing, because you haven't posted the code you used to do that. But my understanding is that this would still fail, for all the same reasons, if you used multiprocessing in such a way that each process created its own fresh ThreadSafeFile. Maybe you're not doing that in the version that uses multiprocessing?
What you need to do is arrange for the synchronization object (the lock) to be connected somehow. The multiprocessing module can do this for you. Note in the example here how the lock is created once and then passed in to each new process as it is created. (This still results in 10 different lock objects in 10 different processes of course, but what Python must be doing behind the scenes is creating an OS-level lock and then making each of the copied Python-level lock objects refer to the single OS-level lock).
If you want to do this with subprocessing, where you're just starting totally independent worker commands from separate scripts, then you'll need some way to get them all talking to a single OS-level lock. I don't know of anything in the standard library that helps you do that. I would just use multiprocessing.
As another thought, your buffering and locking code looks a little suspicious too. What happens if something calls sys.stdout.write("foo\n")? I'm not certain, but at a guess this is only working because the implementation of print happens to call sys.stdout.write on whatever you're printing, then call it again with a single newline. There is absolutely no reason it has to do this! It could just as easily assemble a single string of output in memory and then only call sys.stdout.write once. Plus, what happens if you need to print a block of multiple lines that need to go together in the output?
Another problem is that you acquire the lock the first time a process writes to the buffer, continue to hold it as the buffer is filled, then write the line, and finally release the lock. If your lock actually worked and a process took a long time between starting a line and finishing it it would block all other processes from even buffering up their writes! Maybe that's sort of what you want, if the intention that when a process starts writing something it gets a guarantee that its output will hit the file next. But in that case, you don't even need the buffering at all. I think you should be acquiring the lock just after if data == '\n':, and then you wouldn't need all that code tracking the nesting level either.

Sending data through a socket from another thread does not work in Python

This is my 'game server'. It's nothing serious, I thought this was a nice way of learning a few things about python and sockets.
First the server class initialized the server.
Then, when someone connects, we create a client thread. In this thread we continually listen on our socket.
Once a certain command comes in (I12345001001, for example) it spawns another thread.
The purpose of this last thread is to send updates to the client.
But even though I see the server is performing this code, the data isn't actually being sent.
Could anyone tell where it's going wrong?
It's like I have to receive something before I'm able to send. So I guess somewhere I'm missing something
#!/usr/bin/env python
"""
An echo server that uses threads to handle multiple clients at a time.
Entering any line of input at the terminal will exit the server.
"""
import select
import socket
import sys
import threading
import time
import Queue
globuser = {}
queue = Queue.Queue()
class Server:
def __init__(self):
self.host = ''
self.port = 2000
self.backlog = 5
self.size = 1024
self.server = None
self.threads = []
def open_socket(self):
try:
self.server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
self.server.bind((self.host,self.port))
self.server.listen(5)
except socket.error, (value,message):
if self.server:
self.server.close()
print "Could not open socket: " + message
sys.exit(1)
def run(self):
self.open_socket()
input = [self.server,sys.stdin]
running = 1
while running:
inputready,outputready,exceptready = select.select(input,[],[])
for s in inputready:
if s == self.server:
# handle the server socket
c = Client(self.server.accept(), queue)
c.start()
self.threads.append(c)
elif s == sys.stdin:
# handle standard input
junk = sys.stdin.readline()
running = 0
# close all threads
self.server.close()
for c in self.threads:
c.join()
class Client(threading.Thread):
initialized=0
def __init__(self,(client,address), queue):
threading.Thread.__init__(self)
self.client = client
self.address = address
self.size = 1024
self.queue = queue
print 'Client thread created!'
def run(self):
running = 10
isdata2=0
receivedonce=0
while running > 0:
if receivedonce == 0:
print 'Wait for initialisation message'
data = self.client.recv(self.size)
receivedonce = 1
if self.queue.empty():
print 'Queue is empty'
else:
print 'Queue has information'
data2 = self.queue.get(1, 1)
isdata2 = 1
if data2 == 'Exit':
running = 0
print 'Client is being closed'
self.client.close()
if data:
print 'Data received through socket! First char: "' + data[0] + '"'
if data[0] == 'I':
print 'Initializing user'
user = {'uid': data[1:6] ,'x': data[6:9], 'y': data[9:12]}
globuser[user['uid']] = user
print globuser
initialized=1
self.client.send('Beginning - Initialized'+';')
m=updateClient(user['uid'], queue)
m.start()
else:
print 'Reset receivedonce'
receivedonce = 0
print 'Sending client data'
self.client.send('Feedback: ' +data+';')
print 'Client Data sent: ' + data
data=None
if isdata2 == 1:
print 'Data2 received: ' + data2
self.client.sendall(data2)
self.queue.task_done()
isdata2 = 0
time.sleep(1)
running = running - 1
print 'Client has stopped'
class updateClient(threading.Thread):
def __init__(self,uid, queue):
threading.Thread.__init__(self)
self.uid = uid
self.queue = queue
global globuser
print 'updateClient thread started!'
def run(self):
running = 20
test=0
while running > 0:
test = test + 1
self.queue.put('Test Queue Data #' + str(test))
running = running - 1
time.sleep(1)
print 'Updateclient has stopped'
if __name__ == "__main__":
s = Server()
s.run()
I don't understand your logic -- in particular, why you deliberately set up two threads writing at the same time on the same socket (which they both call self.client), without any synchronization or coordination, an arrangement that seems guaranteed to cause problems.
Anyway, a definite bug in your code is you use of the send method -- you appear to believe that it guarantees to send all of its argument string, but that's very definitely not the case, see the docs:
Returns the number of bytes sent.
Applications are responsible for
checking that all data has been sent;
if only some of the data was
transmitted, the application needs to
attempt delivery of the remaining
data.
sendall is the method that you probably want:
Unlike send(), this method continues
to send data from string until either
all data has been sent or an error
occurs.
Other problems include the fact that updateClient is apparently designed to never terminate (differently from the other two thread classes -- when those terminate, updateClient instances won't, and they'll just keep running and keep the process alive), redundant global statements (innocuous, just confusing), some threads trying to read a dict (via the iteritems method) while other threads are changing it, again without any locking or coordination, etc, etc -- I'm sure there may be even more bugs or problems, but, after spotting several, one's eyes tend to start to glaze over;-).
You have three major problems. The first problem is likely the answer to your question.
Blocking (Question Problem)
The socket.recv is blocking. This means that execution is halted and the thread goes to sleep until it can read data from the socket. So your third update thread just fills the queue up but it only gets emptied when you get a message. The queue is also emptied by one message at a time.
This is likely why it will not send data unless you send it data.
Message Protocol On Stream Protocol
You are trying to use the socket stream like a message stream. What I mean is you have:
self.server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
The SOCK_STREAM part says it is a stream not a message such as SOCK_DGRAM. However, TCP does not support message. So what you have to do is build messages such as:
data =struct.pack('I', len(msg)) + msg
socket.sendall(data)
Then the receiving end will looking for the length field and read the data into a buffer. Once enough data is in the buffer it can grab out the entire message.
Your current setup is working because your messages are small enough to all be placed into the same packet and also placed into the socket buffer together. However, once you start sending large data over multiple calls with socket.send or socket.sendall you are going to start having multiple messages and partial messages being read unless you implement a message protocol on top of the socket byte stream.
Threads
Even though threads can be easier to use when starting out they come with a lot of problems and can degrade performance if used incorrectly especially in Python. I love threads so do not get me wrong. Python also has a problem with the GIL (global interpreter lock) so you get bad performance when using threads that are CPU bound. Your code is mostly I/O bound at the moment, but in the future it may become CPU bound. Also you have to worry about locking with threading. A thread can be a quick fix but may not be the best fix. There are circumstances where threading is quite simply the easiest way to break some time consuming process. So do not discard threads as evil or terrible. In Python they are considered bad mainly because of the GIL, and in other languages (including Python) because of concurrency issues so most people recommend you to use multiple processes with Python or use asynchronous code. The subject of to use a thread or not is very complex as it depends on the language (way your code is run), the system (single or multiple processors), and contention (trying to share a resource with locking), and other factors, but generally asynchronous code is faster because it utilizes more CPU with less overhead especially if you are not CPU bound.
The solution is the usage of the select module in Python, or something similar. It will tell you when a socket has data to be read, and you can set a timeout parameter.
You can gain more performance by doing asynchronous work (asynchronous sockets). To turn a socket into asynchronous mode you simply call socket.settimeout(0) which will make it not block. However, you will constantly eat CPU spinning waiting for data. The select module and friends will prevent you from spinning.
Generally for performance you want to do as much asynchronous (same thread) as possible, and then expand with more threads that are also doing as much asynchronously as possible. However as previously noted Python is an exception to this rule because of the GIL (global interpreter lock) which can actually degrade performance from what I have read. If you are interesting you should try writing a test case and find out!
You should also check out the thread locking primitives in the threading module. They are Lock, RLock, and Condition. They can help multiple threads share data with out problems.
lock = threading.Lock()
def myfunction(arg):
with lock:
arg.do_something()
Some Python objects are thread safe and others are not.
Sending Updates Asynchronously (Improvement)
Instead of using a third thread only to send updates you could instead use the client thread to send updates by checking the current time with the last time an update was sent. This would remove the usage of a Queue and a Thread. Also to do this you must convert your client code into asynchronous code and have a timeout on your select so that you can at interval check the current time to see if an update is needed.
Summary
I would recommend you rewrite your code using asynchronous socket code. I would also recommend that you use a single thread for all clients and the server. This will improve performance and decrease latency. It would make debugging easier because you would have no possible concurrency issues like you have with threads. Also, fix your message protocol before it fails.

Categories

Resources