Deadlock in Python Threads - python

I am trying to implement a simpley portscanner with Python. It works by creating a number of worker threads which scan ports that are provided in a queue. They save the results in another queue. When all ports are scanned the threads and the application should terminate. And here lies the problem: For small numbers of ports everything works fine, but if I try to scan 200 or more ports, the application will get caught in a deadlock. I have no idea, why.
class ConnectScan(threading.Thread):
def __init__(self, to_scan, scanned):
threading.Thread.__init__(self)
self.to_scan = to_scan
self.scanned = scanned
def run(self):
while True:
try:
host, port = self.to_scan.get()
except Queue.Empty:
break
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
try:
s.connect((host, port))
s.close()
self.scanned.put((host, port, 'open'))
except socket.error:
self.scanned.put((host, port, 'closed'))
self.to_scan.task_done()
class ConnectScanner(object):
def scan(self, host, port_from, port_to):
to_scan = Queue.Queue()
scanned = Queue.Queue()
for port in range(port_from, port_to + 1):
to_scan.put((host, port))
for i in range(20):
ConnectScan(to_scan, scanned).start()
to_scan.join()
Does anybody see what might be wrong? Also I would appreciate some tipps how to debug such threading issues in Python.

I don't see anything obviously wrong with your code, but as it stands the break will never be hit - self.to_scan.get() will wait forever rather than raising Queue.Empty. Given that you're loading up the queue with ports to scan before starting the threads, you can change that to self.to_scan.get(False) to have the worker threads exit correctly when all the ports have been claimed.
Combined with the fact that you have non-daemon threads (threads that will keep the process alive after the main thread finishes), that could be the cause of the hang. Try printing something after the to_scan.join() to see whether it's stopped there, or at process exit.
As Ray says, if an exception other than socket.error is raised between self.to_scan.get() and self.to_scan.task_done(), then the join call will hang. It could help to change that code to use a try/finally to be sure:
def run(self):
while True:
try:
host, port = self.to_scan.get(False)
except Queue.Empty:
break
try:
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
try:
s.connect((host, port))
s.close()
self.scanned.put((host, port, 'open'))
except socket.error:
self.scanned.put((host, port, 'closed'))
finally:
self.to_scan.task_done()
In general, debugging multithreaded processes is tricky. I try to avoid anything blocking indefinitely - it's better to have something crash noisily because a timeout was too short than to have it just stop forever waiting for an item that will never appear. So I'd specify timeouts for your self.to_scan.get, socket.connect and to_scan.join calls.
Use logging to work out the order events are occurring - printing can get interleaved from different threads, but loggers are thread-safe.
Also, something like this recipe can be handy for dumping the current stack trace for each thread.
I haven't used any debuggers with support for debugging multiple threads in Python, but there are some listed here.

It is likely that not all items on the to_scan queue are consumed and that you're not calling the task_done method enough times to unblock ConnectScanner.
Could it be that an exception is thrown during the runtime of ConnectScan.run that you're not catching and your threads prematurely terminate?

Related

Gracefully stop socket during blocking call to socket.recv()

I have a program which runs on 2 threads. The main thread is for its own work and the other thread keeps calling recv() on a UDP socket.
Basically, the code structure looks like this:
done = False
def run_sock():
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
sock.bind(('localhost', 12345))
while not done: # receive data until work done
data = sock.recv(1500)
print(data)
sock.close()
thread = threading.Thread(target=run_sock, daemon=True)
thread.start()
# Main thread
while not done:
... # Do work here
if some_condition: # Stop running, thread should as well
done = True
thread.join()
I want to close the socket when the main thread changes done to True, but when that happens, the socket is still in its current blocking recv call and it has to receive another message before it finally stops.
Is there a way to gracefully close the socket (without having to handle errors)? I've tried sock.shutdown(socket.SHUT_RDWR), sock.setblocking(False) and but they all raise errors.
So sock.recv(1500) will block until it receives something. If it receives nothing, then it waits.
But if you set a timeout then periodically that wait will throw an exception and you can do other stuff (like look at the done flag) before trying to read again.
sock.settimeout(1.0)
sock.bind(...)
while not done:
try:
data = sock.recv(1500)
except timeout:
continue
sock.close()
Of course, if the remote end closes the connection that is different. Then you need to look at data to see if it is empty.
while not done:
try:
data = sock.recv(1500)
if not data:
break
except timeout:
continue

Multithreaded TCP socket

I'm trying to create a threaded TCP socket server that can handle multiple socket request at a time.
To test it, I launch several thread in the client side to see if my server can handle it. The first socket is printed successfully but I get a [Errno 32] Broken pipe for the others.
I don't know how to avoid it.
import threading
import socketserver
import graphitesend
class ThreadedTCPRequestHandler(socketserver.BaseRequestHandler):
def handle(self):
data = self.request.recv(1024)
if data != "":
print(data)
class ThreadedTCPServer(socketserver.ThreadingTCPServer):
allow_reuse_address = True
def __init__(self, host, port):
socketserver.ThreadingTCPServer.__init__(self, (host, port), ThreadedTCPRequestHandler)
def stop(self):
self.server_close()
self.shutdown()
def start(self):
threading.Thread(target=self._on_started).start()
def _on_started(self):
self.serve_forever()
def client(g):
g.send("test", 1)
if __name__ == "__main__":
HOST, PORT = "localhost", 2003
server = ThreadedTCPServer(HOST, PORT)
server.start()
g = graphitesend.init(graphite_server = HOST, graphite_port = PORT)
threading.Thread(target = client, args=(g,)).start()
threading.Thread(target = client, args=(g,)).start()
threading.Thread(target = client, args=(g,)).start()
threading.Thread(target = client, args=(g,)).start()
threading.Thread(target = client, args=(g,)).start()
threading.Thread(target = client, args=(g,)).start()
threading.Thread(target = client, args=(g,)).start()
server.stop()
It's a little bit difficult to determine what exactly you're expecting to happen, but I think the proximate cause is that you aren't giving your clients time to run before killing the server.
When you construct a Thread object and call its start method, you're creating a thread, and getting it ready to run. It will then be placed on the "runnable" task queue on your system, but it will be competing with your main thread and all your other threads (and indeed all other tasks on the same machine) for CPU time.
Your multiple threads (main plus others) are also likely being serialized by the python interpreter's GIL (Global Interpreter Lock -- assuming you're using the "standard" CPython) which means they may not have even gotten "out of the gate" yet.
But then you're shutting down the server with server_close() before they've had a chance to send anything. That's consistent with the "Broken Pipe" error: your remaining clients are attempting to write to a socket that has been closed by the "remote" end.
You should collect the thread objects as you create them and put them in a list (so that you can reference them later). When you're finished creating and starting all of them, then go back through the list and call the .join method on each thread object. This will ensure that the thread has had a chance to finish. Only then should you shut down the server. Something like this:
threads = []
for n in range(7):
th = threading.Thread(target=client, args=(g,))
th.start()
threads.append(th)
# All threads created. Wait for them to finish.
for th in threads:
th.join()
server.stop()
One other thing to note is that all of your clients are sharing the same single connection to send to the server, so that your server will never create more than one thread: as far as it's concerned, there is only a single client. You should probably move the graphitesend.init into the client function if you actually want separate connections for each client.
(Disclaimer: I know nothing about graphitesend except what I could glean in a 15 second glance at the first result in google; I'm assuming it's basically just a wrapper around a TCP connection.)

Python: Multithreaded socket server runs endlessly when client stops unexpectedly

I have created a multithreaded socket server to connect many clients to the server using python. If a client stops unexpectedly due to an exception, server runs nonstop. Is there a way to kill that particular thread alone in the server and the rest running
Server:
class ClientThread(Thread):
def __init__(self,ip,port):
Thread.__init__(self)
self.ip = ip
self.port = port
print("New server socket thread started for " + ip + ":" + str(port))
def run(self):
while True :
try:
message = conn.recv(2048)
dataInfo = message.decode('ascii')
print("recv:::::"+str(dataInfo)+"::")
except:
print("Unexpected error:", sys.exc_info()[0])
Thread._stop(self)
tcpServer = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
tcpServer.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
tcpServer.bind((TCP_IP, 0))
tcpServer.listen(10)
print("Port:"+ str(tcpServer.getsockname()[1]))
threads = []
while True:
print( "Waiting for connections from clients..." )
(conn, (ip,port)) = tcpServer.accept()
newthread = ClientThread(ip,port)
newthread.start()
threads.append(newthread)
for t in threads:
t.join()
Client:
def Main():
s = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
s.connect((host,int(port)))
while True:
try:
message = input("Enter Command")
s.send(message.encode('ascii'))
except Exception as ex:
logging.exception("Unexpected error:")
break
s.close()
Sorry about a very, very long answer but here goes.
There are quite a many issues with your code. First of all, your client does not actually close the socket, as s.close() will never get executed. Your loop is interrupted at break and anything that follows it will be ignored. So change the order of these statements for the sake of good programming but it has nothing to do with your problem.
Your server code is wrong in quite a many ways. As it is currently written, it never exits. Your threads also do not work right. I have fixed your code so that it is a working, multithreaded server, but it still does not exit as I have no idea what would be the trigger to make it exit. But let us start from the main loop:
while True:
print( "Waiting for connections from clients..." )
(conn, (ip,port)) = tcpServer.accept()
newthread = ClientThread(conn, ip,port)
newthread.daemon = True
newthread.start()
threads.append(newthread) # Do we need this?
for t in threads:
t.join()
I have added passing of conn to your client thread, the reason of which becomes apparent in a moment. However, your while True loop never breaks, so you will never enter the for loop where you join your threads. If your server is meant to be run indefinitely, this is not a problem at all. Just remove the for loop and this part is fine. You do not need to join threads just for the sake of joining them. Joining threads only allows your program to block until a thread has finished executing.
Another addition is newthread.daemon = True. This sets your threads to daemonic, which means they will exit as soon as your main thread exits. Now your server responds to control + c even when there are active connections.
If your server is meant to be never ending, there is also no need to store threads in your main loop to threads list. This list just keeps growing as a new entry will be added every time a client connects and disconnects, and this leaks memory as you are not using the threads list for anything. I have kept it as it was there, but there still is no mechanism to exit the infinite loop.
Then let us move on to your thread. If you want to simplify the code, you can replace the run part with a function. There is no need to subclass Thread in this case, but this works so I have kept your structure:
class ClientThread(Thread):
def __init__(self,conn, ip,port):
Thread.__init__(self)
self.ip = ip
self.port = port
self.conn = conn
print("New server socket thread started for " + ip + ":" + str(port))
def run(self):
while True :
try:
message = self.conn.recv(2048)
if not message:
print("closed")
try:
self.conn.close()
except:
pass
return
try:
dataInfo = message.decode('ascii')
print("recv:::::"+str(dataInfo)+"::")
except UnicodeDecodeError:
print("non-ascii data")
continue
except socket.error:
print("Unexpected error:", sys.exc_info()[0])
try:
self.conn.close()
except:
pass
return
First of all, we store conn to self.conn. Your version used a global version of conn variable. This caused unexpected results when you had more than one connection to the server. conn is actually a new socket created for the client connection at accept, and this is unique to each thread. This is how servers differentiate between client connections. They listen to a known port, but when the server accepts the connection, accept creates another port for that particular connection and returns it. This is why we need to pass this to the thread and then read from self.conn instead of global conn.
Your server "hung" upon client connetion errors as there was no mechanism to detect this in your loop. If the client closes connection, socket.recv() does not raise an exception but returns nothing. This is the condition you need to detect. I am fairly sure you do not even need try/except here but it does not hurt - but you need to add the exception you are expecting here. In this case catching everything with undeclared except is just wrong. You have also another statement there potentially raising exceptions. If your client sends something that cannot be decoded with ascii codec, you would get UnicodeDecodeError (try this without error handling here, telnet to your server port and copypaste some Hebrew or Japanese into the connection and see what happens). If you just caught everything and treated as socket errors, you would now enter the thread ending part of the code just because you could not parse a message. Typically we just ignore "illegal" messages and carry on. I have added this. If you want to shut down the connection upon receiving a "bad" message, just add self.conn.close() and return to this exception handler as well.
Then when you really are encountering a socket error - or the client has closed the connection, you will need to close the socket and exit the thread. You will call close() on the socket - encapsulating it in try/except as you do not really care if it fails for not being there anymore.
And when you want to exit your thread, you just return from your run() loop. When you do this, your thread exits orderly. As simple as that.
Then there is yet another potential problem, if you are not only printing the messages but are parsing them and doing something with the data you receive. This I do not fix but leave this to you.
TCP sockets transmit data, not messages. When you build a communication protocol, you must not assume that when your recv returns, it will return a single message. When your recv() returns something, it can mean one of five things:
The client has closed the connection and nothing is returned
There is exactly one full message and you receive that
There is only a partial message. Either because you read the socket before the client had transmitted all data, or because the client sent more than 2048 bytes (even if your client never sends over 2048 bytes, a malicious client would definitely try this)
There are more than one messages waiting and you received them all
As 4, but the last message is partial.
Most socket programming mistakes are related to this. The programmer expects 2 to happen (as you do now) but they do not cater for 3-5. You should instead analyse what was received and act accordingly. If there seems to be less data than a full message, store it somewhere and wait for more data to appear. When more data appears, concatenate these and see if you now have a full message. And when you have parsed a full message from this buffer, inspect the buffer to see if there is more data there - the first part of the next message or even more full messages if your client is fast and server is slow. If you process a message and then wipe the buffer, you might have wiped also bytes from your next message.

Python - Can't kill main thread with KeyboardInterrupt

I'm making a simple multi-threaded port scanner. It scans all ports on host and returns open ports. The trouble is interrupting the scan. It take a lot of time for a scan to complete and sometimes I wish to kill program with C-c while in the middle of scan. Trouble is the scan won't stop. Main thread is locked on queue.join() and oblivious to KeyboardInterrupt, until all data from queue is processed thus deblocking main thread and exiting program gracefully. All my threads are daemonized so when main thread dies they should die with him.
I tried using signal lib, no success. Overriding threading.Thread class and adding method for graceful termination didn't work... Main thread just won't receive KeyboardInterrupt while executing queue.join()
import threading, sys, Queue, socket
queue = Queue.Queue()
def scan(host):
while True:
port = queue.get()
if port > 999 and port % 1000 == 0:
print port
try:
#sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
#sock.settimeout(2) #you need timeout or else it will try to connect forever!
#sock.connect((host, port))
#----OR----
sock = socket.create_connection((host, port), timeout = 2)
sock.send('aaa')
data = sock.recv(100)
print "Port {} open, message: {}".format(port, data)
sock.shutdown()
sock.close()
queue.task_done()
except:
queue.task_done()
def main(host):
#populate queue
for i in range(1, 65536):
queue.put(i)
#spawn worker threads
for port in range(100):
t = threading.Thread(target = scan, args = (host,))
t.daemon = True
t.start()
if __name__ == '__main__':
host = ""
#does input exist?
try:
host = sys.argv[1]
except:
print "No argument was recivied!"
exit(1)
#is input sane?
try:
host = socket.gethostbyname(host)
except:
print "Adress does not exist"
exit(2)
#execute main program and wait for scan to complete
main(host)
print "Post main() call!"
try:
queue.join()
except KeyboardInterrupt:
print "C-C"
exit(3)
EDIT:
I have found a solution by using time module.
#execute main program and wait for scan to complete
main(host)
#a little trick. queue.join() makes main thread immune to keyboardinterrupt. So use queue.empty() with time.sleep()
#queue.empty() is "unreliable" so it may return True a bit earlier then intented.
#when queue is true, queue.join() is executed, to confirm that all data was processed.
#not a true solution, you can't interrupt main thread near the end of scan (when queue.empty() returns True)
try:
while True:
if queue.empty() == False:
time.sleep(1)
else:
break
except KeyboardInterrupt:
print "Alas poor port scanner..."
exit(1)
queue.join()
You made your threads daemons already, but you need to keep your main thread alive while daemon threads are there, there's how to do that: Cannot kill Python script with Ctrl-C
When you create the threads add them to a list of running threads and when dealing with ctrl-C send a kill signal to each thread on the list. That way you are actively cleaning up rather than relying on it being done for you.

How can I restart a BaseHTTPServer instance?

This is what I have:
http.py:
class HTTPServer():
def __init__(self, port):
self.port = port
self.thread = None
self.run = True
def serve(self):
self.thread = threading.Thread(target=self._serve)
self.thread.start()
def _serve(self):
serverAddress = ("", self.port)
self.server = MyBaseHTTPServer(serverAddress,MyRequestHandler)
logging.log(logging.INFO, "HTTP server started on port %s"%self.port)
while self.run:
self.server.handle_request()
def stop(self):
self.run = False
self.server.server_close()
Then in another file, to restart it:
def restartHTTP(self):
try:
self.httpserver.stop()
reload(http)
self.httpserver = http.HTTPServer(80)
self.httpserver.serve()
except:
traceback.print_exc()
This gives me an address already in use error, so it seems the HTTP server isn't stopping properly. What else do I need to do to stop it?
EDIT:
Where I call restartHTTP:
def commandHTTPReload(self, parts, byuser, overriderank):
self.client.factory.restartHTTP()
self.client.sendServerMessage("HTTP server reloaded.")
I do know the command is executing because I get the message it's supposed to send.
You just need to let the OS know that you really do want to reuse the port immediately after closing it. Normally it's held in a closed state for a while, in case any extra packets show up. You do this with SO_REUSEADDR:
mysocket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
..after opening mysocket. A good place to do this with HTTPServer could be in an overridden server_bind method:
def server_bind(self):
HTTPServer.server_bind(self)
self.socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
Edit: Having looked more closely at your code, I see that your threading model is also likely causing problems here. You're closing the socket in the main(?) thread while the other thread is waiting on a connection on that same socket (in accept()). This arrangement does not have well-defined semantics, and I believe it does different things on different OSes. In any case, it is something you ought to avoid in order to minimize confusion (already lots of that to go around in a multithreaded program). Your old thread will not actually go away until after it gets a connection and handles its request (because it won't re-check self.run until then), and so the port may not be re-bindable until after that.
There isn't really a simple solution to this. You could add a communication pipe between the threads, and then use select()/poll() in the server thread to wait for activity on either of them, or you could timeout the accept() calls after a short amount of time so that self.run gets checked more frequently. Or you could have the main thread connect to the listening socket itself. But whatever you do, you're probably approaching the level of complexity where you ought to look at using a "real" httpd or network framework instead of rolling your own: apache, lighttpd, Tornado, Twisted, etc.
For gracefully stop HTTPServer and close socket one should use:
# Start server
httpd = HTTPServer(...)
httpd.serve_forever()
# Stop server
httpd.shutdown()
httpd.server_close()

Categories

Resources