Is there a possibility to test if the connection still exists before executing a transport.write()?
I have modified the simpleserv/simpleclient examples so that a message is being send (written to Protocol.transport) every 5 seconds. The connection is persistent.
When disconnecting my wifi, it still writes to transport (of course the messages don't arrive on the other side) but no error is thrown.
When enabling the wifi again, the messages are being delivered, but the next attempt to send a message fails (and Protocol.connectionLost is called).
Here again what happens chronologically:
Sending a message establishes the connection, the message is delivered.
Disabling wifi
Sending a message writes to transport, does not throw an error, the message does not arrive
Enabling wifi
Message sent in 3. arrives
Sending a message results in Protocol.connectionLost call
It would be nice to know before executing step 6 if I can write to transport. Is there any way?
Server:
# Copyright (c) Twisted Matrix Laboratories.
# See LICENSE for details.
from twisted.internet import reactor, protocol
class Echo(protocol.Protocol):
"""This is just about the simplest possible protocol"""
def dataReceived(self, data):
"As soon as any data is received, write it back."
print
print data
self.transport.write(data)
def main():
"""This runs the protocol on port 8000"""
factory = protocol.ServerFactory()
factory.protocol = Echo
reactor.listenTCP(8000,factory)
reactor.run()
# this only runs if the module was *not* imported
if __name__ == '__main__':
main()
Client:
# Copyright (c) Twisted Matrix Laboratories.
# See LICENSE for details.
"""
An example client. Run simpleserv.py first before running this.
"""
from twisted.internet import reactor, protocol
# a client protocol
counter = 0
class EchoClient(protocol.Protocol):
"""Once connected, send a message, then print the result."""
def connectionMade(self):
print 'connectionMade'
def dataReceived(self, data):
"As soon as any data is received, write it back."
print "Server said:", data
def connectionLost(self, reason):
print "connection lost"
def say_hello(self):
global counter
counter += 1
msg = '%s. hello, world' %counter
print 'sending: %s' %msg
self.transport.write(msg)
class EchoFactory(protocol.ClientFactory):
def buildProtocol(self, addr):
self.p = EchoClient()
return self.p
def clientConnectionFailed(self, connector, reason):
print "Connection failed - goodbye!"
def clientConnectionLost(self, connector, reason):
print "Connection lost - goodbye!"
def say_hello(self):
self.p.say_hello()
reactor.callLater(5, self.say_hello)
# this connects the protocol to a server running on port 8000
def main():
f = EchoFactory()
reactor.connectTCP("REMOTE_SERVER_ADDR", 8000, f)
reactor.callLater(5, f.say_hello)
reactor.run()
# this only runs if the module was *not* imported
if __name__ == '__main__':
main()
Protocol.connectionLost is the only way to know when the connection no longer exists. It is also called at the earliest time when it is known that the connection no longer exists.
It is obvious to you or me that disconnecting your network adapter (ie, turning off your wifi card) will break the connection - at least, if you leave it off or if you configure it different when you turn it back on again. It's not obvious to your platform's TCP implementation though.
Since network communication isn't instant and any individual packet may be lost for normal (non-fatal) reasons, TCP includes various timeouts and retries. When you disconnect your network adapter these packets can no longer be delivered but the platform doesn't know that this condition will outlast the longest TCP timeout. So your TCP connection doesn't get closed when you turn off your wifi. It hangs around and starts retrying the send and waiting for an acknowledgement.
At some point the timeouts and retries all expire and the connection really does get closed (although the way TCP works means that if there is no data waiting to be sent then there actually isn't a timeout, a "dead" connection will live forever; addressing this is the reason the TCP "keepalive" feature exists). This is made slightly more complicated by the fact that there are timeouts on both sides of the connection. If the connection closes as soon as you do the write in step six (and no sooner) then the cause is probably a "reset" (RST) packet.
A reset will occur after the timeout on the other side of the connection expires and closes the connection while the connection is still open on your side. Now when your side sends a packet for this TCP connection the other side won't recognize the TCP connection it belongs to (because as far as the other side is concerned that connection no longer exists) and reply with a reset message. This tells the original sender that there is no such connection. The original sender reacts to this by closing its side of the connection (since one side of a two-sided connection isn't very useful by itself). This is presumably when Protocol.connectionLost is called in your application.
All of this is basically just how TCP works. If the timeout behavior isn't suitable for your application then you have a couple options. You could turn on TCP keepalives (this usually doesn't help, by default TCP keepalives introduce timeouts that are hours long though you can tune this on most platforms) or you could build an application-level keepalive feature. This is simply some extra traffic that your protocol generates and then expects a response to. You can build your own timeouts (no response in 3 seconds? close the connection and establish a new one) on top of this or just rely on it to trigger one of the somewhat faster (~2 minute) TCP timeouts. The downside of a faster timeout is that spurious network issues may cause you to close the connection when you really didn't need to.
Related
I have a fake HTTP server that I use as a fixture in my testing. At some point in the test, I want to stop the server regardless of any still open connections. Clients on these open connections should get a TCP FIN.
I am aware that usually production servers need to solve different problem, that of quiescing, sometimes called graceful shutdown. This is the opposite of what I want.
With a standalone process, it is usually possible to simply get the process to quit and the OS will take care of the rest. (Forcibly killing processes is easy, while forcibly killing threads is not.) My fake server is, however, running in a thread of the test process itself, so I don't have this option (and I don't want to externalize it if there is other way around).
I investigated this issue in Python, with the HTTPServer class, where I was not able to find any solution.
I also investigated this in Go, where I was able to find the concept of Contexts, which is close to what I need, but it works the other way around: a http server would propagate a Context that can be used to cancel e.g. a database lookup if a client disconnected.
Edit: looks like Go actually does what I need and has a separate graceful and nongraceful shutdown methods, with the nongraceful being net/http#Server.Close.
server = http.server.HTTPServer(...)
thread = threading.Thread(run=server.serve_forever)
thread.start()
# a client has connected ....
server.shutdown()
# at this point I want to have the server stopped,
# without waiting for the request handling to complete
I've implemented the Go solution in Python. When new client connects, I remember the client socket, and when I want to quit, I shutdown all remembered sockets.
It seems to work.
import socket
import http.server.HTTPServer
class MyHTTPServer(HTTPServer):
"""Adds a method to the HTTPServer to allow it to exit gracefully"""
def __init__(self, addr, handler_cls):
super().__init__(addr, handler_cls)
self._client_sockets: List[socket.socket] = []
self.server_killed = False
def get_request(self) -> Tuple[socket.socket, Any]:
"""Remember the client socket"""
sock, addr = super().get_request()
self._client_sockets.append(sock)
return sock, addr
def shutdown_request(self, request: socket.socket) -> None:
"""Forget the client socket"""
self._client_sockets.remove(request)
print(f"{self._client_sockets=}")
super().shutdown_request(request)
def force_disconnect_clients(self) -> None:
"""Shutdown the remembered sockets"""
for client in self._client_sockets:
client.shutdown(socket.SHUT_RDWR)
Usage
server = MyHTTPServer(server_addr, MyRequestHandler)
# in a new thread
while not server.server_killed:
self._server.handle_request()
# ... use the server (keep in mind it can have at most one client at a time) ...
# in the main program
server.server_killed = True
server.force_disconnect_clients()
server.server_close()
My python script constantly has to send messages to RabbitMQ once it receives one from another data source. The frequency in which the python script sends them can vary, say, 1 minute - 30 minutes.
Here's how I establish a connection to RabbitMQ:
rabt_conn = pika.BlockingConnection(pika.ConnectionParameters("some_host"))
channel = rbt_conn.channel()
I just got an exception
pika.exceptions.ConnectionClosed
How can I reconnect to it? What's the best way? Is there any "strategy"? Is there an ability to send pings to keep a connection alive or set timeout?
Any pointers will be appreciated.
RabbitMQ uses heartbeats to detect and close "dead" connections and to prevent network devices (firewalls etc.) from terminating "idle" connections. From version 3.5.5 on, the default timeout is set to 60 seconds (previously it was ~10 minutes). From the docs:
Heartbeat frames are sent about every timeout / 2 seconds. After two missed heartbeats, the peer is considered to be unreachable.
The problem with Pika's BlockingConnection is that it is unable to respond to heartbeats until some API call is made (for example, channel.basic_publish(), connection.sleep(), etc).
The approaches I found so far:
Increase or deactivate the timeout
RabbitMQ negotiates the timeout with the client when establishing the connection. In theory, it should be possible to override the server default value with a bigger one using the heartbeat_interval argument, but the current Pika version (0.10.0) uses the min value between those offered by the server and the client. This issue is fixed on current master.
On the other hand, is possible to deactivate the heartbeat functionality completely by setting the heartbeat_interval argument to 0, which may well drive you into new issues (firewalls dropping connections, etc)
Reconnecting
Expanding on #itsafire's answer, you can write your own publisher class, letting you reconnect when required. An example naive implementation:
import logging
import json
import pika
class Publisher:
EXCHANGE='my_exchange'
TYPE='topic'
ROUTING_KEY = 'some_routing_key'
def __init__(self, host, virtual_host, username, password):
self._params = pika.connection.ConnectionParameters(
host=host,
virtual_host=virtual_host,
credentials=pika.credentials.PlainCredentials(username, password))
self._conn = None
self._channel = None
def connect(self):
if not self._conn or self._conn.is_closed:
self._conn = pika.BlockingConnection(self._params)
self._channel = self._conn.channel()
self._channel.exchange_declare(exchange=self.EXCHANGE,
type=self.TYPE)
def _publish(self, msg):
self._channel.basic_publish(exchange=self.EXCHANGE,
routing_key=self.ROUTING_KEY,
body=json.dumps(msg).encode())
logging.debug('message sent: %s', msg)
def publish(self, msg):
"""Publish msg, reconnecting if necessary."""
try:
self._publish(msg)
except pika.exceptions.ConnectionClosed:
logging.debug('reconnecting to queue')
self.connect()
self._publish(msg)
def close(self):
if self._conn and self._conn.is_open:
logging.debug('closing queue connection')
self._conn.close()
Other possibilities
Other possibilities which I yet didn't explore:
Using an asynchronous adapter for publishing
Keeping your RabbitMQ connection and your "publish" code on a background thread, which calls periodically connection.sleep() to responde to server heartbeats.
Dead simple: some pattern like this.
import time
while True:
try:
communication_handles = connect_pika()
do_your_stuff(communication_handles)
except pika.exceptions.ConnectionClosed:
print 'oops. lost connection. trying to reconnect.'
# avoid rapid reconnection on longer RMQ server outage
time.sleep(0.5)
You will probably have to re-factor your code, but basically it is about catching the exception, mitigate the problem and continue doing your stuff.
The communication_handles contain all the pika elements like channels, queues and whatever that your stuff needs to communicate with RabbitMQ via pika.
I'm developing a Flask/gevent WSGIserver webserver that needs to communicate (in the background) with a hardware device over two sockets using XML.
One socket is initiated by the client (my application) and I can send XML commands to the device. The device answers on a different port and sends back information that my application has to confirm. So my application has to listen to this second port.
Up until now I have issued a command, opened the second port as a server, waited for a response from the device and closed the second port.
The problem is that it's possible that the device sends multiple responses that I have to confirm. So my solution was to keep the port open and keep responding to incoming requests. However, in the end the device is done sending requests, and my application is still listening (I don't know when the device is done), thereby blocking everything else.
This seemed like a perfect use case for a thread, so that my application launches a listening server in a separate thread. Because I'm already using gevent as a WSGI server for Flask, I can use the greenlets.
The problem is, I have looked for a good example of such a thing, but all I can find is examples of multi-threading handlers for a single socket server. I don't need to handle a lot of connections on the socket server, but I need it launched in a separate thread so it can listen for and handle incoming messages while my main program can keep sending messages.
The second problem I'm running into is that in the server, I need to use some methods from my "main" class. Being relatively new to Python I'm unsure how to structure it in a way to make that possible.
class Device(object):
def __init__(self, ...):
self.clientsocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
self.serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
def _connect_to_device(self):
print "OPEN CONNECTION TO DEVICE"
try:
self.clientsocket.connect((self.ip, 5100))
except socket.error as e:
pass
def _disconnect_from_device(self):
print "CLOSE CONNECTION TO DEVICE"
self.clientsocket.close()
def deviceaction1(self, ...):
# the data that is sent is an XML document that depends on the parameters of this method.
self._connect_to_device()
self._send_data(XMLdoc)
self._wait_for_response()
return True
def _send_data(self, data):
print "SEND:"
print(data)
self.clientsocket.send(data)
def _wait_for_response(self):
print "WAITING FOR REQUESTS FROM DEVICE (CHANNEL 1)"
self.serversocket.bind(('10.0.0.16', 5102))
self.serversocket.listen(5) # listen for answer, maximum 5 connections
connection, address = self.serversocket.accept()
# the data is of a specific length I can calculate
if len(data) > 0:
self._process_response(data)
self.serversocket.close()
def _process_response(self, data):
print "RECEIVED:"
print(data)
# here is some code that processes the incoming data and
# responds to the device
# this may or may not result in more incoming data
if __name__ == '__main__':
machine = Device(ip="10.0.0.240")
Device.deviceaction1(...)
This is (globally, I left out sensitive information) what I'm doing now. As you can see everything is sequential.
If anyone can provide an example of a listening server in a separate thread (preferably using greenlets) and a way to communicate from the listening server back to the spawning thread, it would be of great help.
Thanks.
EDIT:
After trying several methods, I decided to use Pythons default select() method to solve this problem. This worked, so my question regarding the use of threads is no longer relevant. Thanks for the people who provided input for your time and effort.
Hope it can provide some help, In example class if we will call tenMessageSender function then it will fire up an async thread without blocking main loop and then _zmqBasedListener will start listening on separate port untill that thread is alive. and whatever message our tenMessageSender function will send, those will be received by client and respond back to zmqBasedListener.
Server Side
import threading
import zmq
import sys
class Example:
def __init__(self):
self.context = zmq.Context()
self.publisher = self.context.socket(zmq.PUB)
self.publisher.bind('tcp://127.0.0.1:9997')
self.subscriber = self.context.socket(zmq.SUB)
self.thread = threading.Thread(target=self._zmqBasedListener)
def _zmqBasedListener(self):
self.subscriber.connect('tcp://127.0.0.1:9998')
self.subscriber.setsockopt(zmq.SUBSCRIBE, "some_key")
while True:
message = self.subscriber.recv()
print message
sys.exit()
def tenMessageSender(self):
self._decideListener()
for message in range(10):
self.publisher.send("testid : %d: I am a task" %message)
def _decideListener(self):
if not self.thread.is_alive():
print "STARTING THREAD"
self.thread.start()
Client
import zmq
context = zmq.Context()
subscriber = context.socket(zmq.SUB)
subscriber.connect('tcp://127.0.0.1:9997')
publisher = context.socket(zmq.PUB)
publisher.bind('tcp://127.0.0.1:9998')
subscriber.setsockopt(zmq.SUBSCRIBE, "testid")
count = 0
print "Listener"
while True:
message = subscriber.recv()
print message
publisher.send('some_key : Message received %d' %count)
count+=1
Instead of thread you can use greenlet etc.
I am building a minimal socket server that display a window when it receives a message.
The server uses the newConnection signal to get the client connections and connect the proper signals (connected, disconnected and readyRead) for each socket.
When running the server program, and sending some data to the proper address, for each datum sent, a QTCPSocket is connected, then readyRead, then immediately disconnected.
And within the readyRead callback, the readAll return nothing. (I suppose this is the cause of the disconnected signal).
Still, the client gets no "broken pipe error" and can continue to send data.
How that can happen ?
here's the code :
class Client(QObject):
def connects(self):
self.connect(self.socket, SIGNAL("connected()"), SLOT(self.connected()))
self.connect(self.socket, SIGNAL("disconnected()"), SLOT(self.disconnected()))
self.connect(self.socket, SIGNAL("readyRead()"), SLOT(self.readyRead()))
self.socket.error.connect(self.error)
print "Client Connected from IP %s" % self.socket.peerAddress().toString()
def error(self):
print "error somewhere"
def connected(self):
print "Client Connected Event"
def disconnected(self):
self.readyRead()
print "Client Disconnected"
def readyRead(self):
print "reading"
msg = self.socket.readAll()
print QString(msg), len(msg)
n = Notification(QString(msg))
n.show()
class Server(QObject):
def __init__(self, parent=None):
QObject.__init__(self)
self.clients = []
def setup_client_socket(self):
print "incoming"
client = Client(self)
client.socket = self.server.nextPendingConnection()
#self.client.socket.nextBlockSize = 0
client.connects()
self.clients.append(client)
def StartServer(self):
self.server = QTcpServer()
self.server.listen(QHostAddress.LocalHost, 8888)
print self.server.isListening(), self.server.serverAddress().toString(), self.server.serverPort()
self.server.newConnection.connect(self.setup_client_socket)
update :
I tested directly with the socket module from python standard lib and it works. So the general setup of my machine and the network are not the guilty parties there. This is likely some QT issue.
The SLOT function expects a string representing the name of a slot, and it should be preceded by the target of the slot:
self.connect(self.socket, SIGNAL("connected()"), self, SLOT("connected()"))
but without quotes, you are making an immediate call to the function where the connect line is.
As you didn't even declare these functions as slots using the QtCore.Slot/pyqtSlot decorator, they wouldn't be called anyway when the signals are emitted, if you use the SLOT function.
For undecorated python functions, you should use one of these syntaxes:
self.connect(self.socket, SIGNAL("connected()"), self.connected)
Or
self.socket.connected.connect(self.connected)
Notice the lack of () at the end of the last parameter.
Additionally you shouldn't call readyRead in disconnected, if there was data to read before the disconnection, the readyRead function was most likely already called.
Probably a half-open socket - you can't trust clients to always open or terminate sessions correctly. Client might be disconnecting without doing a proper hangup on the server?
I haven't looked at the QTcpServer code, but it might be caused by someone port scanning your box?
Either way, the server should always be able to handle bad or no data gracefully. The internet is a crazy place with all sorts of weird packets floating around.
I am currently using this lib to stress test a kafka server that I have set up: https://github.com/dsully/pykafka
import kafka
import time
def test_kafka_server(n=1):
for i in range(0,n):
producer = kafka.producer.Producer('test',host='10.137.8.192')
message = kafka.message.Message(str(time.time()))
producer.send(message)
producer.disconnect()
def main():
test_kafka_server(100000)
if __name__ == '__main__':
main()
What just ends up happening is that I end up overloading my own local machine.
I get error 10055, which according to google means that "Windows has run out of TCP/IP socket buffers because too many connections are open at once." According to netstat, producer.disconnect() is not closing the socket, but rather putting it in a TIME_WAIT state.
The ipython debugger points to this line:
C:\Python27\lib\socket.pyc in meth(name, self, *args)
222 proto = property(lambda self: self._sock.proto, doc="the socket protocol")
223
--> 224 def meth(name,self,*args):
225 return getattr(self._sock,name)(*args)
226
as the culprit, but this then seems to get into messing with things at a lower level than I am comfortable with.
I had searched and found this Python socket doesn't close connection properly which recommended doing:
setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
so, I rebuilt the pykafka lib using that option in the io.py file:
def connect(self):
""" Connect to the Kafka server. """
global socket
self.socket = socket.socket()
self.socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
self.socket.connect((self.host, self.port))
and I still get the same error.
Am I not putting the setsockopt line in the right spot? Is there anything else I could be trying?
What you are describing is normal TCP behavior at the socket level. When a user level program closes a socket the kernel does not free the socket right away. It enters TIME_WAIT state:
TIME-WAIT (either server or client) represents waiting for enough
time to pass to be sure the remote TCP received the acknowledgment of
its connection termination request. [According to RFC 793 a connection
can stay in TIME-WAIT for a maximum of four minutes known as a MSL
(maximum segment lifetime).
So the socket is closed. The socket.SO_REUSEADDR is for listeners (servers), doesn't effect client connections. Well, really used when binding the socket.