I am currently using this lib to stress test a kafka server that I have set up: https://github.com/dsully/pykafka
import kafka
import time
def test_kafka_server(n=1):
for i in range(0,n):
producer = kafka.producer.Producer('test',host='10.137.8.192')
message = kafka.message.Message(str(time.time()))
producer.send(message)
producer.disconnect()
def main():
test_kafka_server(100000)
if __name__ == '__main__':
main()
What just ends up happening is that I end up overloading my own local machine.
I get error 10055, which according to google means that "Windows has run out of TCP/IP socket buffers because too many connections are open at once." According to netstat, producer.disconnect() is not closing the socket, but rather putting it in a TIME_WAIT state.
The ipython debugger points to this line:
C:\Python27\lib\socket.pyc in meth(name, self, *args)
222 proto = property(lambda self: self._sock.proto, doc="the socket protocol")
223
--> 224 def meth(name,self,*args):
225 return getattr(self._sock,name)(*args)
226
as the culprit, but this then seems to get into messing with things at a lower level than I am comfortable with.
I had searched and found this Python socket doesn't close connection properly which recommended doing:
setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
so, I rebuilt the pykafka lib using that option in the io.py file:
def connect(self):
""" Connect to the Kafka server. """
global socket
self.socket = socket.socket()
self.socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
self.socket.connect((self.host, self.port))
and I still get the same error.
Am I not putting the setsockopt line in the right spot? Is there anything else I could be trying?
What you are describing is normal TCP behavior at the socket level. When a user level program closes a socket the kernel does not free the socket right away. It enters TIME_WAIT state:
TIME-WAIT (either server or client) represents waiting for enough
time to pass to be sure the remote TCP received the acknowledgment of
its connection termination request. [According to RFC 793 a connection
can stay in TIME-WAIT for a maximum of four minutes known as a MSL
(maximum segment lifetime).
So the socket is closed. The socket.SO_REUSEADDR is for listeners (servers), doesn't effect client connections. Well, really used when binding the socket.
Related
I am writing a UDP server application that serves as a back end to Teltonika FMB630 car mounted devices.
I already took care of the protocol specifics and decoding, the problem I am facing relates to the UDP socket used.
My UDP server has to send an acknowledgement to the client device upon receiving a message (that is the protocol), however, if I send those ACKs, the server socket stops receiving data after a while.
The server's UDP socket object is passed to an concurrent.futures.ThreadPoolExecutor that fires a function (send_ack) that sends the ACK, however this is not the issue because I tried calling send_ack in the main thread, after receiving data and the same issue occurs.
I suspect the problem is the remote device somehow breaks the connection or the ISP or MNO doesn't route the reply packet (this is a GPRS device) and then the socket.send() method that is used to send the acknowledge, somehow freezes other socket operations, specifically recvfrom_into called in the main thread loop.
I wrote two scripts to illustrate the situation:
udp_test_echo.py :
#!/usr/env/bin python
import socket
import concurrent.futures
def send_ack(sock, addr, ack):
print("Sending ACK to {}".format(addr))
sock.connect(addr)
print("connected to {}".format(addr))
sock.send(ack)
print("ACK sent to {}".format(addr))
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.bind(("127.0.0.1", 1337))
data = bytearray([0] * 10)
executor = concurrent.futures.ThreadPoolExecutor(max_workers=4)
while True:
print("listening")
nbytes, address = s.recvfrom_into(data)
print("Socket Data received {} bytes Address {}".format(nbytes, address))
print("Data received: ", data, " Echoing back to client")
executor.submit(send_ack, s, address, data[:nbytes])
udp_test_client.py:
#!/usr/env/bin python
import socket
import time
import random
def get_random_bytes():
return bytearray([random.randint(0,255) for b in range(10)])
ip = "127.0.0.1"
port = 1337
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.connect((ip, port))
while True:
stuff_to_send = get_random_bytes()
print("Sending stuff", stuff_to_send)
s.sendall(stuff_to_send)
print("reply: ", s.recvfrom(10))
time.sleep(0.1)
Running udp_test_echo.py in one terminal and udp_test_client.py in another, we see normal operation but if you Ctrl+C the test client and re run it, you will see that the server doesn't respond until it is restarted.
Is there a way to timeout a specific sending operation from a specific call to socket.send() method without affecting other calls ? (I want my socket.recvfrom_into call to block on the main thread)
If I settimeout on the entire socket object, I am going to have to deal with many exceptions while waiting for data in the main thread and I don't like to have to rely on exceptions for proper program operation.
The culprit was the socket.connect() call in send_ack, when being called on the server's socket object it causes the socket to no longer be bound and listen on the port specified in the start of the program.
Instead the send_ack function was changed to be:
def send_ack(sock, addr, ack):
print("Sending ACK to {}".format(addr))
sock.sendto(ack, addr)
print("ACK sent to {}".format(addr))
socket.sendto(data, address) uses the existing connection instead of starting a new one.
Is there a possibility to test if the connection still exists before executing a transport.write()?
I have modified the simpleserv/simpleclient examples so that a message is being send (written to Protocol.transport) every 5 seconds. The connection is persistent.
When disconnecting my wifi, it still writes to transport (of course the messages don't arrive on the other side) but no error is thrown.
When enabling the wifi again, the messages are being delivered, but the next attempt to send a message fails (and Protocol.connectionLost is called).
Here again what happens chronologically:
Sending a message establishes the connection, the message is delivered.
Disabling wifi
Sending a message writes to transport, does not throw an error, the message does not arrive
Enabling wifi
Message sent in 3. arrives
Sending a message results in Protocol.connectionLost call
It would be nice to know before executing step 6 if I can write to transport. Is there any way?
Server:
# Copyright (c) Twisted Matrix Laboratories.
# See LICENSE for details.
from twisted.internet import reactor, protocol
class Echo(protocol.Protocol):
"""This is just about the simplest possible protocol"""
def dataReceived(self, data):
"As soon as any data is received, write it back."
print
print data
self.transport.write(data)
def main():
"""This runs the protocol on port 8000"""
factory = protocol.ServerFactory()
factory.protocol = Echo
reactor.listenTCP(8000,factory)
reactor.run()
# this only runs if the module was *not* imported
if __name__ == '__main__':
main()
Client:
# Copyright (c) Twisted Matrix Laboratories.
# See LICENSE for details.
"""
An example client. Run simpleserv.py first before running this.
"""
from twisted.internet import reactor, protocol
# a client protocol
counter = 0
class EchoClient(protocol.Protocol):
"""Once connected, send a message, then print the result."""
def connectionMade(self):
print 'connectionMade'
def dataReceived(self, data):
"As soon as any data is received, write it back."
print "Server said:", data
def connectionLost(self, reason):
print "connection lost"
def say_hello(self):
global counter
counter += 1
msg = '%s. hello, world' %counter
print 'sending: %s' %msg
self.transport.write(msg)
class EchoFactory(protocol.ClientFactory):
def buildProtocol(self, addr):
self.p = EchoClient()
return self.p
def clientConnectionFailed(self, connector, reason):
print "Connection failed - goodbye!"
def clientConnectionLost(self, connector, reason):
print "Connection lost - goodbye!"
def say_hello(self):
self.p.say_hello()
reactor.callLater(5, self.say_hello)
# this connects the protocol to a server running on port 8000
def main():
f = EchoFactory()
reactor.connectTCP("REMOTE_SERVER_ADDR", 8000, f)
reactor.callLater(5, f.say_hello)
reactor.run()
# this only runs if the module was *not* imported
if __name__ == '__main__':
main()
Protocol.connectionLost is the only way to know when the connection no longer exists. It is also called at the earliest time when it is known that the connection no longer exists.
It is obvious to you or me that disconnecting your network adapter (ie, turning off your wifi card) will break the connection - at least, if you leave it off or if you configure it different when you turn it back on again. It's not obvious to your platform's TCP implementation though.
Since network communication isn't instant and any individual packet may be lost for normal (non-fatal) reasons, TCP includes various timeouts and retries. When you disconnect your network adapter these packets can no longer be delivered but the platform doesn't know that this condition will outlast the longest TCP timeout. So your TCP connection doesn't get closed when you turn off your wifi. It hangs around and starts retrying the send and waiting for an acknowledgement.
At some point the timeouts and retries all expire and the connection really does get closed (although the way TCP works means that if there is no data waiting to be sent then there actually isn't a timeout, a "dead" connection will live forever; addressing this is the reason the TCP "keepalive" feature exists). This is made slightly more complicated by the fact that there are timeouts on both sides of the connection. If the connection closes as soon as you do the write in step six (and no sooner) then the cause is probably a "reset" (RST) packet.
A reset will occur after the timeout on the other side of the connection expires and closes the connection while the connection is still open on your side. Now when your side sends a packet for this TCP connection the other side won't recognize the TCP connection it belongs to (because as far as the other side is concerned that connection no longer exists) and reply with a reset message. This tells the original sender that there is no such connection. The original sender reacts to this by closing its side of the connection (since one side of a two-sided connection isn't very useful by itself). This is presumably when Protocol.connectionLost is called in your application.
All of this is basically just how TCP works. If the timeout behavior isn't suitable for your application then you have a couple options. You could turn on TCP keepalives (this usually doesn't help, by default TCP keepalives introduce timeouts that are hours long though you can tune this on most platforms) or you could build an application-level keepalive feature. This is simply some extra traffic that your protocol generates and then expects a response to. You can build your own timeouts (no response in 3 seconds? close the connection and establish a new one) on top of this or just rely on it to trigger one of the somewhat faster (~2 minute) TCP timeouts. The downside of a faster timeout is that spurious network issues may cause you to close the connection when you really didn't need to.
On Windows 7:
Given this server code:
# in server.py
if __name__ == '__main__':
serversock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# uncommenting this won't help
#serversock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
serversock.bind(('',8888))
serversock.listen(5)
# accept and receive dummy data from client
clientsock,address = serversock.accept()
data = clientsock.recv(1024)
# as long as calc.exe is running, I can't do this again
subprocess.Popen(r"c:\windows\system32\calc.exe")
# letting client close first still won't help
time.sleep(3)
# closing won't help either
clientsock.close()
serversock.close()
And the client code
# in client.py
if __name__ == '__main__':
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('127.0.0.1', 8888))
s.sendall('Hello, world')
# close early to help prevent TIME_WAIT on server, but doesn't help
s.close()
Running server first then client will launch the calculator app.
While the calculator app is still running, I can't run server again. It will complain about
python server.py (ok)
python client.py (ok)
python server.py (boom!)
socket.error: [Errno 10048] Only one usage of each socket address (protocol/network address/port) is normally permitted
If I close the Calculator app, running server is ok again...
This does not happen on Mac.
Enabling SO_REUSEADDR will only make the error go away, but the server is unreachable from the client.
In the example above, I specifically let the client close first so that the server socket don't go into TIME_WAIT.
So the questions:
Am I running into the TIME_WAIT problem on the server?
Are any sockets/filedescriptors left unclosed in the server?
Why SO_REUSEADDR won't help in this case? could the client be coming from the same port?
Could the child process be hanging on to some descriptors?
What can I do about this?
The SOLUTION:
The problem IS with the parent process of Calculator holding on to some file descriptor.
So adding close_fds=True to the Popen will ensure everything is released properly.
subprocess.Popen(r"c:\windows\system32\calc.exe", close_fds=True)
Closing socket after subprocess.Popen leaves socket in TIME_WAIT as long as child process is still running
No it doesn't. It leaves it in TIME_WAIT for a fixed amount of time, 2 or 4 minutes. After the close it has nothing to do with the child process at all.
How can I have a socket server running that accepts incoming connections and deals with that part of the code, while not having code waiting for new connections stuck in that same loop?
I am just starting trying to learn. Would a TCP Handler be useful?
I just need some simple examples on this topic. I'm wanting something like having a commands portion in the server. So i can do certain things while the server is running.
EDIT: What I'm trying to do:
1 - TCP server for multiple clients
2 - Respond to more than one at a time when needed
3 - Text input availability at all time, to be used for getting/setting info
4 - A simple way to get/save client address info. Currently using a list to save them.
You can run your socket server in a thread.
import threading
import SocketServer
server = SocketServer.TCPServer(('localhost', 0), SocketServer.BaseRequestHandler)
th = threading.Thread(target=server.serve_forever)
th.daemon = True
th.start()
Python has builtin support of asynchronous socket handling in asyncore module (http://docs.python.org/library/asyncore.html).
Asynchronous socket handling means that You have to execute at least one iteration of socket processing loop inside Your code (main loop):
asyncore.loop(count=1)
Example taken from documentation:
import asyncore
import socket
class EchoHandler(asyncore.dispatcher_with_send):
def handle_read(self):
data = self.recv(8192)
if data:
self.send(data)
class EchoServer(asyncore.dispatcher):
def __init__(self, host, port):
asyncore.dispatcher.__init__(self)
self.create_socket(socket.AF_INET, socket.SOCK_STREAM)
self.set_reuse_addr()
self.bind((host, port))
self.listen(5)
def handle_accept(self):
pair = self.accept()
if pair is None:
pass
else:
sock, addr = pair
print('Incoming connection from %s' % repr(addr))
handler = EchoHandler(sock)
server = EchoServer('localhost', 8080)
# Note that here loop is infinite (count is not given)
asyncore.loop()
Each time the socket accepts the connection handle_accept is called by the loop. Each time the data is available to read from socket handle_read is called and so on.
You can use both TCP and UDP sockets in this manner.
I'm not exactly sure what you are asking, but normally on the server side, you make socket(), bind() and listen() calls to setup the socket, and then loop around an accept() call. This accept() call blocks until a client connection is made.
For simple servers, you handle whatever request the client makes within the loop. For real-world servers, you need to spawn some other mechanism (e.g. a new thread or process, depending on the language/platform) to handle the request asynchronously, so that the original loop can iterate again on the accept() call and go back to listening for connections.
See the Python socket doc for more info and examples in Python:
http://docs.python.org/howto/sockets.html
I have a question regarding client socket on TCP/IP network. Let's say I use
try:
comSocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
comSocket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
except socket.error, msg:
sys.stderr.write("[ERROR] %s\n" % msg[1])
sys.exit(1)
try:
comSocket.bind(('', 5555))
comSocket.connect()
except socket.error, msg:
sys.stderr.write("[ERROR] %s\n" % msg[1])
sys.exit(2)
The socket created will be bound to port 5555. The problem is that after ending the connection
comSocket.shutdown(1)
comSocket.close()
Using wireshark, I see the socket closed with FIN,ACK and ACK from both sides, I can't use the port again. I get the following error:
[ERROR] Address already in use
I wonder how can I clear the port right away so that next time I still can use that same port.
comSocket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
setsockopt doesn't seem to be able to resolve the problem
Thank you!
Try using the SO_REUSEADDR socket option before binding the socket.
comSocket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
Edit:
I see you're still having trouble with this. There is a case where SO_REUSEADDR won't work. If you try to bind a socket and reconnect to the same destination (with SO_REUSEADDR enabled), then TIME_WAIT will still be in effect. It will however allow you to connect to a different host:port.
A couple of solutions come to mind. You can either continue retrying until you can gain a connection again. Or if the client initiates the closing of the socket (not the server), then it should magically work.
Here is the complete code that I've tested and absolutely does NOT give me a "address already in use" error. You can save this in a file and run the file from within the base directory of the HTML files you want to serve. Additionally, you could programmatically change directories prior to starting the server
import socket
import SimpleHTTPServer
import SocketServer
# import os # uncomment if you want to change directories within the program
PORT = 8000
# Absolutely essential! This ensures that socket resuse is setup BEFORE
# it is bound. Will avoid the TIME_WAIT issue
class MyTCPServer(SocketServer.TCPServer):
def server_bind(self):
self.socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
self.socket.bind(self.server_address)
Handler = SimpleHTTPServer.SimpleHTTPRequestHandler
httpd = MyTCPServer(("", PORT), Handler)
# os.chdir("/My/Webpages/Live/here.html")
httpd.serve_forever()
# httpd.shutdown() # If you want to programmatically shut off the server
According to this link
Actually, SO_REUSEADDR flag can lead to much greater consequences:
SO_REUSADDR permits you to use a port that is stuck in TIME_WAIT, but
you still can not use that port to establish a connection to the last
place it connected to. What? Suppose I pick local port 1010, and
connect to foobar.com port 300, and then close locally, leaving that
port in TIME_WAIT. I can reuse local port 1010 right away to connect
to anywhere except for foobar.com port 300.
However you can completely avoid TIME_WAIT state by ensuring that the remote end initiates the closure (close event). So the server can avoid problems by letting the client close first. The application protocol must be designed so that the client knows when to close. The server can safely close in response to an EOF from the client, however it will also need to set a timeout when it is expecting an EOF in case the client has left the network ungracefully. In many cases simply waiting a few seconds before the server closes will be adequate.
I also advice you to learn more about networking and network programming. You should now at least how tcp protocol works. The protocol is quite trivial and small and hence, may save you a lot of time in future.
With netstat command you can easily see which programs ( (program_name,pid) tuple) are binded to which ports and what is the socket current state: TIME_WAIT, CLOSING, FIN_WAIT and so on.
A really good explanation of linux network configurations can be found https://serverfault.com/questions/212093/how-to-reduce-number-of-sockets-in-time-wait.
In case you face the problem using TCPServer or SimpleHTTPServer,
override SocketServer.TCPServer.allow_reuse_address (python 2.7.x)
or socketserver.TCPServer.allow_reuse_address (python 3.x) attribute
class MyServer(SocketServer.TCPServer):
allow_reuse_address = True
server = MyServer((HOST, PORT), MyHandler)
server.serve_forever()
You need to set the allow_reuse_address before binding. Instead of the SimpleHTTPServer run this snippet:
Handler = SimpleHTTPServer.SimpleHTTPRequestHandler
httpd = SocketServer.TCPServer(("", PORT), Handler, bind_and_activate=False)
httpd.allow_reuse_address = True
httpd.server_bind()
httpd.server_activate()
httpd.serve_forever()
This prevents the server from binding before we got a chance to set the flags.
As Felipe Cruze mentioned, you must set the SO_REUSEADDR before binding. I found a solution on another site - solution on other site, reproduced below
The problem is that the SO_REUSEADDR socket option must be set before
the address is bound to the socket. This can be done by subclassing
ThreadingTCPServer and overriding the server_bind method as follows:
import SocketServer, socket
class MyThreadingTCPServer(SocketServer.ThreadingTCPServer):
def server_bind(self):
self.socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
self.socket.bind(self.server_address)
I found another reason for this exception.
When running the application from Spyder IDE (in my case it was Spyder3 on Raspbian) and the program terminated by ^C or an exception, the socket was still active:
sudo netstat -ap | grep 31416
tcp 0 0 0.0.0.0:31416 0.0.0.0:* LISTEN 13210/python3
Running the program again found the "Address already in use"; the IDE seems to start the new 'run' as a separate process which finds the socket used by the previous 'run'.
socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
did NOT help.
Killing process 13210 helped.
Starting the python script from command-line like
python3 <app-name>.py
always worked well when SO_REUSEADDR was set to true. The new Thonny IDE or Idle3 IDE did not have this problem.
socket.socket() should run before socket.bind() and use REUSEADDR as said
I know you've already accepted an answer but I believe the problem has to do with calling bind() on a client socket. This might be OK but bind() and shutdown() don't seem to play well together. Also, SO_REUSEADDR is generally used with listen sockets. i.e. on the server side.
You should be passing and ip/port to connect(). Like this:
comSocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
comSocket.connect(('', 5555))
Don't call bind(), don't set SO_REUSEADDR.
For me the better solution was the following. Since the initiative of closing the connection was done by the server, the setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) had no effect and the TIME_WAIT was avoiding a new connection on the same port with error:
[Errno 10048]: Address already in use. Only one usage of each socket address (protocol/IP address/port) is normally permitted
I finally used the solution to let the OS choose the port itself, then another port is used if the precedent is still in TIME_WAIT.
I replaced:
self._socket.bind((guest, port))
with:
self._socket.bind((guest, 0))
As it was indicated in the python socket documentation of a tcp address:
If supplied, source_address must be a 2-tuple (host, port) for the socket to bind to as its source address before connecting. If host or port are ‘’ or 0 respectively the OS default behavior will be used.
another solution, in development environment of course, is killing process using it, for example
def serve():
server = HTTPServer(('', PORT_NUMBER), BaseHTTPRequestHandler)
print 'Started httpserver on port ' , PORT_NUMBER
server.serve_forever()
try:
serve()
except Exception, e:
print "probably port is used. killing processes using given port %d, %s"%(PORT_NUMBER,e)
os.system("xterm -e 'sudo fuser -kuv %d/tcp'" % PORT_NUMBER)
serve()
raise e
I think the best way is just to kill the process on that port, by typing in the terminal fuser -k [PORT NUMBER]/tcp, e.g. fuser -k 5001/tcp.
I had the same problem and I couldn't find any other solution (reuse options didn't work) except restarting Raspberry Pi each time. Then I found a workaround;
comSocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
comSocket.close()
comSocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
comSocket.connect(('', 5555))
This means, define socket first, close it, then define again, so you can use the same port if it is stuck.