Cannot bind to address after socket program crashes

Cannot bind to address after socket program crashes - python

If my program crashes before a socket is closed, the next time I run in, I get an error that looks like this;
socket.error: [Errno 48] Address already in use
Changing the port fixes the problem.
Is there any way to avoid this, and why does this happen (when the program exits, shouldn't the socket be garbage collected, and closed)?

Use .setsockopt(SOL_SOCKET, SO_REUSEADDR, 1) on your listening socket.
A search for those terms will net you many explanations for why this is necessary. Basically, after your first program closes down, the OS keeps the previous listening socket around in a shutdown state for TIME_WAIT time. SO_REUSEADDR says that you want to use the same listening port regardless.

Most OSes take up to 2 minutes to close the socket when the program doesn't properly close it first. I've hit this many times with C programs that SEGFAULT (and I don't have it handled) or similar.
Edit:
Thanks to ephemient for pointing out RFC 793 (TCP) which defines this timeout.

Other people who are getting this error may be getting it because the port is in use by another process. So check if the port is being used by any other processes and either run your program in another port or kill the blocking processes.

Related

Keeping python sockets alive in event of connection loss

I'm trying to make a socket connection that will stay alive so that in event of connection loss. So basically I want to keep the server always open (also the client preferably) and restart the client after the connection is lost. But if one end shuts down both ends shut down. I simulated this by having both ends on the same computer "localhost" and just clicking the X button. Could this be the source of my problems?
Anyway my connection code
m.connect(("localhost", 5000))
is in a if and try and while e.g.
while True:
if tryconnection:
#Error handeling
try:
m.connect(("localhost", 5000))
init = True
tryconnection = False
except socket.error:
init = False
tryconnection = True
And at the end of my code I just a m.send("example") when I press a button and if that returns an error the code of trying to connect to "localhost" starts again. And the server is a pretty generic server setup with a while loop around the x.accept(). So how do keep them both alive when the connection closes so they can reconnect when it opens again. Or is my code alright and its just by simulating on the same computer is messing with it?

I'm assuming we're dealing with TCP here since you use the word "connection".
It all depend by what you mean by "connection loss".
If by connection loss you mean that the data exchanges between the server and the client may be suspended/irresponsive (important: I did not say "closed" here) for a long among of time, seconds or minutes, then there's not much you can do about it and it's fine like that because the TCP protocol have been carefully designed to handle such situations gracefully. The timeout before deciding one or the other side is definitely down, give up, and close the connection is veeeery long (minutes). Example of such situation: the client is your smartphone, connected to some server on the web, and you enter a long tunnel.
But when you say: "But if one end shuts down both ends shut down. I simulated this by having both ends on the same computer localhost and just clicking the X button", what you are doing is actually closing the connections.
If you abruptly terminate the server: the TCP/IP implementation of your operating system will know that there's not any more a process listening on port 5000, and will cleanly close all connections to that port. In doing so a few TCP segments exchange will occur with the client(s) side (it's a TCP 4-way tear down or a reset), and all clients will be disconected. It is important to understand that this is done at the TCP/IP implementation level, that's to say your operating system.
If you abruptly terminate a client, accordingly, the TCP/IP implementation of your operating system will cleanly close the connection from it's port Y to your server port 5000.
In both cases/side, at the network level, that would be the same as if you explicitly (not abruptly) closed the connection in your code.
...and once closed, there's no way you can possibly re-establish those connections as they were before. You have to establish new connections.
If you want to establish these new connections and get the application logic to the state it was before, now that's another topic. TCP alone can't help you here. You need a higher level protocol, maybe your own, to implement stateful client/server application.

The issue is not related to the programming language, in this case python. The oeprating system (Windows or linux), has the final word regarding the resilience degree of the socket.

How to abruptly disconnect a socket without closing it appropriately

I have a Python test program for testing features of another software component, let's call the latter the component under test (COT).
The Python test program is connected to the COT via a persistent TCP connection.
The Python program is using the Python socket API for this.
Now in order to simulate a failure of the physical link, I'd like to have the Python program shut the socket down, but without disconnecting appropriately.
I.e. I don't want anything to be sent on the TCP channel any more, including any TCP SYN/ACK/FIN. I just want the socket to go silent. It must not respond to the remote packets any more.
This is not as easy as it seems, since calling close on a socket will send TCP FIN packets to the remote end. (graceful disconnection).
So how can I kill the socket without sending any packets out?
I cannot shut down the Python program itself, because it needs to maintain other connections to other components.
For information, the socket runs in a separate thread. So I thought of abruptly killing the thread, but this is also not so easy. (Is there any way to kill a Thread?)
Any ideas?

You can't do that from a userland process since in-kernel network stack still holds resources and state related to given TCP connection. Event if you kill your whole process the kernel is going to send a FIN to the other side since it knows what file descriptors your process had and will try to clean them up properly.
One way to get around this is to engage firewall software (on local or intermediate machine). Call a script that tells the firewall to drop all packets from/to given IP and port (that of course would need appropriate administrative privileges).

Contrary to Nikolai's answer, there is indeed a way to reset the connection from userland such that an RST is sent and pending data discarded, rather than a FIN after all the pending data. However as it is more abused than used, I won't publish it here. And I don't know whether it can be done from Python. Setting one of the three possible SO_LINGER configurations and closing will do it. I won't say more than that, and I will say that this technique should only be used for the purpose outlined in the question.

closing a previously opened socket

I created a program which listens to particular socket in python, however I ctrl+c'd script which resulted in .close() nor called, however how can I free the socket now.

The socket is closed when the process exits. The port it was using may hang around for a couple of minutes, that's normal, then it will disappear. If you need to re-use the port immediately, set SO_REUSEADDR before binding or connecting.

Set the SO_REUSEADDR socket option before calling the bind method, like this:
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
That will instruct the socket to freely reuse the ports left in a waiting state by recent runs of the program.
Or, use the SocketServer.TCPServer class from the standard library, which will automatically do this if you set the allow_reuse_address property on the server instance to a true value.

How to close a socket left open by a killed program?

I have a Python application which opens a simple TCP socket to communicate with another Python application on a separate host. Sometimes the program will either error or I will directly kill it, and in either case the socket may be left open for some unknown time.
The next time I go to run the program I get this error:
socket.error: [Errno 98] Address already in use
Now the program always tries to use the same port, so it appears as though it is still open. I checked and am quite sure the program isn't running in the background and yet my address is still in use.
SO, how can I manually (or otherwise) close a socket/address so that my program can immediately re-use it?
Update
Based on Mike's answer I checked out the socket(7) page and looked at SO_REUSEADDR:
SO_REUSEADDR
Indicates that the rules used in validating addresses supplied in a bind(2) call should
allow reuse of local addresses. For AF_INET sockets this means that a socket may bind,
except when there is an active listening socket bound to the address. When the listen‐
ing socket is bound to INADDR_ANY with a specific port then it is not possible to bind
to this port for any local address. Argument is an integer boolean flag.

Assume your socket is named s... you need to set socket.SO_REUSEADDR on the server's socket before binding to an interface... this will allow you to immediately restart a TCP server...
s.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
s.bind((ADDR, PORT))

You might want to try using Twisted for your networking. Mike gave the correct low-level answer, SO_REUSEADDR, but he didn't mention that this isn't a very good option to set on Windows. This is the sort of thing that Twisted takes care of for you automatically. There are many, many other examples of this kind of boring low-level detail that you have to pay attention to when using the socket module directly but which you can forget about if you use a higher level library like Twisted.

You are confusing sockets, connections, and ports. Sockets are endpoints of connections, which in turn are 5-tuples {protocol, local-ip, local-port, remote-ip, remote-port}. The killed program's socket has been closed by the OS, and ditto the connection. The only relic of the connection is the peer's socket and the corresponding port at the peer host. So what you should really be asking about is how to reuse the local port. To which the answer is SO_REUSEADDR as per the other answers.

Python doesn't detect a closed socket until the second send

When I close the socket on one end of a connection, the other end gets an error the second time it sends data, but not the first time:
import socket
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.bind(("localhost", 12345))
server.listen(1)
client = socket.create_connection(("localhost",12345))
sock, addr = server.accept()
sock.close()
client.sendall("Hello World!") # no error
client.sendall("Goodbye World!") # error happens here
I've tried setting TCP_NODELAY, using send instead of sendall, checking the fileno(), I can't find any way to get the first send to throw an error or even to detect afterwards that it failed. EDIT: calling sock.shutdown before sock.close doesn't help. EDIT #2: even adding a time.sleep after closing and before writing doesn't matter. EDIT #3: checking the byte count returned by send doesn't help, since it always returns the number of bytes in the message.
So the only solution I can come up with if I want to detect errors is to follow each sendall with a client.sendall("") which will raise an error. But this seems hackish. I'm on a Linux 2.6.x so even if a solution only worked for that OS I'd be happy.

This is expected, and how the TCP/IP APIs are implemented (so it's similar in pretty much all languages and on all operating systems)
The short story is, you cannot do anything to guarantee that a send() call returns an error directly if that send() call somehow cannot deliver data to the other end. send/write calls just delivers the data to the TCP stack, and it's up to the TCP stack to deliver it when it can.
TCP is also just a transport protocol, if you need to know if your application "messages" have reached the other end, you need to implement that yourself(some form of ACK), as part of your application protocol - there's no other free lunch.
However - if you read() from a socket, you can get notified immediatly when an error occurs, or when the other end closed the socket - you usually need to do this in some form of multiplexing event loop (that is, using select/poll or some other IO multiplexing facility).
Just note that you cannot read() from a socket to learn whether the most recent send/write succeded, Here's a few cases as of why (but it's the cases one doesn't think about that always get you)
several write() calls got buffered up due to network congestion, or because the tcp window was closed (perhaps a slow reader) and then the other end closes the socket or a hard network error occurs, thus you can't tell if if was the last write that didn't get through, or a write you did 30 seconds ago.
Network error, or firewall silently drops your packets (no ICMP replys are generated), You will have to wait until TCP times out the connection to get an error which can be many seconds, usually several minutes.
TCP is busy doing retransmission as you call send - maybe those retransmissions generate an error.(really the same as the first case)

As per the docs, try calling sock.shutdown() before the call to sock.close().

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.