I am learning network programing in python and I'm trying to write a Toy vpn forked from android sdk https://github.com/android/platform_development/tree/master/samples/ToyVpn.
My Toy vpn is https://github.com/325862401/ToyVPN.
It's only for Linux.
My home network is behind NAT.
I can use this vpn to surf the internet after connect to remote sever.
But about half an hour or some time later the client udp socket stops receiving any data but the server can receive and send normally.
At this point I must terminate my client and run ToyVpnClient again.
It works normal for some time until it stop receiving again.
Please help me check the client logs.
>2013-08-24 11:42:38 INFO receive data from the tunnel timeout`
you can see that when problem happens, the socket always sends, not receive.
> means send, < means receive
I want to know why the udp socket stops receiving data.
Is there any debug method to find the cause?
For now I've just used logging to debug my program.
Since you're trying your client on the Internet, there is the whole universe of possible causes represented by all the Internet newtwork.
There's not a simple way of debugging here. Possible causes could be of course a software error but also some intermediate network configurations between you and the remote server.
You should capture the udp traffic using the good wireshark or the commandline tcpdump between you and the server and check if you're stopping sending packets or if the server is stopping receiving them.
If you send packets but your server doesn't receive them ( tcpdump on the server ) then there is something on the network which decides to filter your packets. And if it's not on the server (firewall rules to rate limit packets for example or something like that) then there's nothing you can do to that without modifying the logic of your program. Like changing UDP port every X seconds or using a persistent tcp connection.
A udp socket is not stable and may become null once a scanning or other event occupy your network interface for a while (especially true on Android). Using tcp avoids this problem. If you wants to maintain a stable udp, keep monitoring the status of your udp socket; if it becomes null or any unusual things happens, delete this socket and create a new one. Put this reactivating staff in a loop so that your udp socket is always alive.
Related
In python, tcp connect returns success even though the connect request is in queue at server end. Is there any way to know at client whether accept happened or in queue at server?
The problem is not related to Python but is caused by the underlying socket machinery that does its best to hide low level network events from the program. The best I can imagine would be to try a higher level protocol handshake (send a hello string and set a timeout for receiving the answer) but it would make no difference between the following problem:
connection is queued on peer and still not accepted
connection has been accepted, but for any other reason the server could not process it in allocated time
(only if timeout is very short) congestion on machines (including sender) and network added a delay greater that the timeout
My advice is simply that you do not even want to worry with such low level details. As problems can arise server side after the connection has been accepted, you will have to deal with possible higher level protocol errors, timeouts or connection loss. Just say that there is no difference between a timeout after connection has been accepted and a timeout to accept the connection.
If connect returns and there is no error, the TCP 3-Way Handshake has taken place successfully.
Client: connect sends a SYN (and blocks)
Server: (blocking on accept) sends a SYN,ACK
Client: connect sends an ACK
After 3, connectgives control back to you on the client side and accept also gives control back to the caller on the server side.
Of course, if the server is fully loaded, there is no guarantee that the wake-up of accept means actual processing of the request, but the fact that connect has woken up and returned with no error is a guarantee of having successfully set-up the TCP connection.
Packets can be sent.
For a good explanation see for example:
https://www.ibm.com/developerworks/aix/library/au-tcpsystemcalls/index.html
And head to the The 3-way TCP handshake section
I'm going crazy writing a little socket server in python. Everything was working fine, but I noticed that in the case where the client just disappears, the server can't tell. I simulate this by pulling the ethernet cable between the client and server, close the client, then plug the cable back in. The server never hears that the client disconnected and will wait forever, never allowing more clients to connect.
I figured I'd solve this by adding a timeout to the read loop so that it would try and read every 10 seconds. I thought maybe if it tried to read from the socket it would notice the client was missing. But then I realized there really is no way for the server to know that.
So I added a heartbeat. If the server goes 10 seconds without reading, it will send data to the client. However, even this is successful (meaning doesn't throw any kind of exception). So I am able to both read and write to a client that isn't there any more. Is there any way to know that the client is gone without implementing some kind of challenge/response protocol between the client and server? That would be a breaking change in this case and I'd like to avoid it.
Here is the core of my code for this:
def _loop(self):
command = ""
while True:
socket, address = self._listen_socket.accept()
self._socket = socket
self._socket.settimeout(10)
socket.sendall("Welcome\r\n\r\n")
while True:
try:
data = socket.recv(1)
except timeout: # Went 10 seconds without data
pass
except Exception as e: # Likely the client closed the connection
break
if data:
command = command + data
if data == "\n" or data == "\r":
if len(command.strip()) > 0:
self._parse_command(command.strip(), socket)
command = ""
if data == '\x08':
command = command[:-2]
else: # Timeout on read
try:
self._socket.sendall("event,heartbeat\r\n") # Send heartbeat
except:
self._socket.close()
break
The sendall for the heartbeat never throws an exception and the recv only throws a timeout (or another exception if the client properly closes the connection under normal circumstances).
Any ideas? Am I wrong that sending to a client that doesn't ACK should generate an exception eventually (I've tested for several minutes).
The behavior you are observing is the expected behavior for a TCP socket connection. In particular, in general the TCP stack has no way of knowing that an ethernet cable has been pulled or that the (now physically disconnected) remote client program has shut down; all it knows is that it has stopped receiving acknowledgement packets from the remote peer, and for all it knows the packets could just be getting dropped by an overloaded router somewhere and the issue will resolve itself momentarily. Given that, it does what TCP always does when its packets don't get acknowledged: it reduces its transmission rate and its number-of-packets-in-flight limit, and retransmits the unacknowledged packets in the hope that they will get through this time.
Assuming the server's socket has outgoing data pending, the TCP stack will eventually (i.e. after a few minutes) decide that no data has gone through for a long-enough time, and unilaterally close the connection. So if you're okay with a problem-detection time of a few minutes, the easiest way to avoid the zombie-connection problem is simply to be sure to periodically send a bit of heartbeat data over the TCP connection, as you described. When the TCP stack tries (and repeatedly fails) to get the outgoing data sent-and-acknowledged, that is what eventually will trigger it to close the connection.
If you want something quicker than that, you'll need to implement your own challenge/response system with timeouts (either over the TCP socket, or over a separate TCP socket, or over UDP), but note that in doing so you are likely to suffer from false positives yourself (e.g. you might end up severing a TCP connection that was not actually dead but only suffering from a temporary condition of lost packets due to congestion). Whether or not that's a worthwhile tradeoff depends on what sort of program you are writing. (Note also that UDP has its own issues, particularly if you want your system to work across firewalls, etc)
We have a server, written using tornado, which sends asynchronous messages to a client over websockets. In this case, a javascript app running in Chrome on a Mac. When the client is forcibly disconnected, in this case by putting the client to sleep, the server still thinks it is sending messages to the client. Additionally, when the client awakens from sleep, the messages are delivered in a burst.
What is the mechanism by which these messages are queued/buffered? Who is responsible? Why are they still delivered? Who is reconnecting the socket? My intuition is that even though websockets are not request/response like HTTP, they should still require ACK packets since they are built on TCP. Is this being done on purpose to make the protocol more robust to temporary drops in the mobile age?
Browsers may handle websocket client messages in a separate thread, which is not blocked by sleep.
Even if a thread of your custom application is not active, when you force it to sleep (like sleep(100)), TCP connection is not closed in this case. The socket handle is still managed by OS kernel and the TCP server still sends the messages until it reaches the TCP client's receive window overflow. And even after this an application on server side can still submit new messages successfully, which are buffered on TCP level on server side until TCP outgoing buffer is overflown. When outgoing buffer is full, an application should get error code on send request, like "no more space". I have not tried myself, but it should behave like this.
Try to close the client (terminate the process), you will see totally different picture - the server will notice disconnect.
Both cases, disconnect and overflow, are difficult to handle on server side for highly reliable scenarios. Disconnect case can be converted to overflow case (websocket server can buffer messages up to some limit on user space while client is being reconnected). However, there is no easy way to handle reliably overflow of transmit buffer limit. I see only one solution - propagate overflow error back to originator of the event, which raised the message, which has been discarded due to overflow.
I am trying to implement a python traceroute that sends UDP messages and receives the ICMP responses via raw sockets. I've run into an issue where the ICMP packets seem to avoid capture at all cost. The ICMP responses show up in wireshark as exactly what I'd expect, but the socket never receives any data to read. Another complication is that I am running the code on VirtualBox running Ubuntu, as the sendto() would not get the packets on the wire in Windows 7. (I'm running wireshark in windows to capture the packets). The strange thing is that wireshark will capture the ICMP messages when I run the python script from the virtual machine. However, when I try to run the script on windows, the ICMP messages don't show up in wireshark. (The UDP packets have magically started working on windows)
I've played around with all sorts of different versions of setting up the socket from online examples, and played around with using bind() and not using it, but no configuration seems to produce a socket that reads. It will just time out waiting to read the ICMP message.
It should also be noted that if I try to read my udp sending socket, it successfully reads the udp packets. As soon as I set IPPROTO_ICMP the read times out.
receive_response method:
def receive_response(rec_socket, packetid, tsend, timeout):
remain = timeout
print packetid
while remain > 0:
ready = select.select([rec_socket], [], [], remain)
if ready[0] == []:
return
print 'got something'
setting up the socket:
rec_socket = socket.socket(socket.AF_INET, socket.SOCK_RAW, ICMP_CODE)
rec_socket.setsockopt(socket.SOL_IP, socket.IP_HDRINCL, 1)
rec_socket.bind(("",0)) #played with using this statement and skipping it
call to receive is simply:
reached = receive_response(rec_socket, packetid, time.time(), timeout)
It looks like the problem is VirtualBox will default to using NAT to connect to the network. This means that the virtual machine won't receive the ICMP messages by virtue of them being ICMP messages. It seems like the solution to this is to configure VirtualBox networking to use "Bridged networking" mode. Unfortunately I cannot confirm this as I can't set up the virtual machine on my university's network within bridged mode. As for the reason they didn't work in windows, it must be related to windows' lack of support for raw sockets.
I have a Python test program for testing features of another software component, let's call the latter the component under test (COT).
The Python test program is connected to the COT via a persistent TCP connection.
The Python program is using the Python socket API for this.
Now in order to simulate a failure of the physical link, I'd like to have the Python program shut the socket down, but without disconnecting appropriately.
I.e. I don't want anything to be sent on the TCP channel any more, including any TCP SYN/ACK/FIN. I just want the socket to go silent. It must not respond to the remote packets any more.
This is not as easy as it seems, since calling close on a socket will send TCP FIN packets to the remote end. (graceful disconnection).
So how can I kill the socket without sending any packets out?
I cannot shut down the Python program itself, because it needs to maintain other connections to other components.
For information, the socket runs in a separate thread. So I thought of abruptly killing the thread, but this is also not so easy. (Is there any way to kill a Thread?)
Any ideas?
You can't do that from a userland process since in-kernel network stack still holds resources and state related to given TCP connection. Event if you kill your whole process the kernel is going to send a FIN to the other side since it knows what file descriptors your process had and will try to clean them up properly.
One way to get around this is to engage firewall software (on local or intermediate machine). Call a script that tells the firewall to drop all packets from/to given IP and port (that of course would need appropriate administrative privileges).
Contrary to Nikolai's answer, there is indeed a way to reset the connection from userland such that an RST is sent and pending data discarded, rather than a FIN after all the pending data. However as it is more abused than used, I won't publish it here. And I don't know whether it can be done from Python. Setting one of the three possible SO_LINGER configurations and closing will do it. I won't say more than that, and I will say that this technique should only be used for the purpose outlined in the question.