I am struggling to find a mechanism to send a request to the target server and when the socket has data to be read, pass the socket to another process for getting the data out.
I came so far using epoll on Linux, to implement it to the point that i do the handshake, i send the request and the request arrives, then i pass the socket fd to another process for futher handling, i explicitly save the SSL Session using PEM_write_bio_SSL_SESSION and then read it using PEM_read_bio_SSL_SESSION and add it to the context but i can not read the ssl socket in another process because i get either Internal error or Handshake failure.
I've read this article but still couldn't find any mechanism to work it out. I know this is because openssl is application-level library but there has to be way because Apache already is doing this .
At least, if its not possible, is there a way to decrypt the data from socket (which i can read normally) using Master Key from openssl's session ?
The only way you can do this is by cloning the full user space part of the SSL socket, which is spread over multiple internal data structures. Since you don't have access to all the structures from python you can only do this by cloning the process, i.e. use fork.
Note that once you have forked the process you should only continue to work with the SSL socket in one of the processes, i.e. it is not possible to fork, do some work in the child and then do some work in the parent process. This is not possible because once you are dealing with the socket the SSL state gets changed, but only in one of the processes. In the other process the state gets out of sync and any attempts to use this wrong state later will cause strange errors.
Related
I used ftputil to download a batch of files from a FTP server. It raised the error ftputil.error.FTPIOError: [Errno 60] Operation timed out.
As described in Documentation – ftputil,
keep_alive() attempts to keep the connection to the remote server active in order to prevent timeouts from happening. This method is primarily intended to keep the underlying FTP connection of an FTPHost object alive while a file is uploaded or downloaded. This will require either an extra thread while the upload or download is in progress or calling keep_alive from a callback function.
I called keep_alive from a callback function with,
ftp_host.download(source, target, callback=ftp_host.keep_alive)
but it raised ERROR __main__ keep_alive() takes 1 positional argument but 2 were given.
How do I keep a FTP connection alive?
This isn't directly an answer to your question, but it may help finding an answer for your particular problem yourself. Also, a ticket on the ftputil website is better for help with debugging a problem. That said, I think it's fine to ask on StackOverflow first since you don't know in advance if the problem is a simple one or not. :-)
Since FTP is a stateful protocol, client and server can't send arbitrary commands at a given time. The allowed commands and possibly replies are determined by the state the connection is in. See also the state diagrams in RFC 959.
To work around this limitation, ftputil creates a new FTP connection behind the scenes for each remote file object [1]. With this approach, you can still send commands like chdir or start a download while another is still in progress. However, this means that from the perspective of the server, all these FTP connections that come from a single FTPHost object are independent connections, so each of these connections can have their timeout at different times, depending on the usage pattern of the respective connection.
For example, there was ftputil ticket 141, where presumably the main connection initiated by the FTPHost object timed out while a connection used for downloading was still usable.
In your case, it might be helpful to find out which of the underlying connections is timing out (the initial connection or a connection for a remote file). You can use ftputil.session.session_factory to create factories that have FTP debugging enabled (see the documentation).
Unfortunately, a timeout of 60 seconds is quite short, so there are relatively many chances for timeouts.
Especially given the possibility of timeouts in FTP connections, my advice is to write software for FTP transfers in a way so that you can restart the operation (ideally with a new FTPHost object for robustness) where it was interrupted by the timeout. So far I haven't been able to come up with a way to universally work around timeouts. In simple cases you may actually be better off using ftplib directly, although ftputil has robustness and latency improvements that ftplib doesn't have. Using ftplib doesn't save you from timeouts, but at least you don't have any "hidden" connections that may make debugging more difficult.
[1] That said, if you close a remote file in ftputil, the underlying FTP connection can be reused unless it's not timed out. The library checks for a timeout before it reuses the connection.
The picture regarding timeouts is even more complicated by ftputil caching a lot of information from the server to reduce latency. For example, if you call FTPHost.getcwd(), the current directory is retrieved from a cached attribute, not by sending a PWD command to the server and thereby resetting the timeout. Stat information from directory listings is also usually cached.
After couple hours looking for solutions I got it running without '421 Timeout' errors calling keepalive from separate thread. However your I/O Timeout error probably was caused by connection problems.
import ftputil
from threading import Thread
from time import sleep
fhandle = ftputil.FTPHost('host', 'user', 'pwd')
quitThread = 0
def _thread_keep_alive():
while quitThread == 0:
print("KEEPALIVE!")
fhandle.keep_alive()
sleep(25)
thread = Thread(target = _thread_keep_alive)
thread.start()
# some downloading...
quitThread = 1
fhandle.close()
I tried to use python's zmq lib. And now I have two questions:
Is there a way to check socket connection state?
I'd like to know if connection is established after call connect
I want to one-to-one communication model.
I tried to use PAIR zmq socket type.
In that case if one client is already connected, server will not receive any messages from secondary connected client.
But I'd like to get info in the second client that there is another client and server is busy.
You'd get an error if connect fails.
But I guess the real question is how often do you want to check this? once at startup, before each message, or periodically, using some heartbeat?
That does not make sense, as you can not send info without connecting first.
However, some socket types might give some more info.
But the best way would be to use multiple sockets: one for such status information, and another one for sending data.
ZMQ is made to use multiple sockets.
Let say we have a server application written in Python.
Let also say that this main server process forked two more processes at the startup.
Server awaits its clients, and when one comes decides to which of two forked processes should pass the client's socket.
I do not want to fork a process each time a client comes; I want to have fixed number of servers, but one main server that receives a connection, then pass it to a server that deals with a specific work client asked for.
This should be a DOS attack protection, job separation, etc. etc.
Is there any trick to pass a Python object between started Python programs.
Some shared memory or something like that?
Would pickling the socket object and pushing it through IPC work?
Would pickling the socket object and pushing it through IPC work?
No. Inside that object is a file descriptor or handle to the kernel socket. It's just a number that the process uses to identify the socket when making system calls.
If you pickle that Python socket object and send it to another process, that process will be using a handle for a socket it didn't open. Or worse, that handle may refer to a different open file.
The most efficient way to handle this (on Linux) is like this:
Master process opens listening socket (e.g. TCP port 80)
Master process forks N children who all inherit that open socket
They all call accept() and block, waiting for a new connection
When a new client connects, the kernel will select one of the processes with a handle to that socket to accept the connection; the others will continue to wait
This way, you let the kernel handle the load balancing.
If you don't want this behavior, there is a way (in UNIX) to pass an open socket to another process. Again, this is more than just the handle; the kernel effectively copies the open socket to your processs's open file list. This mechanism is known as SCM_RIGHTS, and you can see an example (in C) here:
http://man7.org/tlpi/code/online/dist/sockets/scm_rights_send.c.html
Otherwise, your master process will need to effectively proxy the connection to the child processes, reducing thr efficiency of the system.
I am currently working on a server + client combo on python and I'm using TCP sockets. From networking classes I know, that TCP connection should be closed step by step, first one side sends the signal, that it wants to close the connection and waits for confirmation, then the other side does the same. After that, socket can be safely closed.
I've seen in python documentation function socket.shutdown(flag), but I don't see how it could be used in this standard method, theoretical of closing TCP socket. As far as I know, it just blocks either reading, writing or both.
What is the best, most correct way to close TCP socket in python? Are there standard functions for closing signals or do I need to implement them myself?
shutdown is useful when you have to signal the remote client that no more data is being sent. You can specify in the shutdown() parameter which half-channel you want to close.
Most commonly, you want to close the TX half-channel, by calling shutdown(1). In TCP level, it sends a FIN packet, and the remote end will receive 0 bytes if blocking on read(), but the remote end can still send data back, because the RX half-channel is still open.
Some application protocols use this to signal the end of the message. Some other protocols find the EOM based on data itself. For example, in an interactive protocol (where messages are exchanged many times) there may be no opportunity, or need, to close a half-channel.
In HTTP, shutdown(1) is one method that a client can use to signal that a HTTP request is complete. But the HTTP protocol itself embeds data that allows to detect where a request ends, so multiple-request HTTP connections are still possible.
I don't think that calling shutdown() before close() is always necessary, unless you need to explicitly close a half-channel. If you want to cease all communication, close() does that too. Calling shutdown() and forgetting to call close() is worse because the file descriptor resources are not freed.
From Wikipedia: "On SVR4 systems use of close() may discard data. The use of shutdown() or SO_LINGER may be required on these systems to guarantee delivery of all data." This means that, if you have outstanding data in the output buffer, a close() could discard this data immediately on a SVR4 system. Linux, BSD and BSD-based systems like Apple are not SVR4 and will try to send the output buffer in full after close(). I am not sure if any major commercial UNIX is still SVR4 these days.
Again using HTTP as an example, an HTTP client running on SVR4 would not lose data using close() because it will keep the connection open after request to get the response. An HTTP server under SVR would have to be more careful, calling shutdown(2) before close() after sending the whole response, because the response would be partly in the output buffer.
According to the python documentation which says:
Strictly speaking, you’re supposed to use shutdown on a socket before
you close it. The shutdown is an advisory to the socket at the other
end. Depending on the argument you pass it, it can mean “I’m not going
to send anymore, but I’ll still listen”, or “I’m not listening, good
riddance!”. Most socket libraries, however, are so used to programmers
neglecting to use this piece of etiquette that normally a close is the
same as shutdown(); close(). So in most situations, an explicit
shutdown is not needed.
I think the most correct way to close a TCP connection would be to use shutdown before closing a connection, because close is not atomic! This can make some bugs. Suppose you're using close function without shutdown and the data didn't send to the server correctly, at the same time python closes the connection and server can't reply to client, now the socket at the other end may hang indefinitely.
I have a Python test program for testing features of another software component, let's call the latter the component under test (COT).
The Python test program is connected to the COT via a persistent TCP connection.
The Python program is using the Python socket API for this.
Now in order to simulate a failure of the physical link, I'd like to have the Python program shut the socket down, but without disconnecting appropriately.
I.e. I don't want anything to be sent on the TCP channel any more, including any TCP SYN/ACK/FIN. I just want the socket to go silent. It must not respond to the remote packets any more.
This is not as easy as it seems, since calling close on a socket will send TCP FIN packets to the remote end. (graceful disconnection).
So how can I kill the socket without sending any packets out?
I cannot shut down the Python program itself, because it needs to maintain other connections to other components.
For information, the socket runs in a separate thread. So I thought of abruptly killing the thread, but this is also not so easy. (Is there any way to kill a Thread?)
Any ideas?
You can't do that from a userland process since in-kernel network stack still holds resources and state related to given TCP connection. Event if you kill your whole process the kernel is going to send a FIN to the other side since it knows what file descriptors your process had and will try to clean them up properly.
One way to get around this is to engage firewall software (on local or intermediate machine). Call a script that tells the firewall to drop all packets from/to given IP and port (that of course would need appropriate administrative privileges).
Contrary to Nikolai's answer, there is indeed a way to reset the connection from userland such that an RST is sent and pending data discarded, rather than a FIN after all the pending data. However as it is more abused than used, I won't publish it here. And I don't know whether it can be done from Python. Setting one of the three possible SO_LINGER configurations and closing will do it. I won't say more than that, and I will say that this technique should only be used for the purpose outlined in the question.