How do I keep a FTP connection alive? - python

I used ftputil to download a batch of files from a FTP server. It raised the error ftputil.error.FTPIOError: [Errno 60] Operation timed out.
As described in Documentation – ftputil,
keep_alive() attempts to keep the connection to the remote server active in order to prevent timeouts from happening. This method is primarily intended to keep the underlying FTP connection of an FTPHost object alive while a file is uploaded or downloaded. This will require either an extra thread while the upload or download is in progress or calling keep_alive from a callback function.
I called keep_alive from a callback function with,
ftp_host.download(source, target, callback=ftp_host.keep_alive)
but it raised ERROR __main__ keep_alive() takes 1 positional argument but 2 were given.
How do I keep a FTP connection alive?

This isn't directly an answer to your question, but it may help finding an answer for your particular problem yourself. Also, a ticket on the ftputil website is better for help with debugging a problem. That said, I think it's fine to ask on StackOverflow first since you don't know in advance if the problem is a simple one or not. :-)
Since FTP is a stateful protocol, client and server can't send arbitrary commands at a given time. The allowed commands and possibly replies are determined by the state the connection is in. See also the state diagrams in RFC 959.
To work around this limitation, ftputil creates a new FTP connection behind the scenes for each remote file object [1]. With this approach, you can still send commands like chdir or start a download while another is still in progress. However, this means that from the perspective of the server, all these FTP connections that come from a single FTPHost object are independent connections, so each of these connections can have their timeout at different times, depending on the usage pattern of the respective connection.
For example, there was ftputil ticket 141, where presumably the main connection initiated by the FTPHost object timed out while a connection used for downloading was still usable.
In your case, it might be helpful to find out which of the underlying connections is timing out (the initial connection or a connection for a remote file). You can use ftputil.session.session_factory to create factories that have FTP debugging enabled (see the documentation).
Unfortunately, a timeout of 60 seconds is quite short, so there are relatively many chances for timeouts.
Especially given the possibility of timeouts in FTP connections, my advice is to write software for FTP transfers in a way so that you can restart the operation (ideally with a new FTPHost object for robustness) where it was interrupted by the timeout. So far I haven't been able to come up with a way to universally work around timeouts. In simple cases you may actually be better off using ftplib directly, although ftputil has robustness and latency improvements that ftplib doesn't have. Using ftplib doesn't save you from timeouts, but at least you don't have any "hidden" connections that may make debugging more difficult.
[1] That said, if you close a remote file in ftputil, the underlying FTP connection can be reused unless it's not timed out. The library checks for a timeout before it reuses the connection.
The picture regarding timeouts is even more complicated by ftputil caching a lot of information from the server to reduce latency. For example, if you call FTPHost.getcwd(), the current directory is retrieved from a cached attribute, not by sending a PWD command to the server and thereby resetting the timeout. Stat information from directory listings is also usually cached.

After couple hours looking for solutions I got it running without '421 Timeout' errors calling keepalive from separate thread. However your I/O Timeout error probably was caused by connection problems.
import ftputil
from threading import Thread
from time import sleep
fhandle = ftputil.FTPHost('host', 'user', 'pwd')
quitThread = 0
def _thread_keep_alive():
while quitThread == 0:
print("KEEPALIVE!")
fhandle.keep_alive()
sleep(25)
thread = Thread(target = _thread_keep_alive)
thread.start()
# some downloading...
quitThread = 1
fhandle.close()

Related

How to close a SolrClient connection?

I am using SolrClient for python with Solr 6.6.2. It works as expected but I cannot find anything in the documentation for closing the connection after opening it.
def getdocbyid(docidlist):
for id in docidlist:
solr = SolrClient('http://localhost:8983/solr', auth=("solradmin", "Admin098"))
doc = solr.get('Collection_Test',doc_id=id)
print(doc)
I do not know if the client closes it automatically or not. If it doesn't, wouldn't it be a problem if several connections are left open? I just want to know if it there is any way to close the connection. Here is the link to the documentation:
https://solrclient.readthedocs.io/en/latest/
The connections are not kept around indefinitely. The standard timeout for any persistent http connection in Jetty is five seconds as far as I remember, so you do not have to worry about the number of connections being kept alive exploding.
The Jetty server will also just drop the connection if required, as it's not required to keep it around as a guarantee for the client. solrclient uses a requests session internally, so it should do pipelining for subsequent queries. If you run into issues with this you can keep a set of clients available as a pool in your application instead, then request an available client instead of creating a new one each time.
I'm however pretty sure you won't run into any issues with the default settings.

How python socket detect the server is closed when continue sending data to server?

I use python socket to send data to server, and the code like:
When I close the server, it will send the data twice, and then, it will go to the "except" code. If I set the SEND_INTERVAL too long, it will be a disaster. So, how to get the error immediately when the server is closed or downtime?
Nothing happens immediatly over the network. That's one thing.
Secondly the underlying OS will detect broken connections (and Python gets that info in the form of an exception), but usually this takes time. And that's why you still send messages even though the connection is already dead. But since OS operates on network layer (as opposed to the application layer) then there's an issue: the connection may be dead but OS may never detect this. For example this will happen when the server is dead but behind alive proxy.
Thirdly the most reliable way to know that a server is alive is when it sends something back to the client. So you should always .recv() (with timeout) after .sendall() call and the server should always .sendall() after .recv() (the request-response pattern, even when the response is a simple "I received message"). If you can't modify the server side and (in worst case) if the server doesn't send anything back to the client then there's no reliable way to know this.
That's why you need some form of framing/correctness protocol. Simple .sendall() won't do.

Pass connected SSL Socket to another Process

I am struggling to find a mechanism to send a request to the target server and when the socket has data to be read, pass the socket to another process for getting the data out.
I came so far using epoll on Linux, to implement it to the point that i do the handshake, i send the request and the request arrives, then i pass the socket fd to another process for futher handling, i explicitly save the SSL Session using PEM_write_bio_SSL_SESSION and then read it using PEM_read_bio_SSL_SESSION and add it to the context but i can not read the ssl socket in another process because i get either Internal error or Handshake failure.
I've read this article but still couldn't find any mechanism to work it out. I know this is because openssl is application-level library but there has to be way because Apache already is doing this .
At least, if its not possible, is there a way to decrypt the data from socket (which i can read normally) using Master Key from openssl's session ?
The only way you can do this is by cloning the full user space part of the SSL socket, which is spread over multiple internal data structures. Since you don't have access to all the structures from python you can only do this by cloning the process, i.e. use fork.
Note that once you have forked the process you should only continue to work with the SSL socket in one of the processes, i.e. it is not possible to fork, do some work in the child and then do some work in the parent process. This is not possible because once you are dealing with the socket the SSL state gets changed, but only in one of the processes. In the other process the state gets out of sync and any attempts to use this wrong state later will cause strange errors.

Python server interaction

I'm new to python so please excuse me if question doesn't make sense in advance.
We have a python messaging server which has one file server.py with main function in it. It also has a class "*server" and main defines a global instance of this class, "the_server". All other functions in same file or diff modules (in same dir) import this instance as "from main import the_server".
Now, my job is to devise a mechanism which allows us to get latest message status (number of messages etc.) from the aforementioned messaging server.
This is the dir structure:
src/ -> all .py files only one file has main
In the same directory I created another status server with main function listening for connections on a different port and I'm hoping that every time a client asks me for message status I can invoke function(s) on my messaging server which returns the expected numbers.
How can I import the global instance, "the_server" in my status server or rather is it the right way to go?
You should probably use a single server and design a protocol that supports several kinds of messages. 'send' messages get sent, 'recv' message read any existing message, 'status' messages get the server status, 'stop' messages shut it down, etc.
You might look at existing protocols such as REST, for ideas.
Unless your "status server" and "real server" are running in the same process (that is, loosely, one of them imports the other and starts it), just from main import the_server in your status server isn't going to help. That will just give you a new, completely independent instance of the_server that isn't doing anything, which you can then report status on.
There are a few obvious ways to solve the problem.
Merge the status server into the real server completely, by expanding the existing protocol to handle status-related requests, as Peter Wooster suggestions.
Merge the status server into the real server async I/O implementation, but still listening on two different ports, with different protocol handlers for each.
Merge the status server into the real server process, but with a separate async I/O implementation.
Store the status information in, e.g., a mmap or a multiprocessing.Array instead of directly in the Server object, so the status server can open the same mmap/etc. and read from it. (You might be able to put the Server object itself in shared memory, but I wouldn't recommend this even if you could make it work.)
I could make these more concrete if you explained how you're dealing with async I/O in the server today. Select (or poll/kqueue/epoll) loop? Thread per connection? Magical greenlets? Non-magical cooperative threading (like PEP 3156/tulip)? Even just "All I know is that we're using twisted/tornado/gevent/etc., so whatever that does" is enough.

MySQLdb execute timeout

Sometimes in our production environment occurs situation when connection between service (which is python program that uses MySQLdb) and mysql server is flacky, some packages are lost, some black magic happens and .execute() of MySQLdb.Cursor object never ends (or take great amount of time to end).
This is very bad because it is waste of service worker threads. Sometimes it leads to exhausting of workers pool and service stops responding at all.
So the question is: Is there a way to interrupt MySQLdb.Connection.execute operation after given amount of time?
if the communication is such a problem, consider writing a 'proxy' that receives your SQL commands over the flaky connection and relays them to the MySQL server on a reliable channel (maybe running on the same box as the MySQL server). This way you have total control over failure detection and retrying.
You need to analyse exactly what the problem is. MySQL connections should eventually timeout if the server is gone; TCP keepalives are generally enabled. You may be able to tune the OS-level TCP timeouts.
If the database is "flaky", then you definitely need to investigate how. It seems unlikely that the database really is the problem, more likely that networking in between is.
If you are using (some) stateful firewalls of any kind, it's possible that they're losing some of the state, thus causing otherwise good long-lived connections to go dead.
You might want to consider changing the idle timeout parameter in MySQL; otherwise, a long-lived, unused connection may go "stale", where the server and client both think it's still alive, but some stateful network element in between has "forgotten" about the TCP connection. An application trying to use such a "stale" connection will have a long wait before receiving an error (but it should eventually).

Categories

Resources