How to close a SolrClient connection?

How to close a SolrClient connection? - python

I am using SolrClient for python with Solr 6.6.2. It works as expected but I cannot find anything in the documentation for closing the connection after opening it.
def getdocbyid(docidlist):
for id in docidlist:
solr = SolrClient('http://localhost:8983/solr', auth=("solradmin", "Admin098"))
doc = solr.get('Collection_Test',doc_id=id)
print(doc)
I do not know if the client closes it automatically or not. If it doesn't, wouldn't it be a problem if several connections are left open? I just want to know if it there is any way to close the connection. Here is the link to the documentation:
https://solrclient.readthedocs.io/en/latest/

The connections are not kept around indefinitely. The standard timeout for any persistent http connection in Jetty is five seconds as far as I remember, so you do not have to worry about the number of connections being kept alive exploding.
The Jetty server will also just drop the connection if required, as it's not required to keep it around as a guarantee for the client. solrclient uses a requests session internally, so it should do pipelining for subsequent queries. If you run into issues with this you can keep a set of clients available as a pool in your application instead, then request an available client instead of creating a new one each time.
I'm however pretty sure you won't run into any issues with the default settings.

Related

Realtime Database connection limit

Played around with a python script and got my connection limit to full by a loop lol.
Any way to clear it? I've restarted PC already.
image

In case of a so-called dirty disconnect, where the client simply disappears without information the server, the Realtime Database server depends on the socket timeout to detect when the client is gone. This may take a few minutes, but aside from that should take no action from your side.

How do I keep a FTP connection alive?

I used ftputil to download a batch of files from a FTP server. It raised the error ftputil.error.FTPIOError: [Errno 60] Operation timed out.
As described in Documentation – ftputil,
keep_alive() attempts to keep the connection to the remote server active in order to prevent timeouts from happening. This method is primarily intended to keep the underlying FTP connection of an FTPHost object alive while a file is uploaded or downloaded. This will require either an extra thread while the upload or download is in progress or calling keep_alive from a callback function.
I called keep_alive from a callback function with,
ftp_host.download(source, target, callback=ftp_host.keep_alive)
but it raised ERROR __main__ keep_alive() takes 1 positional argument but 2 were given.
How do I keep a FTP connection alive?

This isn't directly an answer to your question, but it may help finding an answer for your particular problem yourself. Also, a ticket on the ftputil website is better for help with debugging a problem. That said, I think it's fine to ask on StackOverflow first since you don't know in advance if the problem is a simple one or not. :-)
Since FTP is a stateful protocol, client and server can't send arbitrary commands at a given time. The allowed commands and possibly replies are determined by the state the connection is in. See also the state diagrams in RFC 959.
To work around this limitation, ftputil creates a new FTP connection behind the scenes for each remote file object [1]. With this approach, you can still send commands like chdir or start a download while another is still in progress. However, this means that from the perspective of the server, all these FTP connections that come from a single FTPHost object are independent connections, so each of these connections can have their timeout at different times, depending on the usage pattern of the respective connection.
For example, there was ftputil ticket 141, where presumably the main connection initiated by the FTPHost object timed out while a connection used for downloading was still usable.
In your case, it might be helpful to find out which of the underlying connections is timing out (the initial connection or a connection for a remote file). You can use ftputil.session.session_factory to create factories that have FTP debugging enabled (see the documentation).
Unfortunately, a timeout of 60 seconds is quite short, so there are relatively many chances for timeouts.
Especially given the possibility of timeouts in FTP connections, my advice is to write software for FTP transfers in a way so that you can restart the operation (ideally with a new FTPHost object for robustness) where it was interrupted by the timeout. So far I haven't been able to come up with a way to universally work around timeouts. In simple cases you may actually be better off using ftplib directly, although ftputil has robustness and latency improvements that ftplib doesn't have. Using ftplib doesn't save you from timeouts, but at least you don't have any "hidden" connections that may make debugging more difficult.
[1] That said, if you close a remote file in ftputil, the underlying FTP connection can be reused unless it's not timed out. The library checks for a timeout before it reuses the connection.
The picture regarding timeouts is even more complicated by ftputil caching a lot of information from the server to reduce latency. For example, if you call FTPHost.getcwd(), the current directory is retrieved from a cached attribute, not by sending a PWD command to the server and thereby resetting the timeout. Stat information from directory listings is also usually cached.

After couple hours looking for solutions I got it running without '421 Timeout' errors calling keepalive from separate thread. However your I/O Timeout error probably was caused by connection problems.
import ftputil
from threading import Thread
from time import sleep
fhandle = ftputil.FTPHost('host', 'user', 'pwd')
quitThread = 0
def _thread_keep_alive():
while quitThread == 0:
print("KEEPALIVE!")
fhandle.keep_alive()
sleep(25)
thread = Thread(target = _thread_keep_alive)
thread.start()
# some downloading...
quitThread = 1
fhandle.close()

What happens if a HTTP connection is closed while AppEngine is still running

The real question is if Google App Engine guarantees it would complete a HTTP request even if the connection is no longer existed (such as terminated, lost Internet connection).
Says we have a python script running on Google App Engine:
db.put(status = "Outputting")
print very_very_very_long_string_like_1GB
db.put(status = "done")
If the client decides to close the connection in the middle (too much data coming...), will status = "done" be executed? Or will the instance be killed and all following code be ignored?

If the client breaks the connect, the request will continue to execute. Unless it reaches the deadline of 60 seconds.

GAE uses Pending Queue to queue up requests. If client drops connection and request is already in the queue or being executed, then it will not be aborted. Afaik all other http servres behave the same way.
This will be a real problem when you make requests that change state (PUT, POST, DELETE) on mobile networks. On Edge networks we see about 1% of large requests (uploads, ~500kb) dropped in the middle of request executing (exec takes about 1s): e.g. server gets the data and processes it, but client does not receive response, triggering it to retry. This could produce duplicate data in the DB, breaking integrity of this data.
To alleviate this you will need to make your web methods idempotent: repeating the same method with same arguments does not change state. The easiest way to achieve this would be one of:
Hash relevant data and compare to existing hashes. In you case it would be the string you are trying to save (very_very_very_long_string_like_1GB). You can do this server side.
Client provides unique request-scoped ID, and sever checks if this ID was already used.

Closing and restarting DB connection during web2py request cycle

My db is SQL Server on a remote machine and I am encountering quite a bit of latency.
I have a method in a controller that is structured like this:
def submitData():
parameters = db.site.select(...)
results = some_http_post() # posts data to remote server,
if results:
rec = db.status_table.insert(...)
rec.status_tabl.update(...)
What tends to happen is that some_http_post() takes several seconds to get a response and I run out of threads
When I hit web2py with more than 6 concurrent requests to submitData, I am encountering freezes in requests, rather than getting DB error.
This has the effect of stopping any further web2py requests.
I would ideally like to close the db connection before the call to some_http_post and start another db connection after it, but I don't see a simple way to do this with the DAL API. Is this possible or is there a better optimisation that I could be trying?

Connection pooling is enabled by default.
If you add "pool_size=0" to the connection string this disables connection pooling and I assume the behavior to forcibly open/close each conn. instead of leaving them open.
If you need more threads (sounds like), increase your pool_size and see what happens.
OR, yes, you can use the DAL and do a db.close() and the connection is auto-reopened on first request

Try enabling connection pooling: http://web2py.com/books/default/chapter/29/06#Connection-pooling

MySQLdb execute timeout

Sometimes in our production environment occurs situation when connection between service (which is python program that uses MySQLdb) and mysql server is flacky, some packages are lost, some black magic happens and .execute() of MySQLdb.Cursor object never ends (or take great amount of time to end).
This is very bad because it is waste of service worker threads. Sometimes it leads to exhausting of workers pool and service stops responding at all.
So the question is: Is there a way to interrupt MySQLdb.Connection.execute operation after given amount of time?

if the communication is such a problem, consider writing a 'proxy' that receives your SQL commands over the flaky connection and relays them to the MySQL server on a reliable channel (maybe running on the same box as the MySQL server). This way you have total control over failure detection and retrying.

You need to analyse exactly what the problem is. MySQL connections should eventually timeout if the server is gone; TCP keepalives are generally enabled. You may be able to tune the OS-level TCP timeouts.
If the database is "flaky", then you definitely need to investigate how. It seems unlikely that the database really is the problem, more likely that networking in between is.
If you are using (some) stateful firewalls of any kind, it's possible that they're losing some of the state, thus causing otherwise good long-lived connections to go dead.
You might want to consider changing the idle timeout parameter in MySQL; otherwise, a long-lived, unused connection may go "stale", where the server and client both think it's still alive, but some stateful network element in between has "forgotten" about the TCP connection. An application trying to use such a "stale" connection will have a long wait before receiving an error (but it should eventually).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.