Pylons error - 'MySQL server has gone away' - python

I'm using Pylons (a python framework) to serve a simple web application, but it seems to die from time to time, with this in the error log: (2006, 'MySQL server has gone away')
I did a bit of checking, and saw that this was because the connections to MySQL were not being renewed. This shouldn't be a problem though, because the sqlalchemy.pool_recycle in the config file should automatically keep it alive. The default was 3600, but I dialed it back to 1800 because of this problem. It helped a bit, but 3600 should be fine according to the docs. The errors still happen semi-regularly. I don't want to lower it too much though and DOS my own database :).
Maybe something in my MySQL config is goofy? Not sure where to look exactly.
Other relevant details:
Python 2.5
Pylons: 0.9.6.2 (w/ sql_alchemy)
MySQL: 5.0.51

I think I fixed it. It's turns out I had a simple config error. My ini file read:
sqlalchemy.default.url = [connection string here]
sqlalchemy.pool_recycle = 1800
The problem is that my environment.py file declared that the engine would only map keys with the prefix: sqlalchemy.default so pool_recycle was ignored.
The solution is to simply change the second line in the ini to:
sqlalchemy.default.pool_recycle = 1800

You might want to check MySQL's timeout variables:
show variables like '%timeout%';
You're probably interested in wait_timeout (less likely but possible: interactive_timeout). On Debian and Ubuntu, the defaults are 28800 (MySQL kills connections after 8 hours), but maybe the default for your platform is different or whoever administrates the server has configured things differently.
AFAICT, pool_recycle doesn't actually keep the connections alive, it expires them on its own before MySQL kills them. I'm not familiar with pylons, but if causing the connections to intermittently do a SELECT 1; is an option, that will keep them alive at the cost of basically no server load and minimal network traffic. One final thought: are you somehow managing to use a connection that pylons thinks it has expired?

Related

insufficient data in "D" message

I'm using SQLAlchemy scoped sessions to work with a postgresql 9.4 database.
Sometimes I get an error that says "DatabaseError: (DatabaseError) insufficient data in "D" message". I cannot reproduce this error and it happens in an unpredictable way.
After looking at he postgres log files, this error occurs shortly after postgresql logs "could not receive data from client: Connection reset by peer". I guess that means that the connection was cut from the application side. But I don't see anything that could cause this.
It's time to break out your network tools. You have errors on both end that suggests something caused your connection to drop.
It might be hardware, drivers, some bug in your software stack or a proxy / firewall deciding it didn't like the look of your connection and killed it. It's unlikely to be PostgreSQL itself or any of your Python code.
Fire up tcpdump or wireshark and take a look at the packets going back and fore. Ideally on both ends of the connection. That should give you a good indication of where the problem is.

Locust.io Load Testing getting "Connection aborted BadStatusLine" Errors

I'm using Locust.io to load test an application. I will get a random error that I am unable to pinpoint the problem:
1)
ConnectionError(ProtocolError(\'Connection aborted.\', BadStatusLine("\'\'",)),)
2)
ConnectionError(ProtocolError('Connection aborted.', error(104, 'Connection reset by peer')),)
The first one is the one that happens a few times every 1,000,000 requests or so and seems to happen in groups where there will be 5-20 all at once and then it is fine. the second only happens every couple days or so.
The CPU and memory are well below all the servers max load for the database server, app server, and the machine running locust.io.
The servers are medium-sized Linode servers running Ubuntu 14.04. The app is Django and the database in PostgreSQL. I have already increased the maximum open file limit but am wondering if something else needs to be increased on the server that could be leading to the occasional errors.
From what I have been able to gather from searching the error is that it might have something to do with the python requests library.
-Any help would be greatly appreciated.
BadStatusLine is most likely a server side issue. See for example this answer https://stackoverflow.com/a/1767954/1591921 It could be some sort of flood/DoS protection on the server.
Connection reset by peer could also be any number of things, but it is most likely a server/network issue, not an issue on the loadgen side (perhaps connections are idle for too long, or there is a max connection age somewhere)
I dont think there are any general answers to this question, it all depends on your system under test.

How to get instance of database and close it? Tornado

I'm having trouble with MySQL timing out and going away after 8 hours. I am using google app engine as a host. My Python script uses the Tornado framework.
Right now I instantiate my MySQL db connection before any functions right at the top of the main server script. Once I deploy that, the clock starts ticking and 8 hours or so later, MySQL will go away and I will have to deploy my script again.
I haven't been using db.close() at all because I hear that restarting the database connection takes a long time. Is this true? Or is there a proper way to use db.close()?
One of my friends suggested I try getting the database instance and then closing it after each function.. is that recommended and where might I find some tutorials on that?
I'm mostly looking for resources here, but if someone wants to lay it out for me that would be awesome.
Thank you all in advance.
The connection is going away because of the wait_timeout session variable which
is the number of seconds the server waits for activity on a noninteractive connection
before closing it.
http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html#sysvar_wait_timeout
Good way is to close the connection each time and create a new one if you are not reusing the same connection so frequently, otherwise you can increase the value of wait_timeout
Establishing a connection to a MySQL database should be quite fast and it is certainly good practice to keep the connection open only for as long as you need it.
I am not certain why your connection should be non-responsive for 8 hours - have you tried checking your settings?
The correct command in Python is connection.close().

SQLAlchemy Connection Pooling Problems - Postgres on Windows

I'm using SQLAlchemy 0.6.6 against a Postgres 8.3 DB on Windows 7 an PY 2.6. I am leaving the defaults for configuring pooling when I create my engine, which is pool_size=5, max_overflow=10.
For some reason, the connections keep piling up and I intermittently get "Too many clients" from PG. I am positive that connections are being closed in a finally block as this application is only accessed via WSGI (CherryPy) and uses a connection/request pattern. I am also logging when connections are being closed just to make sure.
I've tried to see what's going on by adding echo_pool=true during my engine creation, but nothing is being logged. I can see SQL statement rolling through the console when I set echo=True, but nothing for pooling.
Anyway, this is driving me crazy because my co-worker who is on a Mac doesn't have any of these issues (I know, get a Mac), so I'm trying to see if this is the result of a bug or something. Google has yielded nothing so I'm hoping to get some help here.
Thanks,
cc
Turns out there was ScopedSession being used outside the normal application usage and the close wasn't in a finally.

MySQLdb execute timeout

Sometimes in our production environment occurs situation when connection between service (which is python program that uses MySQLdb) and mysql server is flacky, some packages are lost, some black magic happens and .execute() of MySQLdb.Cursor object never ends (or take great amount of time to end).
This is very bad because it is waste of service worker threads. Sometimes it leads to exhausting of workers pool and service stops responding at all.
So the question is: Is there a way to interrupt MySQLdb.Connection.execute operation after given amount of time?
if the communication is such a problem, consider writing a 'proxy' that receives your SQL commands over the flaky connection and relays them to the MySQL server on a reliable channel (maybe running on the same box as the MySQL server). This way you have total control over failure detection and retrying.
You need to analyse exactly what the problem is. MySQL connections should eventually timeout if the server is gone; TCP keepalives are generally enabled. You may be able to tune the OS-level TCP timeouts.
If the database is "flaky", then you definitely need to investigate how. It seems unlikely that the database really is the problem, more likely that networking in between is.
If you are using (some) stateful firewalls of any kind, it's possible that they're losing some of the state, thus causing otherwise good long-lived connections to go dead.
You might want to consider changing the idle timeout parameter in MySQL; otherwise, a long-lived, unused connection may go "stale", where the server and client both think it's still alive, but some stateful network element in between has "forgotten" about the TCP connection. An application trying to use such a "stale" connection will have a long wait before receiving an error (but it should eventually).

Categories

Resources