How to reduce the number of client connections in MySQL? - python

Recently our testing system is experiencing the error:
Error connecting to database: (1040, 'Too many connections')
And due to the memory limitation, we cannot increase the current max-connections too much. Adding more memory to the server is a short term solution.
The current DB is MySQL and we're using python to do the database programming. Basically, python code would control the test to run on specific node and store the result into the DB. So, open DB, store data, close DB. Every day we had millions of tests to run and store. But I wonder whether there's a way or methodology or philosophy that I could follow from the perspective of programming.
Also, what is the definition of one client connection? In terms of machines number or cores number in total?
Thanks.

Related

cx_Oracle Query Performance

I plan to use python at my job to connect directly to our main system production database. However the IT department is reluctant since apparently there is no easy way to control how much I can query the database. Hence they are worried I could affect performance for the rest of the users.
Is there a way to limit frequency of queries from python connection to the database? Or some other method that I can "sell" to my IT department so they will let me connect directly to production DB via python?
Many thanks
Database resource manager gives quite few options for this, depending how the production usage is compared to what you will be adding. This does not depend on the type of client.
https://blogs.oracle.com/db/oracle-resource-manager-and-dbmsresourcemanager
Often a plan is created where an order of usage limiting is specified. Regular production will get most resources, your project a class lower. If production is running, your session[s] get what is left over by production.
Also very nice is a cost estimation that allows to cancel a query deemed too expensive.
A bit of thought must be given to slow long running transaction that held blocking locks…. It does need a bit of experimentation to get this right.

Why does PostgreSQL say FATAL: sorry, too many clients already when I am nowhere close to the maximum connections?

I am working with an installation of PostgreSQL 11.2 that periodically complains in its system logs
FATAL: sorry, too many clients already
despite being no-where close to its configured limit of connections. This query:
SELECT current_setting('max_connections') AS max,
COUNT(*) AS total
FROM pg_stat_activity
tells me that the database is configured for a maximum of 100 connections. I have never seen over about 45 connections into the database with this query, not even moments before a running program receives a database error saying too many clients backed by the above message in the Postgres logs.
Absolutely everything I can find on issue on the Internet this suggests that the error means you have exceeded the max_connections setting, but the database itself tells me that I am not.
For what it's worth, pyspark is the only database client that triggers this error, and only when it's writing into tables from dataframes. The regular python code using psycopg2 (that is the main client) never triggers it (not even when writing into tables in the same manner from Pandas dataframes), and admin tools like pgAdmin also never trigger it. If I didn't see the error in the database logs directly, I would think that Spark is lying to me about the error. Most of the time, if I use a query like this:
SELECT pg_terminate_backend(pid) FROM pg_stat_activity
WHERE pid <> pg_backend_pid() AND application_name LIKE 'pgAdmin%';
then the problem goes away for several days. But like I said, I've never seen even 50% of the supposed max of 100 connections in use, according to the database itself. How do I figure out what is causing this error?
This is caused by how Spark reads/writes data using JDBC. Spark tries to open several concurrent connections to the database in order to read/write multiple partitions of data in parallel.
I couldn't find it in the docs but I think by default the number of connections is equal to the number of partitions in the datafame you want to write into db table. This explains the intermittency you've noticed.
However, you can control this number by setting numPartitions option:
The maximum number of partitions that can be used for parallelism in
table reading and writing. This also determines the maximum number of
concurrent JDBC connections. If the number of partitions to write
exceeds this limit, we decrease it to this limit by calling
coalesce(numPartitions) before writing.
Example:
spark.read.format("jdbc") \
.option("numPartitions", "20") \
# ...
Three possibilities:
The connections are very short-lived, and they were already gone by the time you looked.
You have a lower connection limit on that database.
You have a lower connection limit on the database user.
But options 2 and 3 would result in a different error message, so it must be the short-lived connections.
Whatever it is, the answer to your problem would be a well-configured connection pool.

Inconsistently slow queries in production (RDS)

First, the server setup:
nginx frontend to the world
gunicorn running a Flask app with gevent workers
Postgres database, connection pooled in the app, running from Amazon RDS, connected with psycopg2 patched to work with gevent
The problem I'm encountering is inexplicably slow queries that are sometimes running on the order of 100ms or so (ideal), but which often spike to 10s or more. While time is a parameter in the query, the difference between the fast and slow query happens much more frequently than a change in the result set. This doesn't seem to be tied to any meaningful spike in CPU usage, memory usage, read/write I/O, request frequency, etc. It seems to be arbitrary.
I've tried:
Optimizing the query - definitely valid, but it runs quite well locally, as well as any time I've tried it directly on the server through psql.
Running on a larger/better RDS instance - I'm currently working on an m3.medium instance with PIOPS and not coming close to that read rate, so I don't think that's the issue.
Tweaking the number of gunicorn workers - I thought this could be an issue, if the psycopg2 driver is having to context switch excessively, but this had no effect.
More - I've been working for a decent amount of time at this, so these were just a couple of the things I've tried.
Does anyone have ideas about how to debug this problem?
This is what shared tenancy gets you, unpredictable results.
What is the size of the data set the queries run on? Although Craig says it sounds like busrty checkpoint activity, that doesn't make sense because this is RDS. It sounds more like cache fallout, e.g; your relations are falling out of cache.
You say you are running piops but m3.medium is not an EBS optimized instance.
You need at least:
High instance level. Make sure your memory is more than the active data set.
EBS optimized instances, see here: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSOptimized.html
Lots of memory.
PIOPS
By the time you have all of that you will realize you will save a ton of money pushing PostgreSQL (or any database) to bare metal and leaving AWS to what it is good at, Memory and CPU (not IO).
You could try this from within psql to get more details on query timing
EXPLAIN sql_statement
Also turn on more database logging. mysql has slow query analysis, maybe PostgreSQL has an equivalent.

Persistant MySQL connection in Python for social media harvesting

I am using Python to stream large amounts of Twitter data into a MySQL database. I anticipate my job running over a period of several weeks. I have code that interacts with the twitter API and gives me an iterator that yields lists, each list corresponding to a database row. What I need is a means of maintaining a persistent database connection for several weeks. Right now I find myself having to restart my script repeatedly when my connection is lost, sometimes as a result of MySQL being restarted.
Does it make the most sense to use the mysqldb library, catch exceptions and reconnect when necessary? Or is there an already made solution as part of sqlalchemy or another package? Any ideas appreciated!
I think the right answer is to try and handle the connection errors; it sounds like you'd only be pulling in a much a larger library just for this feature, while trying and catching is probably how it's done, whatever level of the stack it's at. If necessary, you could multithread these things since they're probably IO-bound (i.e. suitable for Python GIL threading as opposed to multiprocessing) and decouple the production and the consumption with a queue, too, which would maybe take some of the load off of the database connection.

Postgres 8.4.4 + psycopg2 + python 2.6.5 + Win7 instability

You can see the combination of software components I'm using in the title of the question.
I have a simple 10-table database running on a Postgres server (Win 7 Pro). I have client apps (python using psycopg to connect to Postgres) who connect to the database at random intervals to conduct relatively light transactions. There's only one client app at a time doing any kind of heavy transaction, and those are typically < 500ms. The rest of them spend more time connecting than actually waiting for the database to execute the transaction. The point is that the database is under light load, but the load is evenly split between reads and writes.
My client apps run as servers/services themselves. I've found that it is pretty common for me to be able to (1) take the Postgres server completely down, and (2) ruin the database by killing the client app with a keyboard interrupt.
By (1), I mean that the Postgres process on the server aborts and the service needs to be restarted.
By (2), I mean that the database crashes again whenever a client tries to access the database after it has restarted and (presumably) finished "recovery mode" operations. I need to delete the old database/schema from the database server, then rebuild it each time to return it to a stable state. (After recovery mode, I have tried various combinations of Vacuums to see whether that improves stability; the vacuums run, but the server will still go down quickly when clients try to access the database again.)
I don't recall seeing the same effect when I kill the client app using a "taskkill" - only when using a keyboard interrupt to take the python process down. It doesn't happen all the time, but frequently enough that it's a major concern (25%?).
Really surprised that anything on a client would actually be able to take down an "enterprise class" database. Can anyone share tips on how to improve robustness, and hopefully help me to understand why this is happening in the first place? Thanks, M
If you're having problems with postgresql acting up like this, you should read this page:
http://wiki.postgresql.org/wiki/Guide_to_reporting_problems
For an example of a real bug, and how to ask a question that gets action and answers, read this thread.
http://archives.postgresql.org/pgsql-general/2010-12/msg01030.php

Categories

Resources