psycopg2 and infinite python script

psycopg2 and infinite python script - python

I have an infinite script written in Python which connects to Postgresql and inserts there a record when the person appears in front of the camera connected to my computer.
I would like to know what is the best way to connect (and store connection) to the database, if it is necessary to connect and close every time when the person appears or if I can somehow store connection. Because when I create a connection before the infinite loop and there is no activity in front of the camera, the connection stays idle and when the script wants to insert a new row after some time, the connection is closed. When I connect every time I want to insert a new row, there is no problem, but this is slower.
Thank you for any suggestions.

A connection pool works well for this kind of thing. I have not worked with it in production (using mainly Django or SQLAlchemy), but psycopg2.pool includes a few different implementations (SimpleConnectionPool or PersistentConnectionPool) that would probably fit your need. Generally speaking, a pool not only helps with managing connections as a shared resource, but also testing and re-initializing the connection when it's needed.
from psycopg2 import pool
conn_pool = pool.PersistentConnectionPool(minconn, maxconn, **dbopts)
def work_method():
conn = conn_pool.getconn()
with conn.cursor() as stmt:
stmt.execute(sql)
conn_pool.putconn(conn)
The putconn is extremely important, so that an exception doesn't leave the pool thinking the connection is still in use. Would be good to handle it as a context manager:
import contextlib
#contextlib.contextmanager
def get_db_connection():
conn = conn_pool.getconn()
yield conn
conn_pool.putconn(conn)
def work_method():
with get_db_connection() as conn:
with conn.cursor() as stmt:
stmt.execute(sql)
Hope that helps.

Related

MySQLConnector (Python): New DB connection for each query vs. one single connection

I have this problem: I'm writing some Python scripts and while, up until now, I had no problems at all using a single MySQLConnector connection throughout the entire script (only closing it at the end of the script), lately I'm having some problems.
If I create a connection at the beginning of the script, something like (ignore the security concerns, I know):
db_conn = mysql.connector.connect(user='root', password='myPassword', host='127.0.0.1', database='my_db', autocommit=True)
and then always use it like:
db_conn.cursor(buffered=True).execute(...)
or fetch and other methods, I will get errors like:
Failed executing the SQL query: MySQL Connection not available.
OR
Failed executing the SQL query: No result set to fetch from.
OR
OperationalError: (2013, 'Lost connection to MySQL server during query')
The code is correct, I just don't understand why this happens. Maybe because I'm concurrently running the same function multiple times (tried with 2), in async, so maybe the concurrent access to the cursor causes this?
I found someone fixed it by using a different DB connection every time (here).
I tried to create a new connection for every single query to the DB and now there are no errors at all. It works fine but it seems an overkill. Imagine calling the async function 10 or 100 times...there would be a lot of DB connections created. Will it cause problems? Will it run out of memory? And, also, I guess it will slow down.
Is there a way to solve it by keeping the same connection for all the queries? Why does that happen?

MySQL is a stateful protocol (more like ftp than http in this way). This means if you are running multiple async threads that are sending and receiving packets on the same MySQL connection, the protocol can't handle that. The server and client will get confused, because messages will arrive in the wrong order.
What I mean is if different async routines are trying to use the database connection at the same time, you can easily get into trouble:
async1: sends query "select * from table1"
async2: sends query "insert into table2 ..."
async1: expects to fetch rows of result set, but receives only rows-affected and last insertid
It gets worse from there, for example, a query cannot execute while there's an existing query with a result set that hasn't closed its result set. Or even worse, you could prepare two queries that have parameters, then subsequently send parameters for the wrong query.
You can use the same database connection for many queries, but DO NOT share the same connection among concurrently executing async threads. To be safe, each async routine should open its own connection. Then the thread that opened a given connection can use that connection for multiple queries.
Think of it like a call center, where dozens of people each have their own phone line. They certainly should not try to share a single phone line and carry on multiple conversations! The only way that could work is if every word uttered on the phone carried some identifying information for which conversation it belonged to. "Hi this is Mr. Smith calling about case #1234, and the answer to the question you just asking me is..."
But MySQL's protocol doesn't do that. It assumes that each message is a continuation of the previous one, and both client and server remember what that is.

How to handle AWS-RDS connections within my application?

I'm looking for some input on how to handle my connection to AWS-RDS. Should I open and close the connection each time I execute a query? Should I use a lambda function, and why?
I currently have it setup so the connection remains open and executions are handled from there. I have no connection closes or timeouts.
conn = pymysql.connect(db=dbname, host=host, port=port, user=user,
password=password)
cur = conn.cursor()
I then have query executions throughout the code like such.
cur.execute("SELECT product, amount, total " +
"FROM " + table +
" WHERE po_date BETWEEN %s AND %s",
(cur_month, next_month))

This depends on your application preferences.
Global Connection- If you create the connection at the global level, you save on the cost of opening the connection at each time you need to access the database, but you are using more memory on the database as it maintains the open connection. If the application does not close the connection on exit, the database must manually timeout this idle connection and kill it. You will need to add retry logic to the database to ensure the connection is still alive.
Connect Each Time - Added overhead of creating and closing the connection. Uses extra cpu on the client and db side to open and close the connection, but will keep the connection count lower.
As for using lambda, that completely depends on the application design. But, I would say yes, use it when you can!
If you want to use lambda to connect to a database, you will need to build a deployment package or a lambda layer to include the SQL client. Here are some links with step by step instructions to create these for python with pymysql. If needed, you can substitute the pymysql library with another SQL client using these same instructions.
https://geektopia.tech/post.php?blogpost=Create_Lambda_Package_Python
https://geektopia.tech/post.php?blogpost=Create_Lambda_Layer_Python

using sqlite3 in multithreading

I am writing a server program for many clients and i used threads.
Every client can make an action that requires writing or reading from sqlite database.
Do I need to open and close connection for every action or to open the database once for all the clients to share one connection?
Example for my code:
if command == "s":
conn = open_database() #connect to the database
cursor = conn.cursor()
cursor.execute('''SELECT s FROM users WHERE username=?''', (username,))
s= cursor.fetchone()[0]
conn.close()
if not s:
s= "Empty!"
clientsock.send(str(s))
I also used insert command to the database.

One connection has exactly one transaction, so your program is likely to blow up when multiple threads try to share the same connection without locking around all transactions.
Use one connection per thread.
(If you need high concurrency, SQLite might not be the best choice.)

Do I need to close a database connection in a short python script?

Does python (version 2.7, 3.3) close the database connection immediately after a program is finished?
For example:
import MySQLdb
conn = sqlite3.connect(host="localost", user="adam", password="12345", db ="my_db")
c = conn.cursor()
c.execute('''SELECT * FROM MY_TABLE''')
cur.close() # Do I really need that ?
conn.close() # Do I really need that ?
Could there be an issue with closing connections, if I run this script again and again immediately one after another?
ps. Yes, I know that the best practice is to close all the resources.

As you mentioned you should close the connection explicitly in your code.
Most of the time letting the connection implicitly close will not cause issues, but with some DBMSes weird things could happen.
SQLite will rollback any open transactions for example.
Also i read somewhere you can use the withstatement when you are using MYSQL might be worth googling.

You have a bit confusion in your code snippet, mixing mysql and sqlite3 it seems.
I guessed you meant mysql all the way, and rewrote your code to:
import MySQLdb
conn = MySQLdb.connect(host="127.0.0.1", user="test", passwd="", db="test")
c = conn.cursor()
c.execute('''SELECT now()''')
Then I checked connection state, both using netstat and mysql processlist, and on my linux it seems that when the scripts ends, connection is closed, leaving the tcp connection in TIME_WAIT as expected.

MySQL, should I stay connected or connect when needed?

I have been logging temperatures at home to a MySQL database (read 10 sensors in total every 5 minutes), and have been using Python, but I am wondering something...
Currently when I first run my program, I run the normal connect to MySQL, which is only run once.
db = MySQLdb.connect(mysql_server, mysql_username, mysql_passwd, mysql_db)
cursor = db.cursor()
Then I collect the data and publish it to the database successfully. The script then sleeps for 5 minutes, then starts again and collects and publishes the data again and so on. However, I only connect once, and I don't ever disconnect; it just keeps going in a loop. I only disconnect if I terminate the program.
Is this the best practice? That is, keeping the connection open all the time to the MySQL server, or should I disconnect after I have done a insert/commit?
The reason I ask: every now and then, I have to restart the script because maybe my MySQL server has gone offline or some other issue. Should I:
Keep doing what I am doing and just handle any MySQL database disconnections with a reconnect,
Put it in the crontab to collect data every five minutes and have no loop and no sleep, or
Something else?

MySQL servers are configured to handle a fixed limited number of connections. It's not a good practice to tie up a connection that you are not using constantly. So typically you should close the connection as soon as you are done with it, and reconnect only when you need it again. MySQLdb's connections are context mangagers, so you could use the with-statement syntax to make closing the connection automatic.
connection = MySQLdb.connect(
host=config.HOST, user=config.USER,
passwd=config.PASS, db=config.MYDB, )
with connection as cursor:
print(cursor)
# the connection is closed for you automatically
# when Python leaves the `with-suite`.
For robustness, you might want to use try..except to handle the case when (even on the first run) connect fails to make a connection.
Having said that, I would just put it in a crontab entry and dispense with sleeping.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.