I have a script running which I want to process data when it's added to the database.
import mysql.connector
import time
wait_time = 2
mydb = mysql.connector.connect(
host="localhost",
user="xxx",
passwd="yyy",
database="my_database"
)
mycursor = mydb.cursor()
while True:
sql = "SELECT * FROM data WHERE processed = 0"
mycursor.execute(sql)
records = mycursor.fetchall()
for i, r in enumerate(records):
print(r)
time.sleep(wait_time)
However, if insert a row via different connection, this connection doesn't show it.
I.e. if I connect to my database via a third party app, and insert a row to
However if I restart the above script, it appears.
Any ideas?
Use a message queue (e.g. RabbitMQ). Get the third party App to use it. Message queue implementation has better APIs for processing information asynchronously. Even if you just use the message queue for storing the primary key of the database content.
Alternately enable binary logging and use a replication protocol library to process events.
I just faced the same error. and the easiest way to solve it is... defining mydb and mycursor inside the loop.
import mysql.connector
import time
wait_time = 2
while True:
mydb = mysql.connector.connect(
host="localhost",
user="xxx",
passwd="yyy",
database="my_database"
)
mycursor = mydb.cursor()
sql = "SELECT * FROM data WHERE processed = 0"
mycursor.execute(sql)
records = mycursor.fetchall()
for i, r in enumerate(records):
print(r)
time.sleep(wait_time)
Related
In Python is library psycopg and with that I can do queries.
I have an array with texts and I just iterate over them and run query to find this text in postgres. But some time the query takes so much time to execute and in that moment I want to stop/terminate this query and go to the next. For example if query takes 10sec or longer I need to stop it and go to the next.
Is it possible with psycopg? Or maybe it is possible with something else?
You can use psycopg2 lib and create a connection and create a cursor with timeout
pip install psycopg2
import psycopg2
import threading
connection = psycopg2.connect(dbname="database", user="user", password="password", host="localhost", port=5432)
cursor = connection.cursor()
try:
threading.Timer(10.0,lambda con: con.cancel() ,args=(connection,)).start() # you can set with threading timeout
cursor.execute("SELECT * FROM table WHERE column = 'value'")
except psycopg2.extensions.QueryCanceledError:
pass
cursor.close()
connection.close()
docs : https://www.psycopg.org/docs/
Using statment_timeout to cancel statement that runs long. Use options parameter to pass in statement_timeout setting. By default the integer value is in milliseconds. It can be modified with units e.g. 10s = 10 seconds.
import psycopg2
# Set timeout to 1 millisecond for test purposes.
con = psycopg2.connect(options='-c statement_timeout=1', dbname="test", host='localhost', user='postgres', port=5432)
cur = con.cursor()
cur.execute("select * from cell_per")
QueryCanceled: canceling statement due to statement timeout
con.close()
# Set timeout to 10 seconds.
con = psycopg2.connect(options='-c statement_timeout=10s', dbname="test", host='localhost', user='postgres', port=5432)
cur = con.cursor()
cur.execute("select * from cell_per")
cur.rowcount
73
Let me start off by saying I am extremely new to Python and Postgresql so I feel like I'm in way over my head. My end goal is to get connected to the dvdrental database in postgresql and be able to access/manipulate the data. So far I have:
created a .config folder and a database.ini is within there with my login credentials.
in my src i have a config.py folder and use config parser, see below:
def config(filename='.config/database.ini', section='postgresql'):
# create a parser
parser = ConfigParser()
# read config file
parser.read(filename)
# get section, default to postgresql
db = {}
if parser.has_section(section):
params = parser.items(section)
for param in params:
db[param[0]] = param[1]
else:
raise Exception('Section {0} not found in the {1} file'.format(section, filename))
return db
then also in my src I have a tasks.py file that has a basic connect function, see below:
import pandas as pd
from clients.config import config
import psycopg
def connect():
""" Connect to the PostgreSQL database server """
conn = None
try:
# read connection parameters
params = config()
# connect to the PostgreSQL server
print('Connecting to the PostgreSQL database...')
conn = psycopg.connect(**params)
# create a cursor
cur = conn.cursor()
# execute a statement
print('PostgreSQL database version:')
cur.execute('SELECT version()')
# display the PostgreSQL database server version
db_version = cur.fetchone()
print(db_version)
# close the communication with the PostgreSQL
cur.close()
except (Exception, psycopg.DatabaseError) as error:
print(error)
finally:
if conn is not None:
conn.close()
print('Database connection closed.')
if __name__ == '__main__':
connect()
Now this runs and prints out the Postgresql database version which is all well & great but I'm struggling to figure out how to change the code so that it's more generalized and maybe just creates a cursor?
I need the connect function to basically just connect to the dvdrental database and create a cursor so that I can then use my connection to select from the database in other needed "tasks" -- for example I'd like to be able to create another function like the below:
def select_from_table(cursor, table_name, schema):
cursor.execute(f"SET search_path TO {schema}, public;")
results= cursor.execute(f"SELECT * FROM {table_name};").fetchall()
return results
but I'm struggling with how to just create a connection to the dvdrental database & a cursor so that I'm able to actually fetch data and create pandas tables with it and whatnot.
so it would be like
task 1 is connecting to the database
task 2 is interacting with the database (selecting tables and whatnot)
task 3 is converting the result from 2 into a pandas df
thanks so much for any help!! This is for a project in a class I am taking and I am extremely overwhelmed and have been googling-researching non-stop and seemingly end up nowhere fast.
The fact that you established the connection is honestly the hardest step. I know it can be overwhelming but you're on the right track.
Just copy these three lines from connect into the select_from_table method
params = config()
conn = psycopg.connect(**params)
cursor = conn.cursor()
It will look like this (also added conn.close() at the end):
def select_from_table(cursor, table_name, schema):
params = config()
conn = psycopg.connect(**params)
cursor = conn.cursor()
cursor.execute(f"SET search_path TO {schema}, public;")
results= cursor.execute(f"SELECT * FROM {table_name};").fetchall()
conn.close()
return results
I have a main Python script which connects to a MySQL database and pulls out few records from it. Based on the result returned it starts as many threads (class instances) as many records are grabbed. Each thread should go back to the database and update another table by setting one status flag to a different state ("process started").
To achieve this I tried to:
1.) Pass the database connection to all threads
2.) Open a new database connection from each thread
but none of them were working.
I could run my update without any issue in both cases by using try/except, but the MySQL table has not been updated, and no error was generated. I used commit in both cases.
My question would be how to handle MySQL connection(s) in such a case?
Update based on the first few comments:
MAIN SCRIPT
-----------
#Connecting to DB
db = MySQLdb.connect(host = db_host,
db = db_db,
port = db_port,
user = db_user,
passwd = db_password,
charset='utf8')
# Initiating database cursor
cur = db.cursor()
# Fetching records for which I need to initiate a class instance
cur.execute('SELECT ...')
for row in cur.fetchall() :
# Initiating new instance, appending it to a list and
# starting all of them
CLASS WHICH IS INSTANTIATED
---------------------------
# Connecting to DB again. I also tried to pass connection
# which has been opened in the main script but it did not
# work either.
db = MySQLdb.connect(host = db_host,
db = db_db,
port = db_port,
user = db_user,
passwd = db_password,
charset='utf8')
# Initiating database cursor
cur_class = db.cursor()
cur.execute('UPDATE ...')
db.commit()
Here is an example using multithreading deal mysql in Python, I don't know
your table and data, so, just change the code may help:
import threading
import time
import MySQLdb
Num_Of_threads = 5
class myThread(threading.Thread):
def __init__(self, conn, cur, data_to_deal):
threading.Thread.__init__(self)
self.threadID = threadID
self.conn = conn
self.cur = cur
self.data_to_deal
def run(self):
# add your sql
sql = 'insert into table id values ({0});'
for i in self.data_to_deal:
self.cur.execute(sql.format(i))
self.conn.commit()
threads = []
data_list = [1,2,3,4,5]
for i in range(Num_Of_threads):
conn = MySQLdb.connect(host='localhost',user='root',passwd='',db='')
cur = conn.cursor()
new_thread = myThread(conn, cur, data_list[i])
for th in threads:
th.start()
for t in threads:
t.join()
It seems there's no problem with my code but with my MySQL version. I'm using MySQL standard community edition and based on the official documentation found here :
The thread pool plugin is a commercial feature. It is not included in MySQL community distributions.
I'm about to upgrade to MariaDB to solve this issue.
Looks like mysql 5.7 does support multithreading.
As you tried previously - absolutely make sure to pass the connection within the def worker(). defining the connections globally was my mistake
Here's sample code that prints 10 records via 5 threads, 5 times
import MySQLdb
import threading
def write_good_proxies():
local_db = MySQLdb.connect("localhost","username","PassW","DB", port=3306 )
local_cursor = local_db.cursor (MySQLdb.cursors.DictCursor)
sql_select = 'select http from zproxies where update_time is null order by rand() limit 10'
local_cursor.execute(sql_select)
records = local_cursor.fetchall()
id_list = [f['http'] for f in records]
print id_list
def worker():
x=0
while x< 5:
x = x+1
write_good_proxies()
threads = []
for i in range(5):
print i
t = threading.Thread(target=worker)
threads.append(t)
t.start()
I have been trying to insert data into the database using the following code in python:
import sqlite3 as db
conn = db.connect('insertlinks.db')
cursor = conn.cursor()
db.autocommit(True)
a="asd"
b="adasd"
cursor.execute("Insert into links (link,id) values (?,?)",(a,b))
conn.close()
The code runs without any errors. But no updation to the database takes place. I tried adding the conn.commit() but it gives an error saying module not found. Please help?
You do have to commit after inserting:
cursor.execute("Insert into links (link,id) values (?,?)",(a,b))
conn.commit()
or use the connection as a context manager:
with conn:
cursor.execute("Insert into links (link,id) values (?,?)", (a, b))
or set autocommit correctly by setting the isolation_level keyword parameter to the connect() method to None:
conn = db.connect('insertlinks.db', isolation_level=None)
See Controlling Transactions.
It can be a bit late but set the autocommit = true save my time! especially if you have a script to run some bulk action as update/insert/delete...
Reference: https://docs.python.org/2/library/sqlite3.html#sqlite3.Connection.isolation_level
it is the way I usually have in my scripts:
def get_connection():
conn = sqlite3.connect('../db.sqlite3', isolation_level=None)
cursor = conn.cursor()
return conn, cursor
def get_jobs():
conn, cursor = get_connection()
if conn is None:
raise DatabaseError("Could not get connection")
I hope it helps you!
I want to use stored procedure in python code like below.
import pyodbc
conn = pyodbc.connect('Trusted_Connection=yes', driver = '{SQL Server}',
server = 'ZAMAN\SQLEXPRESS', database = 'foy3')
def InsertUser(studentID,name,surname,birth,address,telephone):
cursor = conn.cursor()
cursor.execute("exec InserttoDB studentID,name,surname,birth,address,telephone")
rows = cursor.fetchall()
I have a problem below part of code. How can I send function parametters to DB with InserttoDB (stored procedure)
cursor.execute("exec InserttoDB studentID,name,surname,birth,address,telephone")
I am not sure what database you are using but I think this should do the job.
cursor.execute("call SP_YOUR_SP_NAME(params)")