i wrote a python program like this that should run in multithreading mode:
def Func(host,cursor,db):
cursor.execute('''SELECT If_index, Username, Version, Community, Ip_traff FROM HOST WHERE
Hostname = ?''',(host,))
#do something
#--- Main ---
db = sqlite3.connect(os.getcwd()+'\HOST', check_same_thread = False) #opendatabase
cursor = db.cursor() #generate a cursor
for ii in range(len(host)): #host is a list of ipaddress
#for each host i want generate a thread
thr = threading.Thread(target = Func, args=(host[ii],cursor,db)
thr.start()
i receive the sqlite3.ProgrammingError: Recursive use of cursors not allowed. How can i manage the recursive cursor for sqlite3 in this case?
thanks a lot
Paolo
Well, the thing is the sqlite3 module doesn't likes multithread cases, you can see that in the sqlite3 module's documentation
...the Python module disallows sharing connections and cursors between threads[1]
What I would do is to use some sort of synchronization in the Func function, for example, a threading.Lock[2]. Your Func will look like this:
# Define the lock globally
lock = threading.Lock()
def Func(host,cursor,db):
try:
lock.acquire(True)
res = cursor.execute('''...''',(host,))
# do something
finally:
lock.release()
The previous code will synchronize the execution of the cursor.execute by letting just one thread take the lock, the other threads will wait until it's released, when the thread with the lock is done, it releases the lock for the others to take it.
That should fix the problem.
[1] https://docs.python.org/2/library/sqlite3.html#multithreading
[2] https://docs.python.org/2/library/threading.html?highlight=threading#rlock-objects
One way is to fix it just block sqlite before writing
import sqlite3
con = sqlite3.connect(...)
...
with con:
# Database is locked here
cur = conn.cursor()
res = cur.execute('''...''',(host,))
Related
I am trying to implement a worker thread to go through a queue and add the items inside to a sql db.
But I am experiencing this weird issue where even though I am definetly putting in different statements to the queue, they all become copies of each other inside the queue if I am putting them in the queue within 2 seconds.
This is the worker thread with the queue:
class DBWriterThread(threading.Thread):
def __init__(self):
super().__init__()
self.q = queue.Queue()
self.put = self.q.put
self.start()
def run(self):
db_conn = None
while True:
statements = [self.q.get()]
try:
while self.q.empty() is False:
statements.append(self.q.get(), block=False)
except queue.Empty:
pass
try:
if statements[0] is None:
return
if not db_conn:
db_conn = connect_to_db()
try:
cursor = db_conn.cursor()
for statement in statements:
print(statement)
if statement is None:
return
work_order = statement[0][0]
data = statement[0][1]
if work_order == 'insertTick':
print(f"GOT ORDER TO INSERT DATA OF SYMBOL {data['symbol']}")
insertDataRow(data, cursor)
elif work_order == 'insertTrade':
insertTradeSignal(data, cursor)
else:
print("Unknown work order")
print(work_order)
finally:
db_conn.commit()
finally:
for _ in statements:
self.q.task_done()
I instantiate this thread class inside my main program (which is also a class) where I in the init method is defining it as
self.db_writer = DBWriterThread()
and then throughout the program I am doing
self.dbWriter.put((['insertTick', self.peopleTick],))
to insert data in the queue.
I believe what is happening is that I am getting multiple peopleTicks within a very short span (practically simultaneously), where after I am calling the self.dbWriter.put((['insertTick', self.peopleTick],)) with the two different peopleTicks directly after each other.
This is where I am experiencing that even though I input two different peopleTicks into the queue, when the worker thread retrieves the queue, the items inside are duplicates of the same tick.
This can be stopped if I do a time.sleep(2) (not less) between I call the self.dbWriter.put but that would not work for my program and defeats the whole point of a queue. How do I solve this?
I have tried figuring out locks if that could help, but I don't know how to implement it and if that would be the solution.
For my repository class, I am creating an connection to the database when a function needs it (if None). I am using the __del__ to close the connection if not None. The object is guaranteed to be short lived.
I realized I am dependent on Ref counting to close the connection as soon as possible. I need the connection to be a field in the class because I am doing a select for update. The default python gc I read does ref counting at least the CPython implementation. But the pypy implementation has many different types of garbage collectors. Is it okay to rely on ref counting gc to close the connection as soon as possible? Or should I write a separate function and ensure that the connection is closed using a try finally block?
The class only consists of the connection field and two functions one to lock the table in database and select values and the other to update them.
This is my current implementation:
class ReserveRepository:
def __init__(self):
self.conn = None
def select(self):
cur = None
try:
self.conn = psycopg2.connect(config.DATABASE_URL)
cur = self.conn.cursor()
cur.execute('lock table sometable in ACCESS EXCLUSIVE mode')
cur.execute('select id from sometable where condition')
return list(map(lambda x: x[0]))
finally:
if cur is not None:
cur.close()
def update_quantities(self, id):
cur = None
try:
if self.conn is None:
self.conn = psycopg2.connect(config.DATABASE_URL)
cur = self.conn.cursor()
cur.execute('update sometable set column1 = value where id = %s', (id,))
self.conn.commit()
return reservation_id
finally:
if cur is not None:
cur.close()
def __del__(self):
if self.conn is not None:
self.conn.close()
self.conn = None
tldr; The answer is no. Pypy does not implement reference counting as far as I understand. To demonstrate this, I made this simple program:
class Test:
def __init__(self):
print('__init__')
def __del__(self):
print('__del__')
t = Test()
print(t)
The output on CPython 3.8.2 is:
__init__
<__main__.Test object at 0x7f6e3bc13dc0>
__del__
However, the output on Pypy 7.3.1 (Python 3.6.9) is
__init__
<__main__.Test object at 0x00007f58325dfe88>
The __del__ function is not called.
Even on CPython, the answer is no. If the object is attached to a long-lived cached instance, __del__ will never be called. A better design pattern is to use a context manager for your object and do clean up at __exit__:
with make_a_connection(parameter) as conn:
useconn(conn)
I have an issue while importing a python file (which contains a multi-threading class, that reads every row in the data base and a thread is started for every row.)
When I try to import this file from another python file, nothing is happening. As in, the kernel looks as if it hanged(blank).
** Basically the code isn't progressing after the import line. It is getting struck there.
Any suggestions or help?
pseudo code ( of the file I want to import)
import time
import MySQLdb
import threading
class Job(threading.Thread):
def __init__(self, x, conn, sleepBuffer=0):
threading.Thread.__init__(self)
self.x = x
self.conn = conn
self.sleepBuffer = sleepBuffer
def run(self):
self.session = Session(hostname=self.x)
self.job(self.x)
def job(self, x):
######### do something and update the database columns. It keeps running continuously and updates the table periodically.
db = MySQLdb.connect(####user,password,dbname)
cur = db.cursor()
cur.execute("select x from TABLE where x = %s" %(x))
rows = cur.fetchall()
threads = []
for row in rows:
conn = MySQLdb.connect((####user,password,dbname)
time.sleep(1)
thread = Job(row[0], conn)
thread.start()
threads.append(thread)
for thread in threads:
thread.join()
I'm trying to import the file using the line "import filename". I thought may be I was doing it wrong and checked other ways of importing. But none of them work.
Have you tried this..
import sys
sys.path.insert(0, '/path/to/application/app/folder')
import file
How would I go and create a queue to run tasks in the background in Python?
I have tried via asyncio.Queue() but whenever I use Queue.put(task) it immediately starts the task.
It is for an application which receives an unknown amount of entries (filenames) from a database on a specified time interval. What I wish to accomplish with this backgroundqueue would be that the python application keeps running and keeps returning new filenames. Everytime the application finds new filenames it should handle them by creating a task, which would contain (method(variables)). These tasks should all be thrown into an ever expanding queue which runs the tasks on its own. Here's the code.
class DatabaseHandler:
def __init__(self):
try:
self.cnx = mysql.connector.connect(user='root', password='', host='127.0.0.1', database='mydb')
self.cnx.autocommit = True
self.q = asyncio.Queue()
except mysql.connector.Error as err:
if err.errno == errorcode.ER_ACCESS_DENIED_ERROR:
print("Something is wrong with your user name or password")
elif err.errno == errorcode.ER_BAD_DB_ERROR:
print("Database does not exist")
else:
print(err)
self.get_new_entries(30.0)
def get_new_entries(self, delay):
start_time = t.time()
while True:
current_time = datetime.datetime.now() - datetime.timedelta(seconds=delay)
current_time = current_time.strftime("%Y-%m-%d %H:%M:%S")
data = current_time
print(current_time)
self.select_latest_entries(data)
print("###################")
t.sleep(delay - ((t.time() - start_time) % delay))
def select_latest_entries(self, input_data):
query = """SELECT FILE_NAME FROM `added_files` WHERE CREATION_TIME > %s"""
cursor = self.cnx.cursor()
cursor.execute(query, (input_data,))
for file_name in cursor.fetchall():
file_name_string = ''.join(file_name)
self.q.put(self.handle_new_file_names(file_name_string))
cursor.close()
def handle_new_file_names(self, filename):
create_new_npy_files(filename)
self.update_entry(filename)
def update_entry(self, filename):
print(filename)
query = """UPDATE `added_files` SET NPY_CREATED_AT=NOW(), DELETED=1 WHERE FILE_NAME=%s"""
update_cursor = self.cnx.cursor()
self.cnx.commit()
update_cursor.execute(query, (filename,))
update_cursor.close()
As I said, this will instantly run the task.
create_new_npy_files is a pretty time consuming method in a static class.
There are two problems with this expression:
self.q.put(self.handle_new_file_names(file_name_string))
First, it is actually calling the handle_new_file_names method and is enqueueing its result. This is not specific to asyncio.Queue, it is how function calls work in Python (and most mainstream languages). The above is equivalent to:
_tmp = self.handle_new_file_names(file_name_string)
self.q.put(_tmp)
The second problem is that asyncio.Queue operations like get and put are coroutines, so you must await them.
If you want to enqueue a callable, you can use a lambda:
await self.q.put(lambda: self.handle_new_file_names(file_name_string))
But since the consumer of the queue is under your control, you can simply enqueue the file names, as suggested by #dirn:
await self.q.put(file_name_string)
The consumer of the queue would use await self.q.get() to read the file names and call self.handle_new_file_names() on each.
If you plan to use asyncio, consider reading a tutorial that covers the basics, and switching to an asyncio compliant database connector, so that the database queries play along with the asyncio event loop.
For people who see this in the future. The answer I marked as accepted is the explanation of how to solve the problem. I'll write down some code which I used to create what I wanted. That is, tasks that should run in the background. Here you go.
from multiprocessing import Queue
import threading
class ThisClass
def __init__(self):
self.q = Queue()
self.worker = threading.Thread(target=self._consume_queue)
self.worker.start()
self.run()
The queue created is not a queue for tasks, but for the variables you want to handle.
def run(self):
for i in range(100):
self.q.put(i)
Then for the _consume_queue(), which consumes the items in the queue when there are items:
def _consume_queue(self):
while True:
number = self.q.get()
# the logic you want to use per number.
It seems the self.q.get() waits for new entries, even when there are none.
The -simplified- code above works for me, I hope it will also work for others.
long time lurker here.
I have a thread controller object. This object takes in other objects called "Checks". These Checks pull in DB rows that match their criteria. The thread manager polls each check (asking it for it's DB rows aka work units) and then enqueues each row along with a reference to that check object. The thought is that N many threads will come in and pull off an item from the queue and execute the corresponding Check's do_work method. The do_work method will return Pass\Fail and all passes will be enqueued for further processing.
The main script (not shown) instantiates the checks and adds them to the thread manager using add_check and then calls kick_off_work.
So far I am testing and it simply locks up:
import Queue
from threading import Thread
class ThreadMan:
def __init__(self, reporter):
print "Initializing thread manager..."
self.workQueue = Queue.Queue()
self.resultQueue = Queue.Queue()
self.checks = []
def add_check(self, check):
self.checks.append(check)
def kick_off_work(self):
for check in self.checks:
for work_unit in check.populate_work():
#work unit is a DB row
self.workQueue.put({"object" : check, "work" : work_unit})
threads = Thread(target=self.execute_work_unit)
threads = Thread(target=self.execute_work_unit)
threads.start()
self.workQueue.join();
def execute_work_unit(self):
unit = self.workQueue.get()
check_object = unit['object'] #Check object
work_row = unit['work'] # DB ROW
check_object.do_work(work_row)
self.workQueue.task_done();
print "Done with work!!"
The output is simply:
Initializing thread manager...
In check1's do_work method... Doing work
Done with work!!
(locked up)
I would like to run through the entire queue
you should only add a "while" in your execute_work_unit otherwise it stops at first iteration:
def execute_work_unit(self):
while True:
unit = self.workQueue.get()
check_object = unit['object'] #Check object
work_row = unit['work'] # DB ROW
check_object.do_work(work_row)
self.workQueue.task_done();
print "Done with work!!"
have a look there:
http://docs.python.org/2/library/queue.html#module-Queue
EDIT: to get it finish just add threads.join() after your self.workQueue.join() in
def kick_off_work(self):