how to prevent QTableModel from updating table depending on some condition - python

i have a mysql tables that uses lock-write mechanism. the lock might go for too long (we're talking about 1-2 minutes here).
i had to make a check if the table is in use or not before the update is done (using beforeUpdate signal)
but after checking and returning that my table is in use , system hang until the other user unlocks the table . is it possible to prevent data from updating if the flag returned that the table is in use.,
im searching for a better way to handle this i don't want to re-implement the setData method since doing this is a pain. or if you have a good re-implementation for it . it will be very helpfull.
thanks in advance

Python threading: http://docs.python.org/library/thread.html You can create threads that wait until the table is finished and it should be negligible in system resources, also your end user wont have to wait for the system to respond to continue with a different task.

Related

Concurrency issue from read/write from table on multiple threads (race condition)

I am building an app where each user is assigned a task from a tasks table. In order to do so, we are going to mark an existing entry as deleted (flag) and then add an entry that holds the person responsible for the task in that table.
The issue here is that if multiple users decide to get a task at the same time, the request would prioritize older entries over newer ones, so there is the chance they are going to read the same task and get assigned the same task. Is there a simple way around it?
My first inclination was to create a singleton class that handles job distribution, but I am pretty sure that such issues can be handled by Django direct. What should I try?

MySql: will it deadlock on to many insert - Django?

I wanna migrate from sqlite3 to MySQL in Django. Now I have been working in Oracle, MS Server and I know that I can make Exception to try over and over again until it is done... However this is insert in a same table where the data must be INSERTED right away because users will not be happy for waiting their turn on INSERT on the same table.
So I was wondering, will the deadlock happen on table if to many users make insert in same time and what should I do to bypass that, so that users don't sense it?
I don't think you can get deadlock just from rapid insertions. Deadlock occurs when you have two processes that are each waiting for the other one to do something before they can make the change that the other one is waiting for. If two processes are just inserting, the database will simply process them in the order that they're received, there's no dependency between them.
If you're using InnoDB, it uses row-level locking. So unless two inserts both try to insert the same unique key, they shouldn't even lock each other out, they can be done concurrently.

Should I use MySQL Lock Table or there exists better solution?

I'm working on a multiprocessed application, and each process sometimes executes the following code:
db_cursor.execute("SELECT MAX(id) FROM prqueue;")
for record in db_cursor.fetchall():
if record[0]:
db_cursor.execute("DELETE FROM prqueue WHERE id='%s'" % record[0]);
db_connector.commit()
And I'm facing the following problem: there may be a situation, when two processes take the same maximum value, and both try to delete this value. Such situation is not acceptable in the context of my application, each value must be taken (and deleted) only by one process.
How can I achieve this? Is table locking while taking the maximum and deleting absolutely necessary, or there is another, nice way to do that?
Thank you.
Consider simulating record locks with GET_LOCK();
Choose a name specific to the op you want locking. e.g. 'prqueue_max_del'.
Call SELECT GET_LOCK('prqueue_max_del',30) to lock the name 'prqueue_max_del'.. it will return 1 and set the lock if the name becomes available, or return 0 if the lock is not available after 30 seconds (the second parameter is the timeout).
Use SELECT RELEASE_LOCK('prqueue_max_del') when you are finished.
You will have to use the same names in each transaction and calling GET_LOCK() again in a transaction will release the previously set lock.
Beware; As only the abstract name is locked, all other processes not using this method and abstract name will be able to savage your table independently.
GET_LOCK() docs

SQLite3 and Multiprocessing

I noticed that sqlite3 isnĀ“t really capable nor reliable when i use it inside a multiprocessing enviroment. Each process tries to write some data into the same database, so that a connection is used by multiple threads. I tried it with the check_same_thread=False option, but the number of insertions is pretty random: Sometimes it includes everything, sometimes not. Should I parallel-process only parts of the function (fetching data from the web), stack their outputs into a list and put them into the table all together or is there a reliable way to handle multi-connections with sqlite?
First of all, there's a difference between multiprocessing (multiple processes) and multithreading (multiple threads within one process).
It seems that you're talking about multithreading here. There are a couple of caveats that you should be aware of when using SQLite in a multithreaded environment. The SQLite documentation mentions the following:
Do not use the same database connection at the same time in more than
one thread.
On some operating systems, a database connection should
always be used in the same thread in which it was originally created.
See here for a more detailed information: Is SQLite thread-safe?
I've actually just been working on something very similar:
multiple processes (for me a processing pool of 4 to 32 workers)
each process worker does some stuff that includes getting information
from the web (a call to the Alchemy API for mine)
each process opens its own sqlite3 connection, all to a single file, and each
process adds one entry before getting the next task off the stack
At first I thought I was seeing the same issue as you, then I traced it to overlapping and conflicting issues with retrieving the information from the web. Since I was right there I did some torture testing on sqlite and multiprocessing and found I could run MANY process workers, all connecting and adding to the same sqlite file without coordination and it was rock solid when I was just putting in test data.
So now I'm looking at your phrase "(fetching data from the web)" - perhaps you could try replacing that data fetching with some dummy data to ensure that it is really the sqlite3 connection causing you problems. At least in my tested case (running right now in another window) I found that multiple processes were able to all add through their own connection without issues but your description exactly matches the problem I'm having when two processes step on each other while going for the web API (very odd error actually) and sometimes don't get the expected data, which of course leaves an empty slot in the database. My eventual solution was to detect this failure within each worker and retry the web API call when it happened (could have been more elegant, but this was for a personal hack).
My apologies if this doesn't apply to your case, without code it's hard to know what you're facing, but the description makes me wonder if you might widen your considerations.
sqlitedict: A lightweight wrapper around Python's sqlite3 database, with a dict-like interface and multi-thread access support.
If I had to build a system like the one you describe, using SQLITE, then I would start by writing an async server (using the asynchat module) to handle all of the SQLITE database access, and then I would write the other processes to use that server. When there is only one process accessing the db file directly, it can enforce a strict sequence of queries so that there is no danger of two processes stepping on each others toes. It is also faster than continually opening and closing the db.
In fact, I would also try to avoid maintaining sessions, in other words, I would try to write all the other processes so that every database transaction is independent. At minimum this would mean allowing a transaction to contain a list of SQL statements, not just one, and it might even require some if then capability so that you could SELECT a record, check that a field is equal to X, and only then, UPDATE that field. If your existing app is closing the database after every transaction, then you don't need to worry about sessions.
You might be able to use something like nosqlite http://code.google.com/p/nosqlite/

Google App Engine - design considerations about cron tasks

I'm developing software using the Google App Engine.
I have some considerations about the optimal design regarding the following issue: I need to create and save snapshots of some entities at regular intervals.
In the conventional relational db world, I would create db jobs which would insert new summary records.
For example, a job would insert a record for every active user that would contain his current score to the "userrank" table, say, every hour.
I'd like to know what's the best method to achieve this in Google App Engine. I know that there is the Cron service, but does it allow us to execute jobs which will insert/update thousands of records?
I think you'll find that snapshotting every user's state every hour isn't something that will scale well no matter what your framework. A more ordinary environment will disguise this by letting you have longer running tasks, but you'll still reach the point where it's not practical to take a snapshot of every user's data, every hour.
My suggestion would be this: Add a 'last snapshot' field, and subclass the put() function of your model (assuming you're using Python; the same is possible in Java, but I don't know the syntax), such that whenever you update a record, it checks if it's been more than an hour since the last snapshot, and if so, creates and writes a snapshot record.
In order to prevent concurrent updates creating two identical snapshots, you'll want to give the snapshots a key name derived from the time at which the snapshot was taken. That way, if two concurrent updates try to write a snapshot, one will harmlessly overwrite the other.
To get the snapshot for a given hour, simply query for the oldest snapshot newer than the requested period. As an added bonus, since inactive records aren't snapshotted, you're saving a lot of space, too.
Have you considered using the remote api instead? This way you could get a shell to your datastore and avoid the timeouts. The Mapper class they demonstrate in that link is quite useful and I've used it successfully to do batch operations on ~1500 objects.
That said, cron should work fine too. You do have a limit on the time of each individual request so you can't just chew through them all at once, but you can use redirection to loop over as many users as you want, processing one user at a time. There should be an example of this in the docs somewhere if you need help with this approach.
I would use a combination of Cron jobs and a looping url fetch method detailed here: http://stage.vambenepe.com/archives/549. In this way you can catch your timeouts and begin another request.
To summarize the article, the cron job calls your initial process, you catch the timeout error and call the process again, masked as a second url. You have to ping between two URLs to keep app engine from thinking you are in a accidental loop. You also need to be careful that you do not loop infinitely. Make sure that there is an end state for your updating loop, since this would put you over your quotas pretty quickly if it never ended.

Categories

Resources