SQLAlchemy - select for update example - python

I'm looking for a complete example of using select for update in SQLAlchemy, but haven't found one googling. I need to lock a single row and update a column, the following code doesn't work (blocks forever):
s = table.select(table.c.user=="test",for_update=True)
# Do update or not depending on the row
u = table.update().where(table.c.user=="test")
u.execute(email="foo")
Do I need a commit? How do I do that? As far as I know you need to:
begin transaction
select ... for update
update
commit

If you are using the ORM, try the with_for_update function:
foo = session.query(Foo).filter(Foo.id==1234).with_for_update().one()
# this row is now locked
foo.name = 'bar'
session.add(foo)
session.commit()
# this row is now unlocked

Late answer, but maybe someone will find it useful.
First, you don't need to commit (at least not in-between queries, which I'm assuming you are asking about). Your second query hangs indefinitely, because you are effectively creating two concurrent connections to the database. First one is obtaining lock on selected records, then second one tries to modify locked records. So it can't work properly. (By the way in the example given you are not calling first query at all, so I'm assuming in your real tests you did something like s.execute() somewhere). So to the point—working implementation should look more like:
s = conn.execute(table.select(table.c.user=="test", for_update=True))
u = conn.execute(table.update().where(table.c.user=="test"), {"email": "foo"})
conn.commit()
Of course in such simple case there's no reason to do any locking but I guess it is example only and you were planning to add some additional logic between those two calls.

Yes, you do need to commit, which you can execute on the Engine or create a Transaction explicitely. Also the modifiers are specified in the values(...) method, and not execute:
>>> conn.execute(users.update().
... where(table.c.user=="test").
... values(email="foo")
... )
>>> my_engine.commit()

Related

What is the return of an UPDATE query?

I'm using sqlalchemy in combination with sqlite and the databases library and I'm trying to wrap my head around what that combination returns when doing update queries. I'm running a testcase and I have sqlalchemy set up to roll back upon execution of each testcase via force_rollback=True.
db = databases.Database(DB_URL, force_rollback=True)
query = update(my_table).where(my_table.columns.id == some_id_to_update).values(**values)
res = await db.execute(query)
When working with psql, I'd expect res to be the number of rows that were affected by the UPDATE query, but from reading the documentation, sqlite seems to behave differently in that it doesn't return anything. I tested this manually by connecting to the database via sqlite3 and as expected, there is no return when doing UPDATE queries. sqlalchemy however does return something, which I assume is the number of total rows in the table, but I'm not sure. Can anybody shed some light into what is actually returned?
What's more, when I tried to get the number of rows affected by the UPDATE query via SELECT changes(), I'm also getting the number of total rows in the table and not the rows affected by the most recent query. Do I have a misunderstanding of what changes() does?
"The changes() function returns the number of database rows that were changed or inserted or deleted by the most recently completed INSERT, DELETE, or UPDATE statement, exclusive of statements in lower-level triggers."
When you use the Python sqlite3 module, you use .executeXXX interfaces to evaluate/prepare your query. If the query is supposed to modify the database, it does it at this stage. You have to use the same interface to prepare a SELECT statement. In either case, the .executeXXX interfaces never return anything. To get the result of a SELECT query, you have to use a .fetchXXX interface after running .executeXXX.
To get the number of changed rows after INSERT, DELETE, or UPDATE statement via sqlite3, you can also take the difference in con.total_changes before/after running .executeXXX.

Sequence nextval/currval in two sessions

Setup:
Oracle DB running on a windows machine
Mac connected with the database, both in the same network
Problem:
When I created a sequence in SQL Developer, I can see and use the sequence in this session. If I logoff and login again the sequence is still there. But if I try to use the sequence via Python and cx_Oracle, it doesn't work. It also doesn't work the other way around.
[In SQL Developer: user: uc]
create SEQUENCE seq1;
select seq1.nextval from dual; ---> 1
commit; --> although the create statement is a DDL method, just in case
[login via Python, user: uc]
select seq1.currval from dual;--> ORA-08002 Sequence seq1.currval isn't defined in this session
The python code:
import cx_Oracle
cx_Oracle.init_oracle_client(lib_dir="/Users/benreisinger/Documents/testclients/instantclient_19_8", config_dir=None, error_url=None, driver_name=None)
# Connect as user "hr" with password "hr" to the "orclpdb" service running on a remote computer.
connection = cx_Oracle.connect("uc", "uc", "10.0.0.22/orcl")
cursor = connection.cursor()
cursor.execute("""
select seq1.currval from dual
""")
print(cursor)
for seq1 in cursor:
print(seq1)
The error says, that [seq1] wasn't defined in this session, but why does the following work:
select seq1.nextval from dual
--> returns 2
Even after issuing this, I can't use seq1.currval
Btw., select sequence_name from user_sequences returns seq1in Python
[as SYS user]
select * from v$session
where username = 'uc';
--> returns zero rows
Why is seq1 not in reach for the python program ?
Note: With tables, everything just works fine
EDIT:
also with 'UC' being upper case, no rows returned
first issuing
still doesn't work
Not sure how to explain this. The previous 2 answers are correct, but somehow you seem to miss the point.
First, take everything that is irrelevant out of the equation. Mac client on Windows db: doesn't matter. SQLDeveloper vs python: doesn't matter. The only thing that matters is that you connect twice to the database as the same schema. You connect twice, that means that you have 2 separate sessions and those sessions don't know about each other. Both sessions have access to the same database objects, so you if you execute ddl (eg create sequence), that object will be visible in the other session.
Now to the core of your question. The oracle documentation states
"To use or refer to the current sequence value of your session, reference seq_name.CURRVAL. CURRVAL can only be used if seq_name.NEXTVAL has been referenced in the current user session (in the current or a previous transaction)."
You have 2 different sessions, so according to the documentation, you should not be able to call seq_name.CURRVAL in the other session. That is exactly the behaviour you are seeing.
You ask "Why is seq1 not in reach for the python program ?". The answer is: you're not correct, it is in reach for the python program. You can call seq1.NEXTVAL from any session. But you cannot invoke seq1.NEXTVAL from one session (SQLDeveloper) and then invoke seq1.CURRVAL from another session (python) because that is just how sequences works as stated in documentation.
Just to confirm you're not in the same session, execute the following statement for both clients (SQLDeveloper and python):
select sys_context('USERENV','SID') from dual;
You'll notice that the session id is different.
CURRVAL returns the last allocated sequence number in the current session. So it only works when we have previously executed a NEXTVAL. So these two statements will return the same value when run in the same session:
select seq1.nextval from dual
/
select seq1.currval from dual
/
It's not entirely clear what you're trying to achieve, but it looks like your python code is executing a single statement for the connection, so it's not tapping into an existing session.
This statement returns zero rows ...
select * from v$session
where username = 'uc';
... because database objects in Oracle are stored in UPPER case (at least by default, but it's wise to stick with that default. So use where username = 'UC' instead.
Python established a new session. In it, sequence hasn't been invoked yet, so its currval doesn't exist. First you have to select nextval (which, as you said, returned 2) - only then currval will make sense.
Saying that
Even after issuing this, I can't use seq1.currval
is hard to believe.
This: select * From v$session where username = 'uc' returned nothing because - by default - all objects are stored in uppercase, so you should have ran
.... where username = 'UC'
Finally:
commit; --> although the create statement is a DDL method, just in case
Which case? There's no case. DDL commits. Moreover, commits twice (before and after the actual DDL statement). And there's nothing to commit either. Therefore, what you did is unnecessary and pretty much useless.

Luigi/SQLite: How to update database after initial load?

I'm loading data into an SQLite database via Luigi with the following code:
class LoadData(luigi.Task):
def requires(self):
return TransformData()
def run(self):
with sqlite3.connect('database.db') as db:
cursor = db.cursor()
cursor.execute("INSERT INTO prod SELECT * FROM staging;")
def output(self):
return luigi.LocalTarget('database.db')
This works, but when I want to update or insert new data, the task doesn't execute because Luigi considers it complete (database.db already exists).
Maybe I didn't understand the good use of LocalTarget. What is the right way to approach this?
///EDIT: My question applies to the example given on this page (code for le_create_db.py). How do you solve updates and inserts in that example?
///EDIT: This question about appending to a file is similar, but the solution using marker files does not work because sqla expects an SQLAlchemyTarget output. Are there any other answers, specifically about appending to a database?
Consider using a mock file:
http://gouthamanbalaraman.com/blog/building-luigi-task-pipeline.html
In each execution you will be creating a new file.
Another solution could be using the strategy of creating a marker table inside the db, for example: https://luigi.readthedocs.io/en/stable/api/luigi.contrib.postgres.html#luigi.contrib.postgres.PostgresTarget
I had the same issue and was able to solve it by overriding the complete method to simply return False:
def complete(self):
return False
Now the task is re-run every time, even if database file is present.

(BigQuery PY Client Library v0.28) - Fetch result from table 'query' job

I'm learning BigQuery API using Python Client Libraries v0.28
https://googlecloudplatform.github.io/google-cloud-python/latest/bigquery/usage.html#run-a-simple-query
Wrote this simple code to fetch data from the table
1) Create client object
client_ = bigquery.Client.from_service_account_json('/Users/xyz/key.json')
2) Begin new Async query job
QUERY = 'SELECT visitid FROM `1234567.ga_sessions_20180101`'
query_job = client_.query(QUERY
, job_id=str(uuid.uuid4()))
3) poll until the query is DONE
while (query_job.state == 'RUNNING'):
time.sleep(5)
query_job.reload()
4) Fetch the results in iteration
query_job.reload()
iter = query_job.result()
At this stage I'd like to fetch how many rows are in the table. As per the doc GitHub code iter is of type bigquery.table.RowIterator with a property [tier.total_rows][1]
5) However, at this stage when I print:
print(iter.total_rows)
It keeps returning None
I'm pretty sure this table is NOT empty an dry query is correctly formatted!
Any help to any pointers what am I missing here will be really helpful... Thanks a lot!
Cheers!
You need to also check query_job.error_result to make sure query succeeded.
You can also see your job in the UI, which can be useful for debugging, using project id and job id:
https://bigquery.cloud.google.com/results/projectid:jobid
Also, query_job.result() already waits for the job completion so you don't need to poll.
The current behavior of how RowIterator returns None is indeed perplexing. Luckily, according to this issue, tswast's comment from 10 days ago indicates that the developers are working on a better solution.
Current awkward behavior of .total_rows
Currently, .total_rows is initialized only once iteration begins. (In what follows, for clarity I renamed your iter variable to row_iter.)
row_iter = query_job.result()
itr = iter(row_iter)
first_row = next(itr)
print(row_iter.total_rows) # Now you get a number instead of None.
This is ugly because to continue the iteration, we must either handle the first row differently or call row_iter = query_job.result() again.
Temporary workaround
A currently-working alternative is to use the value of query_job._query_results.total_rows. Unfortunately this is cheating because _query_results is private, so there is no reason to expect that this will work in the future.
Future behavior
If tswast's proposal is implemented, then row_iter.total_rows will be initialized at the beginning, just as you expect.
Suggestion
In my code, I'm going to use something like
try:
num_rows = row_iter.total_rows or query_job._query_results.total_rows
except NameError:
num_rows = None
to be compatible with future behavior while falling-back to the temporary workaround if necessary.

SQL alchemy not updating as expected

This SQL Alchemy 0.9.7 code executes without error -- but does not update the underlying database as expected.
Here is the python:
print t #prints TITLE ABSTRACTOR 1
print newtitle #prints TITLE ABSTRACTOR I
print session.query(Basic).filter(Basic.title==t).count() #prints 1
ret = update(Basic).where(Basic.title==t).values(title=newtitle)
session.commit()
Here is what the database looks like after the update:
select count(*) from basics where title='TITLE ABSTRACTOR 1';
count
-------
1
(1 row)
select count(*) from basics where title='TITLE ABSTRACTOR I';
count
-------
0
(1 row)
Have I hit a SQL alchemy bug or am I missing something?
You're just constructing an update statement:
ret = update(Basic).where(Basic.title==t).values(title=newtitle)
That doesn't do anything unless you execute the statement:
stmt = update(Basic).where(Basic.title==t).values(title=newtitle)
ret = conn.execute(stmt)
But I think you were trying to use the ORM interface, not the core interface. In which case, although I don't remember the details, I'm pretty sure you do that by modifying a query object, not by calling anything named update. Hopefully if this is what you're looking for, hopefully someone who's fresher on this will provide a better answer, but something like this:
ret = session.query(Basic).filter(Basic.title==t)
ret.title = newtitle
If this doesn't make sense to you, see Executing in the tutorial. But I'm guessing you know this and it was just one of those stupid bugs we all make and all have a hard enough time seeing in other people's code, and it's 100x worse in our own. :)

Categories

Resources