I would like update my table through Python using SQLAlchemy. Since the table I would like to update is not in the default schema, I referred to this question to set the session by sess.execute("SET search_path TO client1").
The whole code example is shown as follows:
session = DBSession()
session.execute("SET search_path TO client1")
session.commit()
total_rows = session.query(table).all()
for row in total_rows:
try:
row.attr1 = getAttr1()
row.attr2 = getAttr2()
session.commit()
except Exception as inst:
print(inst)
session.rollback()
Though my code can update the table at the beginning, after several hundreds of iterations (around 500 maybe?) it will throw the exception that the relation table does not exist. My current solution is to iterate my code several times with 500 records updated each time. But I think it is not a perfect solution to this problem and I am still looking forward to finding out the reason that cause this exception.
Related
I have a Flask application that uses SQLAlchemy (with some Marshmallow for serialization and deserialization).
I'm currently encountering some intermittent issues when trying to dump an object post-commit.
To give an example, let's say I have implemented a (multi-tenant) system for tracking system faults of some sort. This information is contained in a fault table:
class Fault(Base):
__tablename__ = "fault"
fault_id = Column(BIGINT, primary_key=True)
workspace_id = Column(Integer, ForeignKey('workspace.workspace_id'))
local_fault_id = Column(Integer)
name = Column(String)
description = Column(String)
I've removed a number of columns in the interest of simplicity, but this is the core of the model. The columns should be largely self explanatory, with workspace_id effectively representing tenant, and local_fault_id representing a tenant-specific fault sequence number, which is handled via a separate fault_sequence table.
This fault_sequence table holds a counter against workspace, and is updated by means of a simple on_fault_created() function that is executed by a trigger:
CREATE TRIGGER fault_created
AFTER INSERT
ON "fault"
FOR EACH ROW
EXECUTE PROCEDURE on_fault_created();
So - the problem:
I have a Flask endpoint for fault creation, where we create an instance of a Fault entity, add this via a scoped session (session.add(fault)), then call session.commit().
It seems that this is always successful in creating the desired entities in the database, executing the sequence update trigger etc. However, when I then try to interrogate the fault object for updated fields (after commit()), around 10% of the time I find that each key/field just points to an Exception:
psycopg2.errors.InFailedSqlTransaction: current transaction is aborted, commands ignored until end of transaction block
Which seems to boil down to the following:
(psycopg2.errors.InvalidTextRepresentation) invalid input syntax for integer: ""
[SQL: SELECT fault.fault_id AS fault_fault_id, fault.workspace_id AS fault_workspace_id, fault.local_fault_id AS fault_local_fault_id, fault.name as fault_name, fault.description as fault_description
FROM fault
WHERE fault.fault_id = %(param_1)s]
[parameters: {'param_1': 166}]
(Background on this error at: http://sqlalche.me/e/13/2j8
My question, then, is what do we think could be causing this?
I think it smells like a race condition, with the update trigger not being complete before SQLAlchemy has tried to get the updated data; perhaps local_fault_id is null, and this is resulting in the invalid input syntax error.
That said, I have very low confidence on this. Any guidance here would be amazing, as I could really do with retrieving that sequence number that's incremented/handled by the update trigger.
Thanks
Edit 1:
Some more info:
I have tried removing the update trigger, in the hope of eliminating that as a suspect. This behaviour is still intermittently evident, so I don't think it's related to that.
I have tried adopting usage of flush and refresh before the commit, and this allows me to get the values that I need - though commit still appears to 'break' the fault object.
Edit 2:
So it really seems to be more postgres than anything else. When I interrogate my database logs, this is the weirdest thing. I can copy and paste the command it says is failing, and I struggle to see how this integer value in the WHERE clause is possibly evaluating to an empty string.
This same error is reproducible with SELECT ... FROM fault WHERE fault.fault_id = '', which in no way seems to be the query making to the DB.
I am stumped.
Your sentence "This same error is reproducible with SELECT ... FROM fault WHERE fault.fault_id = '', which in no way seems to be the query making to the DB." seems to indicate that you are trying to access an object that does not have the database primary key "fault_id".
I guess, given that you did not provide the code, that you are adding the object to your session (session.add), committing (session.commit) and then using the object. As fault_id is autogenerated by the database, the fault object in the session (in memory) does not have fault_id.
I believe you can correct this with:
session.add(fault)
session.commit()
session.refresh(fault)
The refresh needs to be AFTER commit to refresh the fault object and retrieve fault_id.
If you are using async, you need
session.add(fault)
await session.commit()
await session.refresh(fault)
I am new to SQL Alchemy and need a way to run a script whenever a new entry is added to a table. I am currently using the following method to get the task done but I am sure there has to be a more efficient way.
I am using python 2 for my project and MS SQL as database.
Suppose my table is carData and I add a new row for car details from website. The new car data is added to carData. My code works as follows
class CarData:
<fields for table class>
with session_scope() as session:
car_data = session.query(CarData)
reference_df = pd.read_sql_query(car_data.statement, car_data.session.bind)
while True:
with session_scope() as session:
new_df = pd.read_sql_query(car_data.statement, car_data.session.bind)
if len(new_df) > len(reference_df):
print "New Car details added"
<code to get the id of new row added>
<run script>
reference_df = new_df
sleep(10)
The above is ofcourse a much simpler version of the code that I am using but the idea is to have a reference point then keep checking every 10 seconds if there is a new entry. However even after using session_scope() I have seen connection issues after a few days as this script is suppose to run indefinitely.
Is there a better way to know that a new row has been added, get the id of the new row and run the required script?
I believe the error you've described is a connectivity issue with the database e.g. a temporary network problem
OperationalError: TCP Provider: Error code 0x68
So what you need to do is cater this with error handling!
try:
new_df = pd.read_sql_query(car_data.statement, car_data.session.bind)
except:
print("Problem with query, will try again shortly")
I have a database table with UNIQUE key. If I want to insert some record there are two possible ways. First, the unique item doesn't exist yet, that's OK, just return new id. Second, the item already exists and I need to get the id of this unique record.
The problem is, that anything I try, I get always some exception.
Here's example of the code:
def __init__(self, host, user, password, database):
# set basic attributes
super().__init__(host, user, password, database)
#open connection
try:
self.__cnx = mysql.connector.connect(
database=database, user=user, password=password, host = host)
#self.__cursor = self.__cnx.cursor()
except ...
def insert_domain(self, domain):
insertq = "INSERT INTO `sp_domains` (`domain`) VALUES ('{0}')".format(domain)
cursor = self.__cnx.cursor()
try:
cursor.execute(insertq)
print("unique")
except (mysql.connector.errors.IntegrityError) as err:
self.__cnx.commit()
print("duplicate")
s = "SELECT `domain_id` FROM `sp_domains` WHERE `domain` = '{0}';".format(domain)
try:
id = cursor.execute(s).fetchone()[0]
except AttributeError as err:
print("Unable to execute the query:", err, file=sys.stderr)
except mysql.connector.errors.ProgrammingError as err:
print("Query syntax error:", err, file=sys.stderr)
else:
self.__cnx.commit()
cursor.close()
but anyting I try, on the first duplicate record I get either 'MySQL Connection not available', 'Unread result'. The code is just example to demonstrate it.
This is my first program using Connector/python, so I don't know all the rules, about fetch the results, commiting queries and so on.
Could anyone help me with this issue, please? Or is there any efficient way to such task ('cause this one seems to be not the best solution to me). Thank you for any advice.
I can't fix your code, because you've given us two different versions of the code and two partially-described errors without full information, but I can tell you how to get started.
From a comment:
In previous version it was type error I guess, something like "NoneType has no attribute 'fetchone'.
Looking at your code, the only place you call fetchone is here:
id = cursor.execute(s).fetchone()[0]
So obviously, cursor.execute(s) returned None. Why did it return None? Well, that's what it's supposed to return, according to the documentation.*
What you want to do is:
cursor.execute(s)
id = cursor.fetchone()[0]
… as all of the sample code does.
And for future reference, it's a lot easier to debug an error like this if you first note which line it happens on instead of throwing away the traceback, and then breaking that line into pieces and logging the intermediate values. Usually, you'll find one that isn't what you expected, and the problem will be much more obvious at that point, then three steps later on when you get a bizarre exceptions.
* Technically, the documentation just says that "Return values are not defined" for cursor.execute, so it would be perfectly legal for a DB-API module to return self here. Then again, it would also be legal to return some object that erases your hard drive when you call a method on it.
I have a Event model, where external_id is set to be unique.
session1 = create_session()
session2 = create_session()
e1 = Event(external_id=1, headline='session1')
session1.add(e1)
e2 = Event(external_id=1, headline='session2')
session2.add(e2)
session1.commit()
session2.commit()
s = create_session()
e = s.query(Event).filter_by(external_id=1).first()
print e.headline
I am getting output "session1" with no errors, which means session2.commit failed, silently. Ultimately I would like to have choose do I want to overwrite what's in db or not. So if session2.commit() fails, I would like to choose wether to change insert to update for some cases. Anyone can help with this? Thanks.
EDITED:
I found the answer. The way to do it is through a two pass mechanism:
both session should add/commit row with minimal information (unique key only)
both sessions should do query and get the row for update
if we want one session to have priority, make sure we use lock for update for it
I've run into another problem with SQLAlchemy. I have a relationship that is suppose to cascade delete some data from my model declared like so :
parentProject = relationship(Project, backref=backref("OPERATIONS", cascade="all,delete"))
This works fine as long as the data is from the current session. But if I start a session , add some data then close. Start another session and try to delete data from the previous one , the cascade doesnt work. The initializer of the database is as follows:
if isDBEmpty:
LOGGER.info("Initializing Database")
session = dao.Session()
model.Base.metadata.create_all(dao.Engine)
session.commit()
LOGGER.info("Database Default Tables created successfully!")
dao.storeEntity(model.User(administrator_username, md5(administrator_password).hexdigest(), administrator_email, True, model.ROLE_ADMINISTRATOR))
LOGGER.info("Database Default Generic Values were stored!")
else:
LOGGER.info("Database already has some data, will not be re-created at this startup!")
I'm guessing I'm missing something very basic here. Some help would be very appreciated.
Regards,
Bogdan