Updating dynamically determined fields with peewee - python

I have a peewee model like the following:
class Parrot(Model):
is_alive = BooleanField()
bought = DateField()
color = CharField()
name = CharField()
id = IntegerField()
I get this data from the user and look for the corresponding id in the (MySQL) database. What I want to do now is to update those attributes which are not set/empty at the moment. For example, if the new data has the following attributes:
is_alive = True
bought = '1965-03-14'
color = None
name = 'norwegian'
id = 17
and the data from the database has:
is_alive = False
bought = None
color = 'blue'
name = ''
id = 17
I would like to update the bought date and the name (which are not set or empty), but without changing the is_alive status. In this case, I could get the new and old data in separate class instances, manually create a list of attributes and compare them one for one, updating where necessary, and finally saving the result to the database. However, I feel there might be a better way for handling this, which could also be used for any class with any attributes. Is there?

MySQL Solution:
UPDATE my_table SET
bought = ( case when bought is NULL OR bought = '' ) then ? end )
, name = ( case when name is NULL OR name = '' ) then ? end )
-- include other field values if any, here
WHERE
id = ?
Use your scripting language to set the parameter values.
In case of the parameters matching the old values, then update will not be performed, by default.

Related

How to append to a list in on conflict_do_update in SQLAlchemy?

Model:
class Example(Base):
__tablename__ = "example"
create_time = Column(DateTime, server_default=func.now())
time_stamps = Column(MutableList.as_mutable(ARRAY(DateTime)), server_default="{}")
update_time = Column(DateTime, server_default=func.now())
Now when I insert new example, I need to append the create_time of new example into the time_stamps ARRAY, then I need to sort it to get the newest time and that time set as a new update_time.
I managed to do it separately
def update_record(db: Session, create_time: datetime, db_record: Example):
db_record.time_stamps.append(create_time)
sorted_times = sorted(db_record.time_stamps, reverse=True)
db_record.update_time = sorted_times[0]
db_record.time_stamps = sorted_times
db.commit()
But I need to do it atomically using INSERT ON CONFLICT UPDATE clause.
So far I have:
db_dict = {"create_time": record.create_time,
"time_stamps": [record.create_time],
"update_time": record.create_time}
stm = insert(Example).values(db_dict)
do_update_stm = stm.on_conflict_do_update(constraint='my_unique_constraint',
set_=dict(??)
My question is how to access and append to values of the original conflict row in set_ inside conflict_do_update in SQLAlchemy?
Thanks
In the end I bypassed SQLAlchemy by writing a textual query where I could use this syntax to append to ARRAY
.. DO UPDATE SET time_stamps = example.time_stamps || EXCLUDED.create_time,

SQLAlchemy's "post_update" behaves differently with objects that have been expunged from a session

I'm trying to copy rows from one DB instance to a another DB with an identical schema in a different environment. Two tables within this schema are linked in such a way that they result in mutually dependent rows. When these rows are inserted, the post_update runs afterward as expected, but the update statement sets the value of the ID field to None instead of the expected ID.
This only happens when using objects that have been expunged from a session. When using newly created objects, the post_update behaves exactly as expected.
Examples
I have a relationship set up that looks like this:
class Category(Base):
__tablename__ = 'categories'
id = Column(Integer, primary_key=True)
top_product_id = Column(Integer, ForeignKey('products.id'))
products = relationship('Product', primaryjoin='Product.category_id == Category.id', back_populates='category', cascade='all', lazy='selectin')
top_product = relationship('Product', primaryjoin='Category.top_product_id == Product.id', post_update=True, cascade='all', lazy='selectin')
class Product(Base):
__tablename__ = 'products'
id = Column(Integer, primary_key=True)
category_id = Column(Integer, ForeignKey('categories.id'))
category = relationship('Category', primaryjoin='Product.category_id == Category.id', back_populates='products', cascade='all', lazy='selectin')
If I query a category and its related products from one DB and try to write them to another, the update of top_product_id doesn't behave as expected, and sets the value to None instead. The following code:
category = source_session.query(Category).filter(Category.id == 99).one()
source_session.expunge(category)
make_transient(category)
for products in category.products:
make_transient(product)
# this step is necessary to prevent a foreign key error on the initial category insert
category.top_product_id = None
dest_session.add(category)
results in SQLAlchemy generating the following SQL:
INSERT INTO categories (name, top_product_id) VALUES (%s, %s)
('SomeCategoryName', None)
INSERT INTO products (name, category_id) VALUES (%s, %s)
('SomeProductName', 99)
UPDATE categories SET top_product_id=%s WHERE categories.id = %s
(None, 99)
But if I use newly created objects, everything works as expected.
category = Category()
product = Product()
category.name = 'SomeCategoryName'
product.name = 'SomeProductName'
product.category = category
category.top_product = product
dest_session.add(category)
results in:
INSERT INTO categories (name, top_product_id) VALUES (%s, %s)
('SomeCategoryName', None)
INSERT INTO products (name, category_id) VALUES (%s, %s)
('SomeProductName', 99)
UPDATE categories SET top_product_id=%s WHERE categories.id = %s
(1, 99)
Aside from this difference, everything behaves in the same way between these two actions. All other relationship are created properly, IDs and foreign keys are set as expected. Only the top_product_id set in the update clause created by the post_update fails to behave as expected.
As an additional troubleshooting step, I tried:
Creating new objects
Adding them to a session
Flushing the session to the DB
Expunging the objects from the session
Unseting the foreign key ID fields on the objects (to avoid initial insert error) and making the objects transient
Re-adding the objects to the session
Re-flushing to the DB
On the first flush to the DB, the top_product_id is set properly. On the second, it's set to None. So this confirms that the issue is not with differences in the sessions, but something to do with expunging objects from sessions and making them transient. There must be something that does/doesn't happen during the expunge/make transient process that leaves these objects in a fundamentally different state and prevents post_update from behaving the way it should.
Any ideas on where to go from here would be appreciated.
I assume your Base class mixes in the name column?
Your goal is to make inspect(category).committed_state look like it does for newly created objects (except maybe for id attribute). Same for each product object.
In your "newly created objects" example, category's committed_state looks like this before flushing the session:
{'id': symbol('NEVER_SET'),
'name': symbol('NO_VALUE'),
'products': [],
'top_product': symbol('NEVER_SET')}
while product's committed_state looks like this:
{'category': symbol('NEVER_SET'),
'id': symbol('NEVER_SET'),
'name': symbol('NO_VALUE')}
To get the post-update behavior, you need to both expire category.top_product_id (to prevent it from being included in the INSERT) and fudge category.top_product's committed_state (to make SQLAlchemy believe that the value has changed and therefore needs to cause an UPDATE).
First, expire category.top_product_id before making category transient:
source_session.expire(category, ["top_product_id"])
Then fudge category.top_product's committed_state (this can happen before or after making category transient):
from sqlalchemy import inspect
from sqlalchemy.orm.base import NEVER_SET
inspect(category).committed_state.update(top_product=NEVER_SET)
Full example:
from sqlalchemy import Column, ForeignKey, Integer, String, create_engine, inspect
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import Session, make_transient, relationship
from sqlalchemy.orm.base import NEVER_SET
class Base(object):
name = Column(String(50), nullable=False)
Base = declarative_base(cls=Base)
class Category(Base):
__tablename__ = 'categories'
id = Column(Integer, primary_key=True)
top_product_id = Column(Integer, ForeignKey('products.id'))
products = relationship('Product', primaryjoin='Product.category_id == Category.id', back_populates='category', cascade='all', lazy='selectin')
top_product = relationship('Product', primaryjoin='Category.top_product_id == Product.id', post_update=True, cascade='all', lazy='selectin')
class Product(Base):
__tablename__ = 'products'
id = Column(Integer, primary_key=True)
category_id = Column(Integer, ForeignKey('categories.id'), nullable=False)
category = relationship('Category', primaryjoin='Product.category_id == Category.id', back_populates='products', cascade='all', lazy='selectin')
source_engine = create_engine('sqlite:///')
dest_engine = create_engine('sqlite:///', echo=True)
def fk_pragma_on_connect(dbapi_con, con_record):
dbapi_con.execute('pragma foreign_keys=ON')
from sqlalchemy import event
for engine in [source_engine, dest_engine]:
event.listen(engine, 'connect', fk_pragma_on_connect)
Base.metadata.create_all(bind=source_engine)
Base.metadata.create_all(bind=dest_engine)
source_session = Session(bind=source_engine)
dest_session = Session(bind=dest_engine)
source_category = Category(id=99, name='SomeCategoryName')
source_product = Product(category=source_category, id=100, name='SomeProductName')
source_category.top_product = source_product
source_session.add(source_category)
source_session.commit()
source_session.close()
# If you want to test UPSERTs in dest_session.
# dest_category = Category(id=99, name='PrevCategoryName')
# dest_product = Product(category=dest_category, id=100, name='PrevProductName')
# dest_category.top_product = dest_product
# dest_session.add(dest_category)
# dest_session.commit()
# dest_session.close()
category = source_session.query(Category).filter(Category.id == 99).one()
# Ensure relationship attributes are initialized before we make objects transient.
_ = category.top_product
# source_session.expire(category, ['id']) # only if you want new IDs in dest_session
source_session.expire(category, ['top_product_id'])
for product in category.products:
# Ensure relationship attributes are initialized before we make objects transient.
_ = product.category
# source_session.expire(product, ['id']) # only if you want new IDs in dest_session
# Not strictly needed as long as Product.category is not a post-update relationship.
source_session.expire(product, ['category_id'])
make_transient(category)
inspect(category).committed_state.update(top_product=NEVER_SET)
for product in category.products:
make_transient(product)
# Not strictly needed as long as Product.category is not a post-update relationship.
inspect(product).committed_state.update(category=NEVER_SET)
dest_session.add(category)
# Or, if you want UPSERT (must retain original IDs in this case)
# dest_session.merge(category)
dest_session.flush()
Which produces this DML in dest_session:
INSERT INTO categories (name, id, top_product_id) VALUES (?, ?, ?)
('SomeCategoryName', 99, None)
INSERT INTO products (name, id, category_id) VALUES (?, ?, ?)
('SomeProductName', 100, 99)
UPDATE categories SET top_product_id=? WHERE categories.id = ?
(100, 99)
It seems like make_transient should reset committed_state to be as if it were a new object, but I guess not.

Mapping Class Column Headers in Python SQLAlchemy from CSV import

I set up the column names in the class like below:
class Stat1(Base):
__tablename__ = 'stat1'
__table_args__ = {'sqlite_autoincrement': True}
id = Column(VARCHAR, primary_key=True, nullable=False)
Date_and_Time = Column(VARCHAR)
IP_Address = Column(VARCHAR)
Visitor_Label = Column(VARCHAR)
Browser = Column(VARCHAR)
Version = Column(VARCHAR)
The csv file does not use the UNDERSCORE in the column names. It is a csv file downloaded from the internet. For instance, when I import the column names headers like "Date_and_Time" are imported as "Date and Time".
I had assumed (that's wrong, right?) that the CSV's column name would map to the class column headers I set up but that's not happening and the queries are not running properly because of it. I am getting messages like this:
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such
column: stat1.Date_and_Time [SQL: 'SELECT stat1.id AS stat1_id,
stat1."Date_and_Time" AS "stat1_Date_and_Time", stat1."IP_Address" AS
"stat1_IP_Address"...etc.
Is there a way to map these automatically so that queries are successful? Or a way to change the CSV's column headings automatically to insert an UNDERSCORE in the column headings to match with the columns defined in the Class?
There are a couple of different ways that you can approach this:
Implement Your Own De-serialization Logic
This means that the process of reading your CSV file and mapping its columns to your Base model class' attributes is done manually (as in your question), and then you read / map your CSV using your own custom code.
I think, in this scenario, having underscores in your model class attributes (Stat1.Date_and_Time) but not in your CSV header (...,"Date and Time",...) will complicate your code a bit. However, depending on how you've implemented your mapping code you can set your Column to use one model attribute name (Stat1.Date_and_Time)
and a different database column name (e.g. have Stat1.Date_and_Time map to your database column "Date and Time"). To accomplish this, you need to pass the name argument as below:
class Stat1(Base):
__tablename__ = 'stat1'
__table_args__ = { 'sqlite_autoincrement': True }
id = Column(name = 'id', type_ = VARCHAR, primary_key = True, nullable = False)
Date_and_Time = Column(name = 'Date and Time', type_ = VARCHAR)
IP_Address = Column(name = 'IP Address', type_ = VARCHAR)
# etc.
Now when you read records from your CSV file, you will need to load them into the appropriate model attributes in your Stat1 class. A pseudo-code example would be:
id, date_and_time, ip_address = read_csv_record(csv_record)
# Let's assume the "read_csv_record()" function reads your CSV record and returns
# the appropriate value for `id`, `Date_And_Time`, and `IP_Address`
my_record = Stat1(id = id,
Date_And_Time = date_and_time,
ip_address
# etc.)
Here, the trick is in implementing your read_csv_record() function so that it reads and returns the column values for your model attributes, so that you can then pass them appropriately to your Stat1() constructor.
Use SQLAthanor
An (I think easier) alternative to implementing your own de-serialization solution is to use a library like SQLAthanor (full disclosure: I'm the library's author, so I'm a bit biased). Using SQLAthanor, you can either:
Create your Stat model class programmatically:
from sqlathanor import generate_model_from_csv
Stat1 = generate_model_from_csv('your_csv_file.csv',
'stat1',
primary_key = 'id')
Please note, however, that if your column header names are not ANSI SQL standard column names (if they contain spaces, for example), this will likely produce an error.
Define your model, and then create instances from your CSV.
To do this, you would define your model very similarly to how you do above:
from sqlathanor import BaseModel
class Stat1(BaseModel):
__tablename__ = 'stat1'
__table_args__ = { 'sqlite_autoincrement': True }
id = Column(name = 'id', type_ = VARCHAR, primary_key = True, nullable = False, supports_csv = True, csv_sequence = 1)
Date_and_Time = Column(name = 'Date and Time', type_ = VARCHAR, supports_csv = True, csv_sequence = 2)
IP_Address = Column(name = 'IP Address', type_ = VARCHAR, supports_csv = True, csv_sequence = 3)
# etc.
The supports_csv argument tells your Stat1 class that model attribute Stat1.id can be de-serialized from (and serialized to) CSV, and the csv_sequence argument indicates that it will always be the first column in a CSV record.
Now you can create a new Stat1 instance (a record in your database) by passing your CSV record to Stat1.new_from_csv():
# let's assume you have loaded a single CSV record into a variable "csv_record"
my_record = Stat1.new_from_csv(csv_record)
and that's it! Now your my_record variable will contain an object representation of your CSV record, which you can then commit to the database if and when you choose. Since there is a wide variety of ways that CSV files can be constructed (using different delimiters, wrapping strategies, etc.) there are a large number of configuration arguments that can be supplied to .new_from_csv(), but you can find all of them documented here: https://sqlathanor.readthedocs.io/en/latest/using.html#new-from-csv
SQLAthanor is an extremely robust library for moving data into / out of CSV and SQLAlchemy, so I strongly recommend you review the documentation. Here are the important links:
Github Repo
Comprehensive Documentation
PyPi
Hope this helps!

Static table in peewee

I want to store an enum in the database, according to this.
So let's say, I have a Gender enum and a Person model. I want to do a select like Person.select().where(Person.gender == Gender.MALE)
This can be achieved by creating a GenderField in Person as described here. But the gender won't be in the database as a table, and I want the Person to have foreign keys to the Gender table.
So how can I store static Gender data in the database, then query the Person table by the enum values?
You can, as part of the table creation process, populate the gender table:
class Gender(Model):
label = CharField(primary_key=True)
class Meta:
database = db
def create_schema():
# Use an execution context to ensure the connection is closed when
# we're finished.
with db.execution_context():
db.create_tables([Gender, ...], True)
if not Gender.select().exists():
defaults = ['MALE', 'FEMALE', 'UNSPECIFIED']
Gender.insert_many([{'label': label} for label in defaults]).execute()
The only reason to make Gender a table in this example, though, would be if you plan to dynamically add new Gender labels at run-time. If the set of values is fixed, my opinion is you're better off doing something like this:
class Person(Model):
GENDER_MALE = 'M'
GENDER_FEMALE = 'F'
GENDER_UNSPECIFIED = 'U'
name = CharField()
gender = CharField()
# Then, simply:
Person.create(name='Huey', gender=Person.GENDER_MALE)
Person.create(name='Zaizee', gender=Person.GENDER_UNSPECIFIED)

Inserting record to mysql table, via sqlalchemy , with autoinc throws error

I am trying to insert a row for a table which has auto_increment as well as some foreign keys. All the foreign keys exist. But it throws error.
sqlalchemy.orm.exc.FlushError: Instance Stock at 0x9cf062c has a
NULL identity key. If this is an auto-generated value, check that the
database table allows generation of new primary key values, and that
the mapped Column object is configured to expect these generated
values. Ensure also that this flush() is not occurring at an
inappropriate time, such as within a load() event.
Even insertion of record via MySQL, by copy-paste SQL produced by echo=True, is executing.
Stock Class
class Stock(Base):
__tablename__ = 'Stock'
Code = Column('Code',String(8),primary_key=True)
Symbol = Column('Symbol',String(128))
ListingName = Column('ListingName',String(256))
ListingDate = Column('ListingDate',DateTime())
RecordAddedDate = Column('RecordAddedDate',DateTime())
HomeCountry = Column('HomeCountry',ForeignKey('Country.Code'))
PrimaryExchange = Column('PrimaryExchange',ForeignKey('Exchange.Code'))
BaseCurrency = Column('BaseCurrency',ForeignKey('Currency.Code'))
InstrumentType = Column('InstrumentType',ForeignKey('Instrument.InstrumentType'))
Record insertion
Engine = sqlalchemy.create_engine('mysql://user:pass#host/db',echo=True)
Session = sqlalchemy.orm.sessionmaker(bind=Engine)
SessionObj = Session()
NewStock = Stock()
NewStock.InstrumentType = 'Stock'
NewStock.Symbol = 'MSFT'
NewStock.ListingName = 'Microsoft'
NewStock.HomeCountry = 'IN'
NewStock.PrimaryExchange = 'NSEOI'
NewStock.BaseCurrency = 'INR'
NewStock.ListingDate = datetime.datetime.now().strftime("%Y%m%d")
NewStock.RecordAddedDate = datetime.datetime.now().strftime("%Y%m%d")
print NewStock
SessionObj.add(NewStock)
SessionObj.flush()
print NewStock.Code
Add autoincrement=True to your column.
Got it. I had the column type as String, after converting to Integer worked fine.

Categories

Resources