Bulk INSERT IGNORE using Flask-SQLAlchemy - python

I'm trying to update a database using API-gathered data, and I need to make sure all tables are being updated.
Sometime I will receive data that's already in the database, so I want to do an INSERT IGNORE.
My current code is something like this:
def update_orders(new_orders):
entries = []
for each_order in new_orders:
shipping_id = each_order['id']
title = each_order['title']
price = each_order['price']
code = each_order['code']
source = each_order['source']
phone = each_order['phone']
category = each_order['delivery_category']
carrier = each_order['carrier_identifier']
new_entry = Orders(
id=shipping_id,
title=title,
code=code,
source=source,
phone=phone,
category=category,
carrier=carrier,
price=price
)
entries.append(new_entry)
if len(entries) == 0:
print('No new orders.')
break
else:
print('New orders:', len(entries))
db.session.add_all(entries)
db.session.commit()
This works well when I'm creating the database from scratch, but it will give me an error if there's duplicate data, and I'm not able to commit the inserts.
I've been reading for a while, and found a workaround that uses prefix_with:
print('New orders:', len(entries))
if len(entries) == 0:
print('No new orders.')
else:
insert_command = Orders.__table__.insert().prefix_with('OR IGNORE').values(entries)
db.session.execute(insert_command)
db.session.commit()
The problem is that values(entries) is a bunch of objects:
<shop.database.models.Orders object at 0x11986def0> instead of being the instance of the class, is the class instance object in memory.
Anybody has any suggestion on approaching this problem?
Feel free to suggest a different approach, or just an adjustment.
Thanks a lot.

What database are you using ? Under MySQL, "INSERT OR IGNORE" is not valid syntax, instead one should use "INSERT IGNORE". I had the same situation and got my query to work with the following:
insert_command = Orders.__table__.insert().prefix_with(' IGNORE').values(entries)

Related

Troubleshooting uncooperative For Loop of SQLalchemy results

Looking for a second set of eyes here. I cannot figure out why the following loop will not continue past the first iteration.
The 'servicestocheck' sqlalchemy query returns 45 rows in my test, but I cannot iterate through the results like I'm expecting... and no errors are returned. All of the functionality works on the first iteration.
Anyone have any ideas?
def serviceAssociation(current_contact_id,perm_contact_id):
servicestocheck = oracleDB.query(PORTAL_CONTACT).filter(
PORTAL_CONTACT.contact_id == current_contact_id
).order_by(PORTAL_CONTACT.serviceID).count()
print(servicestocheck) # returns 45 items
servicestocheck = oracleDB.query(PORTAL_CONTACT).filter(
PORTAL_CONTACT.contact_id = current_contact_id
).order_by(PORTAL_CONTACT.serviceID).all()
for svc in servicestocheck:
#
# Check to see if already exists
#
check_existing_association = mysqlDB.query(
CONTACTTOSERVICE).filter(CONTACTTOSERVICE.contact_id ==
perm_contact_id,CONTACTTOSERVICE.serviceID ==
svc.serviceID).first()
#
# If no existing association
#
if check_existing_association is None:
print ("Prepare Association")
assoc_contact_id = perm_contact_id
assoc_serviceID = svc.serviceID
assoc_role_billing = False
assoc_role_technical = False
assoc_role_commercial = False
if svc.contact_type == 'Billing':
assoc_role_billing = True
if svc.contact_type == 'Technical':
assoc_role_technical = True
if svc.contact_type == 'Commercial':
assoc_role_commercial = True
try:
newAssociation = CONTACTTOSERVICE(
assoc_contact_id, assoc_serviceID,
assoc_role_billing,assoc_role_technical,
assoc_role_commercial)
mysqlDB.add(newAssociation)
mysqlDB.commit()
mysqlDB.flush()
except Exception as e:
print(e)
This function is called from a script, and it is called from within another loop. I can't find any issues with nested loops.
Ended up being an issue with SQLAlchemy ORM (see SqlAlchemy not returning all rows when querying table object, but returns all rows when I query table object column)
I think the issue is due to one of my tables above does not have a primary key in real life, and adding a fake one did not help. (I don't have access to the DB to add a key)
Rather than fight it further... I went ahead and wrote raw SQL to move my project along.
This did the trick:
query = 'SELECT * FROM PORTAL_CONTACT WHERE contact_id = ' + str(current_contact_id) + 'ORDER BY contact_id ASC'
servicestocheck = oracleDB.execute(query)

Django: Insert or update database entries

I created the following function where I either create a new database entry or update it if event_pk already exists. Now I looked into update_or_create. However, that doesn't work in my case, as the other entries (yhat, etc.) always differ. Do you have any better idea to write it so I don't repeat myself as I do now? One more idea I had was that I could maybe save event=event_obj, yhat=event_forecast.get('yhat') etc. in a dict and unpack it. But didn't figure out how that could work.
def insert_forecast_data_to_db(self) -> None:
"""Insert or update forecast data in database."""
forecast_data = self.get_forecast_data()
for event_pk, event_forecast in forecast_data.items():
event_obj = Event.objects.get(pk=event)
forecast_obj = Forecast.objects.filter(event=event_pk)
if forecast_obj.exists():
forecast_obj.update(
event=event_obj,
yhat=event_forecast.get('yhat'),
yhat_lower=event_forecast.get('yhat_lower'),
yhat_upper=event_forecast.get('yhat_upper'),
img_key=event_forecast.get('img_key'),
)
else:
Forecast.objects.create(
event=event_obj,
yhat=event_forecast.get('yhat'),
yhat_lower=event_forecast.get('yhat_lower'),
yhat_upper=event_forecast.get('yhat_upper'),
img_key=event_forecast.get('img_key'),
)
I think I just figured it out. That's what I was aiming for:
def insert_forecast_data_to_db(self) -> None:
"""Insert or update forecast data in database."""
forecast_data = self.get_forecast_data()
for event_pk, event_forecast in forecast_data.items():
event_obj = Event.objects.get(pk=event_pk)
data = {
'event': event_obj,
'yhat': event_forecast.get('yhat'),
'yhat_lower': event_forecast.get('yhat_lower'),
'yhat_upper': event_forecast.get('yhat_upper'),
'img_key': event_forecast.get('img_key'),
}
forecast_obj = Forecast.objects.filter(event=event_pk)
forecast_obj.update(
**data
) if forecast_obj.exists() else Forecast.objects.create(**data)

Python Flask and SQLAlchemy, selecting all data from a column

I am attempting to query all rows for a column called show_id. I would then like to compare each potential item to be added to the DB with the results. Now the simplest way I can think of doing that is by checking if each show is in the results. If so pass etc. However the results from the below snippet are returned as objects. So this check fails.
Is there a better way to create the query to achieve this?
shows_inDB = Show.query.filter(Show.show_id).all()
print(shows_inDB)
Results:
<app.models.user.Show object at 0x10c2c5fd0>,
<app.models.user.Show object at 0x10c2da080>,
<app.models.user.Show object at 0x10c2da0f0>
Code for the entire function:
def save_changes_show(show_details):
"""
Save the changes to the database
"""
try:
shows_inDB = Show.query.filter(Show.show_id).all()
print(shows_inDB)
for show in show_details:
#Check the show isnt already in the DB
if show['id'] in shows_inDB:
print(str(show['id']) + ' Already Present')
else:
#Add show to DB
tv_show = Show(
show_id = show['id'],
seriesName = str(show['seriesName']).encode(),
aliases = str(show['aliases']).encode(),
banner = str(show['banner']).encode(),
seriesId = str(show['seriesId']).encode(),
status = str(show['status']).encode(),
firstAired = str(show['firstAired']).encode(),
network = str(show['network']).encode(),
networkId = str(show['networkId']).encode(),
runtime = str(show['runtime']).encode(),
genre = str(show['genre']).encode(),
overview = str(show['overview']).encode(),
lastUpdated = str(show['lastUpdated']).encode(),
airsDayOfWeek = str(show['airsDayOfWeek']).encode(),
airsTime = str(show['airsTime']).encode(),
rating = str(show['rating']).encode(),
imdbId = str(show['imdbId']).encode(),
zap2itId = str(show['zap2itId']).encode(),
added = str(show['added']).encode(),
addedBy = str(show['addedBy']).encode(),
siteRating = str(show['siteRating']).encode(),
siteRatingCount = str(show['siteRatingCount']).encode(),
slug = str(show['slug']).encode()
)
db.session.add(tv_show)
db.session.commit()
except Exception:
print(traceback.print_exc())
I have decided to use the method above and extract the data I wanted into a list, comparing each show to the list.
show_compare = []
shows_inDB = Show.query.filter().all()
for item in shows_inDB:
show_compare.append(item.show_id)
for show in show_details:
#Check the show isnt already in the DB
if show['id'] in show_compare:
print(str(show['id']) + ' Already Present')
else:
#Add show to DB
For querying a specific column value, have a look at this question: Flask SQLAlchemy query, specify column names. This is the example code given in the top answer there:
result = SomeModel.query.with_entities(SomeModel.col1, SomeModel.col2)
The crux of your problem is that you want to create a new Show instance if that show doesn't already exist in the database.
Querying the database for all shows and looping through the result for each potential new show might become very inefficient if you end up with a lot of shows in the database, and finding an object by identity is what an RDBMS does best!
This function will check to see if an object exists, and create it if not. Inspired by this answer:
def add_if_not_exists(model, **kwargs):
if not model.query.filter_by(**kwargs).first():
instance = model(**kwargs)
db.session.add(instance)
So your example would look like:
def add_if_not_exists(model, **kwargs):
if not model.query.filter_by(**kwargs).first():
instance = model(**kwargs)
db.session.add(instance)
for show in show_details:
add_if_not_exists(Show, id=show['id'])
If you really want to query all shows upfront, instead of putting all of the id's into a list, you could use a set instead of a list which will speed up your inclusion test.
E.g:
show_compare = {item.show_id for item in Show.query.all()}
for show in show_details:
# ... same as your code

python sqlite3 Getting error: sqlite3.OperationalError: database is locked

Can you help me, I wrote this code:
class Favorits(object):
def __init__(self, graph_):
self.graph_ = graph_
self.conn3 = sqlite3.connect('C:/C/V2.db')
def add_(self):
с3 = self.conn3.cursor()
с3.execute("SELECT COUNT(*) FROM fav")
t_count = с3.fetchall()
self.conn3.commit()
t_count = t_count[0][0]
to_add_rus = self.graph_.text_rus.get('1.0', 'end')
to_add_eng = self.graph_.text_eng.get('1.0', 'end')
to_add_esp = self.graph_.text_esp.get('1.0', 'end')
с3.execute("INSERT INTO fav VALUES(?,?,?,?)", (t_count + 1, to_add_rus, to_add_eng, to_add_esp))
self.conn3.commit()
def rem_(self):
c4 = self.conn3.cursor()
idx = (self.graph_.word_.f_0_to_remove)
idx = idx[0]
print(idx)
c4.execute("DELETE FROM fav WHERE id_=?", (idx,))
self.conn3.commit()
This is a class which I use to add and remove different rows from db (using Tkinter as GUI).
So basically I'm trying to make two different connections to the same db via two different cursors (in order to be able to add and remove words from it). And I constantly get this error:
self.conn3.commit()
sqlite3.OperationalError: database is locked
I already tried differend options, makin' two different cursors, etc. nothing helps.
You need to close the database connection after opening it. Otherwise the connection persists, and so the database believes an edit is underway so locks itself.
You can break the connection with the command self.conn3.close().

Skipping a field on save (Django models, Insert and Update)

Given PostgreSQL 9.2.10, Django 1.8, python 2.7.5, the following model:
class soapProdAPI(models.Model):
soap_id = models.PositiveIntegerField(primary_key=True)
soap_host = models.CharField(max_length=20)
soap_ip = models.GenericIPAddressField(default='0.0.0.0')
soap_asset = models.CharField(max_length=20)
soap_state = models.CharField(max_length=20)
And the following code:
tableProdSoap = soapProdQuery()
#periodic_task(run_every=timedelta(minutes=2))
def saveSoapProd():
tableProdSoap = soapProdQuery()
if tableProdSoap != None:
for item in tableProdSoap:
commit = soapProdAPI(soap_id=item[0], soap_host=item[1], soap_asset=item[2], soap_state=item[3])
commit.save()
saveSoapNullIP()
To answer Josué Padilla's question:
#task
def saveSoapNullIP():
missingIP = soapProdAPI.objects.filter(soap_ip='0.0.0.0')
if missingIP:
for record in missingIP:
if str(record.soap_host).lower().startswith('1a'):
fqdn = str(record.soap_host) + 'stringvaluehere'
elif str(record.soap_host).lower().startswith('1b'):
fqdn = str(record.soap_host) + 'stringvaluehere'
elif str(record.soap_host).lower().startswith('1c'):
fqdn = str(record.soap_host) + 'stringvaluehere'
else:
fqdn = str(record.soap_host) + 'stringvaluehere'
try:
hostIp = check_output('host %s' % fqdn, shell=True)
hostIp = hostIp.split()[-1]
except:
hostIp = '0.0.0.0'
record.soap_ip = hostIp
record.save(update_fields=['soap_ip'])
My soapProdQuery only returns these 4 fields where there is a 5th field in the model (soap_ip). I know it is probably not the best way to do it but I have a separate block of code that queries the db for None values in soap_ip runs a subprocess host on them and saves it back with the ip address (The number of rows returned/updated should get smaller each pass through, as opposed to putting the logic for doing a host lookup into the request/this celery task itself which would run every API request. I have tried this already, it takes FOREVER to return the completed data.). The soap API I query does not provide the IP or I would grab it that way obviously. This all runs as background tasks using celery to make it invisible/seamless to the web user.
The issue I run into is that every time the saveSoapProd() runs it overwrites the previous soap_ip field with '0.0.0.0' thus negating the work of my other function. The other issue is that I cannot force_insert or force_update as I need both functionalities with this. My question is this: is there a way to selectively update/insert at the same time and completely exclude doing anything to the soap_ip each time saveSoapProd() runs? Any and all help is greatly appreciated. Thank you in advance.
** EDIT 1 **
I may or may not have found a solution in update_or_create or get_or_create, however I am unsure on the exact usage. The docs have me slightly confused.
** EDIT 2 **
I guess get_or_create is a bust. Works first pass through but every save after that fails with this:
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "<console>", line 8, in saveSoapProd
File "/lib/python2.7/site-packages/django/db/models/base.py", line 690, in save
% ', '.join(non_model_fields))
ValueError: The following fields do not exist in this model or are m2m fields: soap_id
Here is the code:
#periodic_task(run_every=timedelta(minutes=2))
def saveSoapProd():
tableProdSoap = soapProdQuery()
if tableProdSoap != None:
for item in tableProdSoap:
obj, created = soapProdAPI.objects.get_or_create(soap_id=item[0], defaults={'soap_host': item[1], 'soap_asset': item[2], 'soap_state': item[3]})
if created == False:
commit = soapProdAPI(soap_id=item[0], soap_host=item[1], soap_asset=item[2], soap_state=item[3])
commit.save(update_fields=['soap_id', 'soap_host', 'soap_asset', 'soap_state'])
I will be honest, I am not entirely sure what is causing this error.
** EDIT 3/CURRENT SOLUTION **
I was able to resolve my own issue by modifying my model and my task function. The solution uses get_or_create, but you could easily extrapolate how to use update_or_create from the solution provided. See the selected answer below for a coded example.
** TLDR **
I want to do a .save() where it may need to do a insert for new records or update for changed records WITHOUT touching the soap_ip field (no insert_only or update_only).
I don't know if you already knew this, but you can override the save() function of your model.
class soapProdAPI(models.Model):
soap_id = models.PositiveIntegerField(primary_key=True)
soap_host = models.CharField(max_length=20)
soap_ip = models.GenericIPAddressField(default='0.0.0.0')
soap_asset = models.CharField(max_length=20)
soap_state = models.CharField(max_length=20)
# Override save
def save(self, *args, **kwargs):
if self.soap_ip != '0.0.0.0':
self.soap_ip = your_ip # Here you can get your old IP an save that instead of 0.0.0.0
EDIT
You are getting
ValueError: The following fields do not exist in this model or are m2m fields: soap_id
Because you are trying to update soap_id, that field is defined as your model's primary key, so it is immutable when updating. That's why it crashes when you do:
commit.save(update_fields=['soap_id', 'soap_host', 'soap_asset', 'soap_state'])
Try removing soap_id from update_fields.
Solved my own issue without modifying the save method by making the following changes to my model:
class soapProdAPI(models.Model):
soap_id = models.PositiveIntegerField(unique=True, null=False)
soap_host = models.CharField(max_length=20)
soap_ip = models.GenericIPAddressField(default='0.0.0.0')
soap_asset = models.CharField(max_length=20)
soap_state = models.CharField(max_length=20)
and my task:
def saveSoapProd():
tableProdSoap = soapProdQuery()
if tableProdSoap != None:
for item in tableProdSoap:
try:
obj, created = soapProdAPI.objects.get_or_create(soap_id=item[0], defaults={'soap_host': item[1], 'soap_asset': item[2], 'soap_state': item[3]})
if created == False:
obj.soap_host = item[1]
obj.soap_asset = item[2]
obj.soap_state = item[3]
obj.save(update_fields=['soap_host', 'soap_asset', 'soap_state'])
except:
continue
saveSoapMissingIP()
EDIT
Just noticed Josué Padilla's response, which was in fact part of my problem that I solved with this answer. Thank you to Josué for all of your help.

Categories

Resources