Python Flask and SQLAlchemy, selecting all data from a column - python

I am attempting to query all rows for a column called show_id. I would then like to compare each potential item to be added to the DB with the results. Now the simplest way I can think of doing that is by checking if each show is in the results. If so pass etc. However the results from the below snippet are returned as objects. So this check fails.
Is there a better way to create the query to achieve this?
shows_inDB = Show.query.filter(Show.show_id).all()
print(shows_inDB)
Results:
<app.models.user.Show object at 0x10c2c5fd0>,
<app.models.user.Show object at 0x10c2da080>,
<app.models.user.Show object at 0x10c2da0f0>
Code for the entire function:
def save_changes_show(show_details):
"""
Save the changes to the database
"""
try:
shows_inDB = Show.query.filter(Show.show_id).all()
print(shows_inDB)
for show in show_details:
#Check the show isnt already in the DB
if show['id'] in shows_inDB:
print(str(show['id']) + ' Already Present')
else:
#Add show to DB
tv_show = Show(
show_id = show['id'],
seriesName = str(show['seriesName']).encode(),
aliases = str(show['aliases']).encode(),
banner = str(show['banner']).encode(),
seriesId = str(show['seriesId']).encode(),
status = str(show['status']).encode(),
firstAired = str(show['firstAired']).encode(),
network = str(show['network']).encode(),
networkId = str(show['networkId']).encode(),
runtime = str(show['runtime']).encode(),
genre = str(show['genre']).encode(),
overview = str(show['overview']).encode(),
lastUpdated = str(show['lastUpdated']).encode(),
airsDayOfWeek = str(show['airsDayOfWeek']).encode(),
airsTime = str(show['airsTime']).encode(),
rating = str(show['rating']).encode(),
imdbId = str(show['imdbId']).encode(),
zap2itId = str(show['zap2itId']).encode(),
added = str(show['added']).encode(),
addedBy = str(show['addedBy']).encode(),
siteRating = str(show['siteRating']).encode(),
siteRatingCount = str(show['siteRatingCount']).encode(),
slug = str(show['slug']).encode()
)
db.session.add(tv_show)
db.session.commit()
except Exception:
print(traceback.print_exc())

I have decided to use the method above and extract the data I wanted into a list, comparing each show to the list.
show_compare = []
shows_inDB = Show.query.filter().all()
for item in shows_inDB:
show_compare.append(item.show_id)
for show in show_details:
#Check the show isnt already in the DB
if show['id'] in show_compare:
print(str(show['id']) + ' Already Present')
else:
#Add show to DB

For querying a specific column value, have a look at this question: Flask SQLAlchemy query, specify column names. This is the example code given in the top answer there:
result = SomeModel.query.with_entities(SomeModel.col1, SomeModel.col2)
The crux of your problem is that you want to create a new Show instance if that show doesn't already exist in the database.
Querying the database for all shows and looping through the result for each potential new show might become very inefficient if you end up with a lot of shows in the database, and finding an object by identity is what an RDBMS does best!
This function will check to see if an object exists, and create it if not. Inspired by this answer:
def add_if_not_exists(model, **kwargs):
if not model.query.filter_by(**kwargs).first():
instance = model(**kwargs)
db.session.add(instance)
So your example would look like:
def add_if_not_exists(model, **kwargs):
if not model.query.filter_by(**kwargs).first():
instance = model(**kwargs)
db.session.add(instance)
for show in show_details:
add_if_not_exists(Show, id=show['id'])
If you really want to query all shows upfront, instead of putting all of the id's into a list, you could use a set instead of a list which will speed up your inclusion test.
E.g:
show_compare = {item.show_id for item in Show.query.all()}
for show in show_details:
# ... same as your code

Related

Troubleshooting uncooperative For Loop of SQLalchemy results

Looking for a second set of eyes here. I cannot figure out why the following loop will not continue past the first iteration.
The 'servicestocheck' sqlalchemy query returns 45 rows in my test, but I cannot iterate through the results like I'm expecting... and no errors are returned. All of the functionality works on the first iteration.
Anyone have any ideas?
def serviceAssociation(current_contact_id,perm_contact_id):
servicestocheck = oracleDB.query(PORTAL_CONTACT).filter(
PORTAL_CONTACT.contact_id == current_contact_id
).order_by(PORTAL_CONTACT.serviceID).count()
print(servicestocheck) # returns 45 items
servicestocheck = oracleDB.query(PORTAL_CONTACT).filter(
PORTAL_CONTACT.contact_id = current_contact_id
).order_by(PORTAL_CONTACT.serviceID).all()
for svc in servicestocheck:
#
# Check to see if already exists
#
check_existing_association = mysqlDB.query(
CONTACTTOSERVICE).filter(CONTACTTOSERVICE.contact_id ==
perm_contact_id,CONTACTTOSERVICE.serviceID ==
svc.serviceID).first()
#
# If no existing association
#
if check_existing_association is None:
print ("Prepare Association")
assoc_contact_id = perm_contact_id
assoc_serviceID = svc.serviceID
assoc_role_billing = False
assoc_role_technical = False
assoc_role_commercial = False
if svc.contact_type == 'Billing':
assoc_role_billing = True
if svc.contact_type == 'Technical':
assoc_role_technical = True
if svc.contact_type == 'Commercial':
assoc_role_commercial = True
try:
newAssociation = CONTACTTOSERVICE(
assoc_contact_id, assoc_serviceID,
assoc_role_billing,assoc_role_technical,
assoc_role_commercial)
mysqlDB.add(newAssociation)
mysqlDB.commit()
mysqlDB.flush()
except Exception as e:
print(e)
This function is called from a script, and it is called from within another loop. I can't find any issues with nested loops.
Ended up being an issue with SQLAlchemy ORM (see SqlAlchemy not returning all rows when querying table object, but returns all rows when I query table object column)
I think the issue is due to one of my tables above does not have a primary key in real life, and adding a fake one did not help. (I don't have access to the DB to add a key)
Rather than fight it further... I went ahead and wrote raw SQL to move my project along.
This did the trick:
query = 'SELECT * FROM PORTAL_CONTACT WHERE contact_id = ' + str(current_contact_id) + 'ORDER BY contact_id ASC'
servicestocheck = oracleDB.execute(query)

Django: Insert or update database entries

I created the following function where I either create a new database entry or update it if event_pk already exists. Now I looked into update_or_create. However, that doesn't work in my case, as the other entries (yhat, etc.) always differ. Do you have any better idea to write it so I don't repeat myself as I do now? One more idea I had was that I could maybe save event=event_obj, yhat=event_forecast.get('yhat') etc. in a dict and unpack it. But didn't figure out how that could work.
def insert_forecast_data_to_db(self) -> None:
"""Insert or update forecast data in database."""
forecast_data = self.get_forecast_data()
for event_pk, event_forecast in forecast_data.items():
event_obj = Event.objects.get(pk=event)
forecast_obj = Forecast.objects.filter(event=event_pk)
if forecast_obj.exists():
forecast_obj.update(
event=event_obj,
yhat=event_forecast.get('yhat'),
yhat_lower=event_forecast.get('yhat_lower'),
yhat_upper=event_forecast.get('yhat_upper'),
img_key=event_forecast.get('img_key'),
)
else:
Forecast.objects.create(
event=event_obj,
yhat=event_forecast.get('yhat'),
yhat_lower=event_forecast.get('yhat_lower'),
yhat_upper=event_forecast.get('yhat_upper'),
img_key=event_forecast.get('img_key'),
)
I think I just figured it out. That's what I was aiming for:
def insert_forecast_data_to_db(self) -> None:
"""Insert or update forecast data in database."""
forecast_data = self.get_forecast_data()
for event_pk, event_forecast in forecast_data.items():
event_obj = Event.objects.get(pk=event_pk)
data = {
'event': event_obj,
'yhat': event_forecast.get('yhat'),
'yhat_lower': event_forecast.get('yhat_lower'),
'yhat_upper': event_forecast.get('yhat_upper'),
'img_key': event_forecast.get('img_key'),
}
forecast_obj = Forecast.objects.filter(event=event_pk)
forecast_obj.update(
**data
) if forecast_obj.exists() else Forecast.objects.create(**data)

Bulk INSERT IGNORE using Flask-SQLAlchemy

I'm trying to update a database using API-gathered data, and I need to make sure all tables are being updated.
Sometime I will receive data that's already in the database, so I want to do an INSERT IGNORE.
My current code is something like this:
def update_orders(new_orders):
entries = []
for each_order in new_orders:
shipping_id = each_order['id']
title = each_order['title']
price = each_order['price']
code = each_order['code']
source = each_order['source']
phone = each_order['phone']
category = each_order['delivery_category']
carrier = each_order['carrier_identifier']
new_entry = Orders(
id=shipping_id,
title=title,
code=code,
source=source,
phone=phone,
category=category,
carrier=carrier,
price=price
)
entries.append(new_entry)
if len(entries) == 0:
print('No new orders.')
break
else:
print('New orders:', len(entries))
db.session.add_all(entries)
db.session.commit()
This works well when I'm creating the database from scratch, but it will give me an error if there's duplicate data, and I'm not able to commit the inserts.
I've been reading for a while, and found a workaround that uses prefix_with:
print('New orders:', len(entries))
if len(entries) == 0:
print('No new orders.')
else:
insert_command = Orders.__table__.insert().prefix_with('OR IGNORE').values(entries)
db.session.execute(insert_command)
db.session.commit()
The problem is that values(entries) is a bunch of objects:
<shop.database.models.Orders object at 0x11986def0> instead of being the instance of the class, is the class instance object in memory.
Anybody has any suggestion on approaching this problem?
Feel free to suggest a different approach, or just an adjustment.
Thanks a lot.
What database are you using ? Under MySQL, "INSERT OR IGNORE" is not valid syntax, instead one should use "INSERT IGNORE". I had the same situation and got my query to work with the following:
insert_command = Orders.__table__.insert().prefix_with(' IGNORE').values(entries)

How change one value to another in one place and use it in couple functions?

I'm writing test automation for API in BDD behave. I need a switcher between environments. Is any possible way to change one value in one place without adding this value to every functions? Example:
I've tried to do it by adding value to every function but its makes all project very complicated
headers = {
'Content-Type': 'application/json',
'country': 'fi'
}
what i what to switch only country value in headers e.g from 'fi' to 'es'
and then all function should switch themselves to es environment, e.g
def sending_post_request(endpoint, user):
url = fi_api_endpoints.api_endpoints_list.get(endpoint)
personalId = {'personalId': user}
json_post = requests.post(url,
headers=headers,
data=json.dumps(personalId)
)
endpoint_message = json_post.text
server_status = json_post.status_code
def phone_number(phone_number_status):
if phone_number_status == 'wrong':
cursor = functions_concerning_SQL_conection.choosen_db('fi_sql_identity')
cursor.execute("SELECT TOP 1 PersonalId from Registrations where PhoneNumber is NULL")
result = cursor.fetchone()
user_with_no_phone_number = result[0]
return user_with_no_phone_number
else:
cursor = functions_concerning_SQL_conection.choosen_db('fi_sql_identity')
cursor.execute("SELECT TOP 1 PersonalId from Registrations where PhoneNumber is not NULL")
result = cursor.fetchone()
user_with_phone_number = result[0]
return user_with_phone_number
and when i will change from 'fi' to 'es' in headers i want:
fi_sql_identity change to es_sql_identity
url = fi_api_endpoints.api_endpoints_list.get(endpoint) change to
url = es_api_endpoints.api_endpoints_list.get(endpoint)
thx and please help
With respect to your original question, a solution for this case is closure:
def f(x):
def long_calculation(y):
return x * y
return long_calculation
# create different functions without dispatching multiple times
g = f(val_1)
h = f(val_2)
g(val_3)
h(val_3)
Well, the problem is why do you hardcode everything? With the update you can simplify your function as:
def phone_number(phone_number_status, db_name='fi_sql_identity'):
cursor = functions_concerning_SQL_conection.choosen_db(db_name)
if phone_number_status == 'wrong':
sql = "SELECT TOP 1 PersonalId from Registrations where PhoneNumber is NULL"
else:
sql = "SELECT TOP 1 PersonalId from Registrations where PhoneNumber is not NULL"
cursor.execute(sql)
result = cursor.fetchone()
return result[0]
Also please don't write like:
# WRONG
fi_db_conn.send_data()
But use a parameter:
region = 'fi' # or "es"
db_conn = initialize_conn(region)
db_conn.send_data()
And use a config file to store your endpoints with respect to your region, e.g. consider YAML:
# config.yml
es:
db_name: es_sql_identity
fi:
db_name: fi_sql_identity
Then use them in Python:
import yaml
with open('config.yml') as f:
config = yaml.safe_load(f)
region = 'fi'
db_name = config[region]['db_name'] # "fi_sql_identity"
# status = ...
result = phone_number(status, db_name)
See additional useful link for using YAML.
First, provide an encapsulation how to access the resources of a region by providing this encapsulation with a region parameter. It may also be a good idea to provide this functionality as a behave fixture.
CASE 1: region parameter needs to vary between features / scenarios
For example, this means that SCENARIO_1 needs region="fi" and SCENARIO_2 needs region="es".
Use fixture and fixture-tag with region parameter.
In this case you need to write own scenarios for each region (BAD TEST REUSE)
or use a ScenarioOutline as template to let behave generate the tests for you (by using a fixture-tag with a region parameter value for example).
CASE 2: region parameter is constant for all features / scenarios (during test-run)
You can support multiple test-runs with different region parameters by using a userdata parameter.
Look at behave userdata concept.
This allows you to run behave -D region=fi ... and behave -D region=es ...
This case provides a better reuse of testsuite, meaning a large part of the testsuite is the common testsuite that is applied to all regions.
HINT: Your code examples are too specific ("fi" based) which is a BAD-SMELL.

How to iterate over GQLQuery objects when an attribute exists for only some of them

So I'm building a basic blog where I've also implemented an IP address API to store the user's location. I've used this website's API to get the latitude and longitude of the user. This is the Entries class which is used to store each entry -
class Entries(db.Model):
title = db.StringProperty(required = True)
body = db.TextProperty(required = True)
created = db.DateTimeProperty(auto_now_add = True)
coords = db.GeoPtProperty()
This is the function to store the latitude and longitude in the coords property -
IP_URL = 'http://ip-api.com/xml/'
def get_coords(ip):
ip = '4.2.2.2'
url = IP_URL + ip
content = None
try:
content = urllib2.urlopen(url).read()
except urllib2.URLError:
return
if content:
# parse the XML and find the coordinates
dom = minidom.parseString(content)
status = dom.getElementsByTagName("status")[0].childNodes[0].nodeValue
if status == 'success':
lonNode = dom.getElementsByTagName('lon')[0]
latNode = dom.getElementsByTagName('lat')[0]
if lonNode and latNode and lonNode.childNodes[0].nodeValue and latNode.childNodes[0].nodeValue:
lon = lonNode.childNodes[0].nodeValue
lat = latNode.childNodes[0].nodeValue
return db.GeoPt(lat, lon)
When I go to the admin page (localhost:8000), I can see that some entries have the coords property while others don't. (To be precise, I have 8 entries currently, 6 of which have nothing in the coords column, 1 has the coordinates of 4.2.2.2 and the other has None, because I had passed in my localhost address.)
I have a main page where I list all the entries in the blog and it's here that I tried to see whether I can actually get all the coordinates stored in the database, to be displayed on the page -
class MainPage(webapp2.RequestHandler):
def get(self):
# self.response.write(self.request.remote_addr)
# self.response.write(repr(get_coords(self.request.remote_addr)))
entries = db.GqlQuery('select * from Entries order by created desc limit 10')
# find which entries have coordinates
points = []
for e in entries:
if entries.coords:
points.append(a.coords)
self.response.write(repr(points))
self.response.write(render_str('mainpage.html', entries=entries))
I tried printing out the property itself - self.response.write(repr(get_coords(self.request.remote_addr))) and it showed datastore_types.GeoPt(40.7128, -74.0059). So it seems the property IS stored in Datastore.
The problem I have is that the interpreter throws an AttributeError for the loop in the get of MainPage()-
AttributeError: 'GqlQuery' object has no attribute 'coords'
I think this may be because some of the entries do not have a coords attribute in the first place. But if this is the case, is there any way around it?
If it's of any use, I'm doing this for the Web Development course on Udacity by Steve Huffman and he showed this method of storing and displaying the user coordinates. He instructed to NOT use required=True for the coords property as the already existing entries don't have that property. When he looped over the entries in his blog, everything seemed fine and all the coordinates were displayed as a list.
I managed to solve the problem.
I used a filter instead of the for loop inside MainPage.
points = filter(None, (e.coords for e in entries))
This seemed to work and displayed the list of coordinates. (Though I still do not understand why this method worked)
Edit: I found out why I got the problem with my original code. It should have been e.coords, rather than entries.coords.

Categories

Resources