sqlalchemy: previous row and next row by id - python

I have a table Images with id and name. I want to query its previous image and next image in the database using sqlalchemy. How to do it in only one query?
sel = select([images.c.id, images.c.name]).where(images.c.id == id)
res = engine.connect().execute(sel)
#How to obtain its previous and next row?
...
Suppose it is possible that some rows have been deleted, i.e., the ids may not be continuous. For example,
Table: Images
------------
id | name
------------
1 | 'a.jpg'
2 | 'b.jpg'
4 | 'd.jpg'
------------

prev_image = your_session.query(Images).order_by(Images.id.desc()).filter(Images.id < id).first()
next_image = your_session.query(Images).order_by(Images.id.asc()).filter(Images.id > id).first()

# previous
prv = select([images.c.id, images.c.name]).where(images.c.id < id).order_by(images.c.id.desc()).limit(1)
res = engine.connect().execute(prv)
for res in res:
print(res.id, res.name)
# next
nxt = select([images.c.id, images.c.name]).where(images.c.id > id).order_by(images.c.id).limit(1)
res = engine.connect().execute(nxt)
for res in res:
print(res.id, res.name)

This can be accomplished in a "single" query by taking the UNION of two queries, one to select the previous and target records and one to select the next record (unless the backend is SQLite, which does not permit an ORDER BY before the final statement in a UNION):
import sqlalchemy as sa
...
with engine.connect() as conn:
target = 3
query1 = sa.select(tbl).where(tbl.c.id <= target).order_by(tbl.c.id.desc()).limit(2)
query2 = sa.select(tbl).where(tbl.c.id > target).order_by(tbl.c.id.asc()).limit(1)
res = conn.execute(query1.union(query2))
for row in res:
print(row)
producing
(2, 'b.jpg')
(3, 'c.jpg')
(4, 'd.jpg')
Note that we could make the second query the same as the first, apart from reversing the inequality
query2 = sa.select(tbl).where(tbl.c.id >= target).order_by(tbl.c.id.asc()).limit(2)
and we would get the same result as the union would remove the duplicate target row.
If the requirement were to find the surrounding rows for a selection of rows we could use the lag and lead window functions, if they are supported.
# Works in PostgreSQL, MariaDB and SQLite, at least.
with engine.connect() as conn:
query = sa.select(
tbl.c.id,
tbl.c.name,
sa.func.lag(tbl.c.name).over(order_by=tbl.c.id).label('prev'),
sa.func.lead(tbl.c.name).over(order_by=tbl.c.id).label('next'),
)
res = conn.execute(query)
for row in res:
print(row._mapping)
Output:
{'id': 1, 'name': 'a.jpg', 'prev': None, 'next': 'b.jpg'}
{'id': 2, 'name': 'b.jpg', 'prev': 'a.jpg', 'next': 'c.jpg'}
{'id': 3, 'name': 'c.jpg', 'prev': 'b.jpg', 'next': 'd.jpg'}
{'id': 4, 'name': 'd.jpg', 'prev': 'c.jpg', 'next': 'e.jpg'}
{'id': 5, 'name': 'e.jpg', 'prev': 'd.jpg', 'next': 'f.jpg'}
{'id': 6, 'name': 'f.jpg', 'prev': 'e.jpg', 'next': None}

To iterate through your records. I think that this is what you're looking for.
for row in res:
print row.id
print row.name

Related

Selecting from DB where some data points can be NULL (psycopg2)

I have the following query, where some of the values I am trying to select with can be empty and therefore default to None.
So I've come up with something like this:
db.cursor.execute(
'''
SELECT
s.prod_tires_size_id as size_id
,s.rim_mm
,s.rim_inch
,s.width_mm
,s.width_inch
,s.aspect_ratio
,s.diameter_inch
FROM product_tires.sizes s
WHERE
s.rim_mm %(rim_mm)s
AND s.rim_inch %(rim_inch)s
AND s.width_mm %(width_mm)s
AND s.width_inch %(width_inch)s
AND s.aspect_ratio %(aspect_ratio)s
AND s.diameter_inch %(diameter_inch)s
''', {
'rim_mm': data['RIM_MM'] or None,
'rim_inch': data['RIM_INCH'] or None,
'width_mm': data['WIDTH_MM'] or None,
'width_inch': data['WIDTH_INCH'] or None,
'aspect_ratio': data['ASPECT_RATIO'] or None,
'diameter_inch': data['OVL_DIAMETER'] or None,
}
)
However, = NULL does not work.
If I use IS, then it will not match the values I am proving.
How can I solve this problem?
Generate the query using python:
k = {'rim_mm': 'RIM_MM',
'rim_inch': 'RIM_INCH',
'width_mm': 'WIDTH_MM',
'width_inch': 'WIDTH_INCH',
'aspect_ratio': 'ASPECT_RATIO',
'diameter_inch': 'OVL_DIAMETER',
}
where = []
for column, key in k.items():
if data[key]:
where.append("%s=%%(%s)s" % (column,key)")
else:
where.append("%s IS NULL" % column)
sql = "your select where " + " AND ".join(where)
cursor.execute( sql, data )

Save dataframe in Postgresql Database with SERIAL Autogenerated ID

Having a dataframe in the following way:
word classification counter
0 house noun 2
1 the article 2
2 white adjective 1
3 yellow adjective 1
I would like to store in Postgresql table with the following definition:
CREATE TABLE public.word_classification (
id SERIAL,
word character varying(100),
classification character varying(10),
counter integer,
start_date date,
end_date date
);
ALTER TABLE public.word_classification OWNER TO postgres;
The current basic configuration I have is as follows:
from sqlalchemy import create_engine
import pandas as pd
# Postgres username, password, and database name
POSTGRES_ADDRESS = 'localhost' ## INSERT YOUR DB ADDRESS IF IT'S NOT ON PANOPLY
POSTGRES_PORT = '5432'
POSTGRES_USERNAME = 'postgres' ## CHANGE THIS TO YOUR PANOPLY/POSTGRES USERNAME
POSTGRES_PASSWORD = 'BVict31C' ## CHANGE THIS TO YOUR PANOPLY/POSTGRES PASSWORD
POSTGRES_DBNAME = 'local-sandbox-dev' ## CHANGE THIS TO YOUR DATABASE NAME
# A long string that contains the necessary Postgres login information
postgres_str = ('postgresql://{username}:{password}#{ipaddress}:{port}/{dbname}'.format(username=POSTGRES_USERNAME,password=POSTGRES_PASSWORD,ipaddress=POSTGRES_ADDRESS,port=POSTGRES_PORT,dbname=POSTGRES_DBNAME))
# Create the connection
cnx = create_engine(postgres_str)
data=[['the','article',0],['house','noun',1],['yellow','adjective',2],
['the','article',4],['house','noun',5],['white','adjective',6]]
df = pd.DataFrame(data, columns=['word','classification','position'])
df_db = pd.DataFrame(columns=['word','classification','counter','start_date','end_date'])
count_series=df.groupby(['word','classification']).size()
new_df = count_series.to_frame(name = 'counter').reset_index()
df_db = new_df.to_sql('word_classification',cnx,if_exists='append',chunksize=1000)
I would like to insert into the table as I am able to do with SQL syntax:
insert into word_classification(word, classification, counter)values('hello','world',1);
Currently, I am getting an error when inserting into the table because I am passing the index:
(psycopg2.errors.UndefinedColumn) column "index" of relation "word_classification" does not exist
LINE 1: INSERT INTO word_classification (index, word, classification...
^
[SQL: INSERT INTO word_classification (index, word, classification, counter) VALUES (%(index)s, %(word)s, %(classification)s, %(counter)s)]
[parameters: ({'index': 0, 'word': 'house', 'classification': 'noun', 'counter': 2}, {'index': 1, 'word': 'the', 'classification': 'article', 'counter': 2}, {'index': 2, 'word': 'white', 'classification': 'adjective', 'counter': 1}, {'index': 3, 'word': 'yellow', 'classification': 'adjective', 'counter': 1})]
I have been searching for ways to get rid of passing the index with no luck.
Thanks for your help
Turn off index when storing in database as follows:
df_db = new_df.to_sql('word_classification',cnx,if_exists='append',chunksize=1000, index=False)

Looping through a function

I am struggling with figuring out the best way to loop through a function. The output of this API is a Graph Connection and I am a-little out of my element. I really need to obtain ID's from an api output and have them in a dict or some sort of form that I can pass to another API call.
**** It is important to note that the original output is a graph connection.... print(type(api_response) does show it as a list however, if I do a print(type(api_response[0])) it returns a
This is the original output from the api call:
[{'_from': None, 'to': {'id': '5c9941fcdd2eeb6a6787916e', 'type': 'user'}}, {'_from': None, 'to': {'id': '5cc9055fcc5781152ca6eeb8', 'type': 'user'}}, {'_from': None, 'to': {'id': '5d1cf102c94c052cf1bfb3cc', 'type': 'user'}}]
This is the code that I have up to this point.....
api_response = api_instance.graph_user_group_members_list(group_id, content_type, accept,limit=limit, skip=skip, x_org_id=x_org_id)
def extract_id(result):
result = str(result).split(' ')
for i, r in enumerate(result):
if 'id' in r:
id = (result[i+1].translate(str.maketrans('', '', string.punctuation)))
print( id )
return id
extract_id(api_response)
def extract_id(result):
result = str(result).split(' ')
for i, r in enumerate(result):
if 'id' in r:
id = (result[i+8].translate(str.maketrans('', '', string.punctuation)))
print( id )
return id
extract_id(api_response)
def extract_id(result):
result = str(result).split(' ')
for i, r in enumerate(result):
if 'id' in r:
id = (result[i+15].translate(str.maketrans('', '', string.punctuation)))
print( id )
return id
extract_id(api_response)
I have been able to use a function to extract the ID's but I am doing so through a string. I am in need of a scalable solution that I can use to pass these ID's along to another API call.
I have tried to use a for loop but because it is 1 string and i+1 defines the id's position, it is redundant and just outputs 1 of the id's multiple times.
I am receiving the correct output using each of these functions however, it is not scalable..... and just is not a solution. Please help guide me......
So to solve the response as a string issue I would suggest using python's builtin json module. Specifically, the method .loads() can convert a string to a dict or list of dicts. From there you can iterate over the list or dict and check if the key is equal to 'id'. Here's an example based on what you said the response would look like.
import json
s = "[{'_from': None, 'to': {'id': '5c9941fcdd2eeb6a6787916e', 'type': 'user'}}, {'_from': None, 'to': {'id': '5cc9055fcc5781152ca6eeb8', 'type': 'user'}}, {'_from': None, 'to': {'id': '5d1cf102c94c052cf1bfb3cc', 'type': 'user'}}]"
# json uses double quotes and null; there is probably a better way to do this though
s = s.replace("\'", '\"').replace('None', 'null')
response = json.loads(s) # list of dicts
for d in response:
for key, value in d['to'].items():
if key == 'id':
print(value) # or whatever else you want to do
# 5c9941fcdd2eeb6a6787916e
# 5cc9055fcc5781152ca6eeb8
# 5d1cf102c94c052cf1bfb3cc

How to concatenate structs in a loop in python

I am trying to search for all users in an sql database whose first names are "blah" and return that data to my html through an ajax call. I have this functioning with a single user like this:
user = db.execute(
'SELECT * FROM user WHERE genres LIKE ?', (str,)
).fetchone()
user_details = {
'first': user['first'],
'last': user['last'],
'email': user['email']
}
y = json.dumps(user_details)
return jsonify(y)
Now for multiple users I want the struct to look something like this:
users{
user1_details = {
'first': user['first'],
'last': user['last'],
'email': user['email']
}
user2_details = {
'first': user2['first'],
'last': user2['last'],
'email': user2['email']
}
user3_details = {
'first': user3['first'],
'last': user3['last'],
'email': user3['email']
}
}
generating each user_details in a loop. I know I can use fetchall() to find all the users, but how do I concatenate the details?
Fetch all the rows after the query, then structure the results as you'd like.
Example:
db = mysql.connection.cursor()
# query
db.execute('SELECT * FROM user')
# returned columns
header = [x[0] for x in db.description]
# returned rows
results = db.fetchall()
#data to be returned
users_object = {}
#structure results
for result in results:
users_object[result["user_id"]] = dict(zip(header,result))
return jsonify(users_object)
As you can see in under "#structure results", you just loop through the results and insert the data for each row into the users_object with key equal to "user_id" for example.
If you want the results in an array just convert users_object into an array e.g. users_array and append the dict to the array within the loop instead
The keys in the desired users dictionary do not seem particularly useful so you could instead build a list of user dicts. It's easy to go directly from fetchall() to such a list:
result = db.execute('SELECT * FROM user WHERE genres LIKE ?', (str,))
users = [{'first': first, 'last': last, 'email': email} for first, last, email in result.fetchall()]
return jsonify(users)
To return a dict containing the user list:
return jsonify({'users': users})

Execute user-defined query on list of dictionaries

I have a set of data that a user needs to query using their own query string. The current solution creates a temporary in-memory sqlite database that the query is run against.
The dataset is a list of "flat" dictionaries, i.e. there is no nested data. The query string does not need to be SQL, but it should be simple to define using an existing query framework.
It needs to support ordering (ascending, descending, custom) and filtering.
The purpose of this question is to get a range of different solutions that might work for this use case.
import sqlite3
items = [
{'id': 1},
{'id': 2, 'description': 'This is a description'},
{'id': 3, 'comment': 'This is a comment'},
{'id': 4, 'height': 1.78}
]
# Assemble temporary sqlite database
conn = sqlite3.connect(':memory:')
cur = conn.cursor()
knownTypes = { "id": "real", "height": "real", "comment": "text" }
allKeys = list(set().union(*(d.keys() for d in items)))
allTypes = list(knownTypes.get(k, "text") for k in allKeys)
createTable_query = "CREATE TABLE data ({});".format(", ".join(["{} {}".format(x[0], x[1]) for x in zip(allKeys, allTypes)]))
cur.execute(createTable_query)
conn.commit()
qs = ["?" for i in range(len(allKeys))]
insertRow_query = "INSERT INTO data VALUES ({});".format(", ".join(qs))
for p in items:
vals = list([p.get(k, None) for k in allKeys])
cur.execute(insertRow_query, vals)
conn.commit()
# modify user query here
theUserQuery = "SELECT * FROM data"
# Get data from query
data = [row for row in cur.execute(theUserQuery)]
YAQL is what I'm looking for.
It doesn't do SQL, but it does execute a query string - which is a simple way to do complex user-defined sorting and filtering.
There's a library called litebox that does what you want. It is backed by SQLite.
from litebox import LiteBox
items = [
{'id': 1},
{'id': 2, 'description': 'This is a description'},
{'id': 3, 'comment': 'This is a comment'},
{'id': 4, 'height': 1.78}
]
types = {"id": int, "height": float, "comment": str}
lb = LiteBox(items, types)
lb.find("height > 1.5")
Result: [{'id': 4, 'height': 1.78}]

Categories

Resources