SQLAlchemy parameter creation by loop or function

SQLAlchemy parameter creation by loop or function - python

I'm building a short answer quiz (input as a string). While I've managed to get the quiz up and running, I'm finding that I'm writing code which seems to be a) a bit ugly/unwieldy and b) isn't expandable/reusable in different situations.
Here's an example (this pattern is used a number of times in different functions):
answers_to_store = Answer(answer_1=user_answer_list[0],
answer_2=user_answer_list[1],
answer_3=user_answer_list[2],
answer_4=user_answer_list[3],
answer_5=user_answer_list[4])
Is there a better way to create these parameters? I can't see a way of replacing the parameter name (i.e. answer_1) with a variable or similar (which is the only way I can think of making the task easier).
The only way I've found would be to create the parameters as text using a loop, and then running the resulting command via exec - is that the only way of achieving this, or is there a better way? The other limitation is that this means that having a storage function to store 5 answers from a quiz would be different to one storing 20 answers (the database has that many columns).
I've tried searching, but the problem is I don't know the right term for this, and the nearest thing I came up with was the creation of the appropriate command via a loop and exec - which would work, but seems a long-winded way of doing this.

Assuming this is a short answer quiz, create a database of questionID, answer_string then access the table with the questionID and compare the output with the user input. Here is the idea:
from sqlalchemy import create_engine, Table, MetaData, Column, Integer, VARCHAR
from sqlalchemy.orm import mapper, sessionmaker
Session = sessionmaker()
engine = create_engine('sqlite:///foo.db')
Session.configure(bind=engine)
sess = Session()
metadata = MetaData()
question = Table('question', metadata,
Column('id', Integer, primary_key=True),
Column('answerKey', VARCHAR(None))
)
class Question(object):
def __init__(self, answer):
self.answer = answer
mapper(Question, question)
def check(INquestionID, answerInput):
# Return an answer string associated with the question.
ans = sess.query(question.answer).filter(question.id == INquestionID).limit(1)
if answerInput == ans[0]:
return True
else:
return False
This method stores a string (up to 8000 characters in SQL Server), accesses based on the ID and compares for grading.

Related

Is using quoted_name safe way for parametrizing table name and fields in python's SQL Alchemy?

I spent a lot of time looking for solution to parametrize table names and field names in SQL Alchemy plain textual SQL queries for SQL Server. I stumbled upon several stackoverflow questions and other resources like:
SQL Alchemy Parametrized Query , binding table name as parameter gives error
Answer to above which I don't like as it is just building query from string which is proun to SQL Injection attacks
I know it is possible (I was doing it that way in the past) to do by creating table objects from sqlalchemy.ext.declarative.declarative_base but it requires to declare whole schema of your database which is a lot of unscalable code.
Without much luck with SQL Server I found solution in Postgres psycopg2 using
psycopg2.sql.Identifier. So from here I started looking for equivalent in SQL Alchemy. I found quoted_name. Which I understand works as identifier preventing from SQL Injections. But is it really? Could somebody confirm that it is safe to use?
Code example below which returns number of rows in the passed in table:
def count_rows(self, table_name: str) -> int:
query_base = "SELECT COUNT(*) FROM {}"
query_params = [quoted_name(table_name, True)]
query = text((query_base).format(*query_params))
with self.engine.connect() as con:
result = con.execute(query).fetchone()
return result[0]

I don't get the impression from the documentation this is the purpose for which quoted_name is intended. My reading was that it's for cases where non-standard naming conventions for column or table names are in use, requiring quotation for them to work.
I think there are two possible solutions:
1. exercise total control over the allowed table names
f"SELECT COUNT(*) FROM {table_name}" is fine if you don't allow table_name to be provided by the user without filtering.
For example, you could simply have
...
allowed = ["table_1", ..., "table_N"]
if table_name not in allowed:
raise ValueError(f"Table name must be one of {allowed}. Received {table_name}.")
There are, of course, plenty of other ways to do this. But the idea is to either map user input to allowed values, reject disallowed values, or a mixture of both.
2. reflect the schema
You mentioned that
I know it is possible (I was doing it that way in the past) to do by creating table objects from sqlalchemy.ext.declarative.declarative_base but it requires to declare whole schema of your database which is a lot of unscalable code.
This is not true. You can 'reflect' the schema of an existing database as follows:
from sqlalchemy import create_engine, func, select, MetaData
class YourClass:
def __init__(self, db_connection_string: str):
"""
__init__ for YourClass
(for example)
"""
self.engine = create_engine(db_connection_string)
self.metadata = MetaData(bind=self.engine)
MetaData.reflect(self.metadata)
def count_rows(self, table_name: str) -> int:
"""
count_rows
Returns the COUNT of the rows for a given table
"""
table = self.metadata.tables[table_name]
result = select([func.count()]).select_from(table).scalar()
return result
Worth noting that this approach will also throw an exception if table_name doesn't exist in the database.
Alternative syntax - for full ORM-goodness, use a sessionmaker:
from sqlalchemy import create_engine, MetaData
from sqlalchemy.orm import sessionmaker
class YourClass:
def __init__(self, db_connection_string: str):
self.engine = create_engine(db_connection_string)
self.Session = sessionmaker(bind=self.engine)
self.metadata = MetaData(bind=self.engine)
MetaData.reflect(self.metadata)
def count_rows(self, table_name: str) -> int:
table = self.metadata.tables[table_name]
# if you want a new session every call:
with self.Session.begin() as session:
return session.query(table).count()

How to add a SQL function to sqlalchemy

I'm working with Oracle SQL and I want to use some of Oracle's function that don't exist in other types of relational databases.
Basically I want to add function that return a weekday for a given date.
From what I understand sqlachemy gives me two way to do that, one is provide sql query as text, another exend sqlalchemy implementing a new python function that represents the SQL function. I'm leaning torwards implementing the function because I expect to use this in few queries.
Here is what I implemented so far to get this done, I'm not really sure what is my next step, or if this is even correct.
from sqlalchemy.sql.expression import FunctionElement
from sqlalchemy.ext.compiler import compiles
class weekday(FunctionElement):
name= 'weekday'
#compiles(weekday)
def complie(element, compiler, **kw):
if len(element.clauses) == 1:
return "TO_CHAR(%s,'DY')" % compiler.process(element.clauses)
elif len(element.clauses) == 0:
raise TypeError("Weekday needs a date as parameter")
else:
raise TypeError("Weekday needs just one parameter")
When I tried to add this funtion for one of my objects instead of caculating results I got the function istelf back, here is an example of what I'm taking about:
from sqlalchemy import Column, Date
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.ext.hybrid import hybrid_property
class SomeObject(Base):
__tablename__ = 'table1'
asof = Column(Date,primary_key=True)
#hybrid_property
def weekday(self):
return weekday(self.asof)
In shell I tried:
from datetime import datetime
my_object = SomeObject()
my_object.asof = datetime(2018,1,1)
session.add(my_object)
session.commit()
result = session.query(SomeObject).filter(SomeObject.asof == datetime(2018,1,1)).first()
result.weekday # returns `<orm.weekday as 0x1724b7deeb8; weekday>`
NOTE
I insist on extracting that weekday in SQL query rather than in python because I need this to filter out some records, and in my case that funtion will determine if sqlalchemy pulls out couple million or just couple records.

After trying out few things I realized that hybrid_property is not supposed to return a sql expression, it needs to return the actual value that sql expression would have returned.
Thta being said my sql function would go into the 'expression' part of hybrid_property which would look like this:
#weekday.expression
def weekday(cls):
return weekday(cls.asof)

Old question, but you can also use sqlachemy's SQL and Generic Functions:
https://docs.sqlalchemy.org/en/13/core/functions.html

Get Primary Key column name from table in sqlalchemy (Core)

I am using the core of Sqlalchemy so I am not using a declarative base class like in other similar questions.
How to get the primary key of a table using the engine?

So, I just ran into the same problem. You need to create a Table object that reflects the table for which you are looking to find the primary key.
from sqlalchemy import create_engine, Table, MetaData
dbUrl = 'postgres://localhost:5432/databaseName' #will be different for different db's
engine = create_engine(dbUrl)
meta = MetaData()
table = Table('tableName', meta, autoload=True, autoload_with=engine)
primaryKeyColName = table.primary_key.columns.values()[0].name
The Table construct above is useful for a number of different functions. I use it quite a bit for managing geospatial tables since I do not really like any of the current packages out there.
In your comment you mention that you are not defining a table...I think that means that you aren't creating a sqlalchemy model of the the table. With the approach above, you don't need to do that and can gather all sorts of information from a table in a more dynamic fashion. Especially useful if you are be handed someone else's messy database!
Hope that helps.

I'd like to comment, but I do not have enough reputation for that.
Building on greenbergé answer:
from sqlalchemy import create_engine, Table, MetaData
dbUrl = 'postgres://localhost:5432/databaseName' #will be different for different db's
engine = create_engine(dbUrl)
meta = MetaData()
table = Table('tableName', meta, autoload=True, autoload_with=engine)
[0] is not always the PK, only if the PK has only one column.
table.primary_key.columns.values() is a list.
In order to get all columns of a multi-column pk you could use:
primaryKeyColNames = [pk_column.name for pk_column in table.primary_key.columns.values()]

The two answers were given for retrieving the primary key from a metadata object.
Even if it works well, sometimes one can look for retrieving the primary key from an instance of a SQL Alchemy model, without even knowing what actually the model class is (for example if you search for having a helper function called, let's say, get_primary_key, that would accept an instance of a DB Model and output the primary keys).
For this we can use the inspect function from the inspection module :
from sqlalchemy import inspect
def get_primary_key(model_instance):
inspector = inspect(model_instance)
model_columns = inspector.mapper.columns
return [c.description for c in model_columns if c.primary_key]
You could also directly use the __mapper__ object
def get_primary_key(model_instance):
model_columns = model_instance.__mapper__.columns
return [c.description for c in model_columns if c.primary_key]

for a reflected table this works:
insp=inspect(self.db.engine)
pk_temp=insp.get_pk_constraint(self.__tablename__)['constrained_columns']

Python and sqlite3 data structure to store table name and columns for multiple reuse

I'm using python sqlite3 api to create a database.
In all examples I saw on the documentation table names and colum names are hardcoded inside queries..but this could be a potential problem if I re-use the same table multiple times (ie, creating table, inserting records into table, reading data from table, alter table and so on...) because In case of table modification I need to change the hardcoded names in multiple places and this is not a good programming practice..
How can I solve this problem?
I thought creating a class with just constructor method in order to store all this string names..and use it inside the class that will operation on database..but as I'm not an expert python programmer I would like to share my thoughts...
class TableA(object):
def __init__(self):
self.table_name = 'tableA'
self.name_col1 = 'first_column'
self.type_col1='INTEGER'
self.name_col2 = 'second_column'
self.type.col2 = 'TEXT'
self.name_col3 = 'third_column'
self.type_col3 = 'BLOB'
and then inside the DB classe
table_A = TableA()
def insert_table(self):
conn = sqlite3.connect(self._db_name)
query = 'INSERT INTO ' + table_A.table_name + ..... <SNIP>
conn.execute(query)
Is this a proper way to proceed?

I don't know what's proper but I can tell you that it's not conventional.
If you really want to structure tables as classes, you could consider an object relational mapper like SQLAlchemy. Otherwise, the way you're going about it, how do you know how many column variables you have? What about storing a list of 2-item lists? Or a list of dictionaries?
self.column_list = []
self.column_list.append({'name':'first','type':'integer'})
The way you're doing it sounds pretty novel. Check out their code and see how they do it.

If you are going to start using classes to provide an abstraction layer for your database tables, you might as well start using an ORM. Some examples are SQLAlchemy and SQLObject, both of which are extremely popular.
Here's a taste of SQLAlchemy:
from sqlalchemy import Column, Integer, String
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import create_engine
Base = declarative_base()
class TableA(Base):
__tablename__ = 'tableA'
id = Column(Integer, primary_key=True)
first_column = Column(Integer)
second_column = Column(String)
# etc...
engine = create_engine('sqlite:///test.db')
Base.metadata.bind = engine
session = sessionmaker(bind=engine)()
ta = TableA(first_column=123, second_column='Hi there')
session.add(ta)
session.commit()
Of course you would choose semantic names for the table and columns, but you can see that declaring a table is something along the lines of what you were proposing in your question, i.e. using a class. Inserting records is simplified by creating instances of that class.

I personally don't like to use libraries and frameworks without proper reason. So, if I'd such reason, so will write a thinking wrapper around sqlite.
class Column(object):
def __init__(self, col_name="FOO", col_type="INTEGER"):
# standard initialization
And then table class that encapsulates operations with database
class Table(object):
def __init__(self, list_of_columns, cursor):
#initialization
#create-update-delete commands
In table class you can encapsulate all operations with the database you want.

Why does this SQLAlchemy example commit changes to the DB?

This example illustrates a mystery I encountered in an application I am building. The application needs to support an option allowing the user to exercise the code without actually committing changes to the DB. However, when I added this option, I discovered that changes were persisted to the DB even when I did not call the commit() method.
My specific question can be found in the code comments. The underlying goal is to have a clearer understanding of when and why SQLAlchemy will commit to the DB.
My broader question is whether my application should (a) use a global Session instance, or (b) use a global Session class, from which particular instances would be instantiated. Based on this example, I'm starting to think that the correct answer is (b). Is that right? Edit: this SQLAlchemy documentation suggests that (b) is recommended.
import sys
from sqlalchemy import create_engine, Column, Integer, String
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key = True)
name = Column(String)
age = Column(Integer)
def __init__(self, name, age = 0):
self.name = name
self.age = 0
def __repr__(self):
return "<User(name='{0}', age={1})>".format(self.name, self.age)
engine = create_engine('sqlite://', echo = False)
Base.metadata.create_all(engine)
Session = sessionmaker()
Session.configure(bind=engine)
global_session = Session() # A global Session instance.
commit_ages = False # Whether to commit in modify_ages().
use_global = True # If True, modify_ages() will commit, regardless
# of the value of commit_ages. Why?
def get_session():
return global_session if use_global else Session()
def add_users(names):
s = get_session()
s.add_all(User(nm) for nm in names)
s.commit()
def list_users():
s = get_session()
for u in s.query(User): print ' ', u
def modify_ages():
s = get_session()
n = 0
for u in s.query(User):
n += 10
u.age = n
if commit_ages: s.commit()
add_users(('A', 'B', 'C'))
print '\nBefore:'
list_users()
modify_ages()
print '\nAfter:'
list_users()

tl;dr - The updates are not actually committed to the database-- they are part of an uncommitted transaction in progress.
I made 2 separate changes to your call to create_engine(). (Other than this one line, I'm using your code exactly as posted.)
The first was
engine = create_engine('sqlite://', echo = True)
This provides some useful information. I'm not going to post the entire output here, but notice that no SQL update commands are issued until after the second call to list_users() is made:
...
After:
xxxx-xx-xx xx:xx:xx,xxx INFO sqlalchemy.engine.base.Engine.0x...d3d0 UPDATE users SET age=? WHERE users.id = ?
xxxx-xx-xx xx:xx:xx,xxx INFO sqlalchemy.engine.base.Engine.0x...d3d0 (10, 1)
...
This is a clue that the data is not persisted, but kept around in the session object.
The second change I made was to persist the database to a file with
engine = create_engine('sqlite:///db.sqlite', echo = True)
Running the script again provides the same output as before for the second call to list_users():
<User(name='A', age=10)>
<User(name='B', age=20)>
<User(name='C', age=30)>
However, if you now open the db we just created and query it's contents, you can see that the added users were persisted to the database, but the age modifications were not:
$ sqlite3 db.sqlite "select * from users"
1|A|0
2|B|0
3|C|0
So, the second call to list_users() is getting its values from the session object, not from the database, because there is a transaction in progress that hasn't been committed yet. To prove this, add the following lines to the end of your script:
s = get_session()
s.rollback()
print '\nAfter rollback:'
list_users()

Since you state you are actually using MySQL on the system you are seeing the problem, check the engine type the table was created with. The default is MyISAM, which does not support ACID transactions. Make sure you are using the InnoDB engine, which does do ACID transactions.
You can see which engine a table is using with
show create table users;
You can change the db engine for a table with alter table:
alter table users engine="InnoDB";

1. the example: Just to make sure that (or check if) the session does not commit the changes, it is enough to call expunge_all on the session object. This will most probably prove that the changes are not actually committed:
....
print '\nAfter:'
get_session().expunge_all()
list_users()
2. mysql: As you already mentioned, the sqlite example might not reflect what you actually see when using mysql. As documented in sqlalchemy - MySQL - Storage Engines, the most likely reason for your problem is the usage of non-transactional storage engines (like MyISAM), which results in an autocommit mode of execution.
3. session scope: Although having one global session sounds like a quest for a problem, using new session for every tiny little request is also not a great idea. You should think of a session as a transaction/unit-of-work. I find the usage of the contextual sessions the best of two worlds, where you do not have to pass the session object in the hierarchy of method calls, and at the same time you are given a pretty good safety in the multi-threaded environment. I do use the local session once in a while where I know I do not want to interact with the currently running transaction (session).

Note that the defaults of create_session() are the opposite of that of sessionmaker(): autoflush and expire_on_commit are False, autocommit is True.

global_session is already instantiated when you call modify_ages() and you've already committed to the database. If you re-instantiate global_session after you commit, it should start a new transaction.
My guess is since you've already committed and are re-using the same object, each additional modification is automatically committed.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

SQLAlchemy parameter creation by loop or function - python

Related

Is using quoted_name safe way for parametrizing table name and fields in python's SQL Alchemy?

How to add a SQL function to sqlalchemy

Get Primary Key column name from table in sqlalchemy (Core)

Python and sqlite3 data structure to store table name and columns for multiple reuse

Why does this SQLAlchemy example commit changes to the DB?

Categories

Resources