Retrieve max of column using reflected table in SQLAlchemy - python

The problem is quite simply explained. You have a DB, you have a table and you have an attribute.
What you want to do is to connect to the database, query the table for the max on the content of the table of a specific attribute.
What I tried so far is:
attr_name = 'foo'
meta = MetaData()
meta.reflect(bind=self._engine)
obj_table = meta.tables[table_name]
print("<< select max(attr_name) from obj_table >>")
What I would like to do is to print out the max. I tried with sessions, with getattr.. no clue. I just want to get the max out of this table, from a column having a name passed by parameter (I can't use dot notation).
Any clue?

from sqlalchemy import select, func
col = getattr(obj_table.c, attr_name)
q = select([func.max(col)], obj_table)
with self._engine.connect() as conn:
res = conn.execute(q)
max_value = res.fetchone()[0]
The important thing here is that table columns are accessed via obj_table.c (or obj_table.columns).

Related

Python MySQL search entire database for value

I have a GUI interacting with my database, and MySQL database has around 50 tables. I need to search each table for a value and return the field and key of the item in each table if it is found. I would like to search for partial matches. ex.( Search Value = "test", "Protest", "Test123" would be matches. Here is my attempt.
def searchdatabase(self, event):
print('Searching...')
self.connect_mysql() #Function to connect to database
d_tables = []
results_list = [] # I will store results here
s_string = "test" #Value I am searching
self.cursor.execute("USE db") # select the database
self.cursor.execute("SHOW TABLES")
for (table_name,) in self.cursor:
d_tables.append(table_name)
#Loop through tables list, get column name, and check if value is in the column
for table in d_tables:
#Get the columns
self.cursor.execute(f"SELECT * FROM `{table}` WHERE 1=0")
field_names = [i[0] for i in self.cursor.description]
#Find Value
for f_name in field_names:
print("RESULTS:", self.cursor.execute(f"SELECT * FROM `{table}` WHERE {f_name} LIKE {s_string}"))
print(table)
I get an error on print("RESULTS:", self.cursor.execute(f"SELECT * FROM `{table}` WHERE {f_name} LIKE {s_string}"))
Exception: (1054, "Unknown column 'test' in 'where clause'")
I use a similar insert query that works fine so I am not understanding what the issue is.
ex. insert_query = (f"INSERT INTO `{source_tbl}` ({query_columns}) VALUES ({query_placeholders})")
May be because of single quote you have missed while checking for some columns.
TRY :
print("RESULTS:", self.cursor.execute(f"SELECT * FROM `{table}` WHERE '{f_name}' LIKE '{s_string}'"))
Have a look -> here
Don’t insert user-provided data into SQL queries like this. It is begging for SQL injection attacks. Your database library will have a way of sending parameters to queries. Use that.
The whole design is fishy. Normally, there should be no need to look for a string across several columns of 50 different tables. Admittedly, sometimes you end up in these situations because of reasons outside your control.

How to dynamically use select with SQLAlchemy?

I am trying to create a function which can filter a sql table using SQLAlchemy, with optional parameters.
The function is
def fetch_new_requests(status, request_type, request_subtype, engine, id_r=None):
table = Table('Sample_Table', metadata, autoload=True,
autoload_with=engine)
query = session.query(load_requests).filter_by(status = status,
request_type = request_type,
request_subtype = request_subtype,
id_r = id_r)
return pd.read_sql((query).statement,session.bind)
But it returns every time an empty table if I do not define id_r variable
I have googled but cannot find a woraround
The I have tried to use **kwargs, but it is not what I really need, I mean here I have to explicitly define column names and again come to the issue with id_r
def fetch_new_requests(**kwargs):
for x in kwargs.values():
query = session.query(load_requests).filter_by(status=x)
return pd.read_sql((query).statement,session.bind)
My ideal result
def fetch_new_requests(any column names, values of the columns):
for x in kwargs.values():
query = session.query(load_requests).filter_by(column_name=column_value)
return pd.read_sql((query).statement,session.bind)
In theorie I can use 2 lists or a dict but if there is another solution would be happy to hear
I can only give you an answer for SQLAlchemy core syntax, but it works with a dict! It has the column names in its keys and their required values, in the values.
table = Table('Sample_Table', metadata, autoload=True,
autoload_with=engine)
query = table.select()
where_dict = {"status": 1, "request_type": "something"}
for k, v in where_dict.items():
query = query.where(getattr(table.c, k) == v)
just for completeness: here's the syntax to select only specific fields (your question kinda sounds like you're also looking for this):
query = table.select().with_only_columns(select_columns) # select_columns is a list

SQLAlchemy : Querying a database column using elements from a given array [duplicate]

I'm trying to do this query in sqlalchemy
SELECT id, name FROM user WHERE id IN (123, 456)
I would like to bind the list [123, 456] at execution time.
How about
session.query(MyUserClass).filter(MyUserClass.id.in_((123,456))).all()
edit: Without the ORM, it would be
session.execute(
select(
[MyUserTable.c.id, MyUserTable.c.name],
MyUserTable.c.id.in_((123, 456))
)
).fetchall()
select() takes two parameters, the first one is a list of fields to retrieve, the second one is the where condition. You can access all fields on a table object via the c (or columns) property.
Assuming you use the declarative style (i.e. ORM classes), it is pretty easy:
query = db_session.query(User.id, User.name).filter(User.id.in_([123,456]))
results = query.all()
db_session is your database session here, while User is the ORM class with __tablename__ equal to "users".
An alternative way is using raw SQL mode with SQLAlchemy, I use SQLAlchemy 0.9.8, python 2.7, MySQL 5.X, and MySQL-Python as connector, in this case, a tuple is needed. My code listed below:
id_list = [1, 2, 3, 4, 5] # in most case we have an integer list or set
s = text('SELECT id, content FROM myTable WHERE id IN :id_list')
conn = engine.connect() # get a mysql connection
rs = conn.execute(s, id_list=tuple(id_list)).fetchall()
Hope everything works for you.
Just wanted to share my solution using sqlalchemy and pandas in python 3. Perhaps, one would find it useful.
import sqlalchemy as sa
import pandas as pd
engine = sa.create_engine("postgresql://postgres:my_password#my_host:my_port/my_db")
values = [val1,val2,val3]
query = sa.text("""
SELECT *
FROM my_table
WHERE col1 IN :values;
""")
query = query.bindparams(values=tuple(values))
df = pd.read_sql(query, engine)
With the expression API, which based on the comments is what this question is asking for, you can use the in_ method of the relevant column.
To query
SELECT id, name FROM user WHERE id in (123,456)
use
myList = [123, 456]
select = sqlalchemy.sql.select([user_table.c.id, user_table.c.name], user_table.c.id.in_(myList))
result = conn.execute(select)
for row in result:
process(row)
This assumes that user_table and conn have been defined appropriately.
Or maybe use .in_(list), similar to what #Carl has already suggested
as
stmt = select(
id,
name
).where(
id.in_(idlist),
)
Complete code assuming you have the data model in User class:
def fetch_name_ids(engine, idlist):
# create an empty dataframe
df = pd.DataFrame()
try:
# create session with engine
session = Session(engine, future=True)
stmt = select(
User.id,
User.name
).where(
User.id.in_(idlist),
)
data = session.execute(stmt)
df = pd.DataFrame(data.all())
if len(df) > 0:
df.columns = data.keys()
else:
columns = data.keys()
df = pd.DataFrame(columns=columns)
except SQLAlchemyError as e:
error = str(e.__dict__['orig'])
session.rollback()
raise error
else:
session.commit()
finally:
engine.dispose()
session.close()
return df
This question posted a solution to the select query, unfortunately, it is not working for the update query. Using this solution, it would help even in the select conditions also.
Update Query Solution:
id_list = [1, 2, 3, 4, 5] # in most cases we have an integer list or set
query = 'update myTable set content = 1 WHERE id IN {id_list}'.format(id_list=tuple(id_list))
conn.execute(query)
Note: Use a tuple instead of a list.
Just an addition to the answers above.
If you want to execute a SQL with an "IN" statement you could do this:
ids_list = [1,2,3]
query = "SELECT id, name FROM user WHERE id IN %s"
args = [(ids_list,)] # Don't forget the "comma", to force the tuple
conn.execute(query, args)
Two points:
There is no need for Parenthesis for the IN statement(like "... IN(%s) "), just put "...IN %s"
Force the list of your ids to be one element of a tuple. Don't forget the " , " : (ids_list,)
EDIT
Watch out that if the length of list is one or zero this will raise an error!

sqlalchemy join and order by on multiple tables

I'm working with a database that has a relationship that looks like:
class Source(Model):
id = Identifier()
class SourceA(Source):
source_id = ForeignKey('source.id', nullable=False, primary_key=True)
name = Text(nullable=False)
class SourceB(Source):
source_id = ForeignKey('source.id', nullable=False, primary_key=True)
name = Text(nullable=False)
class SourceC(Source, ServerOptions):
source_id = ForeignKey('source.id', nullable=False, primary_key=True)
name = Text(nullable=False)
What I want to do is join all tables Source, SourceA, SourceB, SourceC and then order_by name.
Sound easy to me but I've been banging my head on this for while now and my heads starting to hurt. Also I'm not very familiar with SQL or sqlalchemy so there's been a lot of browsing the docs but to no avail. Maybe I'm just not seeing it. This seems to be close albeit related to a newer version than what I have available (see versions below).
I feel close not that that means anything. Here's my latest attempt which seems good up until the order_by call.
Sources = [SourceA, SourceB, SourceC]
# list of join on Source
joins = [session.query(Source).join(source) for source in Sources]
# union the list of joins
query = joins.pop(0).union_all(*joins)
query seems right at this point as far as I can tell i.e. query.all() works. So now I try to apply order_by which doesn't throw an error until .all is called.
Attempt 1: I just use the attribute I want
query.order_by('name').all()
# throws sqlalchemy.exc.ProgrammingError: (ProgrammingError) column "name" does not exist
Attempt 2: I just use the defined column attribute I want
query.order_by(SourceA.name).all()
# throws sqlalchemy.exc.ProgrammingError: (ProgrammingError) missing FROM-clause entry for table "SourceA"
Is it obvious? What am I missing? Thanks!
versions:
sqlalchemy.version = '0.8.1'
(PostgreSQL) 9.1.3
EDIT
I'm dealing with a framework that wants a handle to a query object. I have a bare query that appears to accomplish what I want but I would still need to wrap it in a query object. Not sure if that's possible. Googling ...
select = """
select s.*, a.name from Source d inner join SourceA a on s.id = a.Source_id
union
select s.*, b.name from Source d inner join SourceB b on s.id = b.Source_id
union
select s.*, c.name from Source d inner join SourceC c on s.id = c.Source_id
ORDER BY "name";
"""
selectText = text(select)
result = session.execute(selectText)
# how to put result into a query. maybe Query(selectText)? googling...
result.fetchall():
Assuming that coalesce function is good enough, below examples should point you in the direction. One option automatically creates a list of children, while the other is explicit.
This is not the query you specified in your edit, but you are able to sort (your original request):
def test_explicit():
# specify all children tables to be queried
Sources = [SourceA, SourceB, SourceC]
AllSources = with_polymorphic(Source, Sources)
name_col = func.coalesce(*(_s.name for _s in Sources)).label("name")
query = session.query(AllSources).order_by(name_col)
for x in query:
print(x)
def test_implicit():
# get all children tables in the query
from sqlalchemy.orm import class_mapper
_map = class_mapper(Source)
Sources = [_smap.class_
for _smap in _map.self_and_descendants
if _smap != _map # #note: exclude base class, it has no `name`
]
AllSources = with_polymorphic(Source, Sources)
name_col = func.coalesce(*(_s.name for _s in Sources)).label("name")
query = session.query(AllSources).order_by(name_col)
for x in query:
print(x)
Your first attempt sounds like it isn't working because there is no name in Source, which is the root table of the query. In addition, there will be multiple name columns after your joins, so you will need to be more specific. Try
query.order_by('SourceA.name').all()
As for your second attempt, what is ServerA?
query.order_by(ServerA.name).all()
Probably a typo, but not sure if it's for SO or your code. Try:
query.order_by(SourceA.name).all()

sqlalchemy+sqlite stripping column names with dots?

I am using SQLAlchemy 0.7.6. I am aliasing columns with:
column = table.c["name"].label("foo.bar")
and SQLite uses only 'bar' as result field alias. Is there any workaround for that?
Example code:
create_table("sqlite:////tmp/test.sqlite", schema)
engine = create_engine(url)
metadata = MetaData(engine, reflect=True)
table = Table("test_table", metadata, schema=schema, autoload=True)
column = table.c["name"].label("foo.bar")
cursor = sql.expression.select([column])
row = cursor.execute().fetchone()
print "keys are: %s" % (row.keys(), )
Will print:
keys are: [u'bar']
instead of:
keys are: [u'foo.bar']
Works for postgres.
Here is full test code: https://gist.github.com/2506388
I've already reported that to the sqlalchemy lists, however meanwhile I would like to know if anyone else is experiencing similar problem and might have a workaround.
Going to be patched in SQLAlchemy with an engine option. See mailing list thread for more information.
Meanwhile, the workaround is:
# select is sqlalchemy.sql.expression.select()
# each selected column was derived as column = table.c[reference].label(label_with_dot)
labels = [c.name for c in select.columns]
...
record = dict(zip(labels, row))
Solution after patch is :
conn = engine.connect().execution_options(sqlite_raw_colnames=True)
result = conn.execute(stmt)

Categories

Resources