I'm trying to do this query in sqlalchemy
SELECT id, name FROM user WHERE id IN (123, 456)
I would like to bind the list [123, 456] at execution time.
How about
session.query(MyUserClass).filter(MyUserClass.id.in_((123,456))).all()
edit: Without the ORM, it would be
session.execute(
select(
[MyUserTable.c.id, MyUserTable.c.name],
MyUserTable.c.id.in_((123, 456))
)
).fetchall()
select() takes two parameters, the first one is a list of fields to retrieve, the second one is the where condition. You can access all fields on a table object via the c (or columns) property.
Assuming you use the declarative style (i.e. ORM classes), it is pretty easy:
query = db_session.query(User.id, User.name).filter(User.id.in_([123,456]))
results = query.all()
db_session is your database session here, while User is the ORM class with __tablename__ equal to "users".
An alternative way is using raw SQL mode with SQLAlchemy, I use SQLAlchemy 0.9.8, python 2.7, MySQL 5.X, and MySQL-Python as connector, in this case, a tuple is needed. My code listed below:
id_list = [1, 2, 3, 4, 5] # in most case we have an integer list or set
s = text('SELECT id, content FROM myTable WHERE id IN :id_list')
conn = engine.connect() # get a mysql connection
rs = conn.execute(s, id_list=tuple(id_list)).fetchall()
Hope everything works for you.
Just wanted to share my solution using sqlalchemy and pandas in python 3. Perhaps, one would find it useful.
import sqlalchemy as sa
import pandas as pd
engine = sa.create_engine("postgresql://postgres:my_password#my_host:my_port/my_db")
values = [val1,val2,val3]
query = sa.text("""
SELECT *
FROM my_table
WHERE col1 IN :values;
""")
query = query.bindparams(values=tuple(values))
df = pd.read_sql(query, engine)
With the expression API, which based on the comments is what this question is asking for, you can use the in_ method of the relevant column.
To query
SELECT id, name FROM user WHERE id in (123,456)
use
myList = [123, 456]
select = sqlalchemy.sql.select([user_table.c.id, user_table.c.name], user_table.c.id.in_(myList))
result = conn.execute(select)
for row in result:
process(row)
This assumes that user_table and conn have been defined appropriately.
Or maybe use .in_(list), similar to what #Carl has already suggested
as
stmt = select(
id,
name
).where(
id.in_(idlist),
)
Complete code assuming you have the data model in User class:
def fetch_name_ids(engine, idlist):
# create an empty dataframe
df = pd.DataFrame()
try:
# create session with engine
session = Session(engine, future=True)
stmt = select(
User.id,
User.name
).where(
User.id.in_(idlist),
)
data = session.execute(stmt)
df = pd.DataFrame(data.all())
if len(df) > 0:
df.columns = data.keys()
else:
columns = data.keys()
df = pd.DataFrame(columns=columns)
except SQLAlchemyError as e:
error = str(e.__dict__['orig'])
session.rollback()
raise error
else:
session.commit()
finally:
engine.dispose()
session.close()
return df
This question posted a solution to the select query, unfortunately, it is not working for the update query. Using this solution, it would help even in the select conditions also.
Update Query Solution:
id_list = [1, 2, 3, 4, 5] # in most cases we have an integer list or set
query = 'update myTable set content = 1 WHERE id IN {id_list}'.format(id_list=tuple(id_list))
conn.execute(query)
Note: Use a tuple instead of a list.
Just an addition to the answers above.
If you want to execute a SQL with an "IN" statement you could do this:
ids_list = [1,2,3]
query = "SELECT id, name FROM user WHERE id IN %s"
args = [(ids_list,)] # Don't forget the "comma", to force the tuple
conn.execute(query, args)
Two points:
There is no need for Parenthesis for the IN statement(like "... IN(%s) "), just put "...IN %s"
Force the list of your ids to be one element of a tuple. Don't forget the " , " : (ids_list,)
EDIT
Watch out that if the length of list is one or zero this will raise an error!
Related
I want get a db into pandas df in Python. I use a following code:
self.cursor = self.connection.cursor()
query = """
SELECT * FROM `an_visit` AS `visit`
JOIN `an_ip` AS `ip` ON (`visit`.`ip_id` = `ip`.`ip_id`)
JOIN `an_useragent` AS `useragent` ON (`visit`.`useragent_id` = `useragent`.`useragent_id`)
JOIN `an_pageview` AS `pageview` ON (`visit`.`visit_id` = `pageview`.`visit_id`)
WHERE `visit`.`visit_id` BETWEEN %s AND %s
"""
self.cursor.execute(query, (start_id, end_id))
df = pd.DataFrame(self.cursor.fetchall())
This code works, but I want to get column names as well. I tried this question MySQL: Get column name or alias from query
but this did not work:
fields = map(lambda x: x[0], self.cursor.description)
result = [dict(zip(fields, row)) for row in self.cursor.fetchall()]
How can I get column names from db into df? Thanks
The easy way to include column names within recordset is to set dictionary=True as following:
self.cursor = self.connection.cursor(dictionary=True)
Then, all of fetch(), fetchall() and fetchone() are return dictionary with column name and data
check out links:
https://dev.mysql.com/doc/connector-python/en/connector-python-api-mysqlcursordict.html
https://mariadb-corporation.github.io/mariadb-connector-python/connection.html
What work to me is:
field_names = [i[0] for i in self.cursor.description ]
the best practice to list all the columns in the database is to execute this query form the connection cursor
SELECT TABLE_CATALOG,TABLE_SCHEMA,TABLE_NAME,COLUMN_NAME,DATA_TYPE
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_SCHEMA='<schema>' AND TABLE_NAME = '<table_name>'
There is a column_names properties in MySql cursor that you can use:
row = dict(zip(self.cursor.column_names, self.cursor.fetchone()))
https://dev.mysql.com/doc/connector-python/en/connector-python-api-mysqlcursor-column-names.html
I am trying to create a function which can filter a sql table using SQLAlchemy, with optional parameters.
The function is
def fetch_new_requests(status, request_type, request_subtype, engine, id_r=None):
table = Table('Sample_Table', metadata, autoload=True,
autoload_with=engine)
query = session.query(load_requests).filter_by(status = status,
request_type = request_type,
request_subtype = request_subtype,
id_r = id_r)
return pd.read_sql((query).statement,session.bind)
But it returns every time an empty table if I do not define id_r variable
I have googled but cannot find a woraround
The I have tried to use **kwargs, but it is not what I really need, I mean here I have to explicitly define column names and again come to the issue with id_r
def fetch_new_requests(**kwargs):
for x in kwargs.values():
query = session.query(load_requests).filter_by(status=x)
return pd.read_sql((query).statement,session.bind)
My ideal result
def fetch_new_requests(any column names, values of the columns):
for x in kwargs.values():
query = session.query(load_requests).filter_by(column_name=column_value)
return pd.read_sql((query).statement,session.bind)
In theorie I can use 2 lists or a dict but if there is another solution would be happy to hear
I can only give you an answer for SQLAlchemy core syntax, but it works with a dict! It has the column names in its keys and their required values, in the values.
table = Table('Sample_Table', metadata, autoload=True,
autoload_with=engine)
query = table.select()
where_dict = {"status": 1, "request_type": "something"}
for k, v in where_dict.items():
query = query.where(getattr(table.c, k) == v)
just for completeness: here's the syntax to select only specific fields (your question kinda sounds like you're also looking for this):
query = table.select().with_only_columns(select_columns) # select_columns is a list
I have the data in pandas dataframe which I am storing in SQLITE database using Python. When I am trying to query the tables inside it, I am able to get the results but without the column names. Can someone please guide me.
sql_query = """Select date(report_date), insertion_order_id, sum(impressions), sum(clicks), (sum(clicks)+0.0)/sum(impressions)*100 as CTR
from RawDailySummaries
Group By report_date, insertion_order_id
Having report_date like '2014-08-12%' """
cursor.execute(sql_query)
query1 = cursor.fetchall()
for i in query1:
print i
Below is the output that I get
(u'2014-08-12', 10187, 2024, 8, 0.3952569169960474)
(u'2014-08-12', 12419, 15054, 176, 1.1691244851866613)
What do I need to do to display the results in a tabular form with column names
In DB-API 2.0 compliant clients, cursor.description is a sequence of 7-item sequences of the form (<name>, <type_code>, <display_size>, <internal_size>, <precision>, <scale>, <null_ok>), one for each column, as described here. Note description will be None if the result of the execute statement is empty.
If you want to create a list of the column names, you can use list comprehension like this: column_names = [i[0] for i in cursor.description] then do with them whatever you'd like.
Alternatively, you can set the row_factory parameter of the connection object to something that provides column names with the results. An example of a dictionary-based row factory for SQLite is found here, and you can see a discussion of the sqlite3.Row type below that.
Step 1: Select your engine like pyodbc, SQLAlchemy etc.
Step 2: Establish connection
cursor = connection.cursor()
Step 3: Execute SQL statement
cursor.execute("Select * from db.table where condition=1")
Step 4: Extract Header from connection variable description
headers = [i[0] for i in cursor.description]
print(headers)
Try Pandas .read_sql(), I can't check it right now but it should be something like:
pd.read_sql( Q , connection)
Here is a sample code using cx_Oracle, that should do what is expected:
import cx_Oracle
def test_oracle():
connection = cx_Oracle.connect('user', 'password', 'tns')
try:
cursor = connection.cursor()
cursor.execute('SELECT day_no,area_code ,start_date from dic.b_td_m_area where rownum<10')
#only print head
title = [i[0] for i in cursor.description]
print(title)
# column info
for x in cursor.description:
print(x)
finally:
cursor.close()
if __name__ == "__main__":
test_oracle();
The problem is quite simply explained. You have a DB, you have a table and you have an attribute.
What you want to do is to connect to the database, query the table for the max on the content of the table of a specific attribute.
What I tried so far is:
attr_name = 'foo'
meta = MetaData()
meta.reflect(bind=self._engine)
obj_table = meta.tables[table_name]
print("<< select max(attr_name) from obj_table >>")
What I would like to do is to print out the max. I tried with sessions, with getattr.. no clue. I just want to get the max out of this table, from a column having a name passed by parameter (I can't use dot notation).
Any clue?
from sqlalchemy import select, func
col = getattr(obj_table.c, attr_name)
q = select([func.max(col)], obj_table)
with self._engine.connect() as conn:
res = conn.execute(q)
max_value = res.fetchone()[0]
The important thing here is that table columns are accessed via obj_table.c (or obj_table.columns).
I am using SQLAlchemy 0.7.6. I am aliasing columns with:
column = table.c["name"].label("foo.bar")
and SQLite uses only 'bar' as result field alias. Is there any workaround for that?
Example code:
create_table("sqlite:////tmp/test.sqlite", schema)
engine = create_engine(url)
metadata = MetaData(engine, reflect=True)
table = Table("test_table", metadata, schema=schema, autoload=True)
column = table.c["name"].label("foo.bar")
cursor = sql.expression.select([column])
row = cursor.execute().fetchone()
print "keys are: %s" % (row.keys(), )
Will print:
keys are: [u'bar']
instead of:
keys are: [u'foo.bar']
Works for postgres.
Here is full test code: https://gist.github.com/2506388
I've already reported that to the sqlalchemy lists, however meanwhile I would like to know if anyone else is experiencing similar problem and might have a workaround.
Going to be patched in SQLAlchemy with an engine option. See mailing list thread for more information.
Meanwhile, the workaround is:
# select is sqlalchemy.sql.expression.select()
# each selected column was derived as column = table.c[reference].label(label_with_dot)
labels = [c.name for c in select.columns]
...
record = dict(zip(labels, row))
Solution after patch is :
conn = engine.connect().execution_options(sqlite_raw_colnames=True)
result = conn.execute(stmt)