In trying to replicate a MySQL query in SQL Alchemy, I've hit a snag in specifying which tables to select from.
The query that works is
SELECT c.*
FROM attacks AS a INNER JOIN hosts h ON a.host_id = h.id
INNER JOIN cities c ON h.city_id = c.id
GROUP BY c.id;
I try to accomplish this in SQLAlchemy using the following function
def all_cities():
session = connection.globe.get_session()
destination_city = aliased(City, name='destination_city')
query = session.query(City). \
select_from(Attack).\
join((Host, Attack.host_id == Host.id)).\
join((destination_city, Host.city_id == destination_city.id)).\
group_by(destination_city.id)
print query
results = [result.serialize() for result in query]
session.close()
file(os.path.join(os.path.dirname(__file__), "servers.geojson"), 'a').write(geojson.feature_collection(results))
When printing the query, I end up with ALMOST the right query
SELECT
cities.id AS cities_id,
cities.country_id AS cities_country_id,
cities.province AS cities_province,
cities.latitude AS cities_latitude,
cities.longitude AS cities_longitude,
cities.name AS cities_name
FROM cities, attacks
INNER JOIN hosts ON attacks.host_id = hosts.id
INNER JOIN cities AS destination_city ON hosts.city_id = destination_city.id
GROUP BY destination_city.id
However, you will note that it is selecting from cities, attacks...
How can I get it to select only from the attacks table?
The line here :
query = session.query(City)
is querying the City table also that's why you are getting the query as
FROM cities, attacks
Related
I want to ask that if I can update the parameter in SQL query using python. I want to read a SQL query first, and process the SQL outputted data using python. However, I need to specify the data that I want to filter in SQL query, and I wonder if there is any way that I can update the parameter in python instead of updating the SQL query.
The SQL query is like the following:
set nocount on;
declare #pdate datetime
set #pdate = '2022-12-31'
select
cast(L.Date as datetime) as Date
, Amount
, AccountNumber
, Property
, County
, ZipCode
, Price
, Owner
from Account.Detail L
inner join Owner.Detail M
on L.Date = M.Date
and L.Number = M.Number
inner join Purchase.Detail P
on L.Date = P.Date
and L.Purchase.Number = P.Purchase.Number
where L.Date = #pdate
and Purchase.Number not in ('CL1', 'CL2')
and Amount > 0
And I want to run the python code like following:
import pyodbc
server = 'my_server_name'
database = 'my_database_name'
connection = pyodbc.connect(Trusted_Connection = "yes", DRIVER = "{SQL Server}", SERVER = server, DATABASE = database)
cursor = connection.cursor()
query = open('Pathway_for_SQL_Query.sql').read()
data = pd.read_sql(query, connection)
connection.close()
I need to declare the #pdate in SQL query every time, I want to ask if I can update the #pdate using Python?
Instead of parsing and replacing an SQL script, you could use bind variables and have Python control the value (note the "?" in the query):
pdate = "some value"
# query could be read from file, given here for simplicity
query = """
select
cast(L.Date as datetime) as Date
, Amount
, AccountNumber
, Property
, County
, ZipCode
, Price
, Owner
from Account.Detail L
inner join Owner.Detail M
on L.Date = M.Date
and L.Number = M.Number
inner join Purchase.Detail P
on L.Date = P.Date
and L.Purchase.Number = P.Purchase.Number
where L.Date = ?
and Purchase.Number not in ('CL1', 'CL2')
and Amount > 0
"""
data = pd.read_sql(query, connection, params=(pdate,))
I want to convert this sql query to SQLALCHEMY:
SELECT * FROM dbcloud.client_feedback as a
join (select distinct(max(submitted_on)) sub,pb_channel_id pb, mail_thread_id mail from client_feedback group by pb_channel_id, mail_thread_id) as b
where (a.submitted_on = b.sub and a.pb_channel_id = b.pb) or ( a.submitted_on = b.sub and a.mail_thread_id = b.mail )
I can't find as keyword in SQLALCHEMY
I think that what you may be looking for is .label(name).
Assuming you have a model
class MyModel(db.Model):
id = db.Column(primary_key=True)
name = db.Column()
here is an example of how .label(name) can be used
query = db.session.query(MyModel.name.label('a'))
will produce the SQL
SELECT my_model.name as a FROM my_model
It's not straight forward to find information on this so wondering if there are some docs I can look at but basically I want to achieve passing multiple conditions to either .where() or .order_by() that is safe from SQL injection.
Here's how I am currently doing this: Two tables: Archive and Backup, and I am trying to filter by archive.city, archive.zip, and backup.serial and then I am ordering by all of those fields. The values are coming from the user via URL parameters so I need to make sure these are safe from SQL injection and sanitized.
filters = []
sorts = []
if 'city' in query:
city = query['city']
filters.append(text(f'archive.city = {city}'))
sorts.append(text(f'archive.city = {city}'))
if 'zip' in query:
zip = query['zip']
filters.append(text(f'archive.zip > {zip}'))
sorts.append(text(f'archive.zip DESC'))
if 'serial' in query:
serial = query['serial']
filters.append(text(f'backup.serial IN {serial}'))
sorts.append(text(f'backup.serial ASC'))
with Session(engine) as session:
results = session.exec(select(Archive, Backup)
.join(Backup)
.where(and_(*filters))
.order_by(*sorts).all()
as I understand the text() is not safe from sql injection, so how do I transform this so that it does what I want and is safe from sql injection?
You can invoke .where() and .order_by() on a select() multiple times and SQLAlchemy will logically "and" them for you:
qry = select(Task)
qry = qry.where(Task.description == "foo")
qry = qry.where(Task.priority < 2)
qry = qry.order_by(Task.priority)
qry = qry.order_by(Task.description)
print(qry)
"""
SELECT task.id, task.description, task.priority
FROM task
WHERE task.description = :description_1 AND task.priority < :priority_1
ORDER BY task.priority, task.description
"""
Goal
I am aiming to insert database records into MySQL using Python. But with an extra detail, I'll explain as I go along..
This is my current script (Fully functional & working):
#Get data from SQL
sqlCursor = mjmConnection.cursor()
sqlCursor.execute("SELECT sol.id, p.id, p.code,p.description, p.searchRef1, so.number, c.code, c.name, sol.requiredQty \
FROM salesorderline sol JOIN \
salesorder so \
ON sol.salesorderid = so.id JOIN \
product p \
ON sol.productid = p.id JOIN \
customer c \
ON so.customerid = c.id \
WHERE so.orderdate > DATEADD(dd,-35,CAST(GETDATE() AS date));")
#Send recieved data from SQL query from above to MySQL database
print("Sending MJM records to MySQL Database")
mjmCursorMysql = productionConnection.cursor()
for x in sqlCursor.fetchall():
a,b,c,d,e,f,g,h,i = x
mjmCursorMysql.execute("INSERT ignore INTO mjm_python (id, product_id, product_code, product_description, product_weight, \
salesorder_number, customer_code, customer_name, requiredQty) VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s);", (a,b,c,d,e,f,g,h,i))
productionConnection.commit()
mjmCursorMysql.close()
sqlCursor.close()
What it does
The above script does the following:
Gets data from SQL Server
Inserts that data into MySQL
I have specifically used IGNORE in the MySQL query, to prevent duplicate id numbers.
Data will look like this:
Next..
Now - i'd like to add a column name sales_id_increment. This will start from 1 and increment for each same salesorder_number and reset back to 1 when there is a different salesorder_number. So I am wanting it to look something like this:
Question
How do I achieve this? Where do I need to look, in my Python script or the MySQL query?
You can get this column when you select the rows from SQL Server with window functions ROW_NUMBER() or DENSE_RANK() (if there are duplicate ids):
SELECT sol.id, p.id, p.code,p.description, p.searchRef1, so.number, c.code, c.name, sol.requiredQty,
ROW_NUMBER() OVER (PARTITION BY so.number ORDER BY sol.id) sales_id_increment
FROM salesorderline sol
JOIN salesorder so ON sol.salesorderid = so.id
JOIN product p ON sol.productid = p.id
JOIN customer c ON so.customerid = c.id
WHERE so.orderdate > DATEADD(dd,-35,CAST(GETDATE() AS date));
Relative SQL Alchemy newbie here. I create an outer join object and then use it in a select query. While the query is created, the join condition disappears, resulting in a cartesian product.
Creating the join:
data_set = join(db.client, db.employee, isouter=True)
Debugger shows the value of the join object as:
data_set = client LEFT OUTER JOIN employee ON employee.id =
client.account_manager_id
Query the join:
qry = select([data_set.c.client_id.label('ID'), data_set.c.client_contract_client_name.label('CONTRACT CLIENT'),
data_set.c.client_project_client_name.label('PROJECT CLIENT'),
data_set.c.client_ins_dt.label('INSERT'), data_set.c.client_update_dt.label('UPDATE'),
(data_set.c.employee_last_name + data_set.c.employee_first_name).label('ACCT MGR')]).\
order_by(data_set.c.client_contract_client_name)
Debugger shows the SQL of qry as:
SELECT client.id AS "ID", client.contract_client_name AS "CONTRACT
CLIENT", client.project_client_name AS "PROJECT CLIENT", client.ins_dt
AS "INSERT", client.update_dt AS "UPDATE", employee.last_name ||
employee.first_name AS "ACCT MGR"
FROM client, employee ORDER BY
client.contract_client_name
Notice the FROM clause. Where did my JOIN go?
I just figured it out! I needed to use the select_from() method of the select command. My new (correctly functioning) query appears as follows...
qry = select([data_set.c.client_id.label('ID'), data_set.c.client_contract_client_name.label('CONTRACT CLIENT'),
data_set.c.client_project_client_name.label('PROJECT CLIENT'),
data_set.c.client_ins_dt.label('INSERT'), data_set.c.client_update_dt.label('UPDATE'),
(data_set.c.employee_last_name + data_set.c.employee_first_name).label('ACCT MGR')]).\
select_from(data_set).\
order_by(data_set.c.client_contract_client_name)
Notice the second line from the bottom - select_from(data_set)