Python Pony ORM Insert multiple values at once - python

I'm trying to insert multiple values into my postgres database using Pony ORM. My current approach is very inefficient:
from pony.orm import *
db = Database()
class Names(db.Entity):
first_name = Optional(str)
last_name = Optional(str)
family = [["Peter", "Mueller"], ["Paul", "Meyer"], ...]
#db_session
def populate_names(name_list)
for name in name_list:
db.insert("Names", first_name=name[0], last_name=name[1])
if __name__ == "__main__":
db.bind(provider='postgres', user='', password='', host='', database='')
db.generate_mappings(create_tables=True)
populate_names(family)
This is just a short example but the structure of the input is the same:
a list of lists.
I'm extracting the data from several xml files and insert one "file" at a time.
Does anyone has an idea on how to put several rows of data into one insert query in Pony ORM?

Pony doesn't provide something special for this, you can use execute_values from psycopg2.extras. Get connection object from db to use it.
from psycopg2.extras import execute_values
...
names = [
('はると', '一温'),
('りく', '俐空'),
('はる', '和晴'),
('ひなた', '向日'),
('ゆうと', '佑篤')
]
#db_session
def populate_persons(names):
sql = 'insert into Person(first_name, last_name) values %s'
con = db.get_connection()
cur = con.cursor()
execute_values(cur, sql, names)
populate_persons(names)
execute_values is in Fast execution helpers list so I think that iе should be the most efficient way.

Currently I'm experimenting with PonyORM for a future project and also came to the conclusion you provided.
The only way on how to insert data in a bulky way is:
# assuming data has this structure:
# [['foo','bar','bazooka'],...]
#db_session
def insert_bulk_array(field1, field2, field3):
MyClass(field1=field1, field2=field2, field3=field3)
# assuming the data is:
# {'field1':'foo','field2':'bar','field3':'bazooka'}
#db_session
def insert_bulk_dict(data)
MyClass(**data)
But from my point of view this is still somehow handy, specially when your data comes as JSON.

There is an open issue in the issue tracker of PonyORM which asks for exactly this feature.
I recommend to vote for it.

Related

Pythonic way to optimize SQL VIEW count to extract information schema metadata from Snowflake

I have 12 VIEW tables in Snowflake and I would like to extract TABLE_NAME,CREATED,LAST_ALTERED from Snowflakes INFORMATION Schema for View tables, and also want to get row count for each 12 VIEW tables, along with metadata for Base table mentioned in below code.I was wondering if there is way get row count using below code for 12 VIEW tables or there is any better approach for this task i.e. getting TABLE_NAME,CREATED,LAST_CREATED metadata for Base Table and VIEW table in Snowflake ?
Let's say my first 4 VIEW table names are "V_ACCOUNT","V_ADDRESS","V_COUNTRY","V_ORDER"
Below is my code so far.
Thanks in advance for your time and efforts!
Python Code:
import pandas as pd
import snowflake.connector
conn = snowflake.connector.connect(
user="MY_USER",
password="MY_PSWD",
account="MY_ACCOUNT",
warehouse="COMPUTE_WH",
database="SNOWFLAKE_SAMPLE_DATA",
schema="INFORMATION_SCHEMA",
role="SYSADMIN"
)
cur = conn.cursor()
tables = ['CUSTOMER', 'CALL_CENTER', 'CUSTOMER_ADDRESS']
view_tables=['V_ACCOUNT','V_ADDRESS','V_COUNTRY','V_ORDER']
cur.execute(
f"""
SELECT TABLE_NAME, ROW_COUNT, CREATED, LAST_ALTERED
FROM TABLES
WHERE TABLE_TYPE='BASE TABLE'
AND TABLE_SCHEMA='TPCDS_SF100TCL'
AND TABLE_NAME IN ({','.join("'" + x + "'" for x in tables)})
UNION
SELECT (select count(*) from DB.SCHEMA.V_ACCOUNT) AS ROW_COUNT,TABLE_NAME,CREATED,LAST_ALTERED FROM VIEWS WHERE TABLE_NAME="V_ACCOUNT"
"""
)
df = cur.fetch_pandas_all()
finally:
cur.close()
conn.close()
If you are looking for how to dynamically create a view in Python, then please take a look at Barmar's answer on my other question here. This will give you a hint of what needs to be done for dynamic view creation. Thanks, Barmar for help!

Unable to INSERT into PostgreSQL with psycopg2 python library

I am new to working with SQL and Postgres specifically and am trying to write a simple program that stores a course id and some URLs in an SQL table with two columns. I am using the psycopg2 python library.
I am able to read from the table using:
def get_course_urls(course):
con = open_db_connection()
cur = con.cursor()
query = f"SELECT urls FROM courses WHERE course = '{course}'"
cur.execute(query)
rows = cur.fetchall()
cur.close()
close_db_connection(con)
urls = []
for url in rows:
urls.extend(url[0])
return urls
However, I am unable to insert into the table using:
def format_urls_string(urls):
return '{"' + '","'.join(urls) + '"}'
def add_course_urls(course, urls):
con = open_db_connection()
cur = con.cursor()
query = f"INSERT INTO courses (course, urls) VALUES ('{course}', '{format_urls_string(urls)}');"
print(query)
cur.execute(query)
cur.close()
close_db_connection(con)
add_course_urls("CS136", ["http://google.com", "http://wikipedia.com"])
I do not think anything is wrong with my query because when I run the same query in the SQL Shell it works as I want it to.
The locks on the columns say that the columns are READ-ONLY, however, I am able to insert through the shell. I feel like this is a very minor fix but since I am new to PostgreSQL, I am having some trouble.
Your help is appreciated!
This is the danger of doing the substitution yourself, instead of letting the db connector do it. You looked at your string, yes? You're writing
... VALUES ('CS136', '['http://google.com','http://wikipedia.com']')
which is obviously the wrong syntax. It needs to be
... VALUES ('CS136', '{"http://google.com","http://wikipedia.com"}')
which Python's formatter won't generate. So, you can either format the insertion string by hand, or put placeholders and pass the parameters to the cursor.execute call:
query = "INSERT INTO courses (course, urls) VALUES (%s,%s);"
cur.execute( query, (course, urls) )

querying mysql database in python where table is a variable

I am aware this may be a duplicate post. However I have looked at the other posts and cant figure it out in my case.
from configdata import configdata
from dbfiles.dbconnect import connection
c,conn = connection()
table = configdata()[4]
userid = 'jdeepee'
value = c.execute("SELECT * FROM %s WHERE userid = (%s)" % (table), userid)
print(value)
I think the code is self explanatory. But essentially what I am trying to do is query a MySQL database based on a variable for the integer and userid. I believe my syntax is wrong not sure how to fix it however. Help would be great.
Try this:
value = c.execute("SELECT * FROM {} WHERE userid = %s".format(table), (userid,))
Basically, you need to interpolate the table name into the query first, then pass any query parameters to .execute() in a tuple.

python mysql.connector DictCursor?

In Python mysqldb I could declare a cursor as a dictionary cursor like this:
cursor = db.cursor(MySQLdb.cursors.DictCursor)
This would enable me to reference columns in the cursor loop by name like this:
for row in cursor: # Using the cursor as iterator
city = row["city"]
state = row["state"]
Is it possible to create a dictionary cursor using this MySQL connector?
http://dev.mysql.com/doc/connector-python/en/connector-python-example-cursor-select.html
Their example only returns a tuple.
I imagine the creators of MySQL would eventually do this for us?
According to this article it is available by passing in 'dictionary=True' to the cursor constructor:
http://dev.mysql.com/doc/connector-python/en/connector-python-api-mysqlcursordict.html
so I tried:
cnx = mysql.connector.connect(database='bananas')
cursor = cnx.cursor(dictionary=True)
and got:
TypeError: cursor() got an unexpected keyword argument 'dictionary'
and I tried:
cnx = mysql.connector.connect(database='bananas')
cursor = cnx.cursor(named_tuple=True)
and got:
TypeError: cursor() got an unexpected keyword argument 'named_tuple'
and I tried this one too: cursor = MySQLCursorDict(cnx)
but to no avail. Clearly I'm on the wrong version here and I suspect we just have to be patient as the document at http://downloads.mysql.com/docs/connector-python-relnotes-en.a4.pdf suggests these new features are in alpha phase at point of writing.
A possible solution involves subclassing the MySQLCursor class like this:
class MySQLCursorDict(mysql.connector.cursor.MySQLCursor):
def _row_to_python(self, rowdata, desc=None):
row = super(MySQLCursorDict, self)._row_to_python(rowdata, desc)
if row:
return dict(zip(self.column_names, row))
return None
db = mysql.connector.connect(user='root', database='test')
cursor = db.cursor(cursor_class=MySQLCursorDict)
Now the _row_to_python() method returns a dictionary instead of a tuple.
I found this on the mysql forum, and I believe it was posted by the mysql developers themselves. I hope they add it to the mysql connector package some day.
I tested this and it does work.
UPDATE: As mentioned below by Karl M.W... this subclass is no longer needed in v2 of the mysql.connector. The mysql.connector has been updated and now you can use the following option to enable a dictionary cursor.
cursor = db.cursor(dictionary=True)
This example works:
cnx = mysql.connector.connect(database='world')
cursor = cnx.cursor(dictionary=True)
cursor.execute("SELECT * FROM country WHERE Continent = 'Europe'")
print("Countries in Europe:")
for row in cursor:
print("* {Name}".format(Name=row['Name']
Keep in mind that in this example, 'Name' is specific to the column name of the database being referenced.
Also, if you want to use stored procedures, do this instead:
cursor.callproc(stored_procedure_name, args)
result = []
for recordset in cursor.stored_results():
for row in recordset:
result.append(dict(zip(recordset.column_names,row)))
where stored_procedure_name is the name of the stored procedure to use and args is the list of arguments for that stored procedure (leave this field empty like [] if no arguments to pass in).
This is an example from the MySQL documentation found here: http://dev.mysql.com/doc/connector-python/en/connector-python-api-mysqlcursordict.html
Using Python 3.6.2 and MySQLdb version 1.3.10, I got this to work with:
import MySQLdb
import MySQLdb.cursors
...
conn = MySQLdb.connect(host='...',
<connection info>,
cursorclass=MySQLdb.cursors.DictCursor)
try:
with conn.cursor() as cursor:
query = '<SQL>'
data = cursor.fetchall()
for record in data:
... record['<field name>'] ...
finally:
conn.close()
I'm using PyCharm, and simply dug into the MySQLdb modules connections.py and cursors.py.
I had the same problem with the default cursor returning tuples with no column names.
The answer is here:
Getting error while using MySQLdb.cursors.DictCursor in MYSQL_CURSORCLASS
app.config["MYSQL_CURSORCLASS"] = "DictCursor"

How to execute raw SQL in Flask-SQLAlchemy app

How do you execute raw SQL in SQLAlchemy?
I have a python web app that runs on flask and interfaces to the database through SQLAlchemy.
I need a way to run the raw SQL. The query involves multiple table joins along with Inline views.
I've tried:
connection = db.session.connection()
connection.execute( <sql here> )
But I keep getting gateway errors.
Have you tried:
result = db.engine.execute("<sql here>")
or:
from sqlalchemy import text
sql = text('select name from penguins')
result = db.engine.execute(sql)
names = [row[0] for row in result]
print names
Note that db.engine.execute() is "connectionless", which is deprecated in SQLAlchemy 2.0.
SQL Alchemy session objects have their own execute method:
result = db.session.execute('SELECT * FROM my_table WHERE my_column = :val', {'val': 5})
All your application queries should be going through a session object, whether they're raw SQL or not. This ensures that the queries are properly managed by a transaction, which allows multiple queries in the same request to be committed or rolled back as a single unit. Going outside the transaction using the engine or the connection puts you at much greater risk of subtle, possibly hard to detect bugs that can leave you with corrupted data. Each request should be associated with only one transaction, and using db.session will ensure this is the case for your application.
Also take note that execute is designed for parameterized queries. Use parameters, like :val in the example, for any inputs to the query to protect yourself from SQL injection attacks. You can provide the value for these parameters by passing a dict as the second argument, where each key is the name of the parameter as it appears in the query. The exact syntax of the parameter itself may be different depending on your database, but all of the major relational databases support them in some form.
Assuming it's a SELECT query, this will return an iterable of RowProxy objects.
You can access individual columns with a variety of techniques:
for r in result:
print(r[0]) # Access by positional index
print(r['my_column']) # Access by column name as a string
r_dict = dict(r.items()) # convert to dict keyed by column names
Personally, I prefer to convert the results into namedtuples:
from collections import namedtuple
Record = namedtuple('Record', result.keys())
records = [Record(*r) for r in result.fetchall()]
for r in records:
print(r.my_column)
print(r)
If you're not using the Flask-SQLAlchemy extension, you can still easily use a session:
import sqlalchemy
from sqlalchemy.orm import sessionmaker, scoped_session
engine = sqlalchemy.create_engine('my connection string')
Session = scoped_session(sessionmaker(bind=engine))
s = Session()
result = s.execute('SELECT * FROM my_table WHERE my_column = :val', {'val': 5})
docs: SQL Expression Language Tutorial - Using Text
example:
from sqlalchemy.sql import text
connection = engine.connect()
# recommended
cmd = 'select * from Employees where EmployeeGroup = :group'
employeeGroup = 'Staff'
employees = connection.execute(text(cmd), group = employeeGroup)
# or - wee more difficult to interpret the command
employeeGroup = 'Staff'
employees = connection.execute(
text('select * from Employees where EmployeeGroup = :group'),
group = employeeGroup)
# or - notice the requirement to quote 'Staff'
employees = connection.execute(
text("select * from Employees where EmployeeGroup = 'Staff'"))
for employee in employees: logger.debug(employee)
# output
(0, 'Tim', 'Gurra', 'Staff', '991-509-9284')
(1, 'Jim', 'Carey', 'Staff', '832-252-1910')
(2, 'Lee', 'Asher', 'Staff', '897-747-1564')
(3, 'Ben', 'Hayes', 'Staff', '584-255-2631')
You can get the results of SELECT SQL queries using from_statement() and text() as shown here. You don't have to deal with tuples this way. As an example for a class User having the table name users you can try,
from sqlalchemy.sql import text
user = session.query(User).from_statement(
text("""SELECT * FROM users where name=:name""")
).params(name="ed").all()
return user
For SQLAlchemy ≥ 1.4
Starting in SQLAlchemy 1.4, connectionless or implicit execution has been deprecated, i.e.
db.engine.execute(...) # DEPRECATED
as well as bare strings as queries.
The new API requires an explicit connection, e.g.
from sqlalchemy import text
with db.engine.connect() as connection:
result = connection.execute(text("SELECT * FROM ..."))
for row in result:
# ...
Similarly, it’s encouraged to use an existing Session if one is available:
result = session.execute(sqlalchemy.text("SELECT * FROM ..."))
or using parameters:
session.execute(sqlalchemy.text("SELECT * FROM a_table WHERE a_column = :val"),
{'val': 5})
See "Connectionless Execution, Implicit Execution" in the documentation for more details.
result = db.engine.execute(text("<sql here>"))
executes the <sql here> but doesn't commit it unless you're on autocommit mode. So, inserts and updates wouldn't reflect in the database.
To commit after the changes, do
result = db.engine.execute(text("<sql here>").execution_options(autocommit=True))
This is a simplified answer of how to run SQL query from Flask Shell
First, map your module (if your module/app is manage.py in the principal folder and you are in a UNIX Operating system), run:
export FLASK_APP=manage
Run Flask shell
flask shell
Import what we need::
from flask import Flask
from flask_sqlalchemy import SQLAlchemy
db = SQLAlchemy(app)
from sqlalchemy import text
Run your query:
result = db.engine.execute(text("<sql here>").execution_options(autocommit=True))
This use the currently database connection which has the application.
Flask-SQLAlchemy v: 3.0.x / SQLAlchemy v: 1.4
users = db.session.execute(db.select(User).order_by(User.title.desc()).limit(150)).scalars()
So basically for the latest stable version of the flask-sqlalchemy specifically the documentation suggests using the session.execute() method in conjunction with the db.select(Object).
Have you tried using connection.execute(text( <sql here> ), <bind params here> ) and bind parameters as described in the docs? This can help solve many parameter formatting and performance problems. Maybe the gateway error is a timeout? Bind parameters tend to make complex queries execute substantially faster.
If you want to avoid tuples, another way is by calling the first, one or all methods:
query = db.engine.execute("SELECT * FROM blogs "
"WHERE id = 1 ")
assert query.first().name == "Welcome to my blog"

Categories

Resources