I'm using Python (and Peewee) to connect to a SQLite database. My data access layer (DAL) is a mix of peewee ORM and SQL-based functions. I would like to enable EXPLAIN PLAN for all queries upon connecting to the database and toggle it via configuration or CLI parameter ... how can I do that using the Python API?
from playhouse.db_url import connect
self._logger.info("opening db connection to database, creating cursor and initializing orm model ...")
self.__db = connect(url)
# add support for a REGEXP and POW implementation
# TODO: this should be added only for the SQLite case and doesn't apply to other vendors.
self.__db.connection().create_function("REGEXP", 2, regexp)
self.__db.connection().create_function("POW", 2, pow)
self.__cursor = self.__db.cursor()
self.__cursor.arraysize = 100
# what shall I do here to enable EXPLAIN PLANs?
That is a feature of the sqlite interactive shell. To get the query plans, you will need to explicitly request it. This is not quite straightforward with Peewee because it uses parameterized queries. You can get the SQL executed by peewee in a couple of ways.
# Print all queries to stderr.
import logging
logger = logging.getLogger('peewee')
logger.addHandler(logging.StreamHandler())
logger.setLevel(logging.DEBUG)
Or for an individual query:
query = SomeModel.select()
sql, params = query.sql()
# To get the query plan:
curs = db.execute_sql('EXPLAIN ' + sql, params)
print(curs.fetchall()) # prints query plan
Related
I have a code that executes queries to redshift like this:
def send_sql_query(source, sql_query, lst=None):
connection = psycopg2.connect(
host=os.environ["REDSHIFT_HOST"],
port="5439",
dbname="dbname",
user=os.environ["REDSHIFT_USERNAME"],
password=os.environ["REDSHIFT_PASSWORD"],
cursor = connection.cursor()
cursor.execute(sql_query, lst)
sql_results = cursor.fetchall()
return sql_results
finally:
if connection:
connection.close()
I would like to mock the method in a way that it will retrieve and sql_query, and the method will hold a fake db data (preferable in json), but will execute the SQL on the fake data with the sql_query and return the result.
Using mock.return_value and mock.side_effect will not help, because I want to verify that the SQL query is correct. Writing a code to return results doesn't really check the SQL query
Is there a framework in python for it?
Testing the SQL requires a SQL engine. As different databases use different dialects and as you use PostgreSQL as you main database, you should install a PostgreSQL instance on you dev environment with fake data and redirect your queries there while testing.
As you use the environment to store the reference of the database, you have just to setup a test environment pointing to the test database.
I would like to use data from SQL server in Pycharm using python. I have my database connection set up in Pycharm, but not sure how to access this data within my python code. I would like to query the data within the python code (similar to what I would do in R using the RODBC package).
Any suggestions on what to do or where to look would be much appreciated.
I have been having issues with this over learning this the last few days. (database / python) For me I am working in flask but it doesn't really seem to matter.
I did get this to work though not exactly what you ask but might get you a start
import MySQLdb
def database():
db = MySQLdb.connect(host="localhost", port=3306, user="root", passwd="admin", db="echo")
cursor = db.cursor()
cursor.execute( "INSERT INTO `post` (`hello`) VALUES (null), ('hello_world')" )
db.commit()
db.close()
I had to just set up my database from the command line. Its not pretty or intuitive but should get you started.
If you want to work with Python objects rather than SQL, I'd use SqlAlchemy and reflection.
from sqlalchemy import MetaData, create_engine
from sqlalchemy.orm import Session
from sqlalchemy.ext.automap import automap_base
engine = create_engine('mysql+mysqldb://...', pool_recycle=3600)
metadata = MetaData()
metadata.reflect(bind=engine)
session = Session(engine)
Base = automap_base(metadata=metadata)
Base.prepare()
# assuming I have a table named 'users'
Users = Base.classes.users
someUsers = Users.query.filter(Users.name.in_(['Jack', 'Bob', 'YakMan')).all()
import mysql.connector
connection=mysql.connector.connect(user='root', password='daniela', host='localhost', database='girrafe')
mycursor=connection.cursor()
There is a concept called OR(Object Relational) Mapping in python, which can be used for database connections. One of the modules that you need to import for the purpose is SQLAlchemy.
First, you will need to install sqlalchemy by:
pip install sqlalchemy
Now, for database connection, we have an Engine class in the sqlalchemy, which is responsible for the database connectivity. We create an object of the Engine class for establishing connection.
from sqlalchemy import create_engine,MetaData,select
engine=create_engine("mysql://user:pwd#localhost/dbname", echo=True)
connection=engine.connect()
The process of reading the database and creating metadata is called Reflection.
metadata=MetaData()
query=select([Student]) #Assuming that my database already has a table named Student
result=connection.execute(query)
row=result.fetchall() #This works similar to the select* query
In this way, you can manipulate data through other queries too, using sqlalchemy!
I can't seem to correctly connect and pull from a test postgreSQL database in python. I installed PostgreSQL using Homebrew. Here's how I have been accessing the database table and value from the terminal:
xxx-macbook:~ xxx$ psql
psql (9.4.0)
Type "help" for help.
xxx=# \dn
List of schemas
Name | Owner
--------+---------
public | xxx
(1 row)
xxx=# \connect postgres
You are now connected to database "postgres" as user "xxx".
postgres=# SELECT * from test.test;
coltest
-----------
It works!
(1 row)
But when trying to access it from python, using the code below, it doesn't work. Any suggestions?
########################################################################################
# Importing variables from PostgreSQL database via SQL commands
db_conn = psycopg2.connect(database='postgres',
user='xxx')
cursor = db_conn.cursor()
#querying the database
result = cursor.execute("""
Select * From test.test
""")
print "Result: ", result
>>> Result: None
It should say: Result: It works!
You need to fetch the results.
From the docs:
The [execute()-]method returns None. If a query was executed, the returned values can be retrieved using fetch*() methods.
Example:
result = cursor.fetchall()
For reference:
http://initd.org/psycopg/docs/cursor.html#execute
http://initd.org/psycopg/docs/cursor.html#fetch
Note that (unlike psql) psycopg2 wraps anything in transactions. So if you intend to issue persistent changes to the database (INSERT, UPDATE, DELETE, ...) you need to commit them explicitly. Otherwise changes will be rolled back automatically when the connection object is destroyed. Read more on that topic here:
http://initd.org/psycopg/docs/usage.html
http://initd.org/psycopg/docs/usage.html#transactions-control
In sqlalchemy, I make the connection:
conn = engine.connect()
I found this will set autocommit = 0 in my mysqld log.
Now I want to set autocommit = 1 because I do not want to query in a transaction.
Is there a way to do this?
From The SQLAlchemy documentation: Understanding autocommit
conn = engine.connect()
conn.execute("INSERT INTO users VALUES (1, 'john')") # autocommits
The “autocommit” feature is only in effect when no Transaction has otherwise been declared. This means the feature is not generally used with the ORM, as the Session object by default always maintains an ongoing Transaction.
Full control of the “autocommit” behavior is available using the generative Connection.execution_options() method provided on Connection, Engine, Executable, using the “autocommit” flag which will turn on or off the autocommit for the selected scope. For example, a text() construct representing a stored procedure that commits might use it so that a SELECT statement will issue a COMMIT:
engine.execute(text("SELECT my_mutating_procedure()").execution_options(autocommit=True))
What is your dialect for mysql connection?
You can set the autocommit to True to solve the problem, like this mysql+mysqldb://user:password#host:port/db?charset=foo&autocommit=true
You can use this:
from sqlalchemy.sql import text
engine = create_engine(host, user, password, dbname)
engine.execute(text(sql).execution_options(autocommit=True))
In case you're configuring sqlalchemy for a python application using flask / django, you can create the engine like this:
# Configure the SqlAlchemy part of the app instance
app.config['SQLALCHEMY_DATABASE_URI'] = conn_url
session_options = {
'autocommit': True
}
# Create the SqlAlchemy db instance
db = SQLAlchemy(app, session_options=session_options)
I might be little late here, but for fox who is using sqlalchemy >= 2.0.*, above solution might not work as it did not work for me.
So, I went through the official documentation, and below solution worked for me.
from sqlalchemy import create_engine
db_engine = create_engine(database_uri, isolation_level="AUTOCOMMIT")
Above code works if you want to set autocommit engine wide.
But if you want use autocommit for a particular query then you can use below -
with engine.connect().execution_options(isolation_level="AUTOCOMMIT") as connection:
with connection.begin():
connection.execute("<statement>")
Official Documentation
This can be done using the autocommit option in the execution_option() method:
engine.execute("UPDATE table SET field1 = 'test'").execution_options(autocommit=True))
This information is available within the documentation on Autocommit
I have a simple web2py server that we use to visualize data from our PostgreSQL Server. The following functions are all part of the global models in web2py.
The current solution to fetch data is very simple. Every time I connect, and after I get the data I close the connection:
# Old way:
# (imports excluded)
def get_data(query):
postgres_connection = psycopg2.connect("credentials")
df = psql.frame_query(query, con=postgres_connection) # Pandas function to put data from query into DataFrame
postgres.close()
return df
For small queries, opening and closing the connection takes about 9/10 of the time run the function.
Is this a good way to do it instead? If not, what is a better way?
# Better way?
def connect():
"""
Create a connection to server.
"""
return psycopg2.connect("credentials")
db_connection = connect()
def create_pandas_frame(query):
"""
Get query if connection is open.
"""
return psql.frame_query(query, con=db_connection)
def get_data(query):
"""
Try to get data, open a new conneciton if connection is closed.
"""
try:
data = create_pandas_frame(query)
except:
global db_connection
db_connection = connect()
data = create_pandas_frame(query)
return data
If you run that code in a web2py model file, you'll end up creating a new connection on each HTTP request anyway. Instead, you might consider connection pooling.
An easier option might be to use the web2py DAL to fetch the data. Something like:
from pandas.core.api import DataFrame
db = DAL([db connection string], pool_size=10, migrate_enabled=False)
rows = db.executesql(query)
data = DataFrame.from_records(rows, columns=[list, of, column, names])
If you specify the pool_size argument to DAL(), it will automatically maintain a connection pool to be used across requests.
Note, I haven't tried this, so it may need some tweaking, but something along these lines should work.
If you'd like, you can even use the DAL to generate the SQL by defining database table models:
db.define_table('mytable',
Field('field1', 'integer'),
Field('field2', 'double'),
Field('field3', 'boolean'))
rows = db.executesql(db(db.mytable.id > 0)._select())
data = DataFrame.from_records(rows, columns=db.mytable.fields)
The ._select() method just generates the SQL without actually doing the select. The SQL is then passed to .executesql() to fetch the data.
An alternative is to create a special Pandas processor and pass it as the processor argument to .select().
def pandas_processor(rows, fields, columns, cacheable):
return DataFrame.from_records(rows, columns=columns)
data = db(db.mytable.id > 0).select(processor=pandas_processor)
I used Anthony's answer and now have functions that look like this:
# In one of the models files.
from pandas.core.api import DataFrame
external_db = DAL('postgres://connection_stuff',pool_size=10,migrate_enabled=False)
def create_simple_html_table(query):
dict_from_db = external_db.executesql(query, as_dict=True)
return DataFrame(dict_from_db).to_html()
Then later in a view or controller a html table is created using:
# In Controller:
my_table = create_simple_html_table('select * from random_table limit 50')
# In View:
{{=XML(create_simple_html_table('select * from random_table limit 50'))}}
I still need to do more testing, but my understanding so far is that this solution will let me query things from the external db and let web2py keep the connection, and let web2py use the same connection for all users.
Note that this solution is only good if all you want to do is to read and write to you Postgres server with raw SQL.
If you want to use DAL to read and write, you need to either try to find the DAL alternative called MyDAL or play around with the search_path option in Postgres.