The pandas package have a method called .to_sql that help to insert the current data frame on to the database.
.to_sql doc:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_sql.html
The second parameter is con
sqlalchemy.engine.(Engine or Connection) or sqlite3.Connection
Is it possible to generate the SQL query without passing a database connection?
We actually cannot print the query without a database connection, but we can use sqlalchemy create_mock_engine method and pass "memory" as the database URI to trick pandas, e.g:
from sqlalchemy import create_mock_engine, Metadata
def dump(sql, *multiparams, **params):
print(sql.compile(dialect=engine.dialect))
engine = create_mock_engine("sqlite://:memory:", echo=True)
Metadata.create_all(engine, checkfirst=False)
frame.to_sql(engine)
Related
I'm developing a python application where most of its functions will interact (create, read, update and delete) with a specific table in a MySQL database. I know that I can query this specific table with the following code:
engine = create_engine(
f"mysql+pymysql://{username}:{password}#{host}:{port}",
pool_pre_ping=True
)
meta = MetaData(engine)
my_table = Table(
'my_table',
meta,
autoload=True,
schema=db_name
)
dbsession = sessionmaker(bind=engine)
session = dbsession()
# example query to table
results = session.query(my_table).filter(my_table.columns.id >=1)
results.all()
However, I do not understand how to make these definitions (engine, meta, table, session) global to all of my functions. Should I define these things in my init.py and then pass them along as function arguments? Should I define a big class and initialize them during the class init?
My goal is to be able to query that table in any of my functions at any time without having to worry if the connection has gone away. According to the SQL Alchemy docs:
Just one time, somewhere in your application’s global scope. It should be looked upon as part of your application’s configuration. If your application has three .py files in a package, you could, for example, place the sessionmaker line in your init.py file; from that point on your other modules say “from mypackage import Session”. That way, everyone else just uses Session(), and the configuration of that session is controlled by that central point.
Ok, but what about the engine, table and meta? Do I need to worry about those?
If you are working with a single table then the reflected table instance (my_table) and the engine should be all you need to expose globally.
the metadata object (meta) not required for querying, but is available as my_table.metadata if required
sessions are not required because you do not appear to be using the SQLAlchemy ORM.
The engine maintains a pool of connections, which you can check out to run queries (don't forget to close them though). This example code uses context managers to ensure that transactions are committed and connections are closed:
# Check out a connection
with engine.connect() as conn:
# Start a transaction
with conn.begin():
q = select(my_table).where(my_table.c.id >= 1)
result = conn.execute(q)
I'm creating my DB from an existing shema and it's stored in :memory:.
db = Database(filename=':memory:', schema='schema.sql')
db.recreate()
I now want to "link" this to SQL Alchemy. Followed different methods but could not get it right.
My current attempt stands as follow:
engine = create_engine('sqlite:///:memory:')
Base = automap_base()
Base.prepare(engine, reflect=True)
User = Base.classes.user
session = Session(engine)
Much like the other stuff I tried this will throw AttributeError: user.
How can I have this work together?
The relevant part of the documentation is here: https://sqlite.org/inmemorydb.html .
If you use :memory: then every connection will have its own memory database. The trick is to use a named in memory database with the URI format, like the following
import random
import string
import sqlite3
# creating a random name for the temporary memory DB
sqlite_shared_name = "test_db_{}".format(
random.sample(string.ascii_letters, k=4)
)
create_engine(
"sqlite:///file:{}?mode=memory&cache=shared&uri=true".format(
sqlite_shared_name))
the format is a URI as stated by the query string parameter uri=true (see SQLAlchemy documentation)
it is a memory DB with mode=memory
it can be shared among various connection with cache=shared
If you have another connection, then you can use more or less the same connection string. For instance, for getting the connection to that same DB in memory using python's sqlite module, you can drop the uri=true from the query string (and the dialect part sqlite:///) and pass it as argument:
dest = sqlite3.connect(
"file:{}?mode=memory&cache=shared".format(sqlite_shared_name),
uri=True)
As the title states I need some help with Python and MySQL. I am currently studying Python further and I am focusing hard on using Python and MySQL for database design, development, administration and applications.
I am familiar with MySQL and somewhat familiar with Python. Currently I am working on object orientated programming and I am trying my hand at setting up a database connection inside of a database class and then using the class to Create, Update, Delete and Read data.
I have created a new Python object:
import pymysql as MySQL
class Database(object):
Host = "127.0.0.1"
Database = "****"
user = "****"
password = "****"
#staticmethod
def initialize():
currentdb = MySQL.connect(Database.Host, Database.user, Database.password, Database.Database)
cursor = currentdb.cursor()
#staticmethod
def insert(Table, DataDict):
placeholders = ", ".join(["%s"] * len(DataDict))
columns = ", ".join(DataDict.keys())
sql = "INSERT INTO %s (%s) VALUES (%s)"%(Table, columns, placeholders)
cursor.execute(sql, DataDict.values())
I want to know, how do I work with the cursor inside of a object? I don't know if my current approach is even close to how it should be handled, I am really not sure.
Can the cursor be initialized in this way, and then used further in the object as I intend on doing in the above extract?
Any help would be highly appreciated.
The right way to work with cursors is like this:
import contextlib
def doSomething():
with contextlib.closing(database.cursor()) as cursor:
cursor.execute(...)
# At the end of the `with` statement, cursor is closed
Do not keep a cursor open for too long. Keeping a connection open for a long time, as you do, is fine. Also, read on transaction control.
If you're doing more than a handful of DB operations, consider using a library like SQLAlchemy or Pony ORM.
import contextlib
def doSomething():
with contextlib.closing(database.cursor()) as cursor:
cursor.execute(...)
library for db SQLAlchemy or Pony ORM.
Have you considered using SQLAlchemy? This gives you a mapping between Python classes and MySQL (or any other RDBMS) tables. I've recently been using it on a fairly hefty real-world project and it seems to do the job fairly well and is easy enough to learn.
Check out the following code. I added the content in your initialize() to the standard python class init method and made the database be initialized with different types of parameters:
import pymysql as MySQL
class Database(object):
def __init__(self, host, db, user, pw):
self.currentdb = MySQL.connect(Database.host, user, pw, db)
def insert(self, Table, DataDict):
placeholders = ", ".join(["%s"] * len(DataDict))
columns = ", ".join(DataDict.keys())
sql = "INSERT INTO %s (%s) VALUES (%s)"%(Table, columns, placeholders)
with self.currentdb.cursor() as db_cursor:
db_cursor.execute(sql, DataDict.values())
Once you are here, then you can initialize a Database object as below and insert data:
my_db = Database(host="127.0.0.1", user="****", pw="****", db="****")
my_db.insert('table_name', data_dict)
Please note, I haven't changed your code, only presenting an organization based on your initial post that could work.
I am using SQLAlchemy 0.9.7 over Postgres with psyopg2 as the driver.
I have a stray transaction that isn't being closed properly, and in order to debug it, I would like to log all of the operations being sent to the database.
The psycopg2.extras.LoggingConnection looks like it provides the functionality I need, but I can't see how I might persuade SQLAlchemy to use this feature of the dialect.
Is this possible via SQLAlchemy?
You could pass custom connection factory to SQLAlchemy engine:
def _connection_factory(*args, **kwargs):
connection = psycopg2.extras.LoggingConnection(*args, **kwargs)
connection.initialize(open('sql.log', 'a'))
return connection
db_engine = create_engine(conn_string,
connect_args={ "connection_factory": _connection_factory })
Alternatively, you could implement a custom cursor class (see psycopg2.extras.LoggingCursor for example), and pass it in a similar way:
connect_args={ "cursor_factory": MyCursor }
It isn't a direct answer to my own question, but a workaround: similar functionality can be obtained by turning on query logging at the SQLAlchemy layer, rather than the Psycopg2 layer:
PostgreSQL supports specifying Date Formats using the DateStyle Property as mentioned here,
http://www.postgresql.org/docs/current/interactive/runtime-config-client.html#GUC-DATESTYLE. (link was originally to 8.3 version of docs).
I could not find any SQLAlchemy ORM documentation reference on to how to define this property. Is it possible to do it?
SQLAlchemy makes use of the DBAPI, usually psycopg2, to marshal date values to and from python datetime objects - you can then format/parse any way you want using standard python techniques. So no database-side date formatting features are needed.
If you do want to set this variable, you can just execute PG's SET statement:
conn = engine.connect()
conn.execute("SET DateStyle='somestring'")
# work with conn
to make this global to all connections:
from sqlalchemy import event
from sqlalchemy.engine import Engine
#event.listens_for(Engine, "connect")
def connect(dbapi_connection, connection_record):
cursor = dbapi_connection.cursor()
cursor.execute("SET DateStyle='somestring'")
cursor.close()