connecting sqlalchemy to MSAccess - python

How can I connect to MS Access with SQLAlchemy? In their website, it says connection string is access+pyodbc. Does that mean that I need to have pyodbc for the connection? Since I am a newbie, please be gentle.

In theory this would be via create_engine("access:///some_odbc_dsn"), but the Access backend hasn't been in service at all since SQLAlchemy 0.5, and it's not clear how well it was working back then either (this is why it's noted as "development" at http://docs.sqlalchemy.org/en/latest/core/engines.html#supported-databases - "development" means, "a development version of the dialect exists, but is not yet usable"). There's just not enough interest/volunteers to keep this dialect running right now. (when/if it is, you'll see it at http://docs.sqlalchemy.org/en/latest/dialects/access.html).
Your best bet for Access right now would be to export the data into a SQLite database file (or of course some other database, though SQLite is file-based in a similar way at least), then use that.
Update, September 2019:
The sqlalchemy-access dialect has been resurrected. Details here.
Usage example:
engine = create_engine("access+pyodbc://#some_odbc_dsn")

I primarily needed read access and some simple queries. The latest version of sqlalchemy has the (broken) access back end modules, but it isn't registered as an entrypoint.
It needed a few fixups, but this worked for me:
def fixup_access():
import sqlalchemy.dialects.access.base
class FixedAccessDialect(sqlalchemy.dialects.access.base.AccessDialect):
def _check_unicode_returns(self, connection):
return True
def do_execute(self, cursor, statement, params, context=None, **kwargs):
if params == {}:
params = ()
super(sqlalchemy.dialects.access.base.AccessDialect, self).do_execute(cursor, statement, params, **kwargs)
class SomeObject(object):
pass
fixed_dialect_mod = SomeObject
fixed_dialect_mod.dialect = FixedAccessDialect
sqlalchemy.dialects.access.fix = fixed_dialect_mod
fixup_access()
ENGINE = sqlalchemy.create_engine('access+fix://admin#/%s'%(db_location))

Related

SQLite: How to enable explain plan with the Python API?

I'm using Python (and Peewee) to connect to a SQLite database. My data access layer (DAL) is a mix of peewee ORM and SQL-based functions. I would like to enable EXPLAIN PLAN for all queries upon connecting to the database and toggle it via configuration or CLI parameter ... how can I do that using the Python API?
from playhouse.db_url import connect
self._logger.info("opening db connection to database, creating cursor and initializing orm model ...")
self.__db = connect(url)
# add support for a REGEXP and POW implementation
# TODO: this should be added only for the SQLite case and doesn't apply to other vendors.
self.__db.connection().create_function("REGEXP", 2, regexp)
self.__db.connection().create_function("POW", 2, pow)
self.__cursor = self.__db.cursor()
self.__cursor.arraysize = 100
# what shall I do here to enable EXPLAIN PLANs?
That is a feature of the sqlite interactive shell. To get the query plans, you will need to explicitly request it. This is not quite straightforward with Peewee because it uses parameterized queries. You can get the SQL executed by peewee in a couple of ways.
# Print all queries to stderr.
import logging
logger = logging.getLogger('peewee')
logger.addHandler(logging.StreamHandler())
logger.setLevel(logging.DEBUG)
Or for an individual query:
query = SomeModel.select()
sql, params = query.sql()
# To get the query plan:
curs = db.execute_sql('EXPLAIN ' + sql, params)
print(curs.fetchall()) # prints query plan

sqlalchemy looking for server version as string, instead of bytes-like object

I don't know what suddenly caused this (I recently reinstalled Anaconda and all my python libraries, but I am back to the same versions as before), but when sqlalchemy tries to connect to the SQL server, it fails because it looks for the server version and tries to run a string operation on it.
The following had no problems prior to my reinstall of packages. I'd connect like so:
sqlalchemy_conn_string = 'mssql+pyodbc://myDSN'
sqlalchemy.create_engine(sqlalchemy_conn_string, module=pypyodbc)
Then it gets all the way to a file called pyodbc.py and fails at this function:
def _get_server_version_info(self, connection):
try:
raw = connection.scalar("SELECT SERVERPROPERTY('ProductVersion')")
except exc.DBAPIError:
#...
else:
version= []
r = re.compile(r'[.\-]')
for n in r.split(raw): # culprit here
try:
version.append(int(n))
except ValueError:
version.append(n)
return tuple(version)
Out[1]: TypeError: cannot use a string pattern on a bytes-like object
That's because at this step, raw is not a string that can be split:
# from PyCharm's debugger window
raw = {bytes}b'13.0.5026.0'
At this point, I don't know if I should submit a bug report for sqlalchemy and/or pypyodbc, or if there's something I can do to fix this myself. But I'd like a solution that doesn't involve editing the code for sqlalchemy on my own machine (like handling the bytes-like object specifically), because we have other team members who will also be downloading vanilla sqlalchemy & pypyodbc and won't have the confidence in editing that source code.
I have confirmed the pypyodbc behaviour under Python 3.6.4.
print(pypyodbc.version) # 1.3.5
sql = """\
SELECT SERVERPROPERTY('ProductVersion')
"""
crsr.execute(sql)
x = crsr.fetchone()[0]
print(repr(x)) # b'12.0.5207.0'
Note that SQLAlchemy's mssql+pyodbc dialect is coded for pyodbc, not pypyodbc, and the two are not guaranteed to be 100% compatible.
The obvious solution would be to use pyodbc instead.
UPDATE:
Check your version of SQLAlchemy. I just looked at the current source code for the mssql+pyodbc dialect and it does
def _get_server_version_info(self, connection):
try:
# "Version of the instance of SQL Server, in the form
# of 'major.minor.build.revision'"
raw = connection.scalar(
"SELECT CAST(SERVERPROPERTY('ProductVersion') AS VARCHAR)")
which should avoid the issue, even when using pypyodbc.
If you are using the latest production release of SQLAlchemy (currently version 1.2.15), then you might have better luck with version 1.3.0b1.

how to set autocommit = 1 in a sqlalchemy.engine.Connection

In sqlalchemy, I make the connection:
conn = engine.connect()
I found this will set autocommit = 0 in my mysqld log.
Now I want to set autocommit = 1 because I do not want to query in a transaction.
Is there a way to do this?
From The SQLAlchemy documentation: Understanding autocommit
conn = engine.connect()
conn.execute("INSERT INTO users VALUES (1, 'john')") # autocommits
The “autocommit” feature is only in effect when no Transaction has otherwise been declared. This means the feature is not generally used with the ORM, as the Session object by default always maintains an ongoing Transaction.
Full control of the “autocommit” behavior is available using the generative Connection.execution_options() method provided on Connection, Engine, Executable, using the “autocommit” flag which will turn on or off the autocommit for the selected scope. For example, a text() construct representing a stored procedure that commits might use it so that a SELECT statement will issue a COMMIT:
engine.execute(text("SELECT my_mutating_procedure()").execution_options(autocommit=True))
What is your dialect for mysql connection?
You can set the autocommit to True to solve the problem, like this mysql+mysqldb://user:password#host:port/db?charset=foo&autocommit=true
You can use this:
from sqlalchemy.sql import text
engine = create_engine(host, user, password, dbname)
engine.execute(text(sql).execution_options(autocommit=True))
In case you're configuring sqlalchemy for a python application using flask / django, you can create the engine like this:
# Configure the SqlAlchemy part of the app instance
app.config['SQLALCHEMY_DATABASE_URI'] = conn_url
session_options = {
'autocommit': True
}
# Create the SqlAlchemy db instance
db = SQLAlchemy(app, session_options=session_options)
I might be little late here, but for fox who is using sqlalchemy >= 2.0.*, above solution might not work as it did not work for me.
So, I went through the official documentation, and below solution worked for me.
from sqlalchemy import create_engine
db_engine = create_engine(database_uri, isolation_level="AUTOCOMMIT")
Above code works if you want to set autocommit engine wide.
But if you want use autocommit for a particular query then you can use below -
with engine.connect().execution_options(isolation_level="AUTOCOMMIT") as connection:
with connection.begin():
connection.execute("<statement>")
Official Documentation
This can be done using the autocommit option in the execution_option() method:
engine.execute("UPDATE table SET field1 = 'test'").execution_options(autocommit=True))
This information is available within the documentation on Autocommit

twisted adbapi turn off autocommit (psycopg2)

I am inserting a lot of rows and it seems that postgress can't keep up. I've googled a bit and it is suggested that you can turn off autocommit. I don't need to commit the values right away. It's data that I can fetch again if something goes wrong.
Now when I search for turning off autocommit I'm not finding what I'm looking for. I've tried supplying autocommit=False in the dbpool constructor:
dbpool = adbapi.ConnectionPool('psycopg2', user="xxx", password="xxx", database="postgres", host="localhost", autocommit=False)
2013-01-27 18:24:42,254 - collector.EventLogItemController - WARNING - [Failure instance: Traceback: : invalid connection option "autocommit"
psycopg2 does not claim to support an autocommit keyword argument to connect:
connect(dsn=None, database=None, user=None, password=None, host=None, port=None, connection_factory=None, async=False, **kwargs)
Create a new database connection.
The connection parameters can be specified either as a string:
conn = psycopg2.connect("dbname=test user=postgres password=secret")
or using a set of keyword arguments:
conn = psycopg2.connect(database="test", user="postgres", password="secret")
The basic connection parameters are:
- *dbname*: the database name (only in dsn string)
- *database*: the database name (only as keyword argument)
- *user*: user name used to authenticate
- *password*: password used to authenticate
- *host*: database host address (defaults to UNIX socket if not provided)
- *port*: connection port number (defaults to 5432 if not provided)
Using the *connection_factory* parameter a different class or connections
factory can be specified. It should be a callable object taking a dsn
argument.
Using *async*=True an asynchronous connection will be created.
Any other keyword parameter will be passed to the underlying client
library: the list of supported parameter depends on the library version.
The current postgresql documentation doesn't discuss any "autocommit" parameter either:
http://www.postgresql.org/docs/current/static/libpq-connect.html#LIBPQ-CONNSTRING
So perhaps the problem is that this is not the correct way to disable autocommit for a psycopg2 connection. Apart from that, you won't find that turning off autocommit actually helps you at all. adbapi.ConnectionPool will begin and commit explicit transactions for you, side-stepping any behavior autocommit mode might give you.
The problem with adbapi is that:
1) its missing features specific to the some of the database backends
2) its fake asynchronous api. Under the hoods its using the thread pool and to call the blocking methods.
For postgres I'd suggest using txpostgres library (source is here: https://github.com/wulczer/txpostgres). It is using asynchronous api of psycopg2 and it lets you specify the connection string.
You can find an example here: http://txpostgres.readthedocs.org/en/latest/usage.html#customising-the-connection-and-cursor-factories
Use the cp_openfun option of the twisted.enterprise.adbapi.ConnectionPool constructor
This function is called with the connection as a parameter.
In case of psycopg2 you can then set the autocommit property of that connection to True or False as stated here

Multi-tenancy with SQLAlchemy

I've got a web-application which is built with Pyramid/SQLAlchemy/Postgresql and allows users to manage some data, and that data is almost completely independent for different users. Say, Alice visits alice.domain.com and is able to upload pictures and documents, and Bob visits bob.domain.com and is also able to upload pictures and documents. Alice never sees anything created by Bob and vice versa (this is a simplified example, there may be a lot of data in multiple tables really, but the idea is the same).
Now, the most straightforward option to organize the data in the DB backend is to use a single database, where each table (pictures and documents) has user_id field, so, basically, to get all Alice's pictures, I can do something like
user_id = _figure_out_user_id_from_domain_name(request)
pictures = session.query(Picture).filter(Picture.user_id==user_id).all()
This is all easy and simple, however there are some disadvantages
I need to remember to always use additional filter condition when making queries, otherwise Alice may see Bob's pictures;
If there are many users the tables may grow huge
It may be tricky to split the web application between multiple machines
So I'm thinking it would be really nice to somehow split the data per-user. I can think of two approaches:
Have separate tables for Alice's and Bob's pictures and documents within the same database (Postgres' Schemas seems to be a correct approach to use in this case):
documents_alice
documents_bob
pictures_alice
pictures_bob
and then, using some dark magic, "route" all queries to one or to the other table according to the current request's domain:
_use_dark_magic_to_configure_sqlalchemy('alice.domain.com')
pictures = session.query(Picture).all() # selects all Alice's pictures from "pictures_alice" table
...
_use_dark_magic_to_configure_sqlalchemy('bob.domain.com')
pictures = session.query(Picture).all() # selects all Bob's pictures from "pictures_bob" table
Use a separate database for each user:
- database_alice
- pictures
- documents
- database_bob
- pictures
- documents
which seems like the cleanest solution, but I'm not sure if multiple database connections would require much more RAM and other resources, limiting the number of possible "tenants".
So, the question is, does it all make sense? If yes, how do I configure SQLAlchemy to either modify the table names dynamically on each HTTP request (for option 1) or to maintain a pool of connections to different databases and use the correct connection for each request (for option 2)?
After pondering on jd's answer I was able to achieve the same result for postgresql 9.2, sqlalchemy 0.8, and flask 0.9 framework:
from sqlalchemy import event
from sqlalchemy.pool import Pool
#event.listens_for(Pool, 'checkout')
def on_pool_checkout(dbapi_conn, connection_rec, connection_proxy):
tenant_id = session.get('tenant_id')
cursor = dbapi_conn.cursor()
if tenant_id is None:
cursor.execute("SET search_path TO public, shared;")
else:
cursor.execute("SET search_path TO t" + str(tenant_id) + ", shared;")
dbapi_conn.commit()
cursor.close()
Ok, I've ended up with modifying search_path in the beginning of every request, using Pyramid's NewRequest event:
from pyramid import events
def on_new_request(event):
schema_name = _figire_out_schema_name_from_request(event.request)
DBSession.execute("SET search_path TO %s" % schema_name)
def app(global_config, **settings):
""" This function returns a WSGI application.
It is usually called by the PasteDeploy framework during
``paster serve``.
"""
....
config.add_subscriber(on_new_request, events.NewRequest)
return config.make_wsgi_app()
Works really well, as long as you leave transaction management to Pyramid (i.e. do not commit/roll-back transactions manually, letting Pyramid to do that at the end of request) - which is ok as committing transactions manually is not a good approach anyway.
What works very well for me it to set the search path at the connection pool level, rather than in the session. This example uses Flask and its thread local proxies to pass the schema name so you'll have to change schema = current_schema._get_current_object() and the try block around it.
from sqlalchemy.interfaces import PoolListener
class SearchPathSetter(PoolListener):
'''
Dynamically sets the search path on connections checked out from a pool.
'''
def __init__(self, search_path_tail='shared, public'):
self.search_path_tail = search_path_tail
#staticmethod
def quote_schema(dialect, schema):
return dialect.identifier_preparer.quote_schema(schema, False)
def checkout(self, dbapi_con, con_record, con_proxy):
try:
schema = current_schema._get_current_object()
except RuntimeError:
search_path = self.search_path_tail
else:
if schema:
search_path = self.quote_schema(con_proxy._pool._dialect, schema) + ', ' + self.search_path_tail
else:
search_path = self.search_path_tail
cursor = dbapi_con.cursor()
cursor.execute("SET search_path TO %s;" % search_path)
dbapi_con.commit()
cursor.close()
At engine creation time:
engine = create_engine(dsn, listeners=[SearchPathSetter()])

Categories

Resources