SQLAlchemy or psycopg2? - python

I am writing a quick and dirty script which requires interaction with a database (PG).
The script is a pragmatic, tactical solution to an existing problem. however, I envisage that the script will evolve over time into a more "refined" system. Given the fact that it is currently being put together very quickly (i.e. I don't have the time to pour over huge reams of documentation), I am tempted to go the quick and dirty route, using psycopg.
The advantages for psycopg2 (as I currently understand it) is that:
written in C, so faster than sqlAlchemy (written in Python)?
No abstraction layer over the DBAPI since works with one db and one db only (implication -> fast)
(For now), I don't need an ORM, so I can directly execute my SQL statements without having to learn a new ORM syntax (i.e. lightweight)
Disadvantages:
I KNOW that I will want an ORM further down the line
psycopg2 is ("dated"?) - don't know how long it will remain around for
Are my perceptions of SqlAlchemy (slow/interpreted, bloated, steep learning curve) true - IS there anyway I can use sqlAlchemy in the "rough and ready" way I want to use psycopg - namely:
execute SQL statements directly without having to mess about with the ORM layer, etc.
Any examples of doing this available?

SQLAlchemy is a ORM, psycopg2 is a database driver. These are completely different things: SQLAlchemy generates SQL statements and psycopg2 sends SQL statements to the database. SQLAlchemy depends on psycopg2 or other database drivers to communicate with the database!
As a rather complex software layer SQLAlchemy does add some overhead but it also is a huge boost to development speed, at least once you learned the library. SQLAlchemy is an excellent library and will teach you the whole ORM concept, but if you don't want to generate SQL statements to begin with then you don't want SQLAlchemy.

To talk with database any one need driver for that. If you are using client like SQL Plus for oracle, MysqlCLI for Mysql then it will direct run the query and that client come with DBServer pack.
To communicate from outside with any language like java, c, python, C#... We need driver to for that database. psycopg2 is driver to run query for PostgreSQL from python.
SQLAlchemy is the ORM which is not same as database driver. It will give you flexibility so you can write your code without any database specific standard. ORM provide database independence for programmer. If you write object.save in ORM then it will check, which database is associated with that object and it will generate insert query according to the backend database.

Related

Purpose of SQLAlchemy over MYSQL CONNECTOR PYTHON

I am new to working with databases and couldn't find any relevant answers for this.
What are the uses of SQLAlchemy over MYSQL CONNECTOR for python.
I do not have much experience with MYSQL CONNECTOR for Python. However, from what I know SQLAlchemy primarily uses ORM (Object-Relational Mapping) in order to abstract the details of handling the database. This can help avoid errors some times (and also introduce possibly introduce others). You might want to have a look at the ORM technique and see if it is for you (but don't use it as a way to avoid learning SQL). Generally, ORMs tend not to be as scalable as raw SQL either.
I am also a newbie. In my understanding SQLAlchemy is an ORM (Object-Relational Mapping) that allows you to abstract the database and query data from the DB more easily in your coding language treating query data as another object. Pros is that that you can more easily switch your DB under the hood. But it has some learning curve.
Whereas MySQL Connector is "just" a plain simple direct connection to the DBMS at your database and you write SQL queries to get the data.
For now I am sticking with the mysql connector to just train SQL queries more. But later on I will definitely test out SQLAlchemy.

what is difference between raw sql queries & normal sql queries?

I am kind of new to sql and database & currently developing website in django framework.
During my reading of django documentation I have read about raw sql queries which are executed using Manager.raw() like below.
for p in Person.objects.raw('SELECT * FROM myapp_person'):
Manager.raw(raw_query, params=None, translations=None)
How does raw queries differes from normal sql queries & when should I use raw sql queries instead of Django ORM ?
Django (like other similar ORM tools) is a connection between relational databases and object-oriented programming. One of the very important functions that it implements is providing a uniform interface to the database -- regardless of the underlying database.
When you use underlying Django functionality, the code should be supported on any database (there may be specific limits on this). This makes it particularly easy to port to another database. It also helps ensure that the generated queries do what you intend.
When you use raw SQL, the code is likely to be specific to one database (creating a porting problem). The code is also not checked, which can result in hard-to-understand errors.
I have a strong preference for using SQL directly -- but that is because I am not a programmer using an ORM framework. If you are going to use such a framework, it is probably better to use the built-in functionality wherever possible.
This is a borderline opinion question so might get flagged, but it is a good point. Essentially the raw SQL queries are intended to only be used for the edge cases where the Django ORM does not fulfil your needs (and with each new version of Django it support more and more query types so raw becomes less useful).
In general I would suggest using the ORM for the more helpful error messages, maintainability, and plain ease of use, and only use raw as a last-resort

Searching a MySQL Database on ClearDB using Python script

We have built a series a forms using PHP that populate a MySQL Database and after learning more Python want to begin transitioning our whole web app over to a python back end instead of PHP. Can anyone offer a quick intro into searching a MySQL DB using Python?
The easiest way is to abstract the database using an ORM. I found that the python package SQLAlchemy works fantastically.
A ORM let's you treat the database objects as normal python objects. Most queries can be abstracted and for 99% of cases you won't need to write SQL queries. Also this makes transition between database technologies very simple.
Go check it out on:
http://www.sqlalchemy.org/
A search query would be something like:
session.query(User).filter_by(first_name='bob')
Now you will have all users with the first name 'bob'
You search it the same way you would in PHP; because the queries you are using will not change.
The only difference is the driver you have to use.

Coming to python from perl, i'm wondering if there's something like DBI for python?

In perl, DBI module is the standard way of interacting with DBs, where each DB vendor provides its own DBD module which is used by the DBI. (It's somewhat similar to JDBC.) I can't figure out if a similar model exists in python. In case of Postgres, I see there are pg and pgdb modules, where pgdb follows DB-API 2.0 and pg doesn't. Should I care about that? If I go with pgdb, should I expect the same interface from a MySQL db module, which follows DB-API 2.0 ?
Thank you!
A popular module for interacting with Postgres in Python which is DB API 2.0 compliant is psycopg2 (http://initd.org/psycopg/docs/index.html).
That's the one I always use in my Python code to interact with Postgres. I find it straightforward to use, and it offers some nice extras that are fairly easy to add, such as dictionary-based cursors (i.e. DictCursor, where the rows are in a dictionary with the column names as keys, as opposed to an array).
There's also named cursors, where all you have to do is supply a cursor with a name, and psycopg2 will automatically create a server side cursor for you with a default chunk size of 2000, which you can iterate over as any other Python object, with the subsequent fetches going on transparently in the background.
Yes, Python DBAPI 2.0 is the standard API for interacting with database in Python. Note though, that DBAPI is a very simple, low-level interface, by itself, it does not make it easy to write database queries that would be portable across different databases when different databases implement SQL differently.
For a higher level interface that do help you to write portable database application, you can check out SQLAlchemy. Both SQLalchemy core and ORM provides a language for querying database in portable way.

cx_Oracle and the data source paradigm

There is a Java paradigm for database access implemented in the Java DataSource. This object create a useful abstraction around the creation of database connections. The DataSource object keeps database configuration, but will only create database connections on request. This is allows you to keep all database configuration and initialization code in one place, and makes it easy to change database implementation, or use a mock database for testing.
I currently working on a Python project which uses cx_Oracle. In cx_Oracle, one gets a connection directly from the module:
import cx_Oracle as dbapi
connection = dbapi.connect(connection_string)
# At this point I am assuming that a real connection has been made to the database.
# Is this true?
I am trying to find a parallel to the DataSource in cx_Oracle. I can easily create this by creating a new class and wrapping cx_Oracle, but I was wondering if this is the right way to do it in Python.
You'll find relevant information of how to access databases in Python by looking at PEP-249: Python Database API Specification v2.0. cx_Oracle conforms to this specification, as do many database drivers for Python.
In this specification a Connection object represents a database connection, but there is no built-in pooling. Tools such as SQLAlchemy do provide pooling facilities, and although SQLAlchemy is often billed as an ORM, it does not have to be used as such and offers nice abstractions for use on top of SQL engines.
If you do want to do object-relational-mapping, then SQLAlchemy does the business, and you can consider either its own declarative syntax or another layer such as Elixir which sits on top of SQLAlchemy and provides increased ease of use for more common use cases.
I don't think there is a "right" way to do this in Python, except maybe to go one step further and use another layer between yourself and the database.
Depending on the reason for wanting to use the DataSource concept (which I've only ever come across in Java), SQLAlchemy (or something similar) might solve the problems for you, without you having to write something from scratch.
If that doesn't fit the bill, writing your own wrapper sounds like a reasonable solution.
Yes, Python has a similar abstraction.
This is from our local build regression test, where we assure that we can talk to all of our databases whenever we build a new python.
if database == SYBASE:
import Sybase
conn = Sybase.connect('sybasetestdb','mh','secret')
elif database == POSTRESQL:
import pgdb
conn = pgdb.connect('pgtestdb:mh:secret')
elif database == ORACLE:
import cx_Oracle
conn = cx_Oracle.connect("mh/secret#oracletestdb")
curs=conn.cursor()
curs.execute('select a,b from testtable')
for row in curs.fetchall():
print row
(note, this is the simple version, in our multidb-aware code we have a dbconnection class that has this logic inside.)
I just sucked it up and wrote my own. It allowed me to add things like abstracting the database (Oracle/MySQL/Access/etc), adding logging, error handling with transaction rollbacks, etc.

Categories

Resources