Lightweight Object->Database in Python

Lightweight Object->Database in Python - python

I am in need of a lightweight way to store dictionaries of data into a database. What I need is something that:
Creates a database table from a simple type description (int, float, datetime etc)
Takes a dictionary object and inserts it into the database (including handling datetime objects!)
If possible: Can handle basic references, so the dictionary can reference other tables
I would prefer something that doesn't do a lot of magic. I just need an easy way to setup and get data into an SQL database.
What would you suggest? There seems to be a lot of ORM software around, but I find it hard to evaluate them.

SQLAlchemy's SQL expression layer can easily cover the first two requirements. If you also want reference handling then you'll need to use the ORM, but this might fail your lightweight requirement depending on your definition of lightweight.

SQLAlchemy offers an ORM much like django, but does not require that you work within a web framework.

From it's description, perhaps Axiom is a pythonic tool for this .

Seeing as you have mentioned sql, python and orm in your tags, are you looking for Django? Of all the web frameworks I've tried, I like this one the best. You'd be looking at models, specifically. This could be too fancy for your needs, perhaps, but that shouldn't stop you looking at the code of Django itself and learning from it.

Related

Dynamic Scalable Mysql Table

Here is my situation. I used Python, Django and MySQL for a web development.
I have several tables for form posting, whose fields may change dynamically. Here is an example.
Like a table called Article, it has three fields now, called id INT, title VARCHAR(50), author VARCHAR(20), and it should be able store some other values dynamically in the future, like source VARCHAR(100) or something else.
How can I implement this gracefully? Is MySQL be able to handle it? Anyway, I don't want to give up MySQL totally, for that I'm not really familiar with NoSQL databases, and it may be risky to change technique plan in the process of development.
Any ideas welcome. Thanks in advance!

You might be interested in this post about FriendFeed's schemaless SQL approach.
Loosely:
Store documents in JSON, extracting the ID as a column but no other columns
Create new tables for any indexes you require
Populate the indexes via code
There are several drawbacks to this approach, such as indexes not necessarily reflecting the actual data. You'll also need to hack up django's ORM pretty heavily. Depending on your requirements you might be able to keep some of your fields as pure DB columns and store others as JSON?

I've never actually used it, but django-not-eav looks like the tool for the job.
"This app attempts the impossible: implement a bad idea the right way." I already love it :)
That said, this question sounds like a "rethink your approach" situation, for sure. But yes, sometimes that is simply not an option...

python database abstraction to store datastructures unpickled

I am looking for a generic way to store python objects in a database. Of course I could just pickle the objects, but that way I would have binary blobs in my database. That way I can not search my objects. Also it seems to be easier to put it together with other applications.
So in my fantasy, I have on object like
class myClass
data1=1
data2='foobar'
data3=some_html_object
...
and could do something like
mydata=myClass()
mydata.add_data(various_things)
mydata.save_to_database()
and would end up with a database which has colums called data1,data2, data3, where I have the values of the of the objects attributes in the rows stored as text which would be searchable. Of course some inital setup would have to be done.
And of course it would be nice if I could plug any database I want (well, at least not just one database) and would not be bothered with the details.
Now of course I could program my own framework to let me do this, but I was hoping that this has been done bevore by someone else :)
Any suggestions?

Your fantasy in fact exists!
You describe something called the Active Record Pattern. It is usually implemented by using Object-Relational Mapping. One common solution for Python is SQLAlchemy, but Storm is somehow popular too:
See What are some good Python ORM solutions?
If your are developing for the Web, Django possess its own ORM.

It sounds like what you want is an Object Relational Mapper (ORM) to map SQL tables to objects.
The most popular ORMs that support different dialects by community are the following:
Python -- SQLAlchemy, Storm, Django (built into web framework)
Ruby -- ActiveRecord, Sequel
Node -- Sequelize
For a specific example of implementing what you described in Python using SQLAlchemy, check out this blog post that walks through a simple example

PyMongo vs MongoEngine for Django

For one of my projects I prefered using Django+Mongo.
Why should I use MongoEngine, but not just PyMongo? What are advantages? Querying with PyMongo gives results that are allready objects, aren't they? So what is the purpose of MongoEngine?

This is an old question but stumbling across it, I don't think the accepted answer answers the question. The question wasn't "What is MongoEngine?" - it was "Why should I use MongoEngine?" And the advantages of such an approach. This goes beyond Django to Python/Mongo in general. My two cents:
While both PyMongo and MongoEngine do both return objects (which is not wrong), PyMongo returns dictionaries that need to have their keys referenced by string. MongoEngine allows you to define a schema via classes for your document data. It will then map the documents into those classes for you and allow you to manipulate them. Why define a schema for schema-less data? Because in my opinion, its clear, explicit, and much easier to program against. You don't end up with dictionaries scattered about your code where you can't tell what's in them without actually looking at the data or running the program. In the case of MongoEngine and a decent IDE like PyCharm, typing a simple "." after the object will tell you all you need to know via auto-complete. It's also much easier for other developers coming in to examine and learn the data model as they work and will make anybody who hasn't seen the code in a while more productive, quicker.
Additionally, to me, the syntax used to manipulate documents with PyMongo (which is essentially the same as the mongo console) is ugly, error prone, and difficult to maintain.
Here is a basic example of updating a document in MongoEngine, which to me, is very elegant:
BlogPost.objects(id=post.id).update(title='Example Post')
Why use PyMongo? MongoEngine is a layer between you and the bare metal, so it's probably slower, although I don't have any benchmarks. PyMongo is lower level, so naturally you have more control. For simple projects, MongoEngine might be unnecessary. If you're already fluent in Mongo syntax, you may find PyMongo much more intuitive than I do and have no problem writing complex queries and updates. Perhaps you enjoy working directly with dictionaries on that lower level and aren't interested in an additional layer of abstraction. Maybe you're writing a script that isn't part of a big system, and you need it to be as lean and as fast as possible.
There's more to the argument, but I think that's pretty good for the basics.

I assume you have not read the MongoEngine claim.
MongoEngine is a Document-Object
Mapper (think ORM, but for document
databases) for working with MongoDB
from Python.
This basically say it all.
In addition: your claim that Pymongo would deliver objects is wrong....well in Python everything is an object - even a dict is an object...so you are true but not in the sense of having a custom class defined on the application level.
PyMongo is the low-level driver wrapping the MongoDB API into Python and delivering JSON in and out.
MongoEngine or other layers like MongoKit map your MongoDB-based data to objects similar to native Python database drivers + SQLAlchemy as ORM.

Probably way too late, but for anyone else attempting Django+Mongo, Django-nonrel is worth considering.

mongoengine will use the pymongo driver to connect to mongodb.
If you are familiar with django.. use mongoengine

How to persist data between executions in Python

I am working on a personal project in Python where I need some form of persistent data. The data would fit in 2-3 tables of 10-20 columns and 100-200 records each. I have a basic understanding of SQL, so a database seems to make some sense.
I am new to Python, so I am not familiar with the options for database interface from Python. I have also heard about pickling and am not sure if that would be a better solution for my project size. Can anyone recommend a good solution?

Or, if you just want to persist data between executions - for such a small data set you could have a look at the pickle module for persistency, and just load the data into memory during execution.
It's a simple solution - but for a personal project it might be enough.

You should use sqlite3 module for this, it is included in Python.
Also you may want too look for an ORM solution.

This sounds like very few data. An SQL DB might be overkill, especially with an ORM on top. I'd check whether JSON could do the job...

I agree with using sqlite3. It is very easy to use, you don't need to worry about having to set up a database server. You should check out the SQLAlchemy library too.

The real question is really what kind of operations you want to do with your data.
As far as storage possibilities, the simplest solutions are indeed sqlite3 and pickle.
The solution that you will choose depends basically on whether using SQL or Python is the easiest way for you to manage your data. SQL is probably better at complex operations than Python, but Python is definitely more lightweight and simpler, and therefore is a good choice for simple operations. So, if using pickle+Python is too cumbersome, then sqlite3 is a very good choice.

Peewee is another ORM that works with SQLite. It is an alternative to SQLAlchemy. If using SQLite, I would consider Peewee for pet projects and SQLAlchemy for professional work. I typically would not use SQLite directly.

Is this a good approach to avoid using SQLAlchemy/SQLObject?

Rather than use an ORM, I am considering the following approach in Python and MySQL with no ORM (SQLObject/SQLAlchemy). I would like to get some feedback on whether this seems likely to have any negative long-term consequences since in the short-term view it seems fine from what I can tell.
Rather than translate a row from the database into an object:
each table is represented by a class
a row is retrieved as a dict
an object representing a cursor provides access to a table like so:
cursor.mytable.get_by_ids(low, high)
removing means setting the time_of_removal to the current time
So essentially this does away with the need for an ORM since each table has a class to represent it and within that class, a separate dict represents each row.
Type mapping is trivial because each dict (row) being a first class object in python/blub allows you to know the class of the object and, besides, the low-level database library in Python handles the conversion of types at the field level into their appropriate application-level types.
If you see any potential problems with going down this road, please let me know. Thanks.

That doesn't do away with the need for an ORM. That is an ORM. In which case, why reinvent the wheel?
Is there a compelling reason you're trying to avoid using an established ORM?

You will still be using SQLAlchemy. ResultProxy is actually a dictionary once you go for .fetchmany() or similar.
Use SQLAlchemy as a tool that makes managing connections easier, as well as executing statements. Documentation is pretty much separated in sections, so you will be reading just the part that you need.

web.py has in a decent db abstraction too (not an ORM).
Queries are written in SQL (not specific to any rdbms), but your code remains compatible with any of the supported dbs (sqlite, mysql, postresql, and others).
from http://webpy.org/cookbook/select:
myvar = dict(name="Bob")
results = db.select('mytable', myvar, where="name = $name")

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.