PyMongo vs MongoEngine for Django

PyMongo vs MongoEngine for Django - python

For one of my projects I prefered using Django+Mongo.
Why should I use MongoEngine, but not just PyMongo? What are advantages? Querying with PyMongo gives results that are allready objects, aren't they? So what is the purpose of MongoEngine?

This is an old question but stumbling across it, I don't think the accepted answer answers the question. The question wasn't "What is MongoEngine?" - it was "Why should I use MongoEngine?" And the advantages of such an approach. This goes beyond Django to Python/Mongo in general. My two cents:
While both PyMongo and MongoEngine do both return objects (which is not wrong), PyMongo returns dictionaries that need to have their keys referenced by string. MongoEngine allows you to define a schema via classes for your document data. It will then map the documents into those classes for you and allow you to manipulate them. Why define a schema for schema-less data? Because in my opinion, its clear, explicit, and much easier to program against. You don't end up with dictionaries scattered about your code where you can't tell what's in them without actually looking at the data or running the program. In the case of MongoEngine and a decent IDE like PyCharm, typing a simple "." after the object will tell you all you need to know via auto-complete. It's also much easier for other developers coming in to examine and learn the data model as they work and will make anybody who hasn't seen the code in a while more productive, quicker.
Additionally, to me, the syntax used to manipulate documents with PyMongo (which is essentially the same as the mongo console) is ugly, error prone, and difficult to maintain.
Here is a basic example of updating a document in MongoEngine, which to me, is very elegant:
BlogPost.objects(id=post.id).update(title='Example Post')
Why use PyMongo? MongoEngine is a layer between you and the bare metal, so it's probably slower, although I don't have any benchmarks. PyMongo is lower level, so naturally you have more control. For simple projects, MongoEngine might be unnecessary. If you're already fluent in Mongo syntax, you may find PyMongo much more intuitive than I do and have no problem writing complex queries and updates. Perhaps you enjoy working directly with dictionaries on that lower level and aren't interested in an additional layer of abstraction. Maybe you're writing a script that isn't part of a big system, and you need it to be as lean and as fast as possible.
There's more to the argument, but I think that's pretty good for the basics.

I assume you have not read the MongoEngine claim.
MongoEngine is a Document-Object
Mapper (think ORM, but for document
databases) for working with MongoDB
from Python.
This basically say it all.
In addition: your claim that Pymongo would deliver objects is wrong....well in Python everything is an object - even a dict is an object...so you are true but not in the sense of having a custom class defined on the application level.
PyMongo is the low-level driver wrapping the MongoDB API into Python and delivering JSON in and out.
MongoEngine or other layers like MongoKit map your MongoDB-based data to objects similar to native Python database drivers + SQLAlchemy as ORM.

Probably way too late, but for anyone else attempting Django+Mongo, Django-nonrel is worth considering.

mongoengine will use the pymongo driver to connect to mongodb.
If you are familiar with django.. use mongoengine

Related

Is it necessary to use an ODM framework for Mongodb in Python?

Now I want to use mongodb as my Python website backend storage, but I am wondering whether it's necessary to use an ODM such as MongoEngine? Or just use mongodb python driver directly?
Any good advice?

Is it strictly necessary? no - you can use the python driver directly without an ODM in the middle. If you prefer defining schemas and models to crafting/modifying your own schema via normal database operations, then an ODM is probably something you should look into.
A lot of people got used to using this kind of solution when mapping their development data model into a relational database (in that case an ORM). Because the MongoDB document model more closely maps to an object in your code (for example), you may feel you no longer need this mapping.
It can still be convenient though (as you can see from the popularity of mongoengine, mongoid, morphia and others) - the choice, in the end, is yours.

ORM with Graph-Databases like Neo4j in Python

i wonder wether there is a solution (or a need for) an ORM with Graph-Database (f.e. Neo4j). I'm tracking relationships (A is related to B which is related to A via C etc., thus constructing a large graph) of entities (including additional attributes for those entities) and need to store them in a DB, and i think a graph database would fit this task perfectly.
Now, with sql-like DBs, i use sqlalchemyś ORM to store my objects, especially because of the fact that i can retrieve objects from the db and work with them in a pythonic style (use their methods etc.).
Is there any object-mapping solution for Neo4j or other Graph-DB, so that i can store and retrieve python objects into and from the Graph-DB and work with them easily?
Or would you write some functions or adapters like in the python sqlite documentation (http://docs.python.org/library/sqlite3.html#letting-your-object-adapt-itself) to retrieve and store objects?

Shameless plug... there is also my own ORM which you may also want to checkout: https://github.com/robinedwards/neomodel
It's built on top of py2neo, using cypher and rest API calls under hood, i.e no dependency on gremlin.

There are a couple choices in Python out there right now, based on databases' REST interfaces.
As I mentioned in the link #Peter provided, we're working on neo4django, which updates the old Neo4j/Django integration. It's a good choice if you need complex queries and want an ORM that will manage node indexing as well- or if you're already using Django. It works very similarly to the native Django ORM. Find it on PyPi or GitHub.
There's also a more general solution called Bulbflow that is supposed to work with any graph database supported by Blueprints. I haven't used it, but from what I've seen it focuses on domain modeling - Bulbflow already has working relationship models, for example, which we're still working on- but doesn't much support complex querying (as we do with Django querysets + index use). It also lets you work a bit closer to the graph.

Maybe you could take a look on Bulbflow, that allows to create models in Django, Flask or Pyramid. However, it works over a REST client instead of the python-binding provided by Neo4j, so perhaps it's not as fast as the native binding is.

Proper way to establish database connection in python

I have a script with several functions that all need to make database calls. I'm trying to get better at writing clean code rather than just throwing together scripts with horrible style. What is generally considered the best way to establish a global database connection that can be accessed anywhere in the script but is not susceptible to errors such as accidentally redefining the variable holding a connection. I'd imagine I should be putting everything in a module? Any links to actual code would be very useful as well. Thanks.

If you are working with Python and databases, you cannot afford not to look at SQLAlchemy:
SQLAlchemy is the Python SQL toolkit
and Object Relational Mapper that
gives application developers the full
power and flexibility of SQL.
It provides a full suite of well known
enterprise-level persistence patterns,
designed for efficient and
high-performing database access,
adapted into a simple and Pythonic
domain language.
I have built very complex databases with a surprisingly small amount of code (a few hundred lines). The schema definition is almost self-documenting, the objects used for the Object Relational Mapper are Plain Old Python Objects (i.e., what you already have), and the querying API is almost obvious. In addition, the documentation is excellent: many online examples, fully documented API, and an O'Reilly book which, while far from perfect, does take you from zero to dangerous in a few evenings.
If you don't want to use the Object Relational Mapper, you can always fall back to plain connections and literal SQL. Also, the code is portable and database independent (the same code will work with MySQL, Oracle, SQLite, and other database managers).
The Session object will automatically take care of the pooling (what you mention as your concern).
The best way to understand its power is probably to follow the tutorials obtained in the first result page of the Google query sqlalchemy tutorial.

Use a model system/ORM system.

How to persist data between executions in Python

I am working on a personal project in Python where I need some form of persistent data. The data would fit in 2-3 tables of 10-20 columns and 100-200 records each. I have a basic understanding of SQL, so a database seems to make some sense.
I am new to Python, so I am not familiar with the options for database interface from Python. I have also heard about pickling and am not sure if that would be a better solution for my project size. Can anyone recommend a good solution?

Or, if you just want to persist data between executions - for such a small data set you could have a look at the pickle module for persistency, and just load the data into memory during execution.
It's a simple solution - but for a personal project it might be enough.

You should use sqlite3 module for this, it is included in Python.
Also you may want too look for an ORM solution.

This sounds like very few data. An SQL DB might be overkill, especially with an ORM on top. I'd check whether JSON could do the job...

I agree with using sqlite3. It is very easy to use, you don't need to worry about having to set up a database server. You should check out the SQLAlchemy library too.

The real question is really what kind of operations you want to do with your data.
As far as storage possibilities, the simplest solutions are indeed sqlite3 and pickle.
The solution that you will choose depends basically on whether using SQL or Python is the easiest way for you to manage your data. SQL is probably better at complex operations than Python, but Python is definitely more lightweight and simpler, and therefore is a good choice for simple operations. So, if using pickle+Python is too cumbersome, then sqlite3 is a very good choice.

Peewee is another ORM that works with SQLite. It is an alternative to SQLAlchemy. If using SQLite, I would consider Peewee for pet projects and SQLAlchemy for professional work. I typically would not use SQLite directly.

Lightweight Object->Database in Python

I am in need of a lightweight way to store dictionaries of data into a database. What I need is something that:
Creates a database table from a simple type description (int, float, datetime etc)
Takes a dictionary object and inserts it into the database (including handling datetime objects!)
If possible: Can handle basic references, so the dictionary can reference other tables
I would prefer something that doesn't do a lot of magic. I just need an easy way to setup and get data into an SQL database.
What would you suggest? There seems to be a lot of ORM software around, but I find it hard to evaluate them.

SQLAlchemy's SQL expression layer can easily cover the first two requirements. If you also want reference handling then you'll need to use the ORM, but this might fail your lightweight requirement depending on your definition of lightweight.

SQLAlchemy offers an ORM much like django, but does not require that you work within a web framework.

From it's description, perhaps Axiom is a pythonic tool for this .

Seeing as you have mentioned sql, python and orm in your tags, are you looking for Django? Of all the web frameworks I've tried, I like this one the best. You'd be looking at models, specifically. This could be too fancy for your needs, perhaps, but that shouldn't stop you looking at the code of Django itself and learning from it.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.