How can I use AWS's Dynamo Db with Django? - python

I am developing web applications, APIs, and backends using the Django MVC framework. A major aspect of Django is its implementation of an ORM for models. It is an exceptionally good ORM. Typically when using Django, one utilizes an existing interface that maps one's Django model to a specific DBMS like Postgres, MySQL, or Oracle for example.
I have some specific needs, requirements regarding performance and scalability, so I really want to use AWS's Dynamo DB because it is highly cost efficient, very performant, and scales really well.
While I think Django allows one to implement their own interface for a DBMS if one wishes to do so, it is clearly advantageous to be able to use an existing DBMS interface when constructing one's Django models if one exists.
Can someone recommend a Django model interface to use so I can construct a model in Django that uses AWS's Dynamo DB?
How about one using MongoDB?

As written by others, Django does not have NoSQL DBMS support, but there are third-party packages.
PynamoDB seems fine, but I have never used it, so I can’t recommend it. In all use cases I came across, boto3 was sufficient. Setup is pretty simple, but the devil is in details (in the data structure and how nested it is, to be precise). Basically, three steps are needed:
Connect with the DB and perform the operation you want (boto3)
Parse incoming data into a Python dictionary (e.g. with dynamodb-json, boto3.dynamodb.types.TypeDeserializer or you can build your own)
Do business logic, store data into relational DB using the Django ORM or whatever you need
Simplest example:
from dynamodb_json import json_util as dynamodb_json
from .models import YourModel
def get(request, partition_key):
table = boto3.resource(
'dynamodb',
aws_access_key_id=...,
aws_secret_access_key=...,
region_name=...,
).Table(some_table_name)
try:
response = table.get_item(
Key={partition_key: partition_key})
except ClientError as e:
logger.warning(e.response['Error']['Message'])
else:
data_str = response['Item']
_data_dict = dynamodb_json.loads(data_str)
# Validation and modification of incoming data goes here.
data_dict = validation_and_modification(_data_dict)
# Then you can do whatever you need, for example:
obj, created = YourModel.objects.update_or_create(**data_dict)
...
Examples for create, delete, list and update views can be found in the serverless repo.

It's not like ready made battery for django, but worth looking at it regardless.
https://github.com/pynamodb/PynamoDB

You can try Dynamorm or pynamoDB. I haven't tried them maybe they can help.

DynamoDB is non-relational which I think makes it architecturally incompatible with an ORM like Django's.

There is no Django model interface for AWS DynamoDB, but you may retrieve data from that kind of db using boto3 software provided by AWS.

Related

Cleanest way to make an ORM for neo4j + sql in python flask? One model over 2 databases

How can I create one model that talks to two databases in Flask, where one is, say, sqlite, and the other is specifically neo4j?
I'd like to have login and password stuff in a traditional db, and keep other graphy information in neo4j. I'm told neo4j is bad for things that need large graph traversals. Perhaps I'm wrong in needing this, but I have an instance where I'd like to say something like...
"return a dict(person.x,person.y,person.z) from all nodes where type==person", and then feed that into the view of my index page.
I've seen related questions about ORMs with neo4j:
ORM with Graph-Databases like Neo4j in Python
...and this about multiple DBs in Flask:
http://packages.python.org/Flask-SQLAlchemy/binds.html
Specifically, I see this taking the form of my create statement writing to sqlite db connection and then writing a key from there to additional relational information in neo4j.
I've recently released an OGM (Object-Graph Mapping) module for py2neo (http://book.py2neo.org/en/latest/ogm.html). This might help with what you're trying to do.
Otherwise, you could also look at neomodel (https://github.com/robinedwards/neomodel). It's written for Django but should be usable in Flask too.
I don't know about mixed backend models, but I think depending on your user count, you can use neo4j for your users, too. If you put the user nodes into an index, you can get all users without searching the graph.
If you find that this actually is a bottleneck, migrating it to a split storage should not be too difficult.
It is not that hard to adapt neo4j-driver and py2neo to use eg. Flask-Login.
I have use py2neo to that and worked well, but migrated now to neo4j-driver
Downside is that i did not manage to get it working with eg SQLalchemy etc.
Using a double backend solution is not a problem, in an earlier project i have used SQLalchemy with SQLite3 and PostgreSQL, Neo4j and redis together.
Using that, i have found no issues other than some design issues.

Tasty-Pie - pulling in related fields, without using full=True?

I have a app that uses Python Requests to query a Tasty-Pie enabled Django app.
I have a model called Application, with a corresponding Tasty-Pie resource.
This model/Resource has several foreign keys that link Application to other models (e.g. Binary, Host, Colocation etc.)
I'm using a Tasty-Pie filter to get a subset of Applications, then I want to print a nice table of Applications, along with some fields from those related models.
Right now, I'm using the following to get a table of Applications:
def get_applications(self, parsed_args):
r = requests.get('http://foobar.com:8000/api/v1/application/?name__iregex={0}&format=json'.format(parsed_args.applications))
print(r.url)
return r
def application_iter(self, parsed_args):
for application in self.get_applications(parsed_args).json['objects']:
yield (application['name'], application['author'], application['some_other_field'])
def take_action(self, parsed_args):
return(('Name', 'Author', 'Some Other Field),
self.application_iter_iter(parsed_args),
)
My question is, what is the "recommended", or idiomatic way of pulling in all the related fields? Is there a way to extend the above to do this?
I get the impression that full=True is a bad practice, and that using resource URI's is a better way.
How can I do this whilst minimising the number of requests and DB hits?
Cheers,
Victor
why do you think that full=True is bad?
https://django-tastypie.readthedocs.org/en/latest/resources.html#why-resource-uris
Ideology aside, you should use whatever suits you. If you prefer fewer requests & fewer endpoints, use of full=True is available, but be aware of the consequences of each approach.
You can do whatewer you like if it can be read cleanly and if it does what you want. "full=True" is there to be used by developers

ORM with Graph-Databases like Neo4j in Python

i wonder wether there is a solution (or a need for) an ORM with Graph-Database (f.e. Neo4j). I'm tracking relationships (A is related to B which is related to A via C etc., thus constructing a large graph) of entities (including additional attributes for those entities) and need to store them in a DB, and i think a graph database would fit this task perfectly.
Now, with sql-like DBs, i use sqlalchemyś ORM to store my objects, especially because of the fact that i can retrieve objects from the db and work with them in a pythonic style (use their methods etc.).
Is there any object-mapping solution for Neo4j or other Graph-DB, so that i can store and retrieve python objects into and from the Graph-DB and work with them easily?
Or would you write some functions or adapters like in the python sqlite documentation (http://docs.python.org/library/sqlite3.html#letting-your-object-adapt-itself) to retrieve and store objects?
Shameless plug... there is also my own ORM which you may also want to checkout: https://github.com/robinedwards/neomodel
It's built on top of py2neo, using cypher and rest API calls under hood, i.e no dependency on gremlin.
There are a couple choices in Python out there right now, based on databases' REST interfaces.
As I mentioned in the link #Peter provided, we're working on neo4django, which updates the old Neo4j/Django integration. It's a good choice if you need complex queries and want an ORM that will manage node indexing as well- or if you're already using Django. It works very similarly to the native Django ORM. Find it on PyPi or GitHub.
There's also a more general solution called Bulbflow that is supposed to work with any graph database supported by Blueprints. I haven't used it, but from what I've seen it focuses on domain modeling - Bulbflow already has working relationship models, for example, which we're still working on- but doesn't much support complex querying (as we do with Django querysets + index use). It also lets you work a bit closer to the graph.
Maybe you could take a look on Bulbflow, that allows to create models in Django, Flask or Pyramid. However, it works over a REST client instead of the python-binding provided by Neo4j, so perhaps it's not as fast as the native binding is.

Setting up Pyramid to use MySQL raw instead of SQLAlchemy

We're trying to set up a Pyramid project that will use MySQL instead of SQLAlchemy.
My experience with Pyramid/Python is limited, so I was hoping to find a guide online. Unfortunately, I haven't been able to find anything to push us in the right direction. Most search results were for people trying to use raw SQL/MySQL commands with SQLAlchemy (many were re-posted links).
Anyone have a useful tutorial on this?
Pyramid at its base does not assume that you will use any one specific library to help you with your persistence. In order to make things easier, then, for people who DO wish to use libraries such as SQLALchemy, the Pyramid library contains Scaffolding, which is essentially some auto-generated code for a basic site, with some additions to set up items like SQLAlchemy or a specific routing strategy. The pyramid documentation should be able to lead you through creating a new project using the "pyramid_starter" scaffolding, which sets up the basic site without SQLAlchemy.
This will give you the basics you need to set up your views, but next you will need to add code to allow you to connect to a database. Luckily, since your site is just python code, learning how to use MySQL in Pyramid is simply learning how to use MySQL in Python, and then doing the exact same steps within your Pyramid project.
Keep in mind that even if you'd rather use raw SQL queries, you might still find some usefulness in SQLAlchemy. At it's base level, SQLAlchemy simply wraps around the DBAPI calls and adds in useful features like connection pooling. The ORM functionality is actually a large addition to the tight lower-level SQLAlchemy toolset.
sqlalchemy does not make any assumption that you will be using it's orm. If you wish to use plain sql, you can do so, with nothing more than what sqlalchemy already provides. For instance, if you followed the recipe in the cookbook, you would have access to the sqlalchemy session object as request.db, your handler would look something like this:
def someHandler(request):
rows = request.db.execute("SELECT * FROM foo").fetchall()
The Quick Tutorial shows a Pyramid application that uses SQL but not SQLAlchemy. It uses SQLite, but should be reasonably easy to adapt for MySQL.

Converting Django project from MySQL to Mongo, any major pitfalls?

I want to try Mongodb w/ mongoengine. I'm new to Django and databases and I'm having a fit with Foreign Keys, Joins, Circular Imports (you name it). I know I could eventually work through these issues but Mongo just seems like a simpler solution for what I am doing. My question is I'm using a lot of pluggable apps (Imagekit, Haystack, Registration, etc) and wanted to know if these apps will continue to work if I make the switch. Are there any known headaches that I will encounter, if so I might just keep banging my head with MySQL.
There's no reason why you can't use one of the standard RDBMSs for all the standard Django apps, and then Mongo for your app. You'll just have to replace all the standard ways of processing things from the Django ORM with doing it the Mongo way.
So you can keep urls.py and its neat pattern matching, views will still get parameters, and templates can still take objects.
You'll lose querysets because I suspect they are too closely tied to the RDBMS models - but they are just lazily evaluated lists really. Just ignore the Django docs on writing models.py and code up your database business logic in a Mongo paradigm.
Oh, and you won't have the Django Admin interface for easy access to your data.
You might want to check out django-nonrel, which is a young but promising attempt at a NoSQL backend for Django. Documentation is lacking at the moment, but it works great if you just work it out.
I've used mongoengine with django but you need to create a file like mongo_models.py for example. In that file you define your Mongo documents. You then create forms to match each Mongo document. Each form has a save method which inserts or updates whats stored in Mongo. Django forms are designed to plug into any data back end ( with a bit of craft )
BEWARE: If you have very well defined and structured data that can be described in documents or models then don't use Mongo. Its not designed for that and something like PostGreSQL will work much better.
I use PostGreSQL for relational or well structured data because its good for that. Small memory footprint and good response.
I use Redis to cache or operate in memory queues/lists because its very good for that. great performance providing you have the memory to cope with it.
I use Mongo to store large JSON documents and to perform Map and reduce on them ( if needed ) because its very good for that. Be sure to use indexing on certain columns if you can to speed up lookups.
Don't circle to fill a square hole. It won't fill it.
I've seen too many posts where someone wanted to swap a relational DB for Mongo because Mongo is a buzz word. Don't get me wrong, Mongo is really great... when you use it appropriately. I love using Mongo appropriately
Upfront, it won't work for any existing Django app that ships it's models. There's no backend for storing Django's Model data in mongodb or other NoSQL storages at the moment and, database backends aside, models themselves are somewhat of a moot point, because once you get in to using someones app (django.contrib apps included) that ships model-template-view triads, whenever you require a slightly different model for your purposes you either have to edit the application code (plain wrong), dynamically edit the contents of imported Python modules at runtime (magical), fork the application source altogether (cumbersome) or provide additional settings (good, but it's a rare encounter, with django.contrib.auth probably being the only widely known example of an application that allows you to dynamically specify which model it will use, as is the case with user profile models through the AUTH_PROFILE_MODULE setting).
This might sound bad, but what it really means is that you'll have to deploy SQL and NoSQL databases in parallel and go from an app-to-app basis--like Spacedman suggested--and if mongodb is the best fit for a certain app, hell, just roll your own custom app.
There's a lot of fine Djangonauts with NoSQL storages on their minds. If you followed the streams from the past Djangocon presentations, every year there's been important discussions about how Django should leverage NoSQL storages. I'm pretty sure, in this year or the next, someone will refactor the apps and models API to pave the path to a clean design that can finally unify all the different flavors of NoSQL storages as part of the Django core.
I have recently tried this (although without Mongoengine). There are a huge number of pitfalls, IMHO:
No admin interface.
No Auth django.contrib.auth relies on the DB interface.
Many things rely on django.contrib.auth.User. For example, the RequestContext class. This is a huge hindrance.
No Registration (Relies on the DB interface and django.contrib.auth)
Basically, search through the django interface for references to django.contrib.auth and you'll see how many things will be broken.
That said, it's possible that MongoEngine provides some support to replace/augment django.contrib.auth with something better, but there are so many things that depend on it that it's hard to say how you'd monkey patch something that much.
Primary pitfall (for me): no JOINs!

Categories

Resources