How to manage mongodb connection using mongoengine django

How to manage mongodb connection using mongoengine django - python

I am running a django project and using mongoengine as my ORM for MongoDB. The project uses daemons, celery, crons and calling django custom commands which are using the django project and stores massive data in the database (MongoDB Atlas).
Here's my settings.py file where I'm building the mongodb connection using mongoengine
CONNECTION_STRING = 'mongodb://{user}:{pwd}#{host}/{db}?replicaSet={replicaset}&ssl={ssl}&authSource={auth_source}'.format(
user=env('DB_USERNAME'), pwd=env('DB_PASSWORD'), host=env('DB_HOST'), db=env('DB_DATABASE'), replicaset=env.str('DB_REPLICASET'), ssl=env.str('DB_SSL'), auth_source=env.str('DB_AUTH_SOURCE')
)
DB_CONNECTION = mongoengine.connect(host=CONNECTION_STRING, connect=False)
Suddenly I'm gettig alerts from MongoDb Atlas that connections have been overused. And I'm not sure how to handle connections in an optimised way or my existing settings are already optimised?
Below is the screenshot which shows 450+ connections always open.
Do I have to close the connection in every request class? any help is much appreciated.

Related

Do I authenticate at database level, at Flask User level, or both?

I have an MS-SQL deployed on AWS RDS, that I'm writing a Flask front end for.
I've been following some intro Flask tutorials, all of which seem to pass the DB credentials in the connection string URI. I'm following the tutorial here:
https://medium.com/#rodkey/deploying-a-flask-application-on-aws-a72daba6bb80#.e6b4mzs1l
For deployment, do I prompt for the DB login info and add to the connection string? If so, where? Using SQLAlchemy, I don't see any calls to create_engine (using the code in the tutorial), I just see an initialization using config.from_object, referencing the config.py where the SQLALCHEMY_DATABASE_URI is stored, which points to the DB location. Trying to call config.update(dict(UID='****', PASSWORD='******')) from my application has no effect, and looking in the config dict doesn't seem to have any applicable entries to set for this purpose. What am I doing wrong?
Or should I be authenticating using Flask-User, and then get rid of the DB level authentication? I'd prefer authenticating at the DB layer, for ease of use.

The tutorial you are using uses Flask-Sqlalchemy to abstract the database setup stuff, that's why you don't see engine.connect().
Frameworks like Flask-Sqlalchemy are designed around the idea that you create a connection pool to the database on launch, and share that pool amongst your various worker threads. You will not be able to use that for what you are doing... it takes care of initializing the session and things early in the process.
Because of your requirements, I don't know that you'll be able to make any use of things like connection pooling. Instead, you'll have to handle that yourself. The actual connection isn't too hard...
engine = create_engine('dialect://username:password#host/db')
connection = engine.connect()
result = connection.execute("SOME SQL QUERY")
for row in result:
# Do Something
connection.close()
The issue is that you're going to have to do that in every endpoint. A database connection isn't something you can store in the session- you'll have to store the credentials there and do a connect/disconnect loop in every endpoint you write. Worse, you'll have to either figure out encrypted sessions or server side sessions (without a db connection!) to prevent keeping those credentials in the session from becoming a horrible security leak.
I promise you, it will be easier both now and in the long run to figure out a simple way to authenticate users so that they can share a connection pool that is abstracted out of your app endpoints. But if you HAVE to do it this way, this is how you will do it. (make sure you are closing those connections every time!)

Setup Cassandra DB in django using cqlengine but without using django-cassandra-engine

I'm a Django beginner and have developed 1 app using mysql as primary DB, but in my next project I have to use Cassandra db using https://github.com/cqlengine/cqlengine but do not use https://github.com/r4fek/django-cassandra-engine (which is a wrapper over cqlengine?).
I dont have any clue How do I start? I mean how and where should I create db connection and then create models in models.py file?
Should I create connection in init.py file?in views.py? what would be the most efficient way?
would be great(for future readers too) if someone provide a simple configuration and a model.

The twissandra demo should be a good example of how to build an app using Cassandra and Django.
In this implementation there is no models.py and the connection is maintained in the file cass.py.
You'll see cass.py also hosts all the functions required to return data from the C* database and make objects which are used by the system. This is where you would swap out the api requests with your CqlEngine code.
I hope these resources get you pointed in the right direction

Rustyrazorblade shows an easy way to accomplish this via his CQLEngine tutorial branch HERE.
You can easily setup the connection by doing something like this in your_app_project/models/connection.py:
from cqlengine import management
from cqlengine.connection import setup
def connect():
setup(["127.0.0.1", "127.0.1.1", "127.0.1.2"], "tutorial", retry_connect=True)
management.create_keyspace("tutorial", replication_factor=1, strategy_class="SimpleStrategy")
In this example: "tutorial" is the keyspace we are using, strategy_class is the replication strategy your C* instance is using, replication_factor is the amount of replications that will be stored throughout the ring, 127.0.0.1 is a Cassandra cluster node IP address (you can pass this a list or a string) and retry_connect specifies whether or not you would like it to attempt to reconnect if there is a connection failure.
From here, it is very easy for new C* users to get confused. You can call this anytime Before syncing the C* tables or using a C* query.
So, you'll want to do something like:
from cqlengine.management import sync_table
from models.connection import connect
from models.somemodels import MyCassandraModel
# This will fire off our previously setup 'connect' method
connect()
# This will setup the Model as a table in your C* DB
sync_table(MyCassandraModel)
You can even drop this into manage.py, just as long as that CQLEngine setup() is properly executed.

Managing a database connection object in Django

I'm working with a Postgresql database with Django. Because of licensing reasons, I can't use psycopg2 , so I'm using the alternative pygresql.
I don't need to use the Django ORM at all, I simply need the cursor for cur.execute() and cur.fetchall().
Since I can't use the pygresql pgdb module in the Database settings in settings.py; I've to manually open up a connection object.
What would be the best practice to do this? Currently I've simply created the connection object conn=pgdb.connect(params) in views.py outside of all functions, but this seems a bit hacky.
Any tips?

It might be a good idea to create your own PYGRE_CONFIG dictionary in settings.py that has info about the server hostname, database name, login name etc. You can use it by using from django.conf import settings and settings.PYGRE_CONFIG. Then, create a separate application utils or pygre in the root of your project directory that manages the connection object (opening and closing it as needed using settings.PYGRE_CONFIG) and stores it in a thread-local variable. Your other applications can import things from this module. Keeping it as a separate app can make it easy for you to port it from project to project.

How to delete pymongo.Database.Database object

I am using pymongo to connect to mongodb in my code. I am writing a google analytic kind of application. My db structure is like that for each new website I create a new db. So when someone registers a website I create a new db with that name, however when unregistering the website I wish the database to be deleted. I remove all the collection but still the database could not be removed
And as such the list of databases is growing very large. When I do
client = MongoClient(host=MONGO_HOST,port=27017,max_pool_size=200)
client.database_names()
I see a more than a 1000 list of apps. Many of them are just empty databases. Is there a way that I remove the mongo databases ?

Use drop_database method:
client = MongoClient(host=MONGO_HOST,port=27017,max_pool_size=200)
client.drop_database("database_name")

Python database WITHOUT using Django (for Heroku)

To my surprise, I haven't found this question asked elsewhere. Short version, I'm writing an app that I plan to deploy to the cloud (probably using Heroku), which will do various web scraping and data collection. The reason it'll be in the cloud is so that I can have it be set to run on its own every day and pull the data to its database without my computer being on, as well as so the rest of the team can access the data.
I used to use AWS's SimpleDB and DynamoDB, but I found SDB's storage limitations to be to small and DDB's poor querying ability to be a problem, so I'm looking for a database system (SQL or NoSQL) that can store arbitrary-length values (and ideally arbitrary data structures) and that can be queried on any field.
I've found many database solutions for Heroku, such as ClearDB, but all of the information I've seen has shown how to set up Django to access the database. Since this is intended to be script and not a site, I'd really prefer not to dive into Django if I don't have to.
Is there any kind of database that I can hook up to in Heroku with Python without using Django?

You can get a database provided from Heroku without requiring your app to use Django. To do so:
heroku addons:add heroku-postgresql:dev
If you need a larger more dedicated database, you can examine the plans at Heroku Postgres
Within your requirements.txt you'll want to add:
psycopg2
Then you can connect/interact with it similar to the following:
import psycopg2
import os
import urlparse
urlparse.uses_netloc.append('postgres')
url = urlparse.urlparse(os.environ['DATABASE_URL'])
conn = psycopg2.connect("dbname=%s user=%s password=%s host=%s " % (url.path[1:], url.username, url.password, url.hostname))
cur = conn.cursor()
query = "SELECT ...."
cur.execute(query)

I'd use MongoDB. Heroku has support for it, so I think it will be really easy to start and scale out: https://addons.heroku.com/mongohq
About Python: MongoDB is a really easy database. The schema is flexible and fits really well with Python dictionaries. That's something really good.
You can use PyMongo
from pymongo import Connection
connection = Connection()
# Get your DB
db = connection.my_database
# Get your collection
cars = db.cars
# Create some objects
import datetime
car = {"brand": "Ford",
"model": "Mustang",
"date": datetime.datetime.utcnow()}
# Insert it
cars.insert(car)
Pretty simple, uh?
Hope it helps.
EDIT:
As Endophage mentioned, another good option for interfacing with Mongo is mongoengine. If you have lots of data to store, you should take a look at that.

I did this recently with Flask. (https://github.com/HexIce/flask-heroku-sqlalchemy).
There are a couple of gotchas:
1. If you don't use Django you may have to set up your database yourself by doing:
heroku addons:add shared-database
(Or whichever database you want to use, the others cost money.)
2. The database URL is stored in Heroku in the "DATABASE_URL" environment variable.
In python you can get it by doing.
dburl = os.environ['DATABASE_URL']
What you do to connect to the database from there is up to you, one option is SQLAlchemy.

Create a standalone Heroku Postgres database. http://postgres.heroku.com

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.