Small "embeddable" database that can also be synced over the network?

Small "embeddable" database that can also be synced over the network? - python

I am looking for a small database that can be "embedded" into my Python application without running a separate server, as one can do with SQLite or Metakit. I don't need an SQL database, in fact storing free-form data like Python dictionaries or JSON is preferable.
The other requirement is that to be able to run an instance of the database on a server, and have instances of my application (clients) sync the database with the server (two-way), similar to what CouchDB replication can do.
Is there a database that will do this?

From what you describe, it sounds like you could get by using pickle and FTP.

If you don't need an SQL database, what's wrong with CouchDB? You can spawn a local process to serve the DB, and you could easily write a server wrapper to allow only access from your app. I'm not sure about the access story, but I believe the latest Ubuntu uses CouchDB for synchronizeable user-level data.

Seems like the perfect job for CouchDB: 2 way sync is incredibly easy, schema-less JSON documents are the native format. If you're using python, couchdb-python is a great way to work with CouchDB.

Do you need clients to work offline and then resync when they reconnect to the network? I don't know if MongoDB can handle the offline client scenario, but if the client is online all the time, MongoDB might be a good solution too. It has pretty goode python support. Still a separate process, but perhaps easier to get running on Windows than CouchDB.

BerkeleyDB might be another option to check out, and it's lightweight enough. easy_install bsddb3 if you need a Python interface.

HSQLDB does this, but unfortunately it's Java rather than Python.
Firebird SQL might be closer to what you want, since it does seem to have a Python interface.

Related

Simple DB to maintain one view?

I have a python program in which I am downloading user data and updating a table. I only need to store the most current updates for each user.
Is there a simple (maybe No SQL, key/value) DB that would be good for maintaining a single table like this? I would just store it in a dict in python but I need persistence.
I am running this on an AWS EC2 linux server. I know there are AWS options (Dynamo) but I thought a local DB might be easier.
Thanks

Look into python stdblib dbm module, fallback to an embedded okvs.

Export a MySQL database with contacts to a compatible CardDav system

I have a standard MySQL database, with a table containing contacts (I'm adding contacts to the table using a webapp using Zend Framework), thus with my own fields.
Is it possible to create a server which would be compatible to be used with the Address Book system of OsX? I think I must be compatible with the CardDav system.
Has anyone already done that? If yes, how did you handle it? Created your own server? Is there a CardDav library for Python for example? I just want to be able to read my contacts using the Address Book of OsX.
Thanks a lot for your answers,
Best,
Jean

Is it possible to create a server which would be compatible to be used
with the Address Book system of OsX? I think I must be compatible with
the CardDav system.
Yes you can create such a server and there are plenty already. You can choose between either CardDAV or LDAP, depending on your needs. When LDAP is good enough for your use case, you might get even get away with just configuring OpenLDAP to use your database.
LDAP is usually just read & query only (think big company address book / yellow pages). CardDAV is usually read/write and full sync.
Has anyone already done that?
Many people have, the CalConnect CardDAV Server Implementations site alone lists 16, most of them FOSS. There are more.
If yes, how did you handle it? Created your own server?
I think this is the most common approach.
Is there a CardDav library for Python for example?
Please do your research, this is trivial to figure out ...
Many PHP servers (you mentioned Zend) are using SabreDAV as a basis.
I just want to be able to read my contacts using the Address Book of OsX.
That makes it a lot easier. While you can use a library like SabreDAV, implementing CardDAV readonly is really not that hard. Authentication, a few XML requests for locating an addressbook and then some code to render your existing records as vCards.
If you want to add editing, things get more complicated.

Sending data through the web to a remote program using python

I have a program that I wrote in python that collects data. I want to be able to store the data on the internet somewhere and allow for another user to access it from another computer somewhere else, anywhere in the world that has an internet connection. My original idea was to use an e-mail client, such as g-mail, to store the data by sending pickled strings to the address. This would allow for anyone to access the address and simply read the newest e-mail to get the data. It worked perfectly, but the program requires a new e-mail to be sent every 5-30 seconds. So the method fell through because of the limit g-mail has on e-mails, among other reasons, such as I was unable to completely delete old e-mails.
Now I want to try a different idea, but I do not know very much about network programming with python. I want to setup a webpage with essentially nothing on it. The "master" program, the program actually collecting the data, will send a pickled string to the webpage. Then any of the "remote" programs will be able to read the string. I will also need the master program to delete old strings as it updates the webpage. It would be preferred to be able to store multiple string, so there is no chance of the master updating while the remote is reading.
I do not know if this is a feasible task in python, but any and all ideas are welcome. Also, if you have an ideas on how to do this a different way, I am all ears, well eyes in this case.

I would suggest taking a look at setting up a simple site in google app engine. It's free and you can use python to do the site. Than it would just be a matter of creating a simple restful service that you could send a POST to with your pickled data and store it in a database. Than just create a simple web front end onto the database.

Another option in addition to what Casey already provided:
Set up a remote MySQL database somewhere that has user access levels allowing remote connections. Your Python program could then simply access the database and INSERT the data you're trying to store centrally (e.g. through MySQLDb package or pyodbc package). Your users could then either read the data through a client that supports MySQL or you could write a simple front-end in Python or PHP that displays the data from the database.

Adding this as an answer so that OP will be more likely to see it...
Make sure you consider security! If you just blindly accept pickled data, it can open you up to arbitrary code execution.

I suggest you to use a good middle-ware like: Zero-C ICE, Pyro4, Twisted.
Pyro4 using pickle to serialize data.

backend for python

which is the best back end for python applications and what is the advantage of using sqlite ,how it can be connected to python applications

What do you mean with back end? Python apps connect to SQLite just like any other database, you just have to import the correct module and check how to use it.
The advantages of using SQLite are:
You don't need to setup a database server, it's just a file
No configurations needed
Cross platform
Mainly, desktops applications are the ones that take real advantage of this. For web apps, SQLite is not recommended, since the file containing the data, is easily readable (lacks any kind of encryption), and when the web server lacks special configuration, the file is downloadable by anyone.

Django, Twisted, and CherryPy are popular Python "Back-Ends" as far as web applications go, with Twisted likely being the most flexible as far as networking is concerned.
SQLite can, as has been previously posted, be directly interfaced with using SQL commands as it has native bindings for Python, or it can be accessed with an Object Relational Manager such as SQLObject (another Python library).
As far as performance is concered, SQLite is fairly scalable and should be able to handle most use cases that don't require a seperate database server (nothing enterprise level). An additional benefit of SQLite is that the database is self-contained in a single file allowing for easy backup while remained a common enough format that multiple applications can access the data. A word of advice on using SQLite with Python, however, is that you may run into issues with threading (in the past most of the bindings for SQLite were not thread-safe, although this may have changed over time).

The language you are using at the application layer has little to do with your database choice underneath. You need to examine the advantages of other DB packages to get an idea of what you want.
Here are some popular database packages for cheap or free:
ms sql server express, pg/sql, mysql

If you mean "what is the best database?" then there's simply no way to answer this question. If you just want a small database that won't be used by more than a handful of people at a time, SQLite is what you're looking for. If you're running a database for a giant corporation serving thousands, you're probably looking for Oracle. In between those, you have MySQL, PostgreSQL, SQL Server, db2, and probably more.
If you're familiar with one of those, that may be the best to go with from a practical standpoint. If you're doing a typical webapp, my advice would be to go with MySQL or PostgreSQL as they're free and well supported by just about any ORM you could think of (my personal preference is towards PostgreSQL, but I'm not experienced enough with either of these to make a good argument one way or another). If you do go with one of those two, my recommendation is to use storm as the ORM.
(And yes, there are free versions of SQL Server and Oracle. You won't have as many choices as far as ORMs go though)

Which Python client library should I use for CouchdB? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I'm starting to experiment with CouchDB because it looks like the perfect solution for certain problems we have. Given that all work will be on a brand new project with no legacy dependencies, which client library would you suggest that I use, and why?
This would be easier if there was any overlap on the OSes we use. FreeBSD only has py-simplecouchdb already available in its ports collection, but that library's project website says to use CouchDBKit instead. Neither of those come with Ubuntu, which only ships with CouchDB. Since those two OSes don't have an libraries in common, I'll probably be installing something from source (and hopefully submitting packages to the Ubuntu and FreeBSD folks if I have time).
For those interested, I'd like to use CouchDB as a convenient intermediate storage place for data passed between various services - think of a message bus system but with less formality. For example, we have daemons that download and parse web pages, then send interesting bits to other daemons for further processing. A lot of those objects are ill-defined until runtime ("here's some HTML, plus a set of metadata, and some actions to run on it"). Rather than serialize it to an ad-hoc local network protocol or stick it in PostgreSQL, I'd much rather use something designed for the purpose. We're currently using NetWorkSpaces in this role, but it doesn't have nearly the breadth of support or the user community of CouchDB.

I have been using couchdb-python with quite a lot of success and as far as I know the guys of desktopcouch use it in ubuntu. The prerequisites are very basic and you should have not problems:
httplib2
simplejson or cjson
Python
CouchDB 0.9.x (earlier or later versions are unlikely to work as the interface is still changing)
For me some of the advantages are:
Pythonic interface. You can work with the database like if it was a dict.
Interface for design documents.
a CouchDB view server that allows writing view functions in Python
It also provides a couple of command-line tools:
couchdb-dump: Writes a snapshot of a CouchDB database
couchdb-load: Reads a MIME multipart file as generated by couchdb-dump and loads all the documents, attachments, and design documents into a CouchDB database.
couchdb-replicate: Can be used as an update-notification script to trigger replication between databases when data is changed.

If you're still considering CouchDB then I'll recommend Couchdbkit (http://www.couchdbkit.org). It's simple enough to quickly get a hang on and runs fine on my machine running Karmic Koala. Prior to that I've tried couchdb-python but some bugs (maybe ironed out by now) with httplib was giving me some errors (duplicate documents..etc) but Couchdbkit got me up and going so far without any problems.

spycouch
Simple Python API for CouchDB
Python library for easily manage CouchDB.
Compared to ordinarily available libraries on web, works with the latest version CouchDB - 1.2.1
Functionality
Create a new database on the server
Deleting a database from the server
Listing databases on the server
Database information
Database compression
Create map view
Map view
Listing documents in DB
Get document from DB
Save document to DB
Delete document from DB
Editing of a document
spycouch on >> https://github.com/cernyjan/repository

Considering the task you are trying to solve (distributed task processing) you should consider using one of the many tools designed for message passing rather than using a database. See for instance this SO question on running multiple tasks over many machines.
If you really want a simple casual message passing system, I recommend you shift your focus to MorbidQ. As you get more serious, use RabbitMQ or ActiveMQ. This way you reduce the latency in your system and avoid having many clients polling a database (and thus hammering that computer).
I've found that avoiding databases is a good idea (That's my blog) - and I have a end-to-end live data system running using MorbidQ here

I have written a couchdb client library built on python-requests (which is in most distributions). We use this library in production.
https://github.com/adamlofts/couchdb-requests
Robust CouchDB Python interface using python-requests.
Goals:
Only one way to do something
Fast and stable (connection pooled)
Explicit is better than implicit. Buffer sizes, connection pool size.
Specify query parameters, no **params in query functions

After skimming through the docs of many couchdb python libraries, my choice went to pycouchdb.
All I needed to know was very quick to grasp from the doc: https://py-couchdb.readthedocs.org/en/latest/ and it works like a charm.
Also, it works well with Python 3.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.