Fast, thread-safe Python ORM?

Fast, thread-safe Python ORM? - python

Can you recommend a high-performance, thread-safe and stable ORM for Python? The data I need to work with isn't complex, so SQLAlchemy is probably an overkill.

If you are looking for something thats high performance, and based on one of your comments "something that can handle >5k queries per second". You need to keep in mind that an ORM is not built specifically for speed and performance, it is built for maintainability and ease of use. If the data is so basic that even SqlAlchemy might be overkill, and your mostly doing writes, it might be easier to just do straight inserts and skip the ORM altogether.

The peewee orm is fast and extremely lightweight, might be a good fit if SQA is too heavy.

SqlAlchemy's SqlSoup module skips most of the tediousness of SqlAlchemy's mapping and session overhead. You can pretty much 'dive' straight in, take a look.

You can use a more declarative layer on top of SQLAlchemy such as Elixir, or look at Storm which is also a bit more declerative than SQLAlchemy.
The latter was developed for and is used by sites such as Ubuntu/Canonical's launchpad, so it should scale well.

Related

Bad practice to have ORMs with NoSQL stores?

I use Redis (redis-py) inside my Python platform. Recently it was suggested that I switch to an ORM.
E.g.: python-stdnet, rom or redisco
Is use of ORMs considered bad practice in the NoSQL world?

Ultimately the question boils down to at what layer do you want to write code.
Do you want to write code that manipulates data structures in a remote database, or do you want to write higher-level code that uses the abstractions built on top of those data structures? You can think of it as a similar question about relational databases as do you want to write SQL, or do you want to write higher-level code?
Personally, despite using rom myself for a variety of tasks (I am the author), I also directly manipulate Redis in the same projects where it makes sense.

Comments pointing out that the R in ORM is for relational are technically correct. That doesn't mean there aren't valid uses and reasons for libraries that abstract redis away.
There are some great libraries that make interfacing with a redis feel much nicer and more idiomatic to the language you are using. For ruby libraries like ohm or redis-native_hash (disclosure: I wrote that one) do just that. For python there are tools like redisco and surely others. These make persisting objects to redis very simple and make working with redis feel much more ruby-ish or python-ish.
Here are a few more benefits from using even the most basic abstraction, like a very thin wrapper you might write and keep in your application:
Switching redis clients will be easier. Maybe you'll never do this, but if you did, changing your calls to redis in one place (your wrapper) is much simpler than changing them everywhere you use redis.
Implementing things you might need for scaling, like sharding or connection pooling, is likely going to be easier if your calls are made through some abstraction.
Replacing redis with some other key/value store or data structure server would be simpler if an abstraction is in place.
I'm not advocating using an object mapping library or building your own abstraction, just pointing out there are valid reasons why you would. Its up to you to evaluate your needs and pick what works best for you. There is nothing wrong with calling redis directly either.

What is the fastest/most performant SQL driver for Python?

I know of PyMySQLDb, is that pretty much the thinnest/lightest way of accessing MySql?

The fastest is SQLAlchemy.
"Say what!?"
Well, a nice ORM, and I like SQLAlchemy, you will get your code finished much faster. If your code then runs 0.2 seconds slower isn't really gonna make any noticeable difference. :)
Now if you get performance problems, then you can look into improving the code. But choosing the access module after who in theory is "fastest" is premature optimization.

The lightest possible way is to use ctypes and directly call into the MySQL API, of course, without using any translation layers. Now, that's ugly and will make your life miserable unless you also write C, so yes, the MySQLDb extension is the standard and most performant way to use MySQL while still using the Python Database API. Almost anything else will be built on top of that or one of its predecessors.
Of course, the connection layer is rarely where all of the database speed problems come from. That's mostly from misusing the API you have or building a bad database or queries.

MySQLDb is faster while SQLAlchemy makes code more user friendly -:)

which databases can be used better for pyqt application

i want to create application in windows. i need to use databases which would be preferable best for pyqt application
like
sqlalchemy
mysql
etc.

I would use SQLite every time unless performance became an obvious big problem.
It comes with Python
You don't need to worry about installing it on a target machine or having an existing installation which might clash (including a potential port clash - SQLite doesn't use a port)
It's fairly small (doesn't increase the installed size too much)
Then, a much less obvious choice that I would very much consider making: adding Django to the mix. Django's model system could make for much simpler management, depending on the type of data you're working with. Also, in the case where I've considered it (I just haven't got to that stage of development yet) it means I can reuse the models I've got on the server and a good bit of code from there too.
Obviously in this case you could need to be careful about what you expose; business-critical processing stuff that you don't want to share, potential security holes in server code which you've helpfully provided the code for, etc.

SQlite is fine for a single user.
If you are going over a network to talk to a central database, then you need a database woith a decent Python lirary.
Take a serious look at MySQL if you need/want SQL.
Otherwise, there is CouchDB in the Not SQL camp, which is great if you are storing documents, and can express searches as Map/reduce functions. Poor for adhoc queries.

If you want a relational database I'd recommend you to use SQLAlchemy, as you then get a choice as well as an ORM. Bu default go with SQLite, as per other recommendations here.
If you don't need a relational database, take a look at ZODB. It's an awesome Python-only object-oriented database.

i guess its totally upto you ..but as far as i am concerned i personlly use sqlite, becoz it is easy to use and amazingly simple syntax whereas for MYSQL u can use it for complex apps and has options for performance tuning. but in end its totally upto u and wt your app requires

Pros and cons of using sqlite3 vs custom table implementation

I noticed that a significant part of my (pure Python) code deals with tables. Of course, I have class Table which supports the basic functionality, but I end up adding more and more features to it, such as queries, validation, sorting, indexing, etc.
I to wonder if it's a good idea to remove my class Table, and refactor the code to use a regular relational database that I will instantiate in-memory.
Here's my thinking so far:
Performance of queries and indexing would improve but communication between Python code and the separate database process might be less efficient than between Python functions. I assume that is too much overhead, so I would have to go with sqlite which comes with Python and lives in the same process. I hope this means it's a pure performance gain (at the cost of non-standard SQL definition and limited features of sqlite).
With SQL, I will get a lot more powerful features than I would ever want to code myself. Seems like a clear advantage (even with sqlite).
I won't need to debug my own implementation of tables, but debugging mistakes in SQL are hard since I can't put breakpoints or easily print out interim state. I don't know how to judge the overall impact of my code reliability and debugging time.
The code will be easier to read, since instead of calling my own custom methods I would write SQL (everyone who needs to maintain this code knows SQL). However, the Python code to deal with database might be uglier and more complex than the code that uses pure Python class Table. Again, I don't know which is better on balance.
Any corrections to the above, or anything else I should think about?

SQLite does not run in a separate process. So you don't actually have any extra overhead from IPC. But IPC overhead isn't that big, anyway, especially over e.g., UNIX sockets. If you need multiple writers (more than one process/thread writing to the database simultaneously), the locking overhead is probably worse, and MySQL or PostgreSQL would perform better, especially if running on the same machine. The basic SQL supported by all three of these databases is the same, so benchmarking isn't that painful.
You generally don't have to do the same type of debugging on SQL statements as you do on your own implementation. SQLite works, and is fairly well debugged already. It is very unlikely that you'll ever have to debug "OK, that row exists, why doesn't the database find it?" and track down a bug in index updating. Debugging SQL is completely different than procedural code, and really only ever happens for pretty complicated queries.
As for debugging your code, you can fairly easily centralize your SQL calls and add tracing to log the queries you are running, the results you get back, etc. The Python SQLite interface may already have this (not sure, I normally use Perl). It'll probably be easiest to just make your existing Table class a wrapper around SQLite.
I would strongly recommend not reinventing the wheel. SQLite will have far fewer bugs, and save you a bunch of time. (You may also want to look into Firefox's fairly recent switch to using SQLite to store history, etc., I think they got some pretty significant speedups from doing so.)
Also, SQLite's well-optimized C implementation is probably quite a bit faster than any pure Python implementation.

You could try to make a sqlite wrapper with the same interface as your class Table, so that you keep your code clean and you get the sqlite performences.

If you're doing database work, use a database, if your not, then don't. Using tables, it sound's like you are. I'd recommend using an ORM to make it more pythonic. SQLAlchemy is the most flexible (though it's not strictly just an ORM).

Transitioning from php to python/pylons/SQLAlchemy -- Are ORMs the standard now?

Should I invest a lot of time trying to figure out an ORM style implementation, or is it still common to just stick with standard SQL queries in python/pylons/sqlalchemy?

ORMs are very popular, for several reasons -- e.g.: some people would rather not learn SQL, ORMs can ease porting among different SQL dialects, they may fit in more smoothly with the mostly-OOP style of applications, indeed might even ease some porting to non-SQL implementations (e.g, moving a Django app to Google App Engine would be much more work if the storage access layer relied on SQL statements -- as it relies on the ORM, that reduces, a bit, the needed porting work).
SQLAlchemy is the most powerful ORM I know of for Python -- it lets you work at several possible levels, from a pretty abstract declarative one all the way down to injecting actual SQL in some queries where your profiling work has determined it makes a big difference (I think most people use it mostly at the intermediate level where it essentially mediates between OOP and relational styles, just like other ORMs).
You haven't asked for my personal opinion in the matter, which is somewhat athwart of the popular one I summarized above -- I've never really liked "code generators" of any kind (they increase your productivity a bit when everything goes smoothly... but you can pay that back with interest when you find yourself debugging problems [[including performance bottlenecks]] due to issues occurring below the abstraction levels that generators strive to provide).
When I get a chance to use a good relational engine, such as PostgreSQL, I believe I'm overall more productive than I would be with any ORM in between (incuding SQLAlchemy, despite its many admirable qualities). However, I have to admit that the case is different when the relational engine is not all that good (e.g., I've never liked MySQL), or when porting to non-relational deployments is an important consideration.
So, back to your actual question, I do think that, overall, investing time in mastering SQLAlchemy is a good idea, and time well-spent.

If you have never use an ORM like SqlAlchemy before, I would suggest that you learn it - as long as you are learning the Python way. If nothing else, you will be better able to decide where/when to use it vs plain SQL. I don't think you should have to invest a lot of time on it. Documentation for SQLAlchemy is decent, and you can always ask for help if you get stuck.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.