Related
We have an in house developed web-based admin console that uses a combination of C CGI and Perl scripts to administer our mail server stack. Of late we have been thinking of cleaning up the code (well, replacing most of it), making the implementation more secure, and improving the overall behavior.
I don't have much programming knowledge, but I use Ruby on and off (mainly for writing erb templates), and hence was thinking of using ruby/rails for developing such an app (off-duty for now, I also need to learn stuff !).
Before blindly picking up a language though, what would you folks suggest ? Please let me know if this is too vague a question, I'll try to supply more information, if needed.
Have you considered writing your applications as Webmin modules?
You get a lot of stuff for free when you do so (users and groups, tons of security features, a pretty big variety of helper functions related to config files, and tons of existing code for most aspects of a UNIX/Linux system). You also get a lot of stuff for nearly free, like action logging, packages and updates via wbm or apt or yum, an online help system, etc.
There are some cons, as well. It's an old codebase, so it has some clunky bits in the API among other places. A lot of the old modules can be a bit hard to grok if you're not an old-school Perl programmer. But, it's a well-maintained codebase, and it's been banged on by millions of users for over a dozen years. It's pretty robust. The UI isn't beautiful, but it is relatively theme-able, and if you're distributing a minimized version it becomes easier to customize the UI.
I suspect you can be up and running a lot faster than starting from scratch or using most existing frameworks that aren't targeted specifically to building systems management interfaces the way Webmin is.
Also, it's BSD licensed, so you can do whatever you want with it, including building a custom commercial app with it (hundreds of companies have done so over the years).
If you already know a bit of ruby, then there's no reason not to use that.
If you're interested specifically in learning another language, then what you're trying to do could be done in pretty much any language/framework, it's just a matter of which one you want to learn.
Without knowing much about your existing application I'd say that this effectively boils down to "which language do you like to work with?".
Python and Ruby are both mature languages with ample library infrastructure. They also boast popular, similar web application frameworks namely Django and Ruby-on-Rails respectively.
Since you are porting an existing Perl app(lets) it may be worthwhile to note that Ruby is relatively more similar to Perl. Not surprising given that Ruby was influenced "primarily by Perl, Smalltalk, Eiffel and Lisp".
django has a nice admin interface
I'm lucky enough to have full control over the architecture of my company's app, and I've decided to scrap our prototype written in Ruby/Rails and start afresh in Python. This is for a few reasons: I want to learn Python, I prefer the syntax and I've basically said "F**k it, let's do it."
So, baring in mind this is going to be a pretty intensive app, I'd like to hear your opinions on the following:
Generic web frameworks
ORM/Database Layer (perhaps to work with MongoDB)
RESTful API w/ oAuth/xAuth authentication
Testing/BDD support
Message queue (I'd like to keep this in Python if possible)
The API is going to need to interface with a Clojure app to handle some internal data stuff, and interface with the message queue, so if it's not Python it'd be great to have some libraries to it.
TDD/BDD is very important to me, so the more tested, the better!
It'll be really interesting to read your thoughts on this. Much appreciated.
My best,
Jamie
Frameworks
OK, so I'm a little biased here as I currently make extensive use of Django and organise the Django User Group in London so bear that in mind when reading the following.
Start with Django because it's a great gateway drug. Lots of documentation and literature, a very active community of people to talk to and lots of example code around the web.
That's a completely non-technical reason. Pylons is probably purer in terms of Python philosophy (being much more a collection of discrete bits and pieces) but lots of the technical stuff is personal preference, at least until you get into Python more. Compare the very active Django tag on Stack Overflow with that of pylons or turbogears though and I'd argue getting started is simply easier with Django irrespective of anything to do with code.
Personally I default to Django, but find that an increasing amount of time I actually opt for writing using simpler micro frameworks (think Sinatra rather than Rails). Lots of things to choose from (good list here, http://fewagainstmany.com/blog/python-micro-frameworks-are-all-the-rage). I tend to use MNML (because I wrote parts of it and it's tiny) but others are actively developed. I tend to do this for small, stupid web services which are then strung together with a Django project in the middle serving people.
Worth noting here is appengine. You have to work within it's limitations and it's not designed for everything but it's a great way to just play with Python and get something up and working quickly. It makes a great testbed for learning and experimentation.
Mongo/ORM
On the MongoDB front you'll likely want to look at the basic python mongo library ( http://api.mongodb.org/python/ ) first to see if it has everything you need. If you really do want something a little more ORM like then mongoengine (http://hmarr.com/mongoengine/) might be what you're looking for. A bunch of folks are also working on making Django specifically integrate more seamlessly with nosql backends. Some of that is for future Django releases, but django-norel ( http://www.allbuttonspressed.com/projects/django-nonrel) has code now.
For relational data SQLAlchemy ( http://www.sqlalchemy.org/) is good if you want something standalone. Django's ORM is also excellent if you're using Django.
API
The most official Oauth library is python-oauth2 ( http://github.com/simplegeo/python-oauth2), which handily has a Django example as part of it's docs.
Piston ( http://bitbucket.org/jespern/django-piston/wiki/Home) is a Django app which provides lots of tools for building APIs. It has the advantage of being pretty active and well maintained and in production all over the place. Other projects exist too, including Dagny ( http://zacharyvoase.github.com/dagny/) which is an early attempt to create something akin to RESTful resources in Rails.
In reality any Python framework (or even just raw WSGI code) should be reasonably good for this sort of task.
Testing
Python has unittest as part of it's standard library, and unittest2 is in python 2.7 (but backported to previous versions too http://pypi.python.org/pypi/unittest2/0.1.4). Some people also like Nose ( http://code.google.com/p/python-nose/), which is an alternative test runner with some additional features. Twill ( http://twill.idyll.org/) is also nice, it's a "a simple scripting language for Web browsing", so handy for some functional testing. Freshen ( http://github.com/rlisagor/freshen) is a port of cucumber to Python. I haven't yet gotten round to using this in anger, but a quick look now suggests it's much better than when I last looked.
I actually also use Ruby for high level testing of Python apps and apis because I love the combination of celerity and cucumber. But I'm weird and get funny looks from other Python people for this.
Message Queues
For a message queue, whatever language I'm using, I now always use RabbitMQ. I've had some success with stompserver in the past but Rabbit is awesome. Don't worry that it's not itself written in Python, neither is PostgresSQL, Nginx or MongoDB - all for good reason. What you care about are the libraries available. What you're looking for here is py-amqplib ( http://barryp.org/software/py-amqplib/) which is a low level library for talking amqp (the protocol for talking to rabbit as well as other message queues). I've also used Carrot ( http://github.com/ask/carrot/), which is easier to get started with and provides a nicer API. Think bunny in Ruby if you're familiar with that.
Environment
Whatever bits and pieces you decide to use from the Python ecosystem I'd recommend getting to who pip and virtualenv ( http://clemesha.org/blog/2009/jul/05/modern-python-hacker-tools-virtualenv-fabric-pip/ - note that fabric is also cool, but not essential and these docs are out of date on that tool). Think about using Ruby without gem, bundler or rvm and you'll be in the right direction.
Ok, you might be making a mistake, the same one I made when I started with python.
Before you decide on a thing like django, which is an excellent, yet atypical python web framework, spend an night cuddled up with:
This, is a good start. Make sure you do A little Werkzeug watching , Then check out
some classic WebOb. Maybe, if you feel the fire in the blood, and you might, wsgi is a bit flawed, but only to the gods, check out Flask
I'm not saying use it, Django is beautiful too, but if you don't know python, and you go through django, you run the risk of learning a framework.
WSGI is super straightforward. You'll find out about Paste, and Pastescript, and Pylons.
Then, make your decision. It'll be much easier learning stuff doing bare bones wsgi or Flask, stuff like variable assignment, using the interpreter, style concerns, testing, on 3 files for a couple of nights, instead of django. Take 2 nights. Then you'll see the great similarity between python web frameworks, instead of the differences. Hell, you might even roll with Flask.
Just some advice, I did the same thing with ruby, going in through Rails, and... well, strong words were said.
Language, then basic wsgi and testing, then pick your framework and roll
I'm new to python myself, and plan to get more in depth with it this year. I've had a few false starts at this, but always professional needs bring me back to PHP. The few times I've done some development, I've had really good experiences with web2py as a python framework. It's quite well done, and complete in features, while still being extremely lightweight. The database layer seems to be very flexible and mature.
As for TDD/BDD and the rest of your questions, I don't have any experience with python options, but would be interested to hear what others say.
I am using Twisted Framework based Nevow library for python based web app.
All your criteria fit into this single framework.
I know it's kinda subjective but, if you were to put yourself in my shoes which would you invest the time in learning?
I want to write a web app which deals securely with relatively modest amounts of peoples private data, a few thousand records of a few Kb each but stuff that needs to be kept safe, addresses, phone numbers etc. I've done several web projects in PHP/MYSQL and have decided, handy though it is I really don't like PHP and don't want to do another large project in it...
As such I figure I'd best learn something new and so I am considering 2 options (although I'll happily entertain others if you have suggestions). I'm having terrible trouble deciding though. They both look quite involved so rather than just jump in and potentially waste days getting up to speed enough on both of them to make an informed choice I thought I'd come here and canvas some opinion.
So the two options I'm considering are...
One of the PYTHON Web frameworks - TurboGears seems well regarded?
Advantage: Of all the languages I ever tried Python is by far and away my favorite. There's loads of frameworks to choose from and I have done quite a lot of non web python coding over the last few years.
Disadvantage: There's loads to choose from so it's hard to pick! Need to run single server process? or mod_python? which I don't like the sound of. What I do like is the notion of process separation and compartmentalization, i.e. if one users account is compromised it gives an attacker no leverage against the rest of the system. I'm not clear to what extent a python solution would handle that.
Writing it as a SEASIDE app Which I guess runs on a squeak app server?
Adv: From what I've heard it would permit good compartmentalization of users as each would have their own little private VM independent of all the systems other users which sounds wonderful from a security, scaling and redundancy standpoint.
Dis: I've not done any Smalltalk since Uni 15 years back and I never dug too deep into it then. I don't see much entry level help for seaside or that many projects using it. I suspect setting a server up to run it is hard for the same reason i.e. not because it's inherently hard but just cause there will be less help online and a presumption you are already rather au fait with Sqeak/Smalltalk.
So, what do people think? Would I be able to efficiently get the kind of strong separation and compartmentalization I'm after with a Python framework? Is Seaside as good as I think in terms of insulating users from each other? Might I be better off, security wise, sticking to the languages I'm most familiar with so I don't make any n00b mistakes or will Seaside be worth worth scaling the learning curve and prove more secure, comprehensible and maintainable in the long run? At the end of the day it's not a life or death decision and I can always bail if I start with one and then hate it so pls nobody get all holy language war and start flaming anyone! ;-)
Cheers for any replies this gets,
Roger :)
Disclaimer: I really don't like PHP, Python is nice, but doesn't come close to Smalltalk in my book. But I am a biased Smalltalker. Some answers about Seaside/Squeak:
Q: Which I guess runs on a squeak app server?
Seaside runs in several different Smalltalks (VW, Gemstone, Squeak etc). The term "app server" is not really used in Smalltalk country. :)
Q: From what I've heard it would permit good compartmentalization of users as each would have their own little private VM independent of all the systems other users which sounds wonderful from a security, scaling and redundancy standpoint.
Yes, each user has its own WASession and all UI components the user sees are instances living on the server side in that session. So sharing of state between sessions is something you must do explicitly, typically through a db.
Q: I've not done any Smalltalk since Uni 15 years back and I never dug too deep into it then. I don't see much entry level help for seaside or that many projects using it.
Smalltalk is easy to get going with and there is a whole free online book on Seaside.
Q: I suspect setting a server up to run it is hard for the same reason i.e. not because it's inherently hard but just cause there will be less help online and a presumption you are already rather au fait with Sqeak/Smalltalk.
No, not hard. :) In fact, quite trivial. Tons of help - Seaside ml, IRC on freenode, etc.
Q: Is Seaside as good as I think in terms of insulating users from each other?
I would say so.
Q: Might I be better off, security wise, sticking to the languages I'm most familiar with so I don't make any n00b mistakes or will Seaside be worth worth scaling the learning curve and prove more secure, comprehensible and maintainable in the long run?
The killer argument in favor of Seaside IMHO is the true component model. It really, really makes it wonderful for complex UIs and maintenance. If you are afraid of learning "something different" (but then you wouldn't even consider it in the first place I guess) then I would warn you. But if you are not afraid then you will probably love it.
Also - Squeak (or VW) is a truly awesome development environment - debugging live Seaside sessions, changing code in the debugger and resuming etc etc. It rocks.
Forget about mod_python, there is WSGI.
I'd recommend Django. It runs on any WSGI server, there are a lot to choose from. There is mod_wsgi for Apache, wsgiref - reference implementation included in Python and many more. Also Google App Engine is WSGI, and includes Django.
Django is very popular and it's community is rapidly growing.
I'd say take a look at Django. It's a Python framework with a ready-made authentication system that's independent of the hosting OS, which means that compromises are limited to the app that was compromised (barring some exploit against the web server hosting the Python process).
I've been getting into seaside myself but in many ways it is very hard to get started, which has nothing to do with the smalltalk which can be picked up extremely quickly. The challenge is that you are really protected from writing html directly.
I find in most frameworks when you get stuck on how to do something there is always a work around of solving it by using the template. You may later discover that this solution causes problems with clarity down the road and there is in fact a better solutions built into the framework but you were able to move on from that problem until you learned the right way to do it.
Seaside doesn't have templates so you don't get that crutch. No problems have permanently stumped me but some have taken me longer to solve than I would have liked. The flip side of this is you end up learning the seaside methodology much quicker because you can't cheat.
If you decide to go the seaside route don't be afraid to post to the seaside mailing list at squeakfoundation.org. I found it intimidating at first because you don't see a lot of beginner questions there due to the low traffic but people are willing to help beginners there.
Also there are a handful of seaside developers who monitor stackoverflow regularly. Good luck.
Have you taken a look at www.nagare.org ?
A framework particularly for web apps rather than web sites.
It is based around the Seaside concepts but you program in Python (nagare deploys a distribution of python called Stackless Python to get the continuations working).
Like Seaside it will auto generate HTML, but additionally can use templates as required.
It has been recently open sourced by http://www.net-ng.com/ who themselves have many years experience in delivering web apps/sites in quality web frameworks like zope and plone.
I am researching it myself at the moment to see if it fits my needs, so can't tell you what I think of it in the wild. If you take a look, please give your feedback.
While considering a Smalltalk web framework, look at Aida/Web as well. Aida has built-in security with user/group/role management and strong access control, which can help you a lot in your case. That way you can achieve safe enough separation of users at the user level in one image. But if you really want, you can separate them with running many images as well. But this brings increased maintenance and I'd think twice if it is worth.
I'm toying with Seaside myself and found this tutorial to be invaluable in gaining insight into the capabilities of the framework.
I think you've pretty much summed up the pros and cons. Seaside isn't that hard to set up (I've installed it twice for various projects) but using it will definitely affect how you work--in addition to re-learning the language you'll probably have to adjust lots of assumptions about your work flow.
It also depends on two other factors
If other people will eventually be maintaining it, you'll have better luck finding python programmers
If you are doing a highly stateful site, Seaside is going to beat the pants off any other framework I've seen.
There is now an online book on Seaside to complete the tutorial pointed out earlier.
This is a follow-up to two questions I asked a week or so back. The upshot of those was that I was building a prototype of an AI-based application for the web, and I wondered what language(s) to use. The conclusion seemed to be that I should go for something like python and then convert any critical bits into something faster like Java or C/C++.
That sounds fine to me, but I'm wondering now whether python is really the right language to use for building a web application. Most web applications I've worked on in the past were C/C++ CGI and then php. The php I found much easier to work with as it made linking the user interface to the back-end so much easier, and also it made more logical sense to me.
I've not used python before, but what I'm basically wondering is how easy is CGI programming in python? Will I have to go back to the tedious way of doing it in C/C++ where you have to store HTML code in templates and have the CGI read them in and replace special codes with appropriate values or is it possible to have the templates be the code as with php?
I'm probably asking a deeply ignorant question here, for which I apologise, but hopefully someone will know what I'm getting at! My overall question is: is writing web applications in python a good idea, and is it as easy as it is with php?
Python is a good choice.
I would avoid the CGI model though - you'll pay a large penalty for the interpreter launch on each request. Most Python web frameworks support the WSGI standard and can be hooked up to servers in a myriad of ways, but most live in some sort of long-running process that the web server communicates with (via proxying, FastCGI, SCGI, etc).
Speaking of frameworks, the Python landscape is ripe with them. This is both good and bad. There are many fine options but it can be daunting to a newcomer.
If you are looking for something that comes prepackaged with web/DB/templating integration I'd suggest looking at Django, TurboGears or Pylons. If you want to have more control over the individual components, look at CherryPy, Colubrid or web.py.
As for whether or not it is as "easy as PHP", that is subjective. Usually it is encouraged to keep your templates and application logic separate in the Python web programming world, which can make your life easier. On the other hand, being able to write all of the code for a page in a PHP file is another definition of "easy".
Good luck.
"how easy is CGI programming in python?" Easier than C, that's for sure. Python is easier because -- simply -- it's an easier language to work with than C. First and foremost: no memory allocation-deallocation. Beyond that, the OO programming model is excellent.
Beyond the essential language simplicity, the Python WSGI standard is much easier to cope with than the CGI standard.
However, raw CGI is a huge pain when compared with the greatly simplified world of an all-Python framework (TurboGears, CherryPy, Django, whatever.)
The frameworks impose a lot of (necessary) structure. The out-of-the-box experience for a CGI programmer is that it's too much to learn. True. All new things are too much to learn. However, the value far exceeds the investment.
With Django, you're up and running within minutes. Seriously. django-admin.py startproject and you have something you can run almost immediately. You do have to design your URL's, write view functions and design page templates. All of which is work. But it's less work than CGI in C.
Django has a better architecture than PHP because the presentation templates are completely separated from the processing. This leads to some confusion (see Syntax error whenever I put python code inside a django template) when you want to use the free-and-unconstrained PHP style on the Django framework.
linking the user interface to the back-end
Python front-end (Django, for example) uses Python view functions. Those view functions can contain any Python code at all. That includes, if necessary, modules written in C and callable from Python.
That means you can compile a CLIPS module with a Python-friendly interface. It becomes something available to your Python code with the import statement.
Sometimes, however, that's ineffective because your Django pages are waiting for the CLIPS engine to finish. An alternative is to use something like a named pipe.
You have your CLIPS-based app, written entirely in C, reading from a named pipe. Your Django application, written entirely in Python, writes to that named pipe. Since you've got two independent processes, you'll max out all of your cores pretty quickly like this.
I would suggest Django, but given that you ask for something "as easy as it is with php" then you must take a look at PSP (Python Server Pages).
While Django is a complete framework for doing websites, PSP can be used in the same way than PHP, without any framework.
It is easier to write web-apps in python than it's in php. Particularly because python is not a broken language.
Pick up some web framework that supports mod_wsgi or roll out your own. WSGI apps are really easy to deploy after you get a hold from doing it.
If you want templates then genshi is about the best templating engine I've found for python and you can use it however you like.
Is Django a good choice for a security critical application?
I am asking this because most of the online banking software is built using Java. Is there any real reason for this?
Actually, the security in Java and Python is the same. Digest-only password handling, cookies that timeout rapidly, careful deletion of sessions, multi-factor authentication. None of this is unique to a Java framework or a Python framework like Django.
Django, indeed, has a security backend architecture that allows you to add your own LDAP (or AD) connection, possibly changing the digest technique used.
Django has a Profile model where you can keep additional authentication factors.
Django offers a few standard decorators for view function authorization checking. Since Python is so flexible, you can trivially write your own decorator functions to layer in different or additional authentication checking.
Security is a number of first-class features in Django.
Probably the reason behind Java is not in the in the security. I think Java is more used in large development companies and banks usually resort to them for their development needs (which probably are not only related to the web site but creep deeper in the backend).
So, I see no security reasons, mostly cultural ones.
The reasons for building banking apps in Java are not related to security, at least IMHO. They are related to:
Java is the COBOL of the 21st century, so there is a lot of legacy code that would have to be rewritten and that takes time. Basically banking apps are old apps, they were built in java some ten years ago and nobody wants to throw all the old code away (which BTW is almost always a bad decision),
some people believe that a static typed language is somewhat "safer" than the dynamic typed language. This is, IMHO, not true (take for instance collections prior to Java 5).
I find your connection between Java and banking wrong ended.
Most Banking Software has terrible security. And much banking software is written in Java. Does ths mean Java makes it more difficult to write secure software than other languages?
Probably it's not Java's fault that there is so little quality security (and safety) wise in Banking software. Actually, like the other posters mention, the choice of your Language usually has very little consequences for your security - unless you select one of the few languages where only hotshot coders can write secure code in (C and PHP come to mind).
Many huge E-Commerce sites are written in Python, Ruby and Perl using various frameworks. And I would argue that the security requirements for merchants are much higher than the requirements of the banking industry. That is because merchants have to provide security and good user experience, while banking customers are willing to put up with unusable interfaces SecureID tokens and whatever.
So yes: Django is up to the task.
You should not rely the security of the application on the framework. even though Django does come in with a pretty good number of measures against classical security issues, it can not guarantee that your application will be secure, you need much more than a programming Framework to get a security critical application.
I'd say yes, Django is a good choice as long as you know its powers and limitations and are aware of the security flaws of every application.
You can build a secure application with Django just as you can with any popular Java framework. One part where Java does shine is its extensive cryptographic library.
For the minimal encryption tasks that are required by Django, Python’s cryptographic services are sufficient, however its lack of strong block ciphers make the encryption mechanism in Django insecure for data at rest.
Python does natively support secure hashing algorithms to include SHA1, SHA224,
SHA256, SHA384, and SHA512, however Django’s authentication mechanism has yet
to be updated to use anything other than SHA1, making it potentially vulnerable to cryptographic analysis.
Are you referring to the fact that the complete application is built in Java, or just the part you see in your browser? If the latter, the reason is probably because in the context of webpages, Java applets can be downloaded and run.