I want to start writing a http proxy that will modify responses according to some rules/filters I will configure. However, before I start coding it, I want to make sure I'm making the right choice in going with Python. Later, this tool would have to be able to process a lot of requests, so, I would like to know I can count on it later on to be able to perform when "push comes to shove".
As long as the bulk of the processing uses Python's built-in modules it should be fine as far as performance. The biggest strength of Python is its clear syntax and ease of testing/maintainability. If you find that one section of your code is slowing down the process, you can rewrite that section and use it as a C module, while keeping the bulk of your control code in Python.
However if you're looking to make the most optimized Python Code you may want to check out this SO post.
Yes, I think you will find Python to be perfectly adequate for your needs. There's a huge number of web frameworks, WSGI libraries, etc. to choose from, or learn from when building your own.
There's an interesting post on the Python History blog about how Python was supporting high performance websites in 1996.
This will depend on the library you use more than the language itself. The twisted framework is known to scale well.
Here's a proxy server example in python/twisted to get you started.
Bottomline: choose your third party tools wisely and I'm sure you'll be fine.
Python performs pretty well for most tasks, but you'll need to change the way you program if you're used to other languages. See Python is not Java for more info.
If plain old CPython doesn't give the performance you need, you have other options as well.
As has been mentioned, you can extend it in C (using a tool like swig or Pyrex). I also hear good things about PyPy as well, but bear in mind that it uses a restricted subset of Python. Lastly, a lot of people use psyco to speed up performance.
Related
Recently i have developed a billing application for my company with Python/Django. For few months everything was fine but now i am observing that the performance is dropping because of more and more users using that applications. Now the problem is that the application is now very critical for the finance team. Now the finance team are after my life for sorting out the performance issue. I have no other option but to find a way to increase the performance of the billing application.
So do you guys know any performance optimization techniques in python that will really help me with the scalability issue
Guys we are using mysql database and its hosted on apache web server on Linux box. Secondly what i have noticed more is the over all application is slow and not the database transactional part. For example once the application is loaded then it works fine but if they navigate to other link on that application then it takes a whole lot of time.
And yes we are using HTML, CSS and Javascript
As I said in comment, you must start by finding what part of your code is slow.
Nobody can help you without this information.
You can profile your code with the Python profilers then go back to us with the result.
If it's a Web app, the first suspect is generally the database. If it's a calculus intensive GUI app, then look first at the calculations algo first.
But remember that perf issues car be highly unintuitive and therefor, an objective assessment is the only way to go.
ok, not entirely to the point, but before you go and start fixing it, make sure everyone understands the situation. it seems to me that they're putting some pressure on you to fix the "problem".
well first of all, when you wrote the application, have they specified the performance requirements? did they tell you that they need operation X to take less than Y secs to complete? Did they specify how many concurrent users must be supported without penalty to the performance? If not, then tell them to back off and that it is iteration (phase, stage, whatever) one of the deployment, and the main goal was the functionality and testing. phase two is performance improvements. let them (with your help obviously) come up with some non functional requirements for the performance of your system.
by doing all this, a) you'll remove the pressure applied by the finance team (and i know they can be a real pain in the bum) b) both you and your clients will have a clear idea of what you mean by "performance" c) you'll have a base that you can measure your progress and most importantly d) you'll have some agreed time to implement/fix the performance issues.
PS. that aside, look at the indexing... :)
A surprising feature of Python is that the pythonic code is quite efficient... So a few general hints:
Use built-ins and standard functions whenever possible, they're already quite well optimized.
Try to use lazy generators instead one-off temporary lists.
Use numpy for vector arithmetic.
Use psyco if running on x86 32bit.
Write performance critical loops in a lower level language (C, Pyrex, Cython, etc.).
When calling the same method of a collection of objects, get a reference to the class function and use it, it will save lookups in the objects dictionaries (this one is a micro-optimization, not sure it's worth)
And of course, if scalability is what matters:
Use O(n) (or better) algorithms! Otherwise your system cannot be linearly scalable.
Write multiprocessor aware code. At some point you'll need to throw more computing power at it, and your software must be ready to use it!
before you can "fix" something you need to know what is "broken". In software development that means profiling, profiling, profiling. Did I mention profiling. Without profiling you don't know where CPU cycles and wall clock time is going. Like others have said to get any more useful information you need to post the details of your entire stack. Python version, what you are using to store the data in (mysql, postgres, flat files, etc), what web server interface cgi, fcgi, wsgi, passenger, etc. how you are generating the HTML, CSS and assuming Javascript. Then you can get more specific answers to those tiers.
You may be interested in this document I've found some time ago.
As personal advice, be as more pythonic as you can: lazy evaluation is the keyword, so learn to use iterators and generators.
For the type of application you are describing (a web application probably backed by a database) your performance problems are unlikely to be language specific. They are far more likely to stem from design or architecture issues, though they could be simple coding problems too.
To sort this out you need to figure out where the bottlenecks are in your application and for that you need some sort of profiler.
Once you have found your bottlenecks you will be in a much better position. You can evaluate then problem areas for common issues including:
Design and Architecture issues
SQL anti-patterns
Incorrect usage of your framework (perhaps relying on inappropriate defaults)
Badly structured algorithms
The specifics of any solution are going to depend on the specifics of the bottlenecks your find.
http://wiki.python.org/moin/PythonSpeed/PerformanceTips
I optimized some python code a while back, the most surprising thing to me was how much each function call costs. If you minimize function calls or replace loops with builtins you'll be running much faster.
There are some great suggestions hereā¦ So let me suggest an implementation detail. I have found the runprofileserver command found in django-command-extensions very convenient for profiling my Django code.
I am not sure if this would solve the problem but you should have a look at psyco
I am thinking of creating a video library software which keep track of all my videos and keep track of videos that I already haven't watched and stats like this. The stats will be specific to each user using the software.
My question is, is python appropriate to create this software or do I need something like c++.
Python is perfectly appropriate for such tasks - indeed the most popular video site, YouTube, is essentially programmed in Python (using, of course, lower-level components called from Python for such tasks as web serving, relational db, video transcoding -- there are plenty of such reusable opensource components for all these kinds of tasks, but your application's logic flow and all application-level logic can perfectly well be in Python).
Just yesterday evening, at the local Python interest group meeting in Mountain View, we had new members who just moved to Silicon Valley exactly to take Python-based jobs in the video industry, and they were saying that professional level video handing in the industry is also veering more and more towards Python -- stalwarts like Pixar and ILM had been using Python forever, but in the last year or two it's been a flood of Python adoption in the industry.
If you want your code to be REAL FAST, use C++ (or parallel fortran).
However in your application, 99% of the runtime isn't going to be in YOUR code, it's going to be in GUI libraries, OS calls, waiting for user interaction, calling libraries (written in C) to open video files and make thumbnails, that kind of stuff.
So using C++ will make your code 100 times faster, and your application will, as a result, be 1% faster, which is utterly useless. And if you write it in C++ you'll need months, whereas using Python you'll be finished much faster and have lots more fun.
Using C++ could even make it a lot slower, because in Python you can very easily build more scalable algorithms by using super powerful primitives like hashes, sets, generators, etc, try several algorithms in 5 minutes to see which one is the best, import a library which already does 90% of the work, etc.
Write it in Python.
Yes. Python is much easier to use than c++ for something like this. You may want to use it as a front-end to a DB such as sqlite3
Maybe you should take a look at this project:
Moovida
It's a complete media center, open source, written in python that is easy to extend. I don't know if it will do exactly what you want out of the box but you can probably easily add the features you want.
Of course you can use almost any programming language for almost any task. But after noting that, it's also obvious that different languages are also differently well adapted for different tasks.
C/C++ are languages that are very "hardware friendly". Basically, the languages are just one abstraction level above assembler, with C's use of pointers etc. C++ is almost like a (semi-)portable object oriented assembler, if one wants to be funny. :) This makes C/C++ fast and good at talking to hardware.
But those same features become mis-features in other cases. The pointers make it possible to walk all over the memory and unless you are careful you will leak memory all over the place. So I would say (and now C people will get angry) that C/C++ in fact are directly inappropriate for what you want to do.
You want a language that are higher, does more things like memory management automatically and invisibly. There are many to choose from there, but without a doubt Python is eminently suited for this. Python has the last couple of years emerged as The New Cool Language to write these kind of softwares in, and much multimedia software such as Freevo and the already mentioned Moovida are written in Python.
I recall when I first read Pragmatic Programmer that they suggested using scripting languages to make you a more productive programmer.
I am in a quandary putting this into practice.
I want to know specific ways that using Python or Ruby can make me a more productive .NET developer.
One specific way per answer, and even better if you can say whether I could use Python or Ruby or Both for it.
See standard format below.
IronPython / IronRuby
IronPython in Action will do a better job explaining this (and exactly how best to use IronPython) that can possibly be accommodated in a SO answer. I'm biased -- I was a tech reviewer and am a friend of one of the authors -- but objectively think it's a great book. (No idea if IronRuby is blessed with a similarly wonderful book, yet).
As you want "one specific way per answer" (incompatible with SO, which STRONGLY discourages a poster posting 25 different answers if they have 25 "specific ways" to indicate...!-): prototyping in order to explore some specific assembly or collection thereof that you're unfamiliar with (to check if you've understood their docs right and how to perform certain tasks) is an order of magnitude more productive in IronPython than in C#, as you can explore interactively and compilation is instantaneous and as-needed. (Have not tried IronRuby but I'll assume it can work in a roughly equivalent way and speed).
Less Code
I think productivity is direct result on how proficient you are in a specific language. That said the terseness of a language like Python might save some time on getting certain things done.
If I compare how much less code I have to write for simple administration scripts (e.g. clean-up of old files) compared to .NET code there is certain amount of productivity gain. (Plus it is more fun which also helps getting the job done)
Advanced Text Processing
Traditional strengths of awk and perl. You can just glue together a bunch of regular expressions to create a simple data-mining system on the go.
Learning a new language gives you knowledge that you can bring back to any programming language. Here are some things you'd learn.
Add functionality to your objects on the fly.
Mix in modules.
Pass a chunk of code around.
Figure out how to do more with less code: ruby -e "puts 'hello world'"
C# can do some of these things, but a fresh perspective might bring you one step closer to automating your breakfast.
Embedding a script engine
Use of IronPython for a scripting engine inside your .NET application. For example enabling end-users of your application to change customizable parts with a full fledge language such as Python.
A possible example might be to expose custom logic to end-users for a work flow engine.
Quick Prototyping - Both
In the simplest cases when firing a python interpreter and writing a line or two is way faster than creating a new project in visual studio.
And you can use ruby to. Or lua, or evel perl, whatever. The point is implicit typing and light-weight feel.
Cross platform
Compared to .NET a simple script Python is more easily ported to other platforms such as Linux. Although possible to achieve the same with the likes of Mono it simpler to run a Python script file on different platforms.
Processing received Email
Python has built-in support for POP3 and IMAP where the standard .NET framework doesn't. Useful for automating email triggered tasks.
I am considering programming the network related features of my application in Python instead of the C/C++ API. The intended use of networking is to pass text messages between two instances of my application, similar to a game passing player positions as often as possible over the network.
Although the python socket modules seems sufficient and mature, I want to check if there are limitations of the python module which can be a problem at a later stage of the development.
What do you think of the python socket module :
Is it reliable and fast enough for production quality software ?
Are there any known limitations which can be a problem if my app. needs more complex networking other than regular client-server messaging ?
Thanks in advance,
Paul
Check out Twisted, a Python engine for Networking. Has built-in support for TCP, UDP, SSL/TLS, multicast, Unix sockets, a large number of protocols (including HTTP, NNTP, IMAP, SSH, IRC, FTP, and others)
Python is a mature language that can do almost anything that you can do in C/C++ (even direct memory access if you really want to hurt yourself).
You'll find that you can write beautiful code in it in a very short time, that this code is readable from the start and that it will stay readable (you will still know what it does even after returning one year later).
The drawback of Python is that your code will be somewhat slow. "Somewhat" as in "might be too slow for certain cases". So the usual approach is to write as much as possible in Python because it will make your app maintainable. Eventually, you might run into speed issues. That would be the time to consider to rewrite a part of your app in C.
The main advantages of this approach are:
You already have a running application. Translating the code from Python to C is much more simple than write it from scratch.
You already have a running application. After the translation of a small part of Python to C, you just have to test that small part and you can use the rest of the app (that didn't change) to do it.
You don't pay a price upfront. If Python is fast enough for you, you'll never have to do the optional optimization.
Python is much, much more powerful than C. Every line of Python can do the same as 100 or even 1000 lines of C.
To answer #1, I know that among other things, EVE Online (the MMO) uses a variant of Python for their server code.
The python that EVE online uses is StacklessPython (http://www.stackless.com/), and as far as i understand they use it for how it implements threading through using tasklets and whatnot. But since python itself can handle stuff like MMO with 40k people online i think it can do anything.
This bad answer and not really an answer to your question, rather addition to previous answer.
Alan.
What makes Python stand out for use in web development? What are some examples of highly successful uses of Python on the web?
Django is, IMHO, one of the major benefits of using Python. Model your domain, code your classes, and voila, your ORM is done, and you can focus on the UI. Add in the ease of templating with the built-in templating language (or one of many others you can use as well), and it becomes very easy to whip up effective web applications in no time. Throw in the built-in admin interface, and it's a no-brainer.
Certainly one successful use of Python on the web is Google App Engine. Site authors write code in (a slightly restricted subset of) Python, which is then executed by the App Engine servers in a distributed and scalable manner.
Quotes about Python:
"Python is fast enough for our site
and allows us to produce maintainable
features in record times, with a
minimum of developers," said Cuong Do,
Software Architect, YouTube.com.
YouTube uses a lot of Python and is probably the best example of a Python success story.
A great example of a Django success story is the Washington Post, who recently shared a big list of applications they have developed:
http://push.cx/2009/washington-post-update
www.lawrence.com and www.ljworld.com are two of the first sites to use Django (before it was even open source).
djangositeoftheweek.com has a bunch of good case studies.
www.everyblock.com is another great example.
Finally, http://www.djangosites.org/ links to nearly 2,000 other Django powered sites.
Short anwser: the diversity of tools readily available and freedom of choice.
This sounds like a simple question but which it really isn't. While Python is very good for web development and this has been shown by the, oh so famous, Google App Engine, Plone and Django. One has to point out that the development way in Python requires a lot more from the developer than PHP but it gives a lot more to the mix as well.
The entry level on actually producing something is higher. This is because there are bunch of different tools for doing web development with Python. Choosing the web development framework can be a hard decision for an inexperienced developer.
Having a lot of different tools is a two edged sword. To some extent it brings you the freedom of choice to pick the one you might want but then again how do you really know which one is good for what you're doing. This brings me to my point. Python stands out from the mass by not having a standard or de facto web development library. While this is pretty much against the principle of having only one simple way of doing on thing it also brings us a wide variety of different tools with different kind of design choices. At first this might feel very frustrating because it would be so much easier if somebody had made the choice for you but now that you're left to make the choice you actually might have to think about what you're doing and what would fit. ...or you might just end up picking one and blowing your head off after you've realized that you made the wrong choice. Anyway you end up, you've made the choice and no one else.
Furthermore,
Python is both strong in web and in data analytics and machine learning. For example scikit, sci-py and numpy are very strong. In some cases, it can be very interesting to have the both elements on the same server.
For example http://rankmytweet.com uses this a lot.
trac(bug tracker) and moinmoin(wiki) are too web based python tools that I find invaluable.
GNU Mailman is another project written in python that is widely successful.
As many have pointed out, Django is a great reason to use Python...so in order to figure out why Python is great for web development, the best bet is to look at why it is a good language to build a framework like Django.
IMHO Python combines the cleanest, or at least one of the cleanest, metaprogramming models of any language with a very pure object orientation. This not only makes it possible to write extremely general abstractions that are easy to use, but also allows the abstractions to combine relatively cleanly with others. This is harder to do in languages that take a code-generation based approach to metaprogramming (e.g. Ruby).
Dynamic languages are in general good for web apps because the speed of development. Python in particular has two advantages over most of them:
"batteries included" means lots of available libraries
Django. For me this is the only reason why i use Python instead of Lua (which i like a lot more).
Besides the frameworks...
Python's pervasive support for Unicode should make i18n much smoother.
A sane namespace system makes debugging much nicer, because it's typically easier to find where things are defined.
Python's inability to function as a standalone templating language should discourage the mixture of HTML with model code
Great standard library
Other examples of Python sites are Reddit and YouTube.