The ruby folks have Ferret. Someone know of any similar initiative for Python? We're using PyLucene at current, but I'd like to investigate moving to pure Python searching.
Whoosh is a new project which is similar to lucene, but is pure python.
The only one pure-python (not involving even C extension) search solution I know of is Nucular. It's slow (much slower than PyLucene) and unstable yet.
We moved from PyLucene-based home baked search and indexing to Solr but YMMV.
I recently found pyndexter. It provides abstract interface to various different backend full-text search engines/indexers. And it ships with a default pure-python implementation.
These things can be disastrously slow though in Python.
For some applications pure Python is overrated. Take a look at Xapian.
lupy was a lucene port to pure python.The lupy people suggest that you use PyLucene. Sorry. Maybe you can use the Java sources in combination with Jython.
+1 to the Xapian and Pyndexter answers.
Ferret is actually written in C with Ruby bindings on top. A pure Ruby search engine would be even slower than a pure Python one. I would love to see "someone else" write a Cython/Pyrex layer for Python interface to Ferret, but won't do it myself because why bother when there are Python bindings for Xapian.
For non-pure Python, Sphinx Search with Python API works the fastest. From the benchmarks from multiple blogs, Sphinx Search is way faster than Lucene, uses way less memory and it is in C.
I am developing a multi-document search engine based on it, using python and web2py as framework.
After weeks of searching for this, I found a nice Python solution: repoze.catalog. It's not strictly Python-only because it uses ZODB for storage, but it seems a better dependency to me than something like SOLR.
Related
I do understand that this topic has been covered in some way at StackOverflow but I'm still not able to figure out the exact answer: can I treat IronPython as a Pythonic replacement to C#?
I use CPython every day, I love the Zen :) but my current task is a Windows-only application with a complex GUI and some other features which I would like to implement using .NET.
IronPython is NOT equivalent to "other languages that run on .NET", as the language has support for substantially fewer CLR runtime features.
IronPython classes are not "real" .NET classes, and DLR APIs need to be used when calling IronPython code from traditional CLR-based languages; this means that if you want genuinely easy interoperability, you're stuck writing glue to "hide" the DLR.
Boo is a much more complete Pythonically-inspired language targeting the CLR. Its (dynamically inferrable) static typing (which can be replaced with duck typing on a variable-by-variable basis) also allows libraries written in Boo to be natively used from C# and other CLR-based languages, without needing to make any allowances for the language in use.
That depends on what it is about C# that you need, and which needs replacing.
If the reason you use C# is that you need a reasonably high performance statically typed language then no, IronPython is likely not going to be a replacement.
If the reason you use it is simply "I need something that runs on .NET and can access .NET libraries", then yes, any language that runs on .NET can be used to replace it.
If you use C# because you're working with a team of programmers who only know C-like languages, C# might also be difficult to replace with IronPython.
It depends on what characteristics about C# it is that you care about, and need to find replacements for.
One thing to consider - and I have no idea how IronPython behaves in that respect - is Common Language Subsystem (CLS) compliance for assemblies. CLS compliance guarantees that any .Net language can access a compiled DLL of your code. That means e.g. in C# you cannot have any public or protected method or parameter that is a unsigned integer. I have no idea how easy it is to achieve CLS compliant code in IronPython, an interesting blog entry that I found dates from 2008.
The question "Can I treat IronPython as a Pythonic replacement for C#?" has been answered pretty well by jalf. If the question were "Is IronPython a Pythonic .NET Language?" though, then the answer would absolutely be yes. The principles of Zen - esp. least surprise - absolutely apply to IronPython's integration with the CLR as well.
I think if you were to do that,
it would be easier in .NET 4.0.
I think you can use the newly released IronPython 2.6.1 for .NET 4.0,
It is already easy to use C# in IronPython.
As you can see here, you can easily do it the other way around (using IronPython from C#) with .NET 4.0.
I think its possible, but I agree Boo is a safer way to go.
"Complex" UI usually entails not "writing" it but building it within Visual Studio with point and click. All the callbacks and eventing code is inserted by itself. There is almost nothing like that on python side. I'd say go for C# straight out.
There is one nagging thing though. If you are true Pythonista, the static typing will get to you very very quickly and you will want to start throwing heavy objects at random people.
If that point comes think about building out the UI with C# and embedding IronPython as a scripting engine for implementing your business logic. That could be a tolerable middle ground.
I recall when I first read Pragmatic Programmer that they suggested using scripting languages to make you a more productive programmer.
I am in a quandary putting this into practice.
I want to know specific ways that using Python or Ruby can make me a more productive .NET developer.
One specific way per answer, and even better if you can say whether I could use Python or Ruby or Both for it.
See standard format below.
IronPython / IronRuby
IronPython in Action will do a better job explaining this (and exactly how best to use IronPython) that can possibly be accommodated in a SO answer. I'm biased -- I was a tech reviewer and am a friend of one of the authors -- but objectively think it's a great book. (No idea if IronRuby is blessed with a similarly wonderful book, yet).
As you want "one specific way per answer" (incompatible with SO, which STRONGLY discourages a poster posting 25 different answers if they have 25 "specific ways" to indicate...!-): prototyping in order to explore some specific assembly or collection thereof that you're unfamiliar with (to check if you've understood their docs right and how to perform certain tasks) is an order of magnitude more productive in IronPython than in C#, as you can explore interactively and compilation is instantaneous and as-needed. (Have not tried IronRuby but I'll assume it can work in a roughly equivalent way and speed).
Less Code
I think productivity is direct result on how proficient you are in a specific language. That said the terseness of a language like Python might save some time on getting certain things done.
If I compare how much less code I have to write for simple administration scripts (e.g. clean-up of old files) compared to .NET code there is certain amount of productivity gain. (Plus it is more fun which also helps getting the job done)
Advanced Text Processing
Traditional strengths of awk and perl. You can just glue together a bunch of regular expressions to create a simple data-mining system on the go.
Learning a new language gives you knowledge that you can bring back to any programming language. Here are some things you'd learn.
Add functionality to your objects on the fly.
Mix in modules.
Pass a chunk of code around.
Figure out how to do more with less code: ruby -e "puts 'hello world'"
C# can do some of these things, but a fresh perspective might bring you one step closer to automating your breakfast.
Embedding a script engine
Use of IronPython for a scripting engine inside your .NET application. For example enabling end-users of your application to change customizable parts with a full fledge language such as Python.
A possible example might be to expose custom logic to end-users for a work flow engine.
Quick Prototyping - Both
In the simplest cases when firing a python interpreter and writing a line or two is way faster than creating a new project in visual studio.
And you can use ruby to. Or lua, or evel perl, whatever. The point is implicit typing and light-weight feel.
Cross platform
Compared to .NET a simple script Python is more easily ported to other platforms such as Linux. Although possible to achieve the same with the likes of Mono it simpler to run a Python script file on different platforms.
Processing received Email
Python has built-in support for POP3 and IMAP where the standard .NET framework doesn't. Useful for automating email triggered tasks.
I want to start writing a http proxy that will modify responses according to some rules/filters I will configure. However, before I start coding it, I want to make sure I'm making the right choice in going with Python. Later, this tool would have to be able to process a lot of requests, so, I would like to know I can count on it later on to be able to perform when "push comes to shove".
As long as the bulk of the processing uses Python's built-in modules it should be fine as far as performance. The biggest strength of Python is its clear syntax and ease of testing/maintainability. If you find that one section of your code is slowing down the process, you can rewrite that section and use it as a C module, while keeping the bulk of your control code in Python.
However if you're looking to make the most optimized Python Code you may want to check out this SO post.
Yes, I think you will find Python to be perfectly adequate for your needs. There's a huge number of web frameworks, WSGI libraries, etc. to choose from, or learn from when building your own.
There's an interesting post on the Python History blog about how Python was supporting high performance websites in 1996.
This will depend on the library you use more than the language itself. The twisted framework is known to scale well.
Here's a proxy server example in python/twisted to get you started.
Bottomline: choose your third party tools wisely and I'm sure you'll be fine.
Python performs pretty well for most tasks, but you'll need to change the way you program if you're used to other languages. See Python is not Java for more info.
If plain old CPython doesn't give the performance you need, you have other options as well.
As has been mentioned, you can extend it in C (using a tool like swig or Pyrex). I also hear good things about PyPy as well, but bear in mind that it uses a restricted subset of Python. Lastly, a lot of people use psyco to speed up performance.
What makes Python stand out for use in web development? What are some examples of highly successful uses of Python on the web?
Django is, IMHO, one of the major benefits of using Python. Model your domain, code your classes, and voila, your ORM is done, and you can focus on the UI. Add in the ease of templating with the built-in templating language (or one of many others you can use as well), and it becomes very easy to whip up effective web applications in no time. Throw in the built-in admin interface, and it's a no-brainer.
Certainly one successful use of Python on the web is Google App Engine. Site authors write code in (a slightly restricted subset of) Python, which is then executed by the App Engine servers in a distributed and scalable manner.
Quotes about Python:
"Python is fast enough for our site
and allows us to produce maintainable
features in record times, with a
minimum of developers," said Cuong Do,
Software Architect, YouTube.com.
YouTube uses a lot of Python and is probably the best example of a Python success story.
A great example of a Django success story is the Washington Post, who recently shared a big list of applications they have developed:
http://push.cx/2009/washington-post-update
www.lawrence.com and www.ljworld.com are two of the first sites to use Django (before it was even open source).
djangositeoftheweek.com has a bunch of good case studies.
www.everyblock.com is another great example.
Finally, http://www.djangosites.org/ links to nearly 2,000 other Django powered sites.
Short anwser: the diversity of tools readily available and freedom of choice.
This sounds like a simple question but which it really isn't. While Python is very good for web development and this has been shown by the, oh so famous, Google App Engine, Plone and Django. One has to point out that the development way in Python requires a lot more from the developer than PHP but it gives a lot more to the mix as well.
The entry level on actually producing something is higher. This is because there are bunch of different tools for doing web development with Python. Choosing the web development framework can be a hard decision for an inexperienced developer.
Having a lot of different tools is a two edged sword. To some extent it brings you the freedom of choice to pick the one you might want but then again how do you really know which one is good for what you're doing. This brings me to my point. Python stands out from the mass by not having a standard or de facto web development library. While this is pretty much against the principle of having only one simple way of doing on thing it also brings us a wide variety of different tools with different kind of design choices. At first this might feel very frustrating because it would be so much easier if somebody had made the choice for you but now that you're left to make the choice you actually might have to think about what you're doing and what would fit. ...or you might just end up picking one and blowing your head off after you've realized that you made the wrong choice. Anyway you end up, you've made the choice and no one else.
Furthermore,
Python is both strong in web and in data analytics and machine learning. For example scikit, sci-py and numpy are very strong. In some cases, it can be very interesting to have the both elements on the same server.
For example http://rankmytweet.com uses this a lot.
trac(bug tracker) and moinmoin(wiki) are too web based python tools that I find invaluable.
GNU Mailman is another project written in python that is widely successful.
As many have pointed out, Django is a great reason to use Python...so in order to figure out why Python is great for web development, the best bet is to look at why it is a good language to build a framework like Django.
IMHO Python combines the cleanest, or at least one of the cleanest, metaprogramming models of any language with a very pure object orientation. This not only makes it possible to write extremely general abstractions that are easy to use, but also allows the abstractions to combine relatively cleanly with others. This is harder to do in languages that take a code-generation based approach to metaprogramming (e.g. Ruby).
Dynamic languages are in general good for web apps because the speed of development. Python in particular has two advantages over most of them:
"batteries included" means lots of available libraries
Django. For me this is the only reason why i use Python instead of Lua (which i like a lot more).
Besides the frameworks...
Python's pervasive support for Unicode should make i18n much smoother.
A sane namespace system makes debugging much nicer, because it's typically easier to find where things are defined.
Python's inability to function as a standalone templating language should discourage the mixture of HTML with model code
Great standard library
Other examples of Python sites are Reddit and YouTube.
I'm a programmer in Python who works on web-applications. I know a fair bit about the application level. But not so much about the underlying "plumbing" which I find myself having to configure or debug.
I'm thinking of everything from using memcached to flup, fcgi, WSGI etc.
When looking for information about these, online, Google typically delivers older-documents (eg. tutorials from before 2007), fragments of problems that may or may not have been resolved etc.
Are there any good comprehensive and up-to-date resources to learn about how to put together a modern, high-performance server? One that explains both principles of the architecture and the actual packages?
Buy this. http://www.amazon.com/Scalable-Internet-Architectures-Developers-Library/dp/067232699X
General info about highly efficient web architecture: http://highscalability.com/
Interesting Python related articles: http://www.onlamp.com/python/
Printed magazine: http://pythonmagazine.com/
Zope is a still evolving framework, written in Python and is documented online. For a start, see Zope Concepts and Architecture. Like other Python based web frameworks, the source is your best reference.
Note that Zope is not easy to grasp, and is different from frameworks like Django.
You can use this terminology to limit search results to the past year:
http://www.tech-recipes.com/rx/2860/google_how_to_access_filter_by_date_dropdown_box/