From interpeted to native code: "dynamic" languages compiler support - python

First, I am aware that dynamic languages is a term used mainly by a vendor; I am using it just to have a container word to include languages like Perl (a favorite of mine), Python, Tcl, Ruby, PHP and so on. They are interpreted but I am interested here to refer to languages featuring strong capability to support the programmer efficiency and the support for typical constructs of modern interpreted languages
My question is: there are dynamic languages can be compiled efficiently in native executable code - typically for Windows platforms? Which ones? Maybe using some third part ad-hoc tools? I am not talking about huge executables carrying with them a full interpreter or some similar tricks nor some smart module able to include its own dependances or some required modules, but a honest, straight, standard, solid executable code.
If not, there is some technical reason inhibiting the availability of such a best-of-both-world feature?
Thanks!
Daniel

I think you're operating under a misunderstanding: These executables aren't huge because they just lump the interpreter in there, they're huge because the whole runtime is in there.
On Windows, most of your runtime is already installed, so you don't have to ship it. You think your program is small, but a quick look at the virtual memory mappings will tell you that even a small "hello-world" type program written in C is a couple megabytes big.
That's just how big useful runtimes are.
If you really want to keep your ship-size small, your only choice is to use the runtimes that are already there, and that means C/C++ and (recently) dot-net.
If you really can't swallow the runtime, Forth is as small as it gets.
The best, most aggressive dynamic languages with the best compilers for Windows are the commercial Lisps. They do a lot of inlining and pruning when producing executables, so you end up shipping only what you use. They are still 1.5x to 5x larger than C/C++ programs.
As far as languages that you know: Perl is as fat as they get. ActiveState has perlapp which I'm sure you're already aware of, but you dismissed because of it's size. Revisit it if you can.
Now, to answer your question (is) there is some technical reason inhibiting the availability of such a best-of-both-world feature?: Yes.
Perl cannot be statically analyzed (proof), which means there's no way for a perl compiler to tell what can be discarded. That means every part of Perl's runtime needs to be available to your program becuase there's no way for your program to indicate what parts can be discarded.
That means that getting a smaller executable is equivalent to getting a smaller runtime, and you should be comfortable accepting that if the perl developers knew how to make the perl runtime smaller without discarding any features, they'd probably do it.
If you are willing to write in a strict subset of Python or PHP, these languages can be analyzed. Shed Skin and HipHop-php are pretty good, but they're still quite large, and they don't support all of Pythons and PHP's features which means that some modules will simply not work. To my knowledge, nobody has implemented pruning for either of these languages (most of the focus in these compilers is in improving their lackluster performance) and it may be another decade or more before anyone bothers, however these still will be the restrictions you have to accept when doing this sort of thing.

The PyPy project does what you describe for a fairly complete subset of Python.
In the general case, this is a very hard problem to solve, largely due to the very attributes that make these languages "dynamic": late binding, weakly-typed variables, data structures and containers, eval facilities, a fuzzy divide between programming and meta-programming, etc. But a lot of effort is being poured into it, such as the JavaScript JIT-compiler projects listed here.

Shed Skin is an experimental (and restricted) Python-to-C++ compiler that can do what you describe. As Marcelo indicates above with PyPy, there are limitations on what you can compile with Shed Skin, but if you are willing to accept the restrictions, you can achieve large speedups.

Related

Is it feasible to use Lisp/Scheme as a scripting language? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
Is it feasible to script in a Lisp, as opposed to Ruby/Python/Perl/(insert accepted scripting language)? By this I mean do things like file processing (open a text file, count the number of words, return the nth line), string processing (reverse, split, slice, remove punctuation), prototyping/quick computations, and other things you would normally use Python, etc. for. How productive would doing such tasks in a Lisp be, as opposed to Ruby/Python/Perl/scripting language of choice?
I ask because I want to learn a Lisp but also use it to do something instead of only learning it for the sake of it. I looked around, but couldn't find much information about scripting in a Lisp. If it is feasible, what would be a good implementation?
Thank you!
Today, using LISP as if it's certain that anyone would understand what language one is talking about is absurd since it hasn't been one language since the 70's or probably some time earlier too. LISP only indicates that it's fully parenthesized Polish prefix notation just like Pascal, Ruby, Python and Perl are just variations of ALGOL.
Scheme is a standard and Common LISP is a standard. Both of those are general purpose though Common LISP is a batteries included while Scheme is a minimalistic language. They are quite different in style so comparing them would be like comparing Java with Python.
Embedded LISPS
There are lots of use of Scheme and specialized LISP dialects as embedded languages. Emacs is the most widely used editor in the unix segment and its lisp elisp is the most used lisp language because of this. Image processing applpication GIMP has a Scheme base with extensions for image processing.
Stand alone scripts
It's possible in many Common LISP implementation with the standard #!-notation to make a script work as an executable and run it as an application. Eg. I use CLISP and have scripts using #!/usr/bin/clisp -C as first line. I also use Scheme the same way and in the very fast incremental compiler ikarus you use #!/usr/bin/ikarus --r6rs-script. Clojure has all power of Java libraries and you can use your own classes from it and it also can be made an application with #!/usr/bin/env java -cp /path/to/clojure-1.2.0.jar clojure.main
more permanent application
In Common LISP you can dump an image. It will be a Common Lisp binary with your code already compiled in. Many Scheme implementations has compilation to native and Clojure can compile to java bytecode (though it's not the most common way to do it). Still I have had experience with Ikarus sometimes interpreting faster than a compiled executable from racket, chicken and gambit so I often do my programming in DrRacket and running it in ikarus in Scheme.
Try both Common LISP and Scheme as both of them are good enough for the tasks you specified in your question. There are many free books on the subject and some are worth their price as well. You may also try Racket too, which is a Scheme deviate with lots of libraries for everyday tasks, but it's not conforming to any standard.
About productivity
I imagine you are referring to how quick you can write a certain task in a Lisp dialect. I imagine it depends on how used you are to the syntax. It takes a while to get used to it after only knowing Algol dialects. It takes different approaches as well as you need to think in a more functional manner, especially for Scheme. I imagine when you are as good in Scheme as in your favorite Algol dialect it will be similar. Eg. some algol dialects are faster to prototype inn than others and that is true for Lisp dialects as well.
When I first started to poke around Lisp I used it to write shell scripts... I'm somewhat OCD about order and uniformity and I really liked Lisp languages because they have saner syntax (fewer syntax rules, no random decisions related to particular syntax elements).
If you are looking into Common Lisp, then SBCL, as installed by default on any Linux distro is available right away for CGI scripting. SBCL also has its means to process command line arguments, access pipes, processes and so on. If you aren't after portability between different Lisps, then I'd say that you are good to go. Just to give you examples of where I used this scripting: a girl in our office compiled and maintained a list of words which I had to further process in our application. This list was available as Googledoc spreadsheet. My script would download the words table and parse it into the format I needed. I had scripts that helped me with file manipulation and preprocessing before the project was compiled (the project wasn't in Lisp).
Finally, SBCL has its own means of being used with FastCGI http://kdr2.com/project/sb-fastcgi.html , but of course there are several full-blown HTTP-servers, which you could either use as is, or put behind a proxy. Hunchentoot was historically most popular one, but there are others too, like cl-http, here are some more links: http://www.cliki.net/web .
#!/usr/bin/env sbcl --script
would be the shebang comment to use.
Furthermore, I use Common Lisp for my classes, simply to do my homework, which is, I imagine, what you were after when you said "open a text file, count the number of words, return the nth line". Here's an example I used for the class on introduction to logic: https://github.com/wvxvw/coursera-logic/blob/master/formula.lisp .
Since there was a discussion on the matter, here's something else to be considered.
Typically scripting languages have twofold nature: they are written in in a low-level language, while exposing high-level API to the programmer. This can be a blessing or a curse. On one hand languages like Python, Ruby, JavaScript have highly optimised libraries for dealing with common tasks, while Common Lisp typically, similarly to C++ or Java implements everything in Lisp. Thus, for example, strings are a lot less sophisticated in CL then they are in JavaScript: in CL they are simple arrays of characters, while in JS they are a special kind of trees known as ropes.
Typically, a programmer writing in scripting language is neither required nor expected to pull out a high performance code. The language compensates for it with the base level highly optimized library code. Unfortunately, once the programmer actually wants to squeeze as much performance as possible from the scripting language, it appears that they just don't have the tools to do it, because they can't get to the bottom of the implementation.
On the other hand, dealing with the lower level of details means that the programmer will be less productive overall, and this would require more skill, because optimizing the code to get on par with the industry standard implementations requires skill.
Common Lisp generally falls under the second category, but I'd argue it to be still good for one-liners and casual programming because of the extensive library and highly developed macro system, which allows to reduce verbosity usually associated with low-level languages.
I understand that you have asked two questions:
Is it possible to run scheme scripts from a command line
How effective is that?
I can answer your first question, but not your second. It highly depends on your scheme skills and how much code you want to (re)write in scheme how feasible it is.
So I just answer your first questions :)
Have a look at this SO question: Running Scheme from the command line.
If you have installed, for example, DrRacket (which is a good IDE for many scheme dialects) as your scheme interpreter, you may use the shebang line #! /usr/bin/env mzscheme in your scheme script.
This test script (test.scm)
#!/usr/bin/env mzscheme
#lang scheme
(print (+ 40 2))
can be made executable (with chmod +x test.scm) and executed (with ./test.scm).
I'd say that Lisp/Scheme could be used to write small scripts or big application. But they are not yet ready for wide use.
The big difference between python/ruby and scheme is that python has a huge library of modules centralized in one place. Ruby is quite similar to python with ruby gems.
Scheme on the other hand might have a small library of modules scattered accross the internet. The quality of modules doesn't always compare to the popular modules in python and ruby.
One could say that they are aiming at different goal but I'd say scheme just got old and people started to forget about it and how it could be used as a tool instead of just a school subject.
About Lisp, I can't really say. But from your description, it's possible to write scripts that you'd like to write but if you need something specific it's possible that it's not there and you'll have to rewrite it yourself.
All I can say, is jump in. And become someone who gives a future to this language. Don't be scared. This language has a bright future and you'll learn a lot from it.
WRT to your tasks, what about using Emacs, which comes with an interactive Python-shell. So you have the convenience of editing alongside with running scripts.

Learn a scripting language besides Python [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Someone told me once, that programmers tend to learn one scripting language properly and ignore or dislike other scripting languages. Do you have similar experiences?
I'm using Python as my choice for scripting for few years, however, I'm sure that there are many existing and emerging languages that could impress the Pythonistas. Can you recommend scripting languages that would be interesting and useful to learn besides of Python?
Look, Python pretty much has all you need (in my opinion) for application programming. You can write anything from a protocol stack to YouTube, from media players to 3D games and graphics and you get excellent performance.
It occupies the same niche as some of these other mentioned languages:
C, you have access to almost all of the useful C/C++ libraries. The only reason I would pick to write something in C over Python is because I needed the performance gain. Even then, I would probably prototype it in Python first; it's much easier to revise your design when your application is written in Python.
Ruby, there is no good reason to ever use Ruby instead of Python.
Perl, it's great for some particular kinds of tasks, but if you're a fan of consistent, readable and sane programming styles you will hate looking at about 95% of existing Perl code. I don't know if this is because the people who program in Perl tend to be (in my experience) sys admins first and programmers second, or because Perl has a design philosophy that allows for multiple distinct ways to achieve the same effect.
Given that, I would say that if you are going to learn another language, make sure it gives you the ability to do something new. There are two scripting languages that I would recommend for you to learn:
Bash, what a joy it is to manipulate your filesystem with a combination of for loops and pipes. Bash programming doesn't give you more than what you can already do with Python, but if you are a *nix user you will experience great gains in your daily productivity.
Javascript, being able to write browser-based applications is a useful skill and almost definitely the way most applications will be done in the future. The Javascript/browser environment is set to gain a whole host of capabilities in the coming few years, from audio manipulation to OpenGL graphics, and some very fast engines are either in the works or already available (like V8, which powers the Chrome browser and compiles Javascript to native byte code.) Have you seen Quake2 ported to WebGL?
My answer basically boils down to this: first, learn languages that are useful.
Ruby - what it enables and does with blocks is really interesting, and quite foreign to python based programming
Erlang - the functional language has a lot of interesting examples and it will definitely make your head work differently afterwards (in a good way)
Javascript - yes, I'm serious. ALthough there's a fair number of grips to be had with this prototype language, it does some really interesting things with that prototyping and just slightly differently than Ruby and/or Python. And a ton of folks are pouring big money into making Javascript a outstandingly fast scripting language.
I would recommend learning Haskell and a dialect of Lisp such as Scheme or Common Lisp, if you master either of those you'll gain insight into how things are accomplished with the functional paradigm and it'll help out your Python as well.
Here are some languages categorized by paradigms I'd learn:
Imperative/Procedural languages:
C
Functional paradigm languages:
Haskell
Common Lisp/Scheme
Similar object oriented languages:
Ruby
ECMAScript
Other:
Perl
I would advise you to stay away from PHP unless you really need the work. You would probably want to run back to Python.
Scripting languages are so similar that the marginal benefit of moving from one scripting language to another is usually low. So it's unsurprising that people wouldn't bother to learn more than one. Nevertheless, in my career I have passed through times when my main scripting language (in roughly chronological order) was
Awk
Tcl
Icon
Ksh
Lua
I also used Perl and Python but never found them enough better to be worth switching to.
If you want to check out another scripting language, I recommend Lua, because
It's powerful and remarkably simple, having the best power-to-weight ratio of all languages named here.
Like Tcl it was designed from the beginning to incorporate C code seamlessly. This facility works extremely well and greatly extends the range of problems for which it is useful (see Adobe Lightroom, World of Warcraft, Garry's Mod, CHDK).
The implementation is highly performant and brilliantly engineered. If you want to learn something about how languages are implemented, it will repay careful study.
If, however, your goal is to learn a new language to expand your mind, learn something else besides a scripting language. For example, learn Haskell and pick up some mind-blowing ideas (many stolen from the same sources that Guido stole from), or learn C and really understand exactly what's happening on the hardware.
The only relatively unbiased answer you can really look for is probably statistical, and you would still have to account for the natural tendency of people to follow the path of least resistance once one is found or carved.
How many people learnt Python to a decent level, found the language resonates with the way they want to work, then move to something else because the language or the ecosystem, or both, don't support their needs?
I'd say probably a single digit percentage of the educated userbase, wouldn't be surprised if it amounted to less than 5%.
Unless you have work related prospects that involve a different language, or you need to move sideways for similar reasons, I'd say you're probably best off learning something complimentary to Python rather than similar or equivalent.
C++ for low-level or computationally intensive tasks, CUDA if your field can take advantage of it (med-viz, CGI etc.), whatever flavour of shell/sysadmin oriented scripting and hacks float where you work (bash, tcl, awk or whatever else) and so on.
Personally the reason I haven't bothered past a first glance with ruby, php, or a number of other languages is simply that it's better ROI to keep working on my python skills than picking up something that offers mostly the same qualities just in different forms.
If you really want to learn something else for the sake of opening your mind up a bit, and want to stick to "scripting", then LUA was an interesting toy for me for a while, mostly for the ridiculous performance you can squeeze out of a relatively easy integration process, and because it is a rather different set of tracks compared to Python. That, and the fact WoW plugins had to be written in LUA ;)
I'll give an honest answer from my perspective.
No.
Having started scripting using batch, bash, and Perl, discovering Python was discovering precisely what I'd want from a scripting language (and more, but that's off topic). It integrates with familiar Unix interfaces, is modular, doesn't force any particular paradigm, cross platform and under active development. The same can be said of no other scripting language I know of.
The only other scripting languages I'd consider using is Lua or Scheme, for their smaller footprints and suitability for embedding, Python can be a little hefty. However they're hardly suitable for the more general purpose shell and other forms of scripting.
Update0
I just noticed mentions of Ruby and PHP in other answers, these both slipped my mind, because I'd never consider using them. Ruby is slower and not quite as popular, and PHP is more C/Perl like, with flatter interfaces, which comes with performance boons of its own. Using these alternatives to Python is a matter of taste.
To answer your first question: Do people learn one language and then ignore or dislike others?
Well, if you know one language well, you will need to see great advantages to move to another.
I started out using perl and eventually thought that there must be easier way to do some things. I picked up python and stopped using perl almost at once.
A little while later I thought I'd try ruby and learned a bit about that. The advantages over using python weren't big enough to switch, so I decided to stick with python. If I had started out using ruby, I'd probably be using that still.
If you are using python, I don't think you will easily find another scripting language that will win you over.
On the other hand, if you learn functional programming, you will probably learn a few new things, some of them will even be useful in your python programming, since a few things in python seems to be inspired by functional programming and knowing how to use them will make you a better programmer in general and a better python programmer too.
Learn a Lisp. Whether it's "scripting" or not, Eric Raymond had the right of it when he wrote:
"Lisp is worth learning for the
profound enlightenment experience you
will have when you finally get it;
that experience will make you a better
programmer for the rest of your days,
even if you never actually use Lisp
itself a lot."
The programming paradigm needed to be highly effective in Lisp is sufficiently unlike what you use with Python day-to-day that the perspective it gives is very, very much worth it.
And within Lisps, my choice? Clojure; like other Lisps, its macro system gives you capabilities comparable (actually superior) to the excellent metaprogramming in Python, but Clojure in particular has a focus on batteries-included practicality (and an intelligent, opinionated design) which will be familiar to anyone fond of GvR's instincts. Moreover, Clojure's strengths are extremely disjoint from Python's -- in particular, it shines at highly-multithreaded, CPU-bound concurrent programming, which is one of Python's weaknesses -- so having both in your toolbox increases the chance you'll have the right tool when a tricky job comes along.
(Is it scripting? In my view, that's pretty academic these days; if you have a REPL where you can type code and get an immediate response, modify the state of a running program, or experiment with an API, I see a language as "scripting" enough).
I would learn a statically typed language with very powerful type expression capabilities and awesome concurrency.
One of the following would be a good choice (in order of my preference):
Scala
F#
Haskell
Ocaml
Erlang
Typed languages like the above make you think different. Also these languages have REPLs so they can be used as a scripting language although truthfully I'm not really sure what the definition is of "scripting" language is.
Python is missing good concurrency builtin to the language so knowing how to deal with concurrency for many python programmers is a challenge.
I have found that strongly typed languages scale better for big projects for many reasons:
Because types are so important they become an invaluable way to communicate the problem
Refactoring in these languages is much much easier.
Automatic Serialization is sometimes easier too (although for Haskell thats less true).
A lot less time spent on writing assertions on type checking.
Browsing the code is easier because most IDEs will allow you click on and go to different types
I'm actually learning Scala after Python. From "Programming in Scala":
The name Scala stands for “scalable language.” The language is so named because it was designed to grow with the demands of its users. You can apply Scala to a wide range of programming tasks, from writing small scripts to building large systems.
Integration of object-oriented and functional programming inside the language with expressive strong static type system is interesting by itself. And yes, you can use Scala as scripting language. I feel uncomfortable coding in languages with dynamic typing discipline so Scala seems to be a good alternative. Besides its complexity at the initial learning stage.
If you satisfied with dynamic typing discipline take a look at the roots. Smalltalkof course. Try Squeak with Squeak by Example companion book or its open-source fork Pharo with Pharo by Example book for the start.
Ruby/Groovy/Perl if you'd like to stick to traditional scripting practices.
Otherwise I'd heartily recommend you Clojure and Scala - two of the more innovative programing languages of the past few years.
If you are already familiar with Python, you are unlikely to find something compelling in the same niche, although Ruby does have a very strong and vocal following that seems to like it very much. Perhaps you should consider a scripting language that fills a different role, such as BASH shell script for quick, simple scripts that don't need the complexity of Python or JavaScript which runs in the browser.
I can't say that I agree with wiping Ruby off the map... Ruby fixed every problem that perl had as far as syntax goes... I loved Python first but let ruby get a little more mature and it will get in the the fray more and more... Why do I support Ruby strongly? just step away from python for a few months and then give Ruby a chance... I was a Ruby hater when I was a python guy. But I can't hardly stand to use python at this point. One day someone is gonna clean up the GC and toss in some native threads and everybody better watch out.
off the rant, Python is a full featured, not just good, Great Language... Perl... what a mess... I don't know how Perl can look at itself in the mirror standing next to any other mainstream scripting language... PHP is much prettier... At least Perl is fast, right...(CPAN never hurt it either) if Speed is the real issue there are other interpreters that juice it up a bit... Jython, jRuby, PyPy... the list goes one, screw Bash...

Standardizing a Release/Tools group on a specific language

I'm part of a six-member build and release team for an embedded software company. We also support a lot of developer tools, such as Atlassian's Fisheye, Jira, etc., Perforce, Bugzilla, AnthillPro, and a couple of homebrew tools (like my Django release notes generator).
Most of the time, our team just writes little plugins for larger apps (ex: customize workflows in Anthill), long-term utility scripts (package up a release for QA), or things like Perforce triggers (don't let people check into a specific branch unless their change description includes a bug number; authenticate against Active Directory instead of Perforce's internal passwords). That's about the scale of our problems, although we sometimes tackle something slightly more sizable.
My boss, who is reasonably technical, has asked us to standardize on one or two languages so we can more easily substitute for each other. He's advocating bash scripts and Perl, due to their universality and simplicity. I can see his point--we mostly do "glue", so why not use "glue" languages rather than saddle ourselves with something designed for much larger projects? Since some of the tools we work with are Java-based, we do need to use something that speaks JVM sometimes. (The path of least resistance for these projects is BeanShell and Groovy.) I feel a tremendous itch toward language advocacy, but I'm trying to avoid saying "We should use Python 'cause I like it and Perl is gross."
Instead, I'm trying to come up with a good approach to defining our problem set: what problems do we solve with scripts? Would we benefit from a library of common functions by our team, or are most of our projects more isolated? What is it reasonable to expect my co-workers to learn? What languages give us the most ease of development and ease of modification?
Can you folks suggest some useful ways to approach this problem, both for my own thinking process and to help me facilitate some brainstorming among my coworkers?
Two points:
"Eww Perl gross" is somewhat of an urban legend. You can write great clean self-documenting code in Perl, and your can write write-only code in pretty much any language. It's a property of a developer, not a language.
Just because you're writing glue code, doesn't mean the code has to suck like some glue hacks tend to be.
From many threads comparing Perl vs Python on SO, it appears to me that Perl's CPAN is more expansive than Python's repository, but I have no experience with Python and can't substantiate with real comparison.
BUT, one thing I do know. After 5 seconds search, CPAN has a JIRA module. Whether that's a good factor for you or not, that's up to you.
Google standardized on Python for such tasks (and many more) a bit before I joined the company; as far as I know, huge advantages such as Python's great implementations on the JVM and .NET didn't even play a role in the decision -- it was all about readability. At the time (and to some extent, even now) the theory at Google was that every engineer must be able, at need, to tweak every part of the codebase -- including of course build scripts, spiders (which were in Python at the time), and so forth. Demanding of engineers already proficient in C++ and Java to learn many more "scripting" languages (Python, Perl, Bash, Awk, Sed, and so forth) was simply unconsciounable: one had to be selected. Given that constraint, Python was the clear choice (under other constraints, Perl might also have been -- but I can't see the inevitable mix of Bash, Awk and Sed ever competing on such grounds!_) -- and that's how I ended up working there, a bit later;-).
Given that the overall potential of Python vs Ruby vs Perl vs PHP vs Bash + Awk + Sed vs ... is roughly equal, picking one is clearly a winner -- and Python has clean readability, strong implementations on JVM and .NET as big vigorishes. Seriously, I can only think of Javascript (inevitable for client-side work, now rich with strong implementations such as V8) as a possible "competitor" (unfortunately, JS inevitably carries on a lot of baggage for backwards compatibility -- unless you can use a use strict;-like constraint to help on that, it must be an important downside).
I don't think anyone is going to be able to solve your problem on Stackoverflow. Your choice of tools, methods, and process are much more affected by social constraints, e.g. what your boss wants and what you want, then technical merits. That's not necessarily bad.
The short answer is "Use what is going to be most pleasing to the developers". If everyone likes Python more than Perl, for whatever reason, they are probably going to get more done in Python. If they like Ruby more than Python, it's the same thing.
Some things to evaluate as part of your selection:
What do the developers already know?
What are they most willing to learn?
How much weekly time can your team spend learning new things (e.g. lunch seminars, formal classes, etc)?
What do most people in the community use to work with the tool you need to support? For instance, Fisheye has a Java API, and some REST examples for Perl and Python. If you're writing Fisheye extensions, Java seems to be the win there. If you're merely accessing Fisheye data, any language can use the REST stuff.
What is most of your code base in already? What can you replace and what do you have to continue to support? I find that many companies can't answer this question because every developer seems to add two new technologies they don't tell anyone about. :)
Which platforms do you need to support? Some languages have platform specific issues, and I don't mean just Windows vs. Unix. Do you have legacy hardware you have to support? Does your tool work on that stuff?
How much of the stuff you produce can benefit other parts of the company? What are other teams using?
Do the people advocating one tool know it well enough to be its champion? I ask What are five things he hate about your favorite language? If people can't name five valid things that are wrong with their language or tool, they don't have enough experience with it.
The Longer Answer
People tend to try to reduce this to a technical argument because they are afraid to admit their biases or examine why they think what they think. Your boss might favor bash and Perl because that's what he did a lot of work in when he was getting started. You might like Python because you have a personal affinity for the way Python does things. I like Perl because I like its flexibility and DWIMmery. Like any social situation, different people are going to be attracted to different parts of different things. Just because you like chocolate doesn't make vanilla evil. I could give you lots of good arguments why Perl can be useful, but that doesn't mean that something else can't give you the same value.
What problems do we solve with scripts?
That one you have to answer for yourself. :)
Would we benefit from a library of common functions by our team, or are most of our projects more isolated?
This is most likely a good thing in Python, Perl, Ruby, Java, and almost any other language that you might choose. I think this part of your requirement is language agnostic. No matter which one you choose, you'll probably want to do this.
What is it reasonable to expect my co-workers to learn?
A good developer should be able to work with several different languages at least to an apprentice level. Those languages should include ones that have vastly different assumptions about how people express problems, say, for example, the set { Smalltalk Perl C Lisp Java }.
The best developers I've hired and worked with have always wanted to use the right tool for the job instead of making the job right for the tool. They might have their favorite language, but they didn't grouse (too much) about using a different tool when it made more sense.
Many "developers", however, seem to think that they are getting paid to play with their favorite tool. You need to convert them into people who think they have a toolbox to solve problems that create business value.
And remember, you never stop learning. As a developer you don't have to choose one language then defend it with your life, forsaking all others, in sickness and in health, and so on. Good developers are going to continue to track new technologies and evaluate them for usefulness for their tasks. Just because you choose one tool over another doesn't mean you stop paying attention.
No matter what you choose, someone is going to complain. Don't look for the solution that makes everyone happy. There isn't one, short of getting rid of the developers who aren't happy.
What languages give us the most ease of development and ease of modification?
A skilled practitioner in just about any language will think that his chosen language is the easiest to develop, modify, and maintain. Unskilled practitioners tend to blame the language and the tools for their problems. Some languages have steeper learning curves, and some have bigger payouts. A person's tolerance with immediate gratification is a big factor here.
That being said, different languages have developed different cultures and different toolsets. Perl people tend to like vi or emacs, Ruby people tend to like TextMate, Java people tend to like Eclipse or IntelliJ. That's not always true, but the culture that evolves around the tools are often more important than the technical details of the tool. If your developers like a particular type of tool, they are probably going to like the language that has a culture built around that sort of tool.
Some processes and tools take more time to get used to or require more education, but they can have larger advantages when used properly. Other tools get you started sooner but might not give you a path to bigger and greater things, such as cross-team development. The trick, however, is to not code to the tools so you aren't stuck in any particular toolset.
First, it's important to note that it is very hard to convince someone they're wrong.
He's advocating bash scripts and Perl,
due to their universality and
simplicity
Bash scripts are not simple. The bash programming model is really complex and unfriendly. if statements and expressions, in particular are horrifying.
Perl may or may not be simple.
Bash is universal. Perl, however, is exactly as universal as Python. Python is pre-installed in almost all Linux distributions. So that argument is specious.
The "universality" of bash, Perl and python is exactly the same. The "simplicity", however, is not the same. You won't find it easy to to "prove" or "convince" anyone of this once they've already pronounced Perl as simple.
The Situation.
If the boss is advocating Perl, and Perl is not the answer, you will find it is very hard to convince someone they're wrong, making this effort nearly impossible.
If the boss was just throwing out ideas, then this is just difficult.
Quick Hack.
An easy thing you can do is to attempt head-to-head comparisons of Python and Perl for some randomly-chosen jobs. You can then have a code walkthrough to demonstrate the relative opacity of Perl compared with the relative clarity of Python.
Even this is fraught with terrible dangers.
Some folks really think code golf is important. Python loses at code golf. Perl wins. There's nothing worse than "Angry Co-worker with Perl Bias" who will kill you with code-golf solutions that -- because they're smaller -- can baffle management into thinking that they're clearer or "better" on some arbitrary scale.
Some folks really think explicit is "wordy" and bad. Python often loses because the assumptions are stated as actual parameter values. Some folks can (and do) complain at having to actually write things down. Read Stack Overflow for all of the Python questions where someone wants to make the try: block go away in a puff of assumptions.
If you choose random problems, you may -- accidentally -- chose something for which there's an existing piece of Perl or Python that can be downloaded and installed. A language can win just through an accident of the draw. Rather than a more in-depth comparison of language features.
Best Bet
The best you can do is the following.
Identify what folks value. You can call these "non-functional" requirements. These are quality factors. What are the foundational, core principles? Open, Accessible, Transferrable Skills, Simplicity, Cleanliness, Honesty, Integrity, Thriftiness, Reverence, Patience, Hard Work, A Sense of Perspective, Reef the Main in Winds over 20 kn, etc. This is hard. No sympathy here.
Identify the technical use cases. These are "functional" requirements. Which bits of glue and integration there are? This is hard, also. Requirements erupt of out of the woodwork when you do this. Also, when you have a Perl bigot on the team, numerous non-functional requirements will pile into this area. Your manager -- who proposed Perl -- may be the Perl bigot, and the use cases may be difficult to collect in the presence of a Perl bigot.
Identify how (a) Perl + Bash vs. (b) Python vs. (c) Java fit this core values and the functional requirements. Note that using Python means you do not need to use Bash as much. My preference, BTW, is to pare Bash down to the rock-bottom minimum.
This is a big, difficult job. It's hard to short-cut. If you find that Perl is not the answer and the Perl bigot you need to convince is the manager who proposed Perl in the first place, you may find that convincing someone that they're wrong is very hard.
Edit. I am aware that I am forbidden from using the string "Perl Bigot" to describe the manager's potential level of bias toward Perl. I, however, insist on using "Perl Bigot" to describe the manager who proposed Perl. The question provides no information on which to change this. The worst case is that (a) the manager is the Perl Bigot and (b) Perl is not the answer.

Why is (python|ruby) interpreted?

What are the technical reasons why languages like Python and Ruby are interpreted (out of the box) instead of compiled? It seems to me like it should not be too hard for people knowledgeable in this domain to make these languages not be interpreted like they are today, and we would see significant performance gains. So certainly I am missing something.
Several reasons:
faster development loop, write-test vs write-compile-link-test
easier to arrange for dynamic behavior (reflection, metaprogramming)
makes the whole system portable (just recompile the underlying C code and you are good to go on a new platform)
Think of what would happen if the system was not interpreted. Say you used translation-to-C as the mechanism. The compiled code would periodically have to check if it had been superseded by metaprogramming. A similar situation arises with eval()-type functions. In those cases, it would have to run the compiler again, an outrageously slow process, or it would have to also have the interpreter around at run-time anyway.
The only alternative here is a JIT compiler. These systems are highly complex and sophisticated and have even bigger run-time footprints than all the other alternatives. They start up very slowly, making them impractical for scripting. Ever seen a Java script? I haven't.
So, you have two choices:
all the disadvantages of both a compiler and an interpreter
just the disadvantages of an interpreter
It's not surprising that generally the primary implementation just goes with the second choice. It's quite possible that some day we may see secondary implementations like compilers appearing. Ruby 1.9 and Python have bytecode VM's; those are ½-way there. A compiler might target just non-dynamic code, or it might have various levels of language support declarable as options. But since such a thing can't be the primary implementation, it represents a lot of work for a very marginal benefit. Ruby already has 200,000 lines of C in it...
I suppose I should add that one can always add a compiled C (or, with some effort, any other language) extension. So, say you have a slow numerical operation. If you add, say Array#newOp with a C implementation then you get the speedup, the program stays in Ruby (or whatever) and your environment gets a new instance method. Everybody wins! So this reduces the need for a problematic secondary implementation.
Exactly like (in the typical implementation of) Java or C#, Python gets first compiled into some form of bytecode, depending on the implementation (CPython uses a specialized form of its own, Jython uses JVM just like a typical Java, IronPython uses CLR just like a typical C#, and so forth) -- that bytecode then gets further processed for execution by a virtual machine (AKA interpreter), which may also generate machine code "just in time" -- known as JIT -- if and when warranted (CLR and JVM implementations often do, CPython's own virtual machine typically doesn't but can be made to do so e.g. with psyco or Unladen Swallow).
JIT may pay for itself for sufficiently long-running programs (if memory's way cheaper than CPU cycles), but it may not (due to slower startup times and larger memory footprint), especially when the types also have to be inferred or specialized as part of the code generation. Generating machine code without type inference or specialization is easy if that's what you want, e.g. freeze does it for you, but it really doesn't present the advantages that "machine code fetishists" attribute to it. E.g., you get an executable binary of 1.5 to 2 MB in lieu of a tiny "hello world" .pyc -- not much point!-). That executable is stand-alone and distributable as such, but it will only work on a very specific narrow range of operating systems and CPU architectures, so the tradeoffs are quite iffy in most cases. And, the time it takes to prepare the executable is quite long indeed, so it would be a crazy choice to make that mode of operation the default one.
Merely replacing an interpreter with a compiler won't give you as big a performance boost as you might think for a language like Python. When most time is actually spend doing symbolic lookups of object members in dictionaries, it doesn't really matter if the call to the function performing such lookup is interpreted, or is native machine code - the difference, while not quite negligible, will be dwarfed by lookup overhead.
To really improve performance, you need optimizing compilers. And optimization techniques here are very different from what you have with C++, or even Java JIT - an optimizing compiler for a dynamically typed / duck typed language such as Python needs to do some very creative type inference (including probabilistic - i.e. "90% chance of it being T" and then generating efficient machine code for that case with a check/branch before it) and escape analysis. This is hard.
I think the biggest reason for the languages being interpreted is portability. As a programmer you can write code that will run in an interpreter not a specific OS. So your programs behave more uniformly across platforms (more so than compiled languages). Another advantage I can think of is it's easier to have a dynamic type system in an interpreted language. I think the creators of the language were thinking having a language where programmers can be more productive due to automatic memory management, dynamic type system and meta programming wins over any performance loss due to the language being interpreted. If you are concerned about performance you can always compile the language to native machine code employing a technique like JIT compilation.
Today, there is no longer a strong distinction between "compiled" and "interpreted" languages. Python is in fact compiled just as much as Java is, the only differences are:
The Python compiler is much faster than the Java compiler
Python automatically compiles source code as it is executed, there is no separate "compile" step required
Python bytecode is different from JVM bytecode
Python even has a function called compile() which is an interface to the compiler.
It sounds like the distinction you are making is between "dynamically typed" and "statically typed" languages. In dynamic languages such as Python, you can write code like:
def fn(x, y):
return x.foo(y)
Notice that the types of x and y are not specified. At runtime, this function will look at x to see whether it has a member function named foo, and if so will call it with y. If not, it will throw a runtime error that indicates no such function was found. This sort of runtime lookup is much easier to represent using an intermediate representation like bytecode, where a runtime VM does the lookup instead of having to generate machine code to do the lookup itself (or, call a function to do the lookup which is what the bytecode will do anyway).
Python has projects such as Psyco, PyPy, and Unladen Swallow that take various approaches to compiling Python object code into something closer to native code. There is active research in this area but there is not (as yet) a simple answer.
The effort required to create a good compiler to generate native code for a new language is staggering. Small research groups typically take 5 to 10 years (examples: SML/NJ, Haskell, Clean, Cecil, lcc, Objective Caml, MLton, and many others). And when the language in question requires type checking and other decisions to be made at run time, a compiler writer has to work much harder to get good native-code performance (for an excellent example, see work by Craig Chambers and later Urs Hoelzle on Self). The performance gains you might hope for are harder to realize than you might think. This phenomenon partly explains why so many dynamically typed languages are interpreted.
As noted, a decent interpreter is also instantly portable, while porting compilers to new machine architectures takes substantial effort (and is a problem I personally have been working on for over 20 years, with some time off for good behavior). So an interpreter is a way to reach a wide audience quickly.
Finally, although fast compilers and slow interpreters exist, it's usually easer to make the edit-translate-go cycle faster by using an interpreter. (For some nice examples of fast compilers see the aforementioned lcc as well as Ken Thompson's go compiler. For an example of a relatively slow interpreter see GHCi.
Well, isn't one of the strengths of these languages that they are so easily scriptable? They wouldn't be if they were compiled. And on the other hand, dynamic languages are easier to intereprete than to compile.
In a compiled language, the loop you get into when making software is
Make a change
Compile changes
Test changes
goto 1
Interpreted languages tend to be faster to make stuff in because you get to cut out step two of that process (and when you're dealing with a large system where compile times can be upwards of two minutes, step two can add a significant amount of time).
This isn't necessarily the reason python|ruby designers thought of, but keep in mind that "How efficiently does the machine run this?" is only half the software development problem.
It also seems like it would be easier to compile code in a language that's interpreted naturally than it would be to add an interpreter to a language that's compiled by default.
REPL. Don't knock it 'till you've tried it. :)
By design.
The authors wanted something where they can write scripts into.
Python gets compiled the first time it is executed though
Compiling Ruby at least is notoriously hard. I'm working on one, and as part of that I wrote a blog post enumerating some of the issues here.
Specifically, Ruby is suffering from a very unclear (i.e. non-existent) boundary between the "read" and "execute" phase of the program that makes it hard to compile efficiently. You could just emulate what the interpreter does, but then you're not going to see much speed up, so it wouldn't be worth the effort. If you want to compile it efficiently you then face a lot of additional complications to handle the extreme level of dynamism in Ruby.
The good news is that there are techniques for overcoming this. Self, Smalltalk and Lisp/Scheme's have dealt quite successfully with most of the same issues. But it takes time to sift through it and figure out how to make it work with Ruby. It also doesn't help that Ruby has a very convoluted grammar.
Raw compute performance is probably not a goal of most interpreted languages. Interpreted languages are typically more concerned about programmer productivity than raw speed. In most cases these languages are plenty fast enough for the tasks the languages were designed to tackle.
Given that, and that just about the only advantages of a compiler are type checking (difficult to do in a dynamic language) and speed, there's not much incentive to write compilers for most interpreted languages.

Python 3.0 and language evolution

Python 3.0 breaks backwards compatibility with previous versions and splits the language into two paths (at least temporarily). Do you know of any other language that went through such a major design phase while in maturity?
Also, do you believe that this is how programming languages should evolve or is the price to pay simply too high?
The only language I can think of to attempt such a mid-stream change would be Perl. Of course, Python is beating Perl to that particular finish line by releasing first. It should be noted, however, that Perl's changes are much more extensive than Python's and likely will be harder to detangle.
(There's a price for Perl's "There's More Than One Way To Do It" philosophy.)
There are examples like the changes from version to version of .NET-based languages (ironic, considering the whole point of .NET was supposed to be API stability and cross-platform compatibility). However, I would hardly call those languages "mature"; it's always been more of a design-on-the-go, build-the-plane-as-we-fly approach to things.
Or, as I tend to think of it, most languages come from either "organic growth" or "engineered construction." Perl is the perfect example of organic growth; it started as a fancy text processing tool ala awk/sed and grew into a full language.
Python, on the other hand, is much more engineered. Spend a bit of time wandering around the extensive whitepapers on their website to see the extensive debate that goes into every even minor change to the language's syntax and implementation.
The idea of making these sorts of far-reaching changes is somewhat new to programming languages because programming languages themselves have changed in nature. It used to be that programming methodologies changed only when a new processor came out that had a new instruction set. The early languages tended to either be so low-level and married to assembly language (e.g. C) or so utterly dynamic in nature (Forth, Lisp) that such a mid-stream change wouldn't even come up as a consideration.
As to whether or not the changes are good ones, I'm not sure. I tend to have faith in the people guiding Python's development, however; the changes in the language thus far have been largely for the better.
I think in the days to come the Global Interpreter Lock will prove more central than syntax changes. Though the new multiprocessor library might alleviate most of that.
The price of insisting on near-absolute backwards compatibility is just too high. Spend two minutes programming in C++ if you want to see why.
The python team has worked very hard to make the lack of backward compatibility as painless as possible, to the point where the 2.6 release of python was created with a mind towards a painless upgrade process. Once you have upgraded to 2.6 there are scripts that you can run that will move you to 3.0 without issue.
It's worth mentioning that backward compatibility incurs costs of its own. In some cases it's almost impossible to evolve a language in the ideal way if 100% backward compatibility is required. Java's implementation of generics (which erases type information at compile-time in order to be backwardly-compatible) is a good example of how implementing features with 100% backward compatibility can result in a sub-optimal language feature.
So loosely speaking, it can come down to a choice between a poorly implemented new feature that's backwardly compatible, or a nicely implemented new feature that's not. In many cases, the latter is a better choice, particularly if there are tools that can automatically translate incompatible code.
I think there are many examples of backward compatibility breakages. Many of the languages that did this were either small or died out along the way.
Many examples of this involved renaming the language.
Algol 60 and Algol 68 were so different that the meetings on Algol 68 broke up into factions. The Algol 68 faction, the Pascal faction and the PL/I faction.
Wirth's Pascal morphed into Modula-3. It was very similar to pascal -- very similar syntax and semantics -- but several new features. Was that really a Pascal-2 with no backward compatibility?
The Lisp to Scheme thing involved a rename.
If you track down a scan of the old B programming language manual, you'll see that the evolution to C looks kind of incremental -- not radical -- but it did break compatibility.
Fortran existed in many forms. I don't know for sure, but I think that Digital's Fortran 90 for VAX/VMS wasn't completely compatible with ancient Fortran IV programs.
RPG went through major upheavals -- I think that there are really two incompatible languages called RPG.
Bottom Line I think that thinking and learning are inevitable. You have three responses to learning the limitations of a language.
Invent a new language that's utterly incompatible.
Incremental chagne until you are forced to invent a new language.
Break compatibility in a controlled, thoughtful way.
I think that #1 and #2 are both coward's ways out. Chucking the old is easier than attempting to preserve it. Preserving every nuanced feature (no matter how bad) is a lot of work, some of it of little or no value.
Commercial enterprises opt for cowardly approaches in the name of "new marketing" or "preserving our existing customers". That's why commercial software ventures aren't hot-beds of innovation.
I think that only open-source projects can be embrace innovation in the way that the Python community is tackling this change.
C# and the .NET framework broke compatibility between versions 1.0 and 1.1 as well as between 1.1 and 2.0. Running applications in different versions required having multiple versions of the .NET runtime installed.
At least they did include an upgrade wizard to upgrade source from one version to the next (it worked for most of our code).
Wouldn't VB6 to VB.net be the biggest example of this? Or do you all consider them two separate languages?
In the Lisp world it has happened a few times. of course, the language is so dynamic that usually evolution is simply deprecating part of the standard library and making standard another part.
also, Lua 4 to 5 was pretty significant; but the language core is so minimal that even wide-reaching changes are documented in a couple of pages.
Perl 6 is also going through this type of split right now. Perl 5 programs won't run directly on Perl 6, but there will be a translator to translate the code into a form that may work (I don't think it can handle 100% of the cases).
Perl 6 even has its own article on Wikipedia.
First, here is a video talk about the changes Python will go through.
Second, changes are no good.
Third, I for one welcome evolution and believe it is necessary.
gcc regularly changes how it handles C++ almost every minor release. Of course, this is more a consequence of gcc tightening how they follow the rules, and less of C++ itself changing.
The new version of the Ruby programming language will also break compatibility.
And think of the libraries one might use: gtk, Qt, and so on (they also have incompatible versions).
I think incompatibility is necessary sometimes (but not too often) to support progress.

Categories

Resources