As a language, is Python limited due to no end statement?

As a language, is Python limited due to no end statement? - python

Since Python uses tabs spacing to indicate scope (and as such, has no end of } symbols), does that limit the language in any way from having particular functionality?
Note: I'm not talking about personal preferences on coding-style, I'm talking about real language limitation as a direct result of not having an end statement?
For example, it appears by a post directly from Guido that the lack of multi-line lamba's due to Python not having a terminating end / } symbol?
If so, what other Python limations are there because of this language design decision to use indentation?
Update:
Please note this question is not about Lambda's and technically, not even Python per se. It's about programming language design ... and what limitations does a programming language have when it's designed to have indentation (as opposed to end statements) represent block scope.

There is no lack of end/ }: an end is represented by a "dedent" to the previous depth. So there is no limitation in any way.
Example:
x = 123
while x > 10:
if x % 21:
print("x")
print("y")
print("z")
A "begin" corresponds to increasing of indentation level (after while, after if).
An "end" corresponds to decreasing of indentation level (after the respective print()s).
If you omit the print("y"), you have a "dedentation" to the topmost level, which corresponds to having two successive "end"s.

The answer to this question ranges somewhere between syntactic sugar and language style, i.e. how to phrase a problem elegant and compliant to language philosophy. Any turing-complete language, even assembly language and C - definitely lacking any lambda support - may solve any problem. Lambda allows just a different (arguably more elegant if looking from functional language viewpoint) phrasing of stuff also stateable using standard function definition. So I can't recognize a limitation here beyond having to code differently.

One of the biggest limitations (if you would call it that) is that you can not use tabs and spaces in the same program.
And you shouldn't. Ever.
There are no apparent structural limitations, except perhaps when parsing a python source file (parsing, not interpreting):
def foo(bar):
# If bar contains multiple elements
if len(bar) > 1:
return bar
This is perfectly legal python code, however, when parsing the file, you may run into trouble trying to figure out which indentation level the comment belongs to.

What do you mean by "limitation"? Do you mean there are computations Python cannot perform that other languages can? In that case, the answer is definitely No. Python is turing complete.
Do you mean that the lack of end statements change how computations are expressed in Python? In that case, the answer is "mostly not". You must understand that Python's dedent is an end statement; it's a byte sequence that the interpreter recognizes as the end of a block.
However, as others have mentioned, the use of indentation to denote blocks is awkward when it comes to inline functions (Python's lambda). This means the style of Python programs might be slightly different than from JavaScript, for example (where it's common to write large inline functions).
That being said, many languages don't even have inline functions to begin with, so I wouldn't call this a limitation.

Related

Questions about Python conceptual hierarchy

I'm reading through Learning Python (3rd Edition), by Mark Lutz, and I'm in the portion that is dealing with the nuts and bolts of Python syntax.
He defines the Python language-structure hierarchy as follows:
Programs are composed of modules
Modules contain statements
Statements contain expressions
Expressions create and process objects
I'm a little confused about the definition of Python statements.
I've heard Expressions described as anything that is a value, but also can contain things like addition, etc.
Is it safe to say that statements are structured operations on expressions that drive a module's logic?

Yes, you are almost there.
Expressions are something that evaluate to some value.
On the other hand, statements are something that cause some action.
That action can be on some object, based on result of an expression which may or may not involve some other object(s).

I found this with a quick Google search, is it what you where looking for?
What is the difference between an expression and a statement in Python?
"Statements (see 1, 2), on the other hand, are everything that can make up a line (or several lines) of Python code. Note that expressions are statements as well."

I'm quite wary of classifications like this, and especially attempts to make them into a hierarchy. An expression can also be, for example, a function call; I guess that falls into your "anything that is a value" definition since a function always returns a value even if it is None.
A statement is really everything else; assignment, flow control (eg defining a for or while loop, try/except, break, continue...), the introduction of a function or a class definition (the def or class keywords), and so on.

How do the for / while / print things work in python?

What i mean is, how is the syntax defined, i.e. how can i make my own constructs like these?
I realise in a lot of languages, things like this will be built into the compiler / spec, and so it's dealt with by the compiler (at least that how i understand it to work).
But with python, everything i've come across so far has been accessible to the programmer, and so you more or less have the freedom to do whatever you want.
How would i go about writing my own version of for or while? Is it even possible?
I don't have any actual application for this, so the answer to any WHY?! questions is just "because why not?" or "curiosity".

No, you can't, not from within Python. You can't add new syntax to the language. (You'd have to modify the source code of Python itself to make your own custom version of Python.)
Note that the iterator protocol allows you to define objects that can be used with for in a custom way, which covers a lot of the possible use cases of writing your own iteration syntax.

Well, you have a couple of options for creating your own syntax:
Write a higher-order function, like map or reduce.
Modify python at the C level. This is, as you might expect, relatively easy as compared with fiddling with many other languages. See this article for an example: http://eli.thegreenplace.net/2010/06/30/python-internals-adding-a-new-statement-to-python/
Fake it using the debug facilities, or the encodings facility. See this code: http://entrian.com/goto/download.html and http://timhatch.com/projects/pybraces/
Use a preprocessor. Here's one project that tries to make this easy: http://www.fiber-space.de/langscape/doc/index.html
Use of the python facilities built in to achieve a similar effect (decorators, metaclasses, and the like).
Obviously, none of this is quite what you're looking for, but python, unlike smalltalk or lisp, isn't (necessarily) programmed in itself and guarantees to expose its own underlying execution and parsing mechanisms at runtime.

You can't make equivalent constructs. for, while, if etc. are statements, and they are built into the language with their own specific syntax. There are languages that do allow this sort of thing though (to some degree), such as Scala.

while, print, for etc. are keywords. That means they are parsed by the python parser whilst reading the code, stripped any redundant characters and result in tokens. Afterwards a lexer takes those tokens as input and builds a program tree which is then excuted by the interpreter. Said so, those constructs are used only as syntactic sugar for underlying lexical machinery and as such are not visible from inside the code.

Why is it not possible to create a practical Perl to Python source code converter?

It would be nice if there existed a program that automatically transforms Perl code to Python code, making the resultant Python program as readable and maintainable as the original one, let alone working the same way.
The most obvious solution would just invoke perl via Python utils:
#!/usr/bin/python
os.exec("tail -n -2 "+__file__+" | perl -")
...the rest of file is the original perl program...
However, the resultant code is hardly a Python code, it's essentially a Perl code. The potential converter should convert Perl constructs and idioms to easy-to-read Python code, it should retain variable and subroutine names (i.e. the result should not look obfuscated) and should not shatter the wrokflow too much.
Such a conversion is obviously very hard. The hardness of the conversion depends on the number of Perl features and syntactical constructs, which do not have easy-to-read, unobfuscated Python equivalents. I believe that the large amount of such features renders such automatic conversion impossible practically (while theoretical possibility exists).
So, could you please name Perl idioms and syntax features that can't be expressed in Python as concise as in the original Perl code?
Edit: some people linked Python-to-Perl conventers and deduced, on this basis, that it should be easy to write Perl-to-Python as well. However, I'm sure that converting to Python is in greater demand; still this converter is not yet written--while the reverse has already been! Which only makes my confidence in impossibility of writing a good converter to Python more solid.

Your best Perl to Python converter is probably 23 years old, just graduated university and is looking for a job.

Why Perl is not Python.
Perl has statements which Python more-or-less totally lacks. While you can probably contrive matching statements, the syntax will be so utterly unlike Perl as to make it difficult to call it a "translation". You'd really have to cook up some fancy Python stuff to make it as terse as the original Perl.
Perl has run-time semantics which are so unlike Python as to make translation very challenging. We'll look at just one example below.
Perl has data structures which are enough different from Python that translation is hard.
Perl threads don't share data by default. Only selected data elements can be shared. Python threads have more common "shared everything" data.
One example of #2 should be enough.
Perl:
do_something || die()
Where do_something is any statement of any kind.
To automagically translate this into Python you'd have to wrap every || die() statement in
try:
python_version_of_do_something
except OrdinaryStatementFailure, e:
die()
sys.exit()
Where the more common formulation
Perl
do_something
Would become this using simple -- unthinking -- translation of the source
try:
python_version_of_do_something
except OrdinaryStatementFailure, e:
pass
And, of course,
Perl
do_this || do_that || die()
Is even more complex to translate into Python.
And
Perl
do_this && do_that || die()
really push the envelope. My Perl is rusty, so I can't recall the precise semantics of this kind of thing. But you have to totally understand the semantics to work out a Pythonic implementation.
The Python examples are not good Python. To write good Python requires "thinking", something an automatic translated can't do.
And every Perl construct would have to be "wrapped" like that in order to get the original Perl semantics into a Pythonic form.
Now, do a similar analysis for every feature of Perl.

Just to expand on some of the other lists here, these are a few Perl constructs that are probably very clumsy in python (if possible).
dynamic scope (via the local keyword)
typeglob manipulation (multiple variables with the same name)
formats (they have a syntax all their own)
closures over mutable variables
pragmas
lvalue subroutines (mysub() = 5; type code)
source filters
context (list vs scalar, and the way that called code can inspect this with wantarray)
type coercion / dynamic typing
any program that uses string eval
The list goes on an on, and someone could try to create a mapping between all of the analogous constructs, but in the end it will be a failure for one simple reason.
Perl can not be statically parsed. The definitions in Perl code (particularly those in BEGIN blocks) change the way the compiler is going to interpret the remaining code. So for non-trivial programs, conversion from Perl => Python suffers from the halting problem.
There is no way to know exactly how all of the program will be compiled until the program has finished running, and it is theoretically possible to create a Perl program that will compile differently every time it is run. Meaning that one Perl program could map to an infinite number of Python programs, the correct of which is only know after running the original program in the perl interpreter.

It is not impossible, it would just take a lot of work.
By the way, there is Perthon, a Python-to-Perl translator. It just seems like nobody is willing to make one that goes the other way.
EDIT: I think I might I've found the reason why a Python to Perl translator is much easier to implement. It's because Python lets you fiddle with a script's AST. See parser module.

Perl can experimentally be built to collect additional information (for instance, comments) during compilation of perl code and even emit the results as XML. There doesn't appear to be any documentation of this outside the source, except for: http://search.cpan.org/perldoc/perl5100delta#MAD
This should be helpful in building a translator. I'd expect you to get 80% of the way there fairly easily, 95% with great difficulty, and never much better than that. There are too many things that don't map well.

Fundamentally, these are two different languages. Converting from one to another and have the result be mostly readable would mean that the software would have to be able to recognize and generate code idioms, and be able to do some static analysis.
The meaning of a program may be exactly defined by the language definition, but the programmer did not necessarily require all the details. A C programmer testing if the value a printf() returned is negative is checking for an error condition, and doesn't typically care about the exact value. if (printf("%s","...") < 0) exit(); can be translated into Perl as print "..." or die();. These statements may not mean exactly the same thing, but they'll typically be what the programmer means, and to create idiomatic C or Perl code from idiomatic Perl or C code the translator must take this into account.
Since different computer languages tend to have different slightly semantics for similar things, it's typically impossible to translate one language into another and come up with the exact same meaning in readable form. To create readable code, the translator needs to understand what the programmer was intending to do, and that's real difficult.
In addition, it would be easier to translate from Python to Perl rather than Perl to Python. Python is intended as a straightforward language with clear standard ways to do things, while Perl is an unduly complex language with the motto "There's More Than One Way To Do It." Translating a Python expression into one of the innumerable corresponding Perl expressions is easier than figuring out what the Perl programmer meant and expressing it in Python.

Python scope and namespace are different from Perl.
In Python, everything is an object. In Perl, everything under the hood seems to be a list/hash/scalar/reference/function. This induces different design approaches and idioms.
Perl has anonymous code blocks and can generate closures on the fly with some branches. I am pretty sure that is not a python feature.
I do think that a very smart chap could statically analyze the bulk of Perl and produce a program that takes small Perl programs and output Python programs that do the same job.
I am much more doubtful about the feasibility of large and/or gnarly Perl translation. Some of us write some really funky code at times.... :)

This is impossible just because you can't even properly parse perl code. See Perl Cannot Be Parsed: A Formal Proof for more details.

The B set of modules by Malcolm Beattie would be the only sane starting point for something like this, though I'm with other answers in that this would be a difficult problem to solve. In general, translating the sense of one high-level language into another high-level language requires a high-level translator, and, for the time being, that can mean only a human.
The difficulty of this problem, for any pair of languages, is due to fundamental differences in the nature of the languages in question, such as runtime semantics and common idioms, not to mention libraries.

The reason it is close to impossible to create a generic translator from one high-level language to another, is that the program only describe HOW and not WHY (this is the reason for comments in the source code).
In order to create a meaningful program in another highlevel language you (or the translator program) needs to know WHY to be able to create the best possible program. If you cannot do that, all you can do is essentially to create a Python interpreter for the compiled version of the Perl program.
In other words, to do this properly you need to go outside the box, and this is very hard for a computer.

NullUserException basically summed it up - it certainly can be done; it would just be an enormous amount of effort to do so. Some language conversion utilities I've seen compile to an intermediate language (such as .NET's CIL) and then decompile that to the desired language. I have not seen any for Perl to Python. You can, however, find a Python to Perl converter here, though that's likely of little use to you unless you're trying to create your own, in which case it may provide some helpful reference.
Edit: if you just need the exact functionality in a Python script, PyPerl may be of some use to you.

Try my version of the Pythonizer: http://github.com/snoopyjc/pythonizer - it does a decent job

Managing Perl habits in a Python environment

Perl habits die hard. Variable declaration, scoping, global/local is different between the 2 languages. Is there a set of recommended python language idioms that will render the transition from perl coding to python coding less painful.
Subtle variable misspelling can waste an extraordinary amount of time.
I understand the variable declaration issue is quasi-religious among python folks
I'm not arguing for language changes or features, just a reliable bridge between
the 2 languages that will not cause my perl habits sink my python efforts.
Thanks.

Splitting Python classes into separate files (like in Java, one class per file) helps find scoping problems, although this is not idiomatic python (that is, not pythonic).
I have been writing python after much perl and found this from tchrist to be useful, even though it is old:
http://linuxmafia.com/faq/Devtools/python-to-perl-conversions.html
Getting used to doing without perl's most excellent variable scoping has been the second most difficult issue with my perl->python transition. The first is obvious if you have much perl: CPAN.

I like the question, but I don't have any experience in Perl so I'm not sure how to best advise you.
I suggest you do a Google search for "Python idioms". You will find some gems. In particular:
http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html
http://docs.python.org/dev/howto/doanddont.html
http://jaynes.colorado.edu/PythonIdioms.html
As for the variable "declaration" issue, here's my best advice for you:
Remember that in Python, objects have a life of their own, separate from variable names. A variable name is a tag that is bound to an object. At any time, you may rebind the name to a different object, perhaps of a completely different type. Thus, this is perfectly legal:
x = 1 # bind x to integer, value == 1
x = "1" # bind x to string, value is "1"
Python is in fact strongly typed; try executing the code 1 + "1" and see how well it works, if you don't believe me. The integer object with value 1 does not accept addition of a string value, in the absence of explicit type coercion. So Python names never ever have sigil characters that flag properties of the variable; that's just not how Python does things. Any legal identifier name could be bound to any Python object of any type.

In python $_ does not exist except in the python shell and variables with global scope are frowned upon.
In practice this has two major effects:
In Python you can't use regular expressions as naturally as Perl, s0 matching each iterated $_ and similarly catching matches is more cumbersome
Python functions tend to be called explicitly or have default variables
However these differences are fairly minor when one considers that in Python just about everything becomes a class. When I used to do Perl I thought of "carving"; in Python I rather feel I am "composing".
Python doesn't have the idiomatic richness of Perl and I think it is probably a mistake to attempt to do the translation.

Read, understand, follow, and love PEP 8, which details the style guidelines for everything about Python.
Seriously, if you want to know about the recommended idioms and habits of Python, that's the source.

Don't mis-type your variable names. Seriously. Use short, easy, descriptive ones, use them locally, and don't rely on the global scope.
If you're doing a larger project that isn't served well by this, use pylint, unit tests and coverage.py to make SURE your code does what you expect.
Copied from a comment in one of the other threads:
"‘strict vars’ is primarily intended to stop typoed references and missed-out ‘my’s from creating accidental globals (well, package variables in Perl terms). This can't happen in Python as bare assignments default to local declaration, and bare unassigned symbols result in an exception."

Partial evaluation for parsing

I'm working on a macro system for Python (as discussed here) and one of the things I've been considering are units of measure. Although units of measure could be implemented without macros or via static macros (e.g. defining all your units ahead of time), I'm toying around with the idea of allowing syntax to be extended dynamically at runtime.
To do this, I'm considering using a sort of partial evaluation on the code at compile-time. If parsing fails for a given expression, due to a macro for its syntax not being available, the compiler halts evaluation of the function/block and generates the code it already has with a stub where the unknown expression is. When this stub is hit at runtime, the function is recompiled against the current macro set. If this compilation fails, a parse error would be thrown because execution can't continue. If the compilation succeeds, the new function replaces the old one and execution continues.
The biggest issue I see is that you can't find parse errors until the affected code is run. However, this wouldn't affect many cases, e.g. group operators like [], {}, (), and `` still need to be paired (requirement of my tokenizer/list parser), and top-level syntax like classes and functions wouldn't be affected since their "runtime" is really load time, where the syntax is evaluated and their objects are generated.
Aside from the implementation difficulty and the problem I described above, what problems are there with this idea?

Here are a few possible problems:
You may find it difficult to provide the user with helpful error messages in case of a problem. This seems likely, as any compilation-time syntax error could be just a syntax extension.
Performance hit.
I was trying to find some discussion of the pluses, minuses, and/or implementation of dynamic parsing in Perl 6, but I couldn't find anything appropriate. However, you may find this quote from Nicklaus Wirth (designer of Pascal and other languages) interesting:
The phantasies of computer scientists
in the 1960s knew no bounds. Spurned
by the success of automatic syntax
analysis and parser generation, some
proposed the idea of the flexible, or
at least extensible language. The
notion was that a program would be
preceded by syntactic rules which
would then guide the general parser
while parsing the subsequent program.
A step further: The syntax rules would
not only precede the program, but they
could be interspersed anywhere
throughout the text. For example, if
someone wished to use a particularly
fancy private form of for statement,
he could do so elegantly, even
specifying different variants for the
same concept in different sections of
the same program. The concept that
languages serve to communicate between
humans had been completely blended
out, as apparently everyone could now
define his own language on the fly.
The high hopes, however, were soon
damped by the difficulties encountered
when trying to specify, what these
private constructions should mean. As
a consequence, the intreaguing idea of
extensible languages faded away rather
quickly.
Edit: Here's Perl 6's Synopsis 6: Subroutines, unfortunately in markup form because I couldn't find an updated, formatted version; search within for "macro". Unfortunately, it's not too interesting, but you may find some things relevant, like Perl 6's one-pass parsing rule, or its syntax for abstract syntax trees. The approach Perl 6 takes is that a macro is a function that executes immediately after its arguments are parsed and returns either an AST or a string; Perl 6 continues parsing as if the source actually contained the return value. There is mention of generation of error messages, but they make it seem like if macros return ASTs, you can do alright.

Pushing this one step further, you could do "lazy" parsing and always only parse enough to evaluate the next statement. Like some kind of just-in-time parser. Then syntax errors could become normal runtime errors that just raise a normal Exception that could be handled by surrounding code:
def fun():
not implemented yet
try:
fun()
except:
pass
That would be an interesting effect, but if it's useful or desirable is a different question. Generally it's good to know about errors even if you don't call the code at the moment.
Macros would not be evaluated until control reaches them and naturally the parser would already know all previous definitions. Also the macro definition could maybe even use variables and data that the program has calculated so far (like adding some syntax for all elements in a previously calculated list). But this is probably a bad idea to start writing self-modifying programs for things that could usually be done as well directly in the language. This could get confusing...
In any case you should make sure to parse code only once, and if it is executed a second time use the already parsed expression, so that it doesn't lead to performance problems.

Here are some ideas from my master's thesis, which may or may not be helpful.
The thesis was about robust parsing of natural language.
The main idea: given a context-free grammar for a language, try to parse a given
text (or, in your case, a python program). If parsing failed, you will have a partially generated parse tree. Use the tree structure to suggest new grammar rules that will better cover the parsed text.
I could send you my thesis, but unless you read Hebrew this will probably not be useful.
In a nutshell:
I used a bottom-up chart parser. This type of parser generates edges for productions from the grammar. Each edge is marked with the part of the tree that was consumed. Each edge gets a score according to how close it was to full coverage, for example:
S -> NP . VP
Has a score of one half (We succeeded in covering the NP but not the VP).
The highest-scored edges suggest a new rule (such as X->NP).
In general, a chart parser is less efficient than a common LALR or LL parser (the types usually used for programming languages) - O(n^3) instead of O(n) complexity, but then again you are trying something more complicated than just parsing an existing language.
If you can do something with the idea, I can send you further details.
I believe looking at natural language parsers may give you some other ideas.

Another thing I've considered is making this the default behavior across the board, but allow languages (meaning a set of macros to parse a given language) to throw a parse error at compile-time. Python 2.5 in my system, for example, would do this.
Instead of the stub idea, simply recompile functions that couldn't be handled completely at compile-time when they're executed. This will also make self-modifying code easier, as you can modify the code and recompile it at runtime.

You'll probably need to delimit the bits of input text with unknown syntax, so that the rest of the syntax tree can be resolved, apart from some character sequences nodes which will be expanded later. Depending on your top level syntax, that may be fine.
You may find that the parsing algorithm and the lexer and the interface between them all need updating, which might rule out most compiler creation tools.
(The more usual approach is to use string constants for this purpose, which can be parsed to a little interpreter at run time).

I don't think your approach would work very well. Let's take a simple example written in pseudo-code:
define some syntax M1 with definition D1
if _whatever_:
define M1 to do D2
else:
define M1 to do D3
code that uses M1
So there is one example where, if you allow syntax redefinition at runtime, you have a problem (since by your approach the code that uses M1 would be compiled by definition D1). Note that verifying if syntax redefinition occurs is undecidable. An over-approximation could be computed by some kind of typing system or some other kind of static analysis, but Python is not well known for this :D.
Another thing that bothers me is that your solution does not 'feel' right. I find it evil to store source code you can't parse just because you may be able to parse it at runtime.
Another example that jumps to mind is this:
...function definition fun1 that calls fun2...
define M1 (at runtime)
use M1
...function definition for fun2
Technically, when you use M1, you cannot parse it, so you need to keep the rest of the program (including the function definition of fun2) in source code. When you run the entire program, you'll see a call to fun2 that you cannot call, even if it's defined.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.