Reading a line from standard input in Python - python

What (if any) are the differences between the following two methods of reading a line from standard input: raw_input() and sys.stdin.readline() ? And in which cases one of these methods is preferable over the other ?

raw_input() takes an optional prompt argument. It also strips the trailing newline character from the string it returns, and supports history features if the readline module is loaded.
readline() takes an optional size argument, does not strip the trailing newline character and does not support history whatsoever.
Since they don't do the same thing, they're not really interchangeable. I personally prefer using raw_input() to fetch user input, and readline() to read lines out of a file.

"However, from the point of view of many Python beginners and educators, the use of sys.stdin.readline() presents the following problems:
Compared to the name "raw_input", the name "sys.stdin.readline()" is clunky and inelegant.
The names "sys" and "stdin" have no meaning for most beginners, who are mainly interested in what the function does, and not where in the package structure it is located. The lack of meaning also makes it difficult to remember: is it "sys.stdin.readline()", or " stdin.sys.readline()"? To a programming novice, there is not any obvious reason to prefer one over the other. In contrast, functions simple and direct names like print, input, and raw_input, and open are easier to remember." from here: http://www.python.org/dev/peps/pep-3111/

Related

Parsing blocks as Python

I am writing a lexer + parser in JFlex + CUP, and I wanted to have Python-like syntax regarding blocks; that is, indentation marks the block level.
I am unsure of how to tackle this, and whether it should be done at the lexical or sintax level.
My current approach is to solve the issue at the lexical level - newlines are parsed as instruction separators, and when one is processed I move the lexer to a special state which checks how many characters are in front of the new line and remembers in which column the last line started, and accordingly introduces and open block or close block character.
However, I am running into all sort of trouble. For example:
JFlex cannot match empty strings, so my instructions need to have at least one blanck after every newline.
I cannot close two blocks at the same time with this approach.
Is my approach correct? Should I be doing things different?
Your approach of handling indents in the lexer rather than the parser is correct. Well, it’s doable either way, but this is usually the easier way, and it’s the way Python itself (or at least CPython and PyPy) does it.
I don’t know much about JFlex, and you haven’t given us any code to work with, but I can explain in general terms.
For your first problem, you're already putting the lexer into a special state after the newline, so that "grab 0 or more spaces" should be doable by escaping from the normal flow of things and just running a regex against the line.
For your second problem, the simplest solution (and the one Python uses) is to keep a stack of indents. I'll demonstrate something a bit simpler than what Python does.
First:
indents = [0]
After each newline, grab a run of 0 or more spaces as spaces. Then:
if len(spaces) == indents[-1]:
pass
elif len(spaces) > indents[-1]:
indents.append(len(spaces))
emit(INDENT_TOKEN)
else:
while len(spaces) != indents[-1]:
indents.pop()
emit(DEDENT_TOKEN)
Now your parser just sees INDENT_TOKEN and DEDENT_TOKEN, which are no different from, say, OPEN_BRACE_TOKEN and CLOSE_BRACE_TOKEN in a C-like language.
Of you’d want better error handling—raise some kind of tokenizer error rather than an implicit IndexError, maybe use < instead of != so you can detect that you’ve gone too far instead of exhausting the stack (for better error recovery if you want to continue to emit further errors instead of bailing at the first one), etc.
For real-life example code (with error handling, and tabs as well as spaces, and backslash newline escaping, and handling non-syntactic indentation inside of parenthesized expressions, etc.), see the tokenize docs and source in the stdlib.

pass stryng between two python scripts

I need to pass a string value between two python scripts.
It's not an argument but it's a string containing a sentence (with spaces, commas and so on).
example:
one.py has a string variable "hello world, how are you today?"
and I need to pass it to two.py
How can I achieve this result?
It's not an argument but it's a string containing a sentence (with spaces, commas and so on).
Why isn't that an argument?
I don't know how you were planning to run the other script, but pretty much any way of doing so allows you to pass strings with spaces, commas and so on as arguments.
If you're doing things the smart way, it works automatically:
subprocess.check_call([sys.executable, path_to_script2, arg])
If you're doing something like os.system you'll have to quote the argument manually to pass it through the shell… but the easiest answer there is "don't use os.system, so I won't show how to do that unless you ask for it specifically.
Either way, when script2 runs, its sys.argv[1] will be arg, with the spaces and commas and so on preserved.
If the string is too big, you may run into problems with maximum argv length—and, worse, they may be different problems on different platforms.
Also, if you're using Unicode, especially in Python 2.x, there can be some complexities to deal with.
But, for short-ish all-ASCII strings like "hello world, how are you today?", it's all trivial.
I'd suggest using a text document that one script writes to and the other reads from. It's should be pretty simple to implement.
Documentation for reading and writing files can be found here:
http://docs.python.org/2/tutorial/inputoutput.html#reading-and-writing-files

^H ^? in python

Some terminals will send ^? as backspace, some other terminals will send ^H.
Most of the terminals can be configured to change their behavior.
I do not want to deal with all the possible combinations but I would like to accept both ^? and ^H as a backspace from python.
doing this
os.system("stty erase '^?'")
I will accept the first option and with
os.system("stty erase '^H'")
I will accept the second one but the first will be no longer available.
I would like to use
raw_input("userinput>>")
to grab the input.
The only way I was able to figure out is implementing my own shell which works not on "raw based input" but on "char based input".
Any better (and quicker) idea?
The built-in function raw_input() (or input() in Python 3) will automatically use the readline library after importing it. This gives you a nice and full-feautured line editor, and it is probably your best bet on platforms where it is available, as long as you don't mind Readline having a contagious licence (GPL).
I don't know your question exactly. IMO, you need a method to read some line-based text(including some special character) from console to program.
No matter what method you use, if read this character have special mean in different console, you should confront a console(not only system-specific, but also console-specific) question, all text in console will be store in buffer first, and then show in screen, finally processed and send in to your program. Another way to surround this problem is to use a raw line-obtaining console environment.
You can add a special method(a decorator) to decorate the raw_input() or somewhat input method to process special word.
After solved that question
using this snippet can deal with input,:
def pre():
textline=raw_input()
# ^? should replace to the specific value.
textline.replace("^?","^H")
return textline
To be faster, maybe invoke some system function depend on OS is an idea. But in fact, IO in python is faster enough for common jobs.
To fix ^? on erase do stty erase ^H

Why does python use unconventional triple-quotation marks for comments?

Why didn't python just use the traditional style of comments like C/C++/Java uses:
/**
* Comment lines
* More comment lines
*/
// line comments
// line comments
//
Is there a specific reason for this or is it just arbitrary?
Python doesn't use triple quotation marks for comments. Comments use the hash (a.k.a. pound) character:
# this is a comment
The triple quote thing is a doc string, and, unlike a comment, is actually available as a real string to the program:
>>> def bla():
... """Print the answer"""
... print 42
...
>>> bla.__doc__
'Print the answer'
>>> help(bla)
Help on function bla in module __main__:
bla()
Print the answer
It's not strictly required to use triple quotes, as long as it's a string. Using """ is just a convention (and has the advantage of being multiline).
A number of the answers got many of the points, but don't give the complete view of how things work. To summarize...
# comment is how Python does actual comments (similar to bash, and some other languages). Python only has "to the end of the line" comments, it has no explicit multi-line comment wrapper (as opposed to javascript's /* .. */). Most Python IDEs let you select-and-comment a block at a time, this is how many people handle that situation.
Then there are normal single-line python strings: They can use ' or " quotation marks (eg 'foo' "bar"). The main limitation with these is that they don't wrap across multiple lines. That's what multiline-strings are for: These are strings surrounded by triple single or double quotes (''' or """) and are terminated only when a matching unescaped terminator is found. They can go on for as many lines as needed, and include all intervening whitespace.
Either of these two string types define a completely normal string object. They can be assigned a variable name, have operators applied to them, etc. Once parsed, there are no differences between any of the formats. However, there are two special cases based on where the string is and how it's used...
First, if a string just written down, with no additional operations applied, and not assigned to a variable, what happens to it? When the code executes, the bare string is basically discarded. So people have found it convenient to comment out large bits of python code using multi-line strings (providing you escape any internal multi-line strings). This isn't that common, or semantically correct, but it is allowed.
The second use is that any such bare strings which follow immediately after a def Foo(), class Foo(), or the start of a module, are treated as string containing documentation for that object, and stored in the __doc__ attribute of the object. This is the most common case where strings can seem like they are a "comment". The difference is that they are performing an active role as part of the parsed code, being stored in __doc__... and unlike a comment, they can be read at runtime.
Triple-quotes aren't comments. They're string literals that span multiple lines and include those line breaks in the resulting string. This allows you to use
somestr = """This is a rather long string containing
several lines of text just as you would do in C.
Note that whitespace at the beginning of the line is\
significant."""
instead of
somestr = "This is a rather long string containing\n\
several lines of text just as you would do in C.\n\
Note that whitespace at the beginning of the line is\
significant."
Most scripting languages use # as a comment marker so to skip automatically the shebang (#!) which specifies to the program loader the interpreter to run (like in #!/bin/bash). Alternatively, the interpreter could be instructed to automatically skip the first line, but it's way more convenient just to define # as comment marker and that's it, so it's skipped as a consequence.
Guido - the creator of Python, actually weighs in on the topic here:
https://twitter.com/gvanrossum/status/112670605505077248?lang=en
In summary - for multiline comments, just use triple quotes. For academic purposes - yes it technically is a string, but it gets ignored because it is never used or assigned to a variable.

In Python what's the best way to emulate Perl's __END__?

Am I correct in thinking that that Python doesn't have a direct equivalent for Perl's __END__?
print "Perl...\n";
__END__
End of code. I can put anything I want here.
One thought that occurred to me was to use a triple-quoted string. Is there a better way to achieve this in Python?
print "Python..."
"""
End of code. I can put anything I want here.
"""
The __END__ block in perl dates from a time when programmers had to work with data from the outside world and liked to keep examples of it in the program itself.
Hard to imagine I know.
It was useful for example if you had a moving target like a hardware log file with mutating messages due to firmware updates where you wanted to compare old and new versions of the line or keep notes not strictly related to the programs operations ("Code seems slow on day x of month every month") or as mentioned above a reference set of data to run the program against. Telcos are an example of an industry where this was a frequent requirement.
Lastly Python's cult like restrictiveness seems to have a real and tiresome effect on the mindset of its advocates, if your only response to a question is "Why would you want to that when you could do X?" when X is not as useful please keep quiet++.
The triple-quote form you suggested will still create a python string, whereas Perl's parser simply ignores anything after __END__. You can't write:
"""
I can put anything in here...
Anything!
"""
import os
os.system("rm -rf /")
Comments are more suitable in my opinion.
#__END__
#Whatever I write here will be ignored
#Woohoo !
What you're asking for does not exist.
Proof: http://www.mail-archive.com/python-list#python.org/msg156396.html
A simple solution is to escape any " as \" and do a normal multi line string -- see official docs: http://docs.python.org/tutorial/introduction.html#strings
( Also, atexit doesn't work: http://www.mail-archive.com/python-list#python.org/msg156364.html )
Hm, what about sys.exit(0) ? (assuming you do import sys above it, of course)
As to why it would useful, sometimes I sit down to do a substantial rewrite of something and want to mark my "good up to this point" place.
By using sys.exit(0) in a temporary manner, I know nothing below that point will get executed, therefore if there's a problem (e.g., server error) I know it had to be above that point.
I like it slightly better than commenting out the rest of the file, just because there are more chances to make a mistake and uncomment something (stray key press at beginning of line), and also because it seems better to insert 1 line (which will later be removed), than to modify X-many lines which will then have to be un-modified later.
But yeah, this is splitting hairs; commenting works great too... assuming your editor supports easily commenting out a region, of course; if not, sys.exit(0) all the way!
I use __END__ all the time for multiples of the reasons given. I've been doing it for so long now that I put it (usually preceded by an exit('0');), along with BEGIN {} / END{} routines, in by force-of-habit. It is a shame that Python doesn't have an equivalent, but I just comment-out the lines at the bottom: extraneous, but that's about what you get with one way to rule them all languages.
Python does not have a direct equivalent to this.
Why do you want it? It doesn't sound like a really great thing to have when there are more consistent ways like putting the text at the end as comments (that's how we include arbitrary text in Python source files. Triple quoted strings are for making multi-line strings, not for non-code-related text.)
Your editor should be able to make using many lines of comments easy for you.

Categories

Resources