s='s=%r;print(s%%s)';print(s%s)
I understand % is to replace something in a string by s (but actually who to replace?)
Maybe more intriguing is, why the print(s%%s) become print(s%s) automatically after %s is replaced by s itself?
The "%%" you see in that code is a "conversion specifier" for the older printf-style of string formatting.
Most conversion specifiers tell Python how to convert an argument that is passed into the % format operator (for instance, "%d" says to convert the next argument to a decimal integer before inserting it into the string).
"%%" is different, because it directly converts to a single "%" character without consuming an argument. This conversion is needed in the format string specification, since otherwise any "%" would be taken as the first part of some other code and there would be no easy way to produce a string containing a percent sign.
The code you show is a quine (a program that produces its own code as its output). When it runs print(s%s), it does a string formatting operation where both the format string, and the single argument are the same string, s.
The "%r" in the string is a conversion specifier that does a repr of its argument. repr on a string produces the string with quotes around it. This is where the quoted string comes from in the output.
The "%%" produces the % operator that appears between the two s's in the print call. If only one "%" was included in s, you'd get an error about the formatting operation expecting a second argument (since %s is another conversion specifier).
print '% %s' % '' #wrong
print '%% %s' % '' #correct and print '% '
Think about \\ and \.
Related
_='_=%r;print (_%%_) ';print (_%_)
(Edit: I have recieved your input and fixed the code, thanks for the correction.)
This is the shortest quine you can write in Python (I'm told). A quine being code that returns itself.
Can someone explain this line of code to me as if I know nothing about Python? I use Python 3.x by the way.
What I'm looking for is a character-by-character explanation of what's going on.
Thanks.
As pointed out in the comments, the correct quine is _='_=%r;print (_%%_) ';print (_%_), using this, let's begin:
The ; executes to commands in a line, so the following:
_='_=%r;print (_%%_) ';print (_%_)
is equivalent to:
_='_=%r;print (_%%_) '
print (_%_)
In the first line, _ is a valid variable name which is assigned the string '_=%r;print (_%%_) '
Using python's string formatting, we can inject variable into strings in a printf fashion:
>>> name = 'GNU'
>>> print('%s is Not Unix'%name)
GNU is Not Unix
>>> print('%r is Not Unix'%name)
'GNU' is Not Unix
%s uses a string, %r uses any object and converts the object to a representation through the repr() function.
Now imagine you want to print a % as well; a string such as GNU is Not Unix %. If you try the following,
>>> print('%s is Not Unix %'%name)
You will end up with a ValueError, so you would have to escape the % with another %:
>>> print('%s is Not Unix %%'%name)
GNU is Not Unix %
Back to the original code, when you use _%_, you are actually substituting the %r in the _='_=%r;print (_%%_) with itself and the %% would result in a % because the first one is treated as escape character and finally you are printing the whole result, so you would end up with:
_='_=%r;print (_%%_) ';print (_%_)
which is the exact replica of what produced it in the first place i.e. a quine.
I have a piece of code as below:
tupvalue = [('html', 96), ('css', 115), ('map', 82)]
So while printing the above tuple in the desired format for a particular index I found a code like this:
>>> '%s:%d' % tupvalue[0]
'html:96'
I'm wondering how the single value tupvalue[0] is recognised as a tuple of two values by the format specifier '%s:%d'? Please explain this mechanism with a documentation reference.
How can I use a comprehension to format all the values in tupvalue in the required format as in the example shown?
First, the easy question:
How can I use a comprehension to format all the values in tupvalue in the required format as in the example shown?
That's a list comprehension: ['%s:%d' % t for t in tupvalue]
Now, the harder question!
how the single value tupvalue[0] is recognised as a tuple of two values by the format specifier '%s:%d'?
Your intuition that something a bit strange is going on here is correct. Tuples are special-cased in the language for use with string formatting.
>>> '%s:%d' % ('css', 115) # tuple is seen as two elements
'css:115'
>>> '%s:%d' % ['css', 115] # list is just seen as one object!
TypeError: not enough arguments for format string
The percent-style string formatting does not duck-type properly. So, if you actually wanted to format a tuple, you'll have to wrap it in another tuple, unlike any other kind of object:
>>> '%s' % []
'[]'
>>> '%s' % ((),)
'()'
>>> '%s' % ()
TypeError: not enough arguments for format string
The relevant section of the documentation is at section 4.7.2. printf-style String Formatting, where it is mentioned:
If format requires a single argument, values may be a single non-tuple object. Otherwise, values must be a tuple with exactly the number of items specified by the format string
The odd handling of tuples is one of the quirks called out in the note at the beginning of that section of the documentation, and one of the reasons that the newer string formatting method str.format is recommended instead.
Note that the handling of the string formatting happens at runtime†. You can verify this with the abstract syntax tree:
>>> import ast
>>> ast.dump(ast.parse('"%s" % val'))
"Module(body=[Expr(value=BinOp(left=Str(s='%s'), op=Mod(), right=Name(id='val', ctx=Load())))])"
'%s' % val parses to a binary operation on '%s' and val, which is handled like str.__mod__(val), in CPython that's a BINARY_MODULO opcode. This means it's usually up to the str type to decide what to do when the val received is incorrect*, which occurs only once the expression is evaluated, i.e. once the interpreter has reached that line. So, it doesn't really matter whether the val is the wrong type or has too few/too many elements - that's a runtime error, not a syntax error.
† Except in some special cases where CPython's peephole optimizer is able to "constant fold" it at compile time.
* Unless val's type subclasses str, in which case type(val).__rmod__ should be able to control the result.
for each_ID ,each_Title in zip(Id,Title):
url="http://www.zjjsggzy.gov.cn/%E6%96%B0%E6%B5%81%E7%A8%8B/%E6%8B%9B%E6%8A%95%E6%A0%87%E4%BF%A1%E6%81%AF/jyxx_1.html?iq=x&type=%E6%8B%9B%E6%A0%87%E5%85%AC%E5%91%8A&tpid=%s&tpTitle=%s"%(each_ID,each_Title)
“each_ID”and “each_Title” are from website unicode parameters, but why it cause a “float”error, %s is not a string?
You have loads of % formatters in your string. %E formats a float object. You have several of those in your string, including at the start:
"http://www.zjjsggzy.gov.cn/%E6
# ^^
You'd need to double up every single % used in a URL character escape:
"http://www.zjjsggzy.gov.cn/%%E6%%96%%B0%%E6%%B5%%81%%E7%%A8%%8B/..."
That'd be a lot of work, you'd be better off using a different string formatting style. Use str.format():
url = (
"http://www.zjjsggzy.gov.cn/"
"%E6%96%B0%E6%B5%81%E7%A8%8B/%E6%8B%9B%E6%8A%95%E6%A0%87%E4%BF%A1%E6%81%AF"
"/jyxx_1.html?iq=x&type=%E6%8B%9B%E6%A0%87%E5%85%AC%E5%91%8A&"
"tpid={}&tpTitle={}".format(
each_ID, each_Title)
)
I broke the string up into multiple chunks to make it easier to read; the {} brackets delineate the placeholders.
Try using the format method on string. The existing '%' chars conflicting with your %s placeholders :
for each_ID ,each_Title in zip(Id,Title):
url="http://www.zjjsggzy.gov.cn/%E6%96%B0%E6%B5%81%E7%A8%8B/%E6%8B%9B%E6%8A%95%E6%A0%87%E4%BF%A1%E6%81%AF/jyxx_1.html?iq=x&type=%E6%8B%9B%E6%A0%87%E5%85%AC%E5%91%8A&tpid={}&tpTitle={}".format(each_ID, each_Title)
I'm just starting to fool around with formatting the output of a print statement.
The examples I've seen have a % after the format list and before the arguments, like this:
>>> a=123
>>> print "%d" % a
123
What is the meaning of the % and more important, why is it necessary?
It's the string formatting operator, it tells Python to look at the string to the left, and build a new string where %-sequences in the string are replaced with formatted versions of the values from the right-hand side of the operator.
It's not "necessary", you can print values directly:
>>> print a
123
But it's nice to have printf()-style formatting available, and this is how you do it in Python.
As pointed out in a comment, note that the string formatting operator is not connected to print in any way, it's an operator just like any other. You can format a value into a string without printing it:
>>> a = 123
>>> padded = "%05d" % a
>>> padded
00123
In python the % operator is implemented by calling the method __mod__ on the left hand argument, falling back to __rmod__ on the right argument if it's not found. So what you have written is equivalent to
a = 123
print "%d".__mod__(a)
Python's string classes simply implement __mod__ to do string formatting.
Also note that this style of string formatting is referred to in the documentation as "old string formatting"; moving forward we should move to the new-style string formatting as described here: http://docs.python.org/library/stdtypes.html#str.format
like:
>>> a=123
>>> print "{0}".format(a)
123
See Format String Syntax for a description of the various
formatting options that can be specified in format strings.
This method of string formatting is the new standard in Python 3.0,
and should be preferred to the % formatting described in String
Formatting Operations in new code.
Alex's answer has the following line when translated to English
print "%2d. %8.2f %8.2f %8.2f" % (
i, payment, interest, monthPayment)
I am unsure about the line
"%2d. %8.2f %8.2f %8.2f" % #Why do we need the last % here?
It seems to mean the following
apply %2d. to i
apply %8.2f to payment
apply %8.2f to interest
apply %8.2f to monthPayment
The %-words seem to mean the following
%2d.: a decimal presentation of two decimals
2-4. %8.2f: a floating point presentation of two decimals
I am not sure why we use the 8 in %8.2f.
How do you understand the challenging line?
The 8 in 8.2 is the width
"Minimum number of characters to be printed. If the value to be printed is shorter than this number, the result is padded with blank spaces. The value is not truncated even if the result is larger"
The 2 is the number of decimal places
The final % just links the format string (in quotes) with the list of arguments (in brackets).
It's a bit confusing that they chose a % to do this - there is probably some deep python reason.
edit: Apparently '%' is used simply because '%' is used inside the format - which is IMHO stupid and guaranteed to cause confusion. It's like requiring an extra dot at the end of a floating point number to show that it's floating point!
The last % is an operator that takes the string before it and the tuple after and applies the formatting as you note. See the Python tutorial for more details.
The % is an operator which makes a format string. A simple example would be:
"%s is %s" % ( "Alice", "Happy" )
Which would evaluate to the string "Alice is Happy". The format string that is provided defines how the values you pass are put into the string; the syntax is available here. In short the d is "treat as a decimal number" and the 8.2 is "pad to 8 characters and round to 2 decimal places". In essence it looks like that format in particular is being used so that the answers line up when viewed with a monospace font. :)
In my code example the s means "treat as a string".
The % after a string tells Python to attempt to fill in the variables on the left side of the
'%' operator with the items in the list on the right side of the '%' operator.
The '%' operator knows to find the variable in the string by looking for character in the string starting with %.
Your confusion is that you think the % operator and the % character in the string are the same.
Try to look at it this way, outside a string % is an operator, inside a string it is possibly a template for substitution.
As usual, a quote of the doc is required - string-formatting:
String and Unicode objects have one unique built-in operation: the % operator (modulo). This is also known as the string formatting or interpolation operator. Given format % values (where format is a string or Unicode object), % conversion specifications in format are replaced with zero or more elements of values. The effect is similar to the using sprintf in the C language.
And the description of the conversion specifier to explain %8.2f
A conversion specifier contains two or more characters and has the following components, which must occur in this order:
The '%' character, which marks the start of the specifier.
Mapping key (optional), consisting of a parenthesised sequence of characters (for example, (somename)).
Conversion flags (optional), which affect the result of some conversion types.
Minimum field width (optional). If specified as an '*' (asterisk), the actual width is read from the next element of the tuple in values, and the object to convert comes after the minimum field width and optional precision.
Precision (optional), given as a '.' (dot) followed by the precision. If specified as '*' (an asterisk), the actual width is read from the next element of the tuple in values, and the value to convert comes after the precision.
Length modifier (optional).
Conversion type.
When the right argument is a dictionary (or other mapping type), the format string includes mapping keys (2). Breaking the example to 2 steps, we have a dictionary and a format that includes keys from the dictionary (the # is a key):
>>> mydict = {'language':'python', '#':2}
>>> '%(language)s has %(#)03d quote types.' % mydict
'python has 002 quote types.'
>>>
the %8.2f means allow 8 character spaces to hold the number given by the corrisponding variable holding a float, and then have decimal precision of 2.