I know of two ways to format a string:
print 'Hi {}'.format(name)
print 'Hi %s' % name
What are the relative dis/advantages of using either?
I also know both can efficiently handle multiple parameters like
print 'Hi %s you have %d cars' % (name, num_cars)
and
print 'Hi {0} and {1}'.format('Nick', 'Joe')
There is not really any difference between the two string formatting solutions.
{} is usually referred to as "new-style" and %s is "old string formatting", but old style formatting isn't going away any time soon.
The new style formatting isn't supported everywhere yet though:
logger.debug("Message %s", 123) # Works
logger.debug("Message {}", 123) # Does not work.
Nevertheless, I'd recommend using .format. It's more feature-complete, but there is not a huge difference anyway.
It's mostly a question of personal taste.
I use the "old-style" so I can recursively build strings with strings. Consider...
'%s%s%s'
...this represents any possible string combination you can have. When I'm building an output string of N size inputs, the above lets me recursively go down each root and return up.
An example usage is my Search Query testing (Quality Assurance). Starting with %s I can make any possible query.
/.02
Related
I want to know if there's way to do this:
printf("address is %x", address)
in Python. That is to integrate special strings that control the format of a output string. Thanks.
Just use the % operator. See the documentation for details.
address = 16
print "address is %x" % address
The modern and preferred1 way to perform string formatting operations is to use str.format:
print "address is {:x}".format(address)
Although the hex function works equally well in this case:
# [2:] removes the leading '0x'
print "address is", hex(address)[2:]
1For those who would like a citation, here is a note from the documentation for % formatting:
The formatting operations described here are obsolete and may go away
in future versions of Python. Use the new String Formatting in new
code.
I've been looking around and I've been unable to find a definitive answer to this question: what's the recommended way to print variables in Python?
So far, I've seen three ways: using commas, using percent signs, or using plus signs:
>>> a = "hello"
>>> b = "world"
>>> print a, "to the", b
hello to the world
>>> print "%s to the %s" % (a, b)
hello to the world
>>> print a + " to the " + b
hello to the world
Each method seems to have its pros and cons.
Commas allow to write the variable directly and add spaces, as well as automatically perform a string conversion if needed. But I seem to remember that good coding practices say that it's best to separate your variables from your text.
Percent signs allow that, though they require to use a list when there's more than one variable, and you have to write the type of the variable (though it seems able to convert even if the variable type isn't the same, like trying to print a number with %s).
Plus signs seem to be the "worst" as they mix variables and text, and don't convert on the fly; though maybe it is necessary to have more control on your variable from time to time.
I've looked around and it seems some of those methods may be obsolete nowadays. Since they all seem to work and each have their pros and cons, I'm wondering: is there a recommended method, or do they all depend on the context?
Including the values from identifiers inside a string is called string formatting. You can handle string formatting in different ways with various pros and cons.
Using string concatenation (+)
Con: You must manually convert objects to strings
Pro: The objects appear where you want to place the into the string
Con: The final layout may not be clear due to breaking the string literal
Using template strings (i.e. $bash-style substitution):
Pro: You may be familiar with shell variable expansion
Pro: Conversion to string is done automatically
Pro: Final layout is clear.
Con: You cannot specify how to perform the conversion
Using %-style formatting:
Pro: similar to formatting with C's printf.
Pro: conversions are done for you
Pro: you can specify different type of conversions, with some options (e.g. precision for floats)
Pro: The final layout is clear
Pro: You can also specify the name of the elements to substitute as in: %(name)s.
Con: You cannot customize handling of format specifiers.
Con: There are some corner cases that can puzzle you. To avoid them you should always use either tuple or dict as argument.
Using str.format:
All the pros of %-style formatting (except that it is not similar to printf)
Similar to .NET String.Format
Pro: You can manually specify numbered fields which allows you to use a positional argument multiple times
Pro: More options in the format specifiers
Pro: You can customize the formatting specifiers in custom types
The commas do not do string-formatting. They are part of the print statement statement syntax.
They have a softspace "feature" which is gone in python3 since print is a function now:
>>> print 'something\t', 'other'
something other
>>> print 'something\tother'
something other
Note how the above outputs are exactly equivalent even though the first one used comma.
This is because the comma doesn't introduce whitespace in certain situations (e.g. right after a tab or a newline).
In python3 this doesn't happen:
>>> print('something\t', 'other')
something other
>>> print('something\tother') # note the difference in spacing.
something other
Since python2.6 the preferred way of doing string formatting is using the str.format method. It was meant to replace the %-style formatting, even though currently there are no plans (and I don't there will ever be) to remove %-style formatting.
string.format() basics
Here are a couple of example of basic string substitution, the {} is the placeholder for the substituted variables. If no format is specified, it will insert and format as a string.
s1 = "so much depends upon {}".format("a red wheel barrow")
s2 = "glazed with {} water beside the {} chickens".format("rain", "white")
You can also use the numeric position of the variables and change them in the strings, this gives some flexibility when doing the formatting, if you made a mistake in the order you can easily correct without shuffling all variables around.
s1 = " {0} is better than {1} ".format("emacs", "vim")
s2 = " {1} is better than {0} ".format("emacs", "vim")
The format() function offers a fair amount of additional features and capabilities, here are a few useful tips and tricks using .format()
Named Arguments
You can use the new string format as a templating engine and use named arguments, instead of requiring a strict order.
madlib = " I {verb} the {object} off the {place} ".format(verb="took", object="cheese", place="table")
>>> I took the cheese off the table
Reuse Same Variable Multiple Times
Using the % formatter, requires a strict ordering of variables, the .format() method allows you to put them in any order as we saw above in the basics, but also allows for reuse.
str = "Oh {0}, {0}! wherefore art thou {0}?".format("Romeo")
>>> Oh Romeo, Romeo! wherefore art thou Romeo?
Use Format as a Function
You can use .format as a function which allows for some separation of text and formatting from code. For example at the beginning of your program you could include all your formats and then use later. This also could be a nice way to handle internationalization which not only requires different text but often requires different formats for numbers.
email_f = "Your email address was {email}".format
print(email_f(email="bob#example.com"))
Escaping Braces
If you need to use braces when using str.format(), just double up
print(" The {} set is often represented as {{0}} ".format("empty"))
>>> The empty set is often represented as {0}
the question is, wether you want print variables (case 1) or want to output formatted text (case 2). Case one is good and easy to use, mostly for debug output.
If you like to say something in a defined way, formatting is the better choice. '+' is not the pythonic way of string maipulation.
An alternative to % is "{0} to the {1}".format(a,b) and is the preferred way of formatting since Python 3.
Depends a bit on which version.
Python 2 will be simply:
print 'string'
print 345
print 'string'+(str(345))
print ''
Python 3 requires parentheses (wish it didn't personally)
print ('string')
print (345)
print ('string'+(str(345))
Also the most foolproof method to do it is to convert everything into a variable:
a = 'string'
b = 345
c = str(345)
d = a + c
I have this dirty short line of code printing a formatted date:
print '%f %d' % math.modf(time.time())
Now the modf method gives back two arguments, so the print function expects both. In the sake of purely doing this, and not the simple alternative of putting the output separately in a variable; is there any existing way of excepting parameters in print or is there any way to call specific argument-indexes?
For example I have 3 arguments in a print:
print '%s %d.%f' % 'Money',19,.99
I want to skip the first String-parameter that is parsed. Is there any way to do this?
This is mainly a want-to-know-if-possible question, no need to give alternative solutions :P.
Not in "old school" string formatting.
But the format method of strings does have this ability.
print "{1}".format ("1","2")
The above will skip the "1" and will only print "2".
I hope you don't mind I gave an alternative despite you not asking for it :)
I was going through http://web2py.com/book/default/chapter/02 and found this:
>>> print 'number is ' + str(3)
number is 3
>>> print 'number is %s' % (3)
number is 3
>>> print 'number is %(number)s' % dict(number=3)
number is 3
It has been given that The last notation is more explicit and less error prone, and is to be preferred.
I am wondering what is the advantage of using the last notation.. will it not have a performance overhead?
>>> print 'number is ' + str(3)
number is 3
This is definitely the worst solution and might cause you problems if you do the beginner mistake "Value of obj: " + obj where obj is not a string or unicode object. For many concatenations, it's not readable at all - it's similar to something like echo "<p>Hello ".$username."!</p>"; in PHP (this can get arbitrarily ugly).
print 'number is %s' % (3)
number is 3
Now that is much better. Instead of a hard-to-read concatenation, you see the output format immediately. Coming back to the beginner mistake of outputting values, you can do print "Value of obj: %r" % obj, for example. I personally prefer this in most cases. But note that you cannot use it in gettext-translated strings if you have multiple format specifiers because the order might change in other languages.
As you forgot to mention it here, you can also use the new string formatting method which is similar:
>>> "number is {0}".format(3)
'number is 3'
Next, dict lookup:
>>> print 'number is %(number)s' % dict(number=3)
number is 3
As said before, gettext-translated strings might change the order of positional format specifiers, so this option is the best when working with translations. The performance drop should be negligible - if your program is not all about formatting strings.
As with the positional formatting, you can also do it in the new style:
>>> "number is {number}".format(number=3)
'number is 3'
It's hard to tell which one to take. I recommend you to use positional arguments with the % notation for simple strings and dict lookup formatting for translated strings.
I can think of a few differences.
First to me is cumbersome, if more than one variable is involved. I can not speak of performance penalty on that. See additional arguments below.
The second example is positional dependent and it can be easy to change position causing errors. It also does not tell you anything about the variables.
The third example, the position of variables is not important. You use a dictionary. This makes it elegant as it does not rely on positional structuring of variables.
See the example below:
>>> print 'number is %s %s' % (3,4)
number is 3 4
>>> print 'number is %s %s' % (4,3)
number is 4 3
>>> print 'number is %(number)s %(two)s' % dict(number=3, two=4)
number is 3 4
>>> print 'number is %(number)s %(two)s' % dict(two=4, number=3)
number is 3 4
>>>
Also another part of discussion on this
"+" is the string concatenation operator.
"%" is string formatting.
In this trivial case, string formatting accomplishes the same result as concatenation. Unlike string formatting, string concatenation only works when everything is already a string. So if you miss to convert your variables to string, concatenation will cause error.
[Edit: My answer was biased towards templating since the question came from web2py where templates are so commonly involved]
As Ryan says below, the concatenation is faster than formatting.
Suggestion is
Use the first form - concatenation, if you are concatenating just two strings
Use the second form, if there are few variables. You can invariably see the positions and deal with them
Use the third form when you are doing templating i.e. formatting a large piece of string with variable data. The dictionary form helps in providing meaning to variables inside the large piece of text.
I am wondering what is the advantage
of using the last notation..
Hm, as you said, the last notation is really more explicit and actually is less error prone.
will it not have a performance
overhead?
It will have little performance overhead, but it's minor if compared with data fetching from DB or network connections.
It's a bad, unjustified piece of advice.
The third method is cumbersome, violates DRY, and error prone, except if:
You are writing a framework which don't have control over the format string. For example, logging module, web2py, or gettext.
The format string is extremely long.
The format string is read from a file from a config file.
The problem with the third method should be obvious when you consider that foo appears three times in this code: "%(foo)s" % dict(foo=foo). This is error prone. Most programs should not use the third method, unless they know they need to.
The second method is the simplest method, and is what you generally use in most programs. It is best used when the format string is immediate, e.g. 'values: %s %s %s' % (a, b, c) instead of taken from a variable, e.g. fmt % (a, b, c).
The first concatenation is almost never useful, except perhaps if you're building list by loops:
s = ''
for x in l:
s += str(x)
however, in that case, it's generally better and faster to use str.join():
s = ''.join(str(x) for x in l)
When formatting a string, my string may contain a modulo "%" that I do not wish to have converted. I can escape the string and change each "%" to "%%" as a workaround.
e.g.,
'Day old bread, 50%% sale %s' % 'today!'
output:
'Day old bread, 50% sale today'
But are there any alternatives to escaping? I was hoping that using a dict would make it so Python would ignore any non-keyword conversions.
e.g.,
'Day old bread, 50% sale %(when)s' % {'when': 'today'}
but Python still sees the first modulo % and gives a:
TypeError: not enough arguments for format string
You could (and should) use the new string .format() method (if you have Python 2.6 or higher) instead:
"Day old bread, 50% sale {0}".format("today")
The manual can be found here.
The docs also say that the old % formatting will eventually be removed from the language, although that will surely take some time. The new formatting methods are way more powerful, so that's a Good Thing.
Not really - escaping your % signs is the price you pay for using string formatting. You could use string concatenation instead: 'Day old bread, 50% sale ' + whichday if that helps...
Escaping a '%' as '%%' is not a workaround. If you use String formatting that is the way to represent a '%' sign. If you don't want that, you can always do something like:
print "Day old bread, 50% sale " + "today"
e.g. not using formatting.
Please note that when using string concatenation, be sure that the variable is a string (and not e.g. None) or use str(varName). Otherwise you get something like 'Can't concatenate str and NoneType'.
You can use regular expressions to replace % by %% where % is not followed by (
def format_with_dict(str, dictionary):
str = re.sub(r"%([^\(])", r"%%\1", str)
str = re.sub(r"%$", r"%%", str) # There was a % at the end?
return str % dictionary
This way:
print format_with_dict('Day old bread, 50% sale %(when)s', {'when': 'today'})
Will output:
Day old bread, 50% sale today
This method is useful to avoid "not enough arguments for format string" errors.