Replacing a substring of a string with Python - python

I'd like to get a few opinions on the best way to replace a substring of a string with some other text. Here's an example:
I have a string, a, which could be something like "Hello my name is $name". I also have another string, b, which I want to insert into string a in the place of its substring '$name'.
I assume it would be easiest if the replaceable variable is indicated some way. I used a dollar sign, but it could be a string between curly braces or whatever you feel would work best.
Solution:
Here's how I decided to do it:
from string import Template
message = 'You replied to $percentageReplied of your message. ' +
'You earned $moneyMade.'
template = Template(message)
print template.safe_substitute(
percentageReplied = '15%',
moneyMade = '$20')

Here are the most common ways to do it:
>>> import string
>>> t = string.Template("Hello my name is $name")
>>> print t.substitute(name='Guido')
Hello my name is Guido
>>> t = "Hello my name is %(name)s"
>>> print t % dict(name='Tim')
Hello my name is Tim
>>> t = "Hello my name is {name}"
>>> print t.format(name='Barry')
Hello my name is Barry
The approach using string.Template is easy to learn and should be familiar to bash users. It is suitable for exposing to end-users. This style became available in Python 2.4.
The percent-style will be familiar to many people coming from other programming languages. Some people find this style to be error-prone because of the trailing "s" in %(name)s, because the %-operator has the same precedence as multiplication, and because the behavior of the applied arguments depends on their data type (tuples and dicts get special handling). This style has been supported in Python since the beginning.
The curly-bracket style is only supported in Python 2.6 or later. It is the most flexible style (providing a rich set of control characters and allowing objects to implement custom formatters).

There are a number of ways to do it, the more commonly used would be through the facilities already provided by strings. That means the use of the % operator, or better yet, the newer and recommended str.format().
Example:
a = "Hello my name is {name}"
result = a.format(name=b)
Or more simply
result = "Hello my name is {name}".format(name=b)
You can also use positional arguments:
result = "Hello my name is {}, says {}".format(name, speaker)
Or with explicit indexes:
result = "Hello my name is {0}, says {1}".format(name, speaker)
Which allows you to change the ordering of the fields in the string without changing the call to format():
result = "{1} says: 'Hello my name is {0}'".format(name, speaker)
Format is really powerful. You can use it to decide how wide to make a field, how to write numbers, and other formatting of the sort, depending on what you write inside the brackets.
You could also use the str.replace() function, or regular expressions (from the re module) if the replacements are more complicated.

Checkout the replace() function in python. Here is a link:
http://www.tutorialspoint.com/python/string_replace.htm
This should be useful when trying to replace some text that you have specified. For example, in the link they show you this:
str = "this is string example....wow!!! this is really string"
print str.replace("is", "was")
For every word "is", it would replace it with the word "was".

Actually this is already implemented in the module string.Template.

You can do something like:
"My name is {name}".format(name="Name")
It's supported natively in python, as you can see here:
http://www.python.org/dev/peps/pep-3101/

You may also use formatting with % but .format() is considered more modern.
>>> "Your name is %(name)s. age: %(age)i" % {'name' : 'tom', 'age': 3}
'Your name is tom'
but it also supports some type checking as known from usual % formatting:
>>> '%(x)i' % {'x': 'string'}
Traceback (most recent call last):
File "<pyshell#40>", line 1, in <module>
'%(x)i' % {'x': 'string'}
TypeError: %d format: a number is required, not str

Related

Store formatted strings, pass in values later?

I have a dictionary with a lot of strings.
Is it possible to store a formatted string with placeholders and pass in a actual values later?
I'm thinking of something like this:
d = {
"message": f"Hi There, {0}"
}
print(d["message"].format("Dave"))
The above code obviously doesn't work but I'm looking for something similar.
You use f-string; it already interpolated 0 in there. You might want to remove f there
d = {
# no f here
"message": "Hi There, {0}"
}
print(d["message"].format("Dave"))
Hi There, Dave
Issue: mixing f-String with str.format
Technique
Python version
f-String
since 3.6
str.format
since 2.6
Your dict-value contains an f-String which is immediately evaluated.
So the expression inside the curly-braces (was {0}) is directly interpolated (became 0), hence the value assigned became "Hi There, 0".
When applying the .format argument "Dave", this was neglected because string already lost the templating {} inside. Finally string was printed as is:
Hi There, 0
Attempt to use f-String
What happens if we use a variable name like name instead of the constant integer 0 ?
Let's try on Python's console (REPL):
>>> d = {"message": f"Hi There, {name}"}
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'name' is not defined
OK, we must define the variable before. Let's assume we did:
>>> name = "Dave"; d = {"message": f"Hi There, {name}"}
>>> print(d["message"])
Hi There, Dave
This works. But it requires the variable or expression inside the curly-braces to be valid at runtime, at location of definition: name is required to be defined before.
Breaking a lance for str.format
There are reasons
when you need to read templates from external sources (e.g. file or database)
when not variables but placeholders are configured independently from your source
Then indexed-placeholders should be preferred to named-variables.
Consider a given database column message with value "Hello, {1}. You are {0}.". It can be read and used independently from the implementation (programming-language, surrounding code).
For example
in Java: MessageFormat.format(message, 74, "Eric")
in Python: message.format(74, 'Eric').
See also:
Format a message using MessageFormat.format() in Java

Can modulo (%s) be placed in string when first creating a variable and then used later on?

I'm quite new to python, so forgive me if this is a silly question. I know how to use the modulo in strings in this fashion
Me = "I'm %s and I like to %s" % ('Mike', 'code')
However, through my searching I haven't found an answer to whether or not it's possible to hardcode modulos into a string, then take advantage of it later.
Example:
REPO_MENU = {'Issues':Api.github/repo/%s/branch/%s/issues,
'Pull Requests':'Api.github/repo/%s/branch/%s/pull_requests',
'Commits':'Api.github/repo/%s/branch/%s/commits'
'<FILTER>: Branch':'Api.github/repo/%s/branch/%s'
}
for key, value in REPO_MENU.items():
Print value % ('Beta', 'master')
Will that format work? Is it good practice to use this method? I feel it could be beneficial in a lot of situations.
This does work. You can also use the format function, which works well. For example:
menu1 = {'start':'hello_{0}_{1}',
'end':'goodbye_{0}_{1}'}
menu2 = {'start':'hello_%s_%s',
'end':'goodbye_%s_%s'}
for key, value in menu1.items():
print value.format('john','smith')
for key, value in menu2.items():
print value %('john','smith')
% is an operator like any other; when its left-hand operand is a string, it attempts to replace various placeholders with values from its right-hand operand. It doesn't matter if the left-hand operand is a string literal or a more complex expression, as long as it evaluates to a string.
As the other answers have noted, you can definitely perform the string-modulo operation multiple times on the same string. However, if you are using Python 3.6 (and if you can, you definitely SHOULD!), I suggest that you use fstrings rather than the string-modulo or .format. They are faster, easier to read, and very convenient:
A formatted string literal or f-string is a string literal that is prefixed with 'f' or 'F'. These strings may contain replacement fields, which are expressions delimited by curly braces {}. While other string literals always have a constant value, formatted strings are really expressions evaluated at run time.
So the f-string is also portable, just like the other formatting options.
E.g.:
>>> value = f'A {flower.lower()} by any name would smell as sweet.'
>>> flower = 'ROSE'
>>> print(value)
A rose by any name would smell as sweet.
>>> flower = 'Petunia'
>>> print(value)
A petunia by any name would smell as sweet.
>>> flower = 'Ferrari'
>>> print(value)
A ferrari by any name would smell as sweet.
You can add this at the top of any module using the f-string as a helpful alert for other users (or future-you):
try:
eval(f'')
except SyntaxError:
print('Python 3.6+ required.')`.
raise

Is there a difference between str function and percent operator in Python

When converting an object to a string in python, I saw two different idioms:
A: mystring = str(obj)
B: mystring = "%s" % obj
Is there a difference between those two? (Reading the Python docs, I would suspect no, because the latter case would implicitly call str(obj) to convert obj to a string.
If yes, when should I use which?
If no, which one should I prefer in "good" python code? (From the python philosophy "explicit over implicit", A would be considered the better one?)
The second version does more work.
The %s operator calls str() on the value it interpolates, but it also has to parse the template string first to find the placeholder in the first place.
Unless your template string contains more text, there is no point in asking Python to spend more cycles on the "%s" % obj expression.
However, paradoxically, the str() conversion is, in practice, slower as looking up the name str() and pushing the stack to call the function takes more time than the string parsing:
>>> from timeit import timeit
>>> timeit('str(obj)', 'obj = 4.524')
0.32349491119384766
>>> timeit('"%s" % obj', 'obj = 4.524')
0.27424097061157227
You can recover most of that difference by binding str to a local name first:
>>> timeit('_str(obj)', 'obj = 4.524; _str = str')
0.28351712226867676
To most Python developers, using the string templating option is going to be confusing as str() is far more straightforward. Stick to the function unless you have a critical section that does a lot of string conversions.

Are there other ways to format strings other then comma, percent, plus sign?

I've been looking around and I've been unable to find a definitive answer to this question: what's the recommended way to print variables in Python?
So far, I've seen three ways: using commas, using percent signs, or using plus signs:
>>> a = "hello"
>>> b = "world"
>>> print a, "to the", b
hello to the world
>>> print "%s to the %s" % (a, b)
hello to the world
>>> print a + " to the " + b
hello to the world
Each method seems to have its pros and cons.
Commas allow to write the variable directly and add spaces, as well as automatically perform a string conversion if needed. But I seem to remember that good coding practices say that it's best to separate your variables from your text.
Percent signs allow that, though they require to use a list when there's more than one variable, and you have to write the type of the variable (though it seems able to convert even if the variable type isn't the same, like trying to print a number with %s).
Plus signs seem to be the "worst" as they mix variables and text, and don't convert on the fly; though maybe it is necessary to have more control on your variable from time to time.
I've looked around and it seems some of those methods may be obsolete nowadays. Since they all seem to work and each have their pros and cons, I'm wondering: is there a recommended method, or do they all depend on the context?
Including the values from identifiers inside a string is called string formatting. You can handle string formatting in different ways with various pros and cons.
Using string concatenation (+)
Con: You must manually convert objects to strings
Pro: The objects appear where you want to place the into the string
Con: The final layout may not be clear due to breaking the string literal
Using template strings (i.e. $bash-style substitution):
Pro: You may be familiar with shell variable expansion
Pro: Conversion to string is done automatically
Pro: Final layout is clear.
Con: You cannot specify how to perform the conversion
Using %-style formatting:
Pro: similar to formatting with C's printf.
Pro: conversions are done for you
Pro: you can specify different type of conversions, with some options (e.g. precision for floats)
Pro: The final layout is clear
Pro: You can also specify the name of the elements to substitute as in: %(name)s.
Con: You cannot customize handling of format specifiers.
Con: There are some corner cases that can puzzle you. To avoid them you should always use either tuple or dict as argument.
Using str.format:
All the pros of %-style formatting (except that it is not similar to printf)
Similar to .NET String.Format
Pro: You can manually specify numbered fields which allows you to use a positional argument multiple times
Pro: More options in the format specifiers
Pro: You can customize the formatting specifiers in custom types
The commas do not do string-formatting. They are part of the print statement statement syntax.
They have a softspace "feature" which is gone in python3 since print is a function now:
>>> print 'something\t', 'other'
something other
>>> print 'something\tother'
something other
Note how the above outputs are exactly equivalent even though the first one used comma.
This is because the comma doesn't introduce whitespace in certain situations (e.g. right after a tab or a newline).
In python3 this doesn't happen:
>>> print('something\t', 'other')
something other
>>> print('something\tother') # note the difference in spacing.
something other
Since python2.6 the preferred way of doing string formatting is using the str.format method. It was meant to replace the %-style formatting, even though currently there are no plans (and I don't there will ever be) to remove %-style formatting.
string.format() basics
Here are a couple of example of basic string substitution, the {} is the placeholder for the substituted variables. If no format is specified, it will insert and format as a string.
s1 = "so much depends upon {}".format("a red wheel barrow")
s2 = "glazed with {} water beside the {} chickens".format("rain", "white")
You can also use the numeric position of the variables and change them in the strings, this gives some flexibility when doing the formatting, if you made a mistake in the order you can easily correct without shuffling all variables around.
s1 = " {0} is better than {1} ".format("emacs", "vim")
s2 = " {1} is better than {0} ".format("emacs", "vim")
The format() function offers a fair amount of additional features and capabilities, here are a few useful tips and tricks using .format()
Named Arguments
You can use the new string format as a templating engine and use named arguments, instead of requiring a strict order.
madlib = " I {verb} the {object} off the {place} ".format(verb="took", object="cheese", place="table")
>>> I took the cheese off the table
Reuse Same Variable Multiple Times
Using the % formatter, requires a strict ordering of variables, the .format() method allows you to put them in any order as we saw above in the basics, but also allows for reuse.
str = "Oh {0}, {0}! wherefore art thou {0}?".format("Romeo")
>>> Oh Romeo, Romeo! wherefore art thou Romeo?
Use Format as a Function
You can use .format as a function which allows for some separation of text and formatting from code. For example at the beginning of your program you could include all your formats and then use later. This also could be a nice way to handle internationalization which not only requires different text but often requires different formats for numbers.
email_f = "Your email address was {email}".format
print(email_f(email="bob#example.com"))
Escaping Braces
If you need to use braces when using str.format(), just double up
print(" The {} set is often represented as {{0}} ".format("empty"))
>>> The empty set is often represented as {0}
the question is, wether you want print variables (case 1) or want to output formatted text (case 2). Case one is good and easy to use, mostly for debug output.
If you like to say something in a defined way, formatting is the better choice. '+' is not the pythonic way of string maipulation.
An alternative to % is "{0} to the {1}".format(a,b) and is the preferred way of formatting since Python 3.
Depends a bit on which version.
Python 2 will be simply:
print 'string'
print 345
print 'string'+(str(345))
print ''
Python 3 requires parentheses (wish it didn't personally)
print ('string')
print (345)
print ('string'+(str(345))
Also the most foolproof method to do it is to convert everything into a variable:
a = 'string'
b = 345
c = str(345)
d = a + c

String formatting issues and concatenating a string with a number

I'm coming from a c# background, and I do this:
Console.Write("some text" + integerValue);
So the integer automatically gets converted to a string and it outputs.
In python I get an error when I do:
print 'hello' + 10
Do I have to convert to string everytime?
How would I do this in python?
String.Format("www.someurl.com/{0}/blah.html", 100);
I'm beginning to really like python, thanks for all your help!
From Python 2.6:
>>> "www.someurl.com/{0}/blah.html".format(100)
'www.someurl.com/100/blah.html'
To support older environments, the % operator has a similar role:
>>> "www.someurl.com/%d/blah.html" % 100
'www.someurl.com/100/blah.html'
If you would like to support named arguments, then you can can pass a dict.
>>> url_args = {'num' : 100 }
>>> "www.someurl.com/%(num)d/blah.html" % url_args
'www.someurl.com/100/blah.html'
In general, when types need to be mixed, I recommend string formatting:
>>> '%d: %s' % (1, 'string formatting',)
'1: string formatting'
String formatting coerces objects into strings by using their __str__ methods.[*] There is much more detailed documentation available on Python string formatting in the docs. This behaviour is different in Python 3+, as all strings are unicode.
If you have a list or tuple of strings, the join method is quite convenient. It applies a separator between all elements of the iterable.
>>> ' '.join(['2:', 'list', 'of', 'strings'])
'2: list of strings'
If you are ever in an environment where you need to support a legacy environment, (e.g. Python <2.5), you should generally avoid string concatenation. See the article referenced in the comments.
[*] Unicode strings use the __unicode__ method.
>>> u'3: %s' % ':)'
u'3: :)'
>>> "www.someurl.com/{0}/blah.html".format(100)
'www.someurl.com/100/blah.html'
you can skip 0 in python 2.7 or 3.1.
Additionally to string formatting, you can always print like this:
print "hello", 10
Works since those are separate arguments and print converts non-string arguments to strings (and inserts a space in between).
For string formatting that includes different types of values, use the % to insert the value into a string:
>>> intvalu = 10
>>> print "hello %i"%intvalu
hello 10
>>>
so in your example:
>>>print "www.someurl.com/%i/blah.html"%100
www.someurl.com/100/blah.html
In this example I'm using %i as the stand-in. This changes depending on what variable type you need to use. %s would be for strings. There is a list here on the python docs website.

Categories

Resources