I have a string: 1x22x1x.
I need to replace all 1 to 2 and vice versa. So example line would be 2x11x2x. Just wondering how is it done. I tried
a = "1x22x1x"
b = a.replace('1', '2').replace('2', '1')
print b
output is 1x11x1x
Maybe i should forget about using replace..?
Here's a way using the translate method of a string:
>>> a = "1x22x1x"
>>> a.translate({ord('1'):'2', ord('2'):'1'})
'2x11x2x'
>>>
>>> # Just to explain
>>> help(str.translate)
Help on method_descriptor:
translate(...)
S.translate(table) -> str
Return a copy of the string S, where all characters have been mapped
through the given translation table, which must be a mapping of
Unicode ordinals to Unicode ordinals, strings, or None.
Unmapped characters are left untouched. Characters mapped to None
are deleted.
>>>
Note however that I wrote this for Python 3.x. In 2.x, you will need to do this:
>>> from string import maketrans
>>> a = "1x22x1x"
>>> a.translate(maketrans('12', '21'))
'2x11x2x'
>>>
Finally, it is important to remember that the translate method is for interchanging characters with other characters. If you want to interchange substrings, you should use the replace method as Rohit Jain demonstrated.
One way is to use a some temporary string as intermediate replacement:
b = a.replace('1', '#temp_replace#').replace('2', '1').replace('#temp_replace#', '2')
But this may fail, if your string already contains #temp_replace#. This technique is also described in PEP 378
If the "sources" are all one character, you can make a new string:
>>> a = "1x22x1x"
>>> replacements = {"1": "2", "2": "1"}
>>> ''.join(replacements.get(c,c) for c in a)
'2x11x2x'
IOW, make a new string using the get method which accepts a default parameter. somedict.get(c,c) means something like somedict[c] if c in somedict else c, so if the character is in the replacements dictionary you use the associated value otherwise you simply use the character itself.
Related
In python when I try this :-
ac = "Pearl Riverb-Vaccines"
b = ac.strip("-Vaccines")
b = b.strip()
print(b)
The output is :- Pearl Riverb
But when I try this :-
ac = "Pearl Rivera-Vaccines"
b = ac.strip("-Vaccines")
b = b.strip()
print(b)
The output is :- Pearl River
So why is the 'a' missing in the second code?
I have tried every other letter and it is printing but what is the problem with letter 'a' ?
strip() does not respect count or order when it removes characters from the end of your string. The argument you passed it, "-Vaccines", contains an "a", so it will remove the "a" from "Rivera". It does not matter that it already removed an "a" from "Vaccines" and it does not matter that it doesn't come between a V and a c.
Consider another example:
>>> "abcXqrqqqrrrqrqrqrqrqqrr".strip("qr")
'abcX'
Many qs and rs are removed here, even though the argument to strip contains only one of each.
In general, strip is not suitable for removing a static number of characters from the end of a string. One possible alternative is to use regex, which can match a literal character sequence that appears at the end of a string:
>>> import re
>>> ac = "Pearl Rivera-Vaccines"
>>> re.sub("-Vaccines$", "", ac)
'Pearl Rivera'
In his answer, Tom Karzes observes that this approach doesn't readily work on strings that contain characters that have special meanings in a regex. For instance,
>>> import re
>>> s = "foo^bar"
>>> re.sub("^bar$", "", s)
'foo^bar'
^ has a special meaning in regex, so the pattern "^bar$" fails to match the end of the string s. If the string you want to match contains special characters, you should escape it, either manually or with an re.escape call.
>>> import re
>>> s = "foo^bar"
>>> re.sub(r"\^bar$", "", s)
'foo'
>>> re.sub(re.escape("^bar") + "$", "", s)
'foo'
The problem is that the argument to strip isn't used the way you think it is. The argument isn't treated as a sequence of characters, but rather as a set of characters. Any character in the argument string is removed. For example:
"abaca".strip("ac")
Produces:
'b'
since all instances of "a" and "b" have been removed.
If you just want to remove a suffix from a string, you can do something like:
ac = "Pearl Rivera-Vaccines"
s = "-Vaccines"
b = ac
if b.endswith(s):
b = b[:-len(s)]
This will result in b having the value:
'Pearl Rivera'
Note that this will be faster than using the re module. It will also be more flexible, since it will work with any non-empty string (whereas creating a regular expression will require escaping certain characters).
I have a string opening with { and closing with }. This brackets are always at first and at last and must appear, they can not appear in the middle. as following:
{-4,10746,.....,205}
{-3,105756}
what is the most efficient way to remove the brackets to receive:
-4,10746,.....,205
-3,105756
s[1:-1] # skip the first and last character
You can also use replace method.
In [1]: a = 'hello world'
In [3]: a.replace('l','')
Out[3]: 'heo word'
Since you were not clear there are two possibilities it may be a string or a set
If it is a set this might work:
a= {-4, 205, 10746}
",".join([str(s) for s in a])
output='10746,-4,205'
If it is a string this will work:
a= '{-4, 205, 10746}'
a.replace("{","").replace("}","")
output= '-4, 205, 10746'
Since there is no order in set the output is that way
Here's a rather roundabout way of doing exactly what you need:
l = {-3,105756}
new_l = []
for ch in l:
if ch!='{' and ch!= '}':
new_l.append(ch)
for i,val in enumerate(new_l):
length = len(new_l)
if(i==length-1):
print str(val)
else:
print str(val)+',',
I'm sure there are numerous single line codes to give you what you want, but this is kind of what goes on in the background, and will also remove the braces irrespective of their positions in the input string.
Just a side note, answer by #dlask is good to solve your issue.
But if what you really want is to convert that string (that looks like a set) to a set object (or some other data structure) , you can also use ast.literal_eval() function -
>>> import ast
>>> s = '{-3,105756}'
>>> sset = ast.literal_eval(s)
>>> sset
{105756, -3}
>>> type(sset)
<class 'set'>
From documentation -
ast.literal_eval(node_or_string)
Safely evaluate an expression node or a Unicode or Latin-1 encoded string containing a Python literal or container display. The string or node provided may only consist of the following Python literal structures: strings, numbers, tuples, lists, dicts, booleans, and None.
The safest way would be to strip:
'{-4, 205, 10746}'.strip("{}")
How can I convert a specific letter in a string, i.e all the the as in 'ahdhkhkahfkahafafkh' to uppercase?
I can only seem to find ways to capitalize the first word or upper/lower case the entire string.
You can use str.translate with string.maketrans:
>>> import string
>>> table = string.maketrans('a', 'A')
>>> 'abcdefgahajkl'.translate(table)
'AbcdefgAhAjkl'
This really shines if you want to replace 'a' and 'b' with their uppercase versions... then you just change the translation table:
table = string.maketrans('ab', 'AB')
Or, you can use str.replace if you really are only doing a 1 for 1 swap:
>>> 'abcdefgahajkl'.replace('a', 'A')
'AbcdefgAhAjkl'
This method shines when you only have one replacement. It replaces substrings rather than characters, so 'Bat'.replace('Ba', 'Cas') -> 'Cast'.
'banana'.replace('a', "A")
From the docs: https://docs.python.org/2/library/string.html#string.replace
>>> a = 'ahdhkhkahfkahafafkh'
>>> "".join(i.upper() if i == 'a' else i for i in a)
'AhdhkhkAhfkAhAfAfkh'
Or
>>> a.replace('a',"A")
'AhdhkhkAhfkAhAfAfkh'
Is there a way to convert a string to lowercase?
"Kilometers" → "kilometers"
Use str.lower():
"Kilometer".lower()
The canonical Pythonic way of doing this is
>>> 'Kilometers'.lower()
'kilometers'
However, if the purpose is to do case insensitive matching, you should use case-folding:
>>> 'Kilometers'.casefold()
'kilometers'
Here's why:
>>> "Maße".casefold()
'masse'
>>> "Maße".lower()
'maße'
>>> "MASSE" == "Maße"
False
>>> "MASSE".lower() == "Maße".lower()
False
>>> "MASSE".casefold() == "Maße".casefold()
True
This is a str method in Python 3, but in Python 2, you'll want to look at the PyICU or py2casefold - several answers address this here.
Unicode Python 3
Python 3 handles plain string literals as unicode:
>>> string = 'Километр'
>>> string
'Километр'
>>> string.lower()
'километр'
Python 2, plain string literals are bytes
In Python 2, the below, pasted into a shell, encodes the literal as a string of bytes, using utf-8.
And lower doesn't map any changes that bytes would be aware of, so we get the same string.
>>> string = 'Километр'
>>> string
'\xd0\x9a\xd0\xb8\xd0\xbb\xd0\xbe\xd0\xbc\xd0\xb5\xd1\x82\xd1\x80'
>>> string.lower()
'\xd0\x9a\xd0\xb8\xd0\xbb\xd0\xbe\xd0\xbc\xd0\xb5\xd1\x82\xd1\x80'
>>> print string.lower()
Километр
In scripts, Python will object to non-ascii (as of Python 2.5, and warning in Python 2.4) bytes being in a string with no encoding given, since the intended coding would be ambiguous. For more on that, see the Unicode how-to in the docs and PEP 263
Use Unicode literals, not str literals
So we need a unicode string to handle this conversion, accomplished easily with a unicode string literal, which disambiguates with a u prefix (and note the u prefix also works in Python 3):
>>> unicode_literal = u'Километр'
>>> print(unicode_literal.lower())
километр
Note that the bytes are completely different from the str bytes - the escape character is '\u' followed by the 2-byte width, or 16 bit representation of these unicode letters:
>>> unicode_literal
u'\u041a\u0438\u043b\u043e\u043c\u0435\u0442\u0440'
>>> unicode_literal.lower()
u'\u043a\u0438\u043b\u043e\u043c\u0435\u0442\u0440'
Now if we only have it in the form of a str, we need to convert it to unicode. Python's Unicode type is a universal encoding format that has many advantages relative to most other encodings. We can either use the unicode constructor or str.decode method with the codec to convert the str to unicode:
>>> unicode_from_string = unicode(string, 'utf-8') # "encoding" unicode from string
>>> print(unicode_from_string.lower())
километр
>>> string_to_unicode = string.decode('utf-8')
>>> print(string_to_unicode.lower())
километр
>>> unicode_from_string == string_to_unicode == unicode_literal
True
Both methods convert to the unicode type - and same as the unicode_literal.
Best Practice, use Unicode
It is recommended that you always work with text in Unicode.
Software should only work with Unicode strings internally, converting to a particular encoding on output.
Can encode back when necessary
However, to get the lowercase back in type str, encode the python string to utf-8 again:
>>> print string
Километр
>>> string
'\xd0\x9a\xd0\xb8\xd0\xbb\xd0\xbe\xd0\xbc\xd0\xb5\xd1\x82\xd1\x80'
>>> string.decode('utf-8')
u'\u041a\u0438\u043b\u043e\u043c\u0435\u0442\u0440'
>>> string.decode('utf-8').lower()
u'\u043a\u0438\u043b\u043e\u043c\u0435\u0442\u0440'
>>> string.decode('utf-8').lower().encode('utf-8')
'\xd0\xba\xd0\xb8\xd0\xbb\xd0\xbe\xd0\xbc\xd0\xb5\xd1\x82\xd1\x80'
>>> print string.decode('utf-8').lower().encode('utf-8')
километр
So in Python 2, Unicode can encode into Python strings, and Python strings can decode into the Unicode type.
With Python 2, this doesn't work for non-English words in UTF-8. In this case decode('utf-8') can help:
>>> s='Километр'
>>> print s.lower()
Километр
>>> print s.decode('utf-8').lower()
километр
Also, you can overwrite some variables:
s = input('UPPER CASE')
lower = s.lower()
If you use like this:
s = "Kilometer"
print(s.lower()) - kilometer
print(s) - Kilometer
It will work just when called.
Don't try this, totally un-recommend, don't do this:
import string
s='ABCD'
print(''.join([string.ascii_lowercase[string.ascii_uppercase.index(i)] for i in s]))
Output:
abcd
Since no one wrote it yet you can use swapcase (so uppercase letters will become lowercase, and vice versa) (and this one you should use in cases where i just mentioned (convert upper to lower, lower to upper)):
s='ABCD'
print(s.swapcase())
Output:
abcd
I would like to provide the summary of all possible methods
.lower() method.
str.lower()
combination of str.translate() and str.maketrans()
.lower() method
original_string = "UPPERCASE"
lowercase_string = original_string.lower()
print(lowercase_string) # Output: "uppercase"
str.lower()
original_string = "UPPERCASE"
lowercase_string = str.lower(original_string)
print(lowercase_string) # Output: "uppercase"
combination of str.translate() and str.maketrans()
original_string = "UPPERCASE"
lowercase_string = original_string.translate(str.maketrans(string.ascii_uppercase, string.ascii_lowercase))
print(lowercase_string) # Output: "uppercase"
lowercasing
This method not only converts all uppercase letters of the Latin alphabet into lowercase ones, but also shows how such logic is implemented. You can test this code in any online Python sandbox.
def turnIntoLowercase(string):
lowercaseCharacters = ''
abc = ['a','b','c','d','e','f','g','h','i','j','k','l','m',
'n','o','p','q','r','s','t','u','v','w','x','y','z',
'A','B','C','D','E','F','G','H','I','J','K','L','M',
'N','O','P','Q','R','S','T','U','V','W','X','Y','Z']
for character in string:
if character not in abc:
lowercaseCharacters += character
elif abc.index(character) <= 25:
lowercaseCharacters += character
else:
lowercaseCharacters += abc[abc.index(character) - 26]
return lowercaseCharacters
string = str(input("Enter your string, please: " ))
print(turnIntoLowercase(string = string))
Performance check
Now, let's enter the following string (and press Enter) to make sure everything works as intended:
# Enter your string, please:
"PYTHON 3.11.2, 15TH FeB 2023"
Result:
"python 3.11.2, 15th feb 2023"
If you want to convert a list of strings to lowercase, you can map str.lower:
list_of_strings = ['CamelCase', 'in', 'Python']
list(map(str.lower, list_of_strings)) # ['camelcase', 'in', 'python']
In Python, strings are immutable.
What is the standard idiom to walk through a string character-by-character and modify it?
The only methods I can think of are some genuinely stanky hacks related to joining against a result string.
--
In C:
for(int i = 0; i < strlen(s); i++)
{
s[i] = F(s[i]);
}
This is super expressive and says exactly what I am doing. That is what I am looking for.
Don't use a string, use something mutable like bytearray:
#!/usr/bin/python
s = bytearray("my dog has fleas")
for n in xrange(len(s)):
s[n] = chr(s[n]).upper()
print s
Results in:
MY DOG HAS FLEAS
Edit:
Since this is a bytearray, you aren't (necessarily) working with characters. You're working with bytes. So this works too:
s = bytearray("\x81\x82\x83")
for n in xrange(len(s)):
s[n] = s[n] + 1
print repr(s)
gives:
bytearray(b'\x82\x83\x84')
If you want to modify characters in a Unicode string, you'd maybe want to work with memoryview, though that doesn't support Unicode directly.
The Python analog of your C:
for(int i = 0; i < strlen(s); i++)
{
s[i] = F(s[i]);
}
would be:
s = "".join(F(c) for c in s)
which is also very expressive. It says exactly what is happening, but in a functional style rather than a procedural style.
you can use the UserString module:
>>> import UserString
... s = UserString.MutableString('Python')
... print s
Python
>>> s[0] = 'c'
>>> print s
cython
I'd say the most Pythonic way is to use map():
s = map(func, s) # func has been applied to every character in s
This is the equivalent of writing:
s = "".join(func(c) for c in s)
The question first states that strings are immutable and then asks for a way to change them in place. This is kind of contradictory. Anyway, as this question pops up at the top of the list when you search for "python string in-place modification", I'm adding the answer for a real in place change.
Strings seem to be immutable when you look at the methods of the string class. But no language with an interface to C can really provide immutable data types. The only question is whether you have to write C code in order to achieve the desired modification.
Here python ctypes is your friend. As it supports getting pointers and includes C-like memory copy functions, a python string can be modified in place like this:
s = 16 * "."
print s
ctypes.memmove(ctypes.c_char_p(s), "Replacement", 11)
print s
Results in:
................
Replacement.....
(Of course, you can calculate the replacement string at runtime by applying a function F to every character of the original string. Different ways how to do this have been shown in the previous answers.)
Note that I do not in any way encourage doing this. However, I had to write a replacement for a class that was mapped from C++ to python and included a method:
int readData(char* data, int length)
(The caller is supposed to provide memory with length bytes and the method then writes the available data -- up to length -- into that memory, returning the number of bytes written.) While this is a perfectly sensible API in C/C++, it should not have been made available as method of a python class or at least the users of the API should be made aware that they may only pass mutable byte arrays as parameter.
As you might expect, "common usage" of the method is as shown in my example (create a string and pass it together with its length as arguments). As I did not really want to write a C/C++ extension I had to come up with a solution for implementing the behavior in my replacement class using python only.
string.translate is probably the closest function to what you're after.
Strings are iterable and can be walked through like lists. Strings also have a number of basic methods such as .replace() that might be what you're looking for. All string methods return a new string. So instead of modifying the string in place you can simply replace its existing value.
>>> mystring = 'robot drama'
>>> mystring = mystring.replace('r', 'g')
>>> mystring
'gobot dgama'
Assigning a particular character to a particular index in a string is not a particularly common operation, so if you find yourself needing to do it, think about whether there may be a better way to accomplish the task. But if you do need to, probably the most standard way would be to convert the string to a list, make your modifications, and then convert it back to a string.
s = 'abcdefgh'
l = list(s)
l[3] = 'r'
s2 = ''.join(l)
EDIT: As posted in bstpierre's answer, bytearray is probably even better for this task than list, as long as you're not working with Unicode strings.
s = 'abcdefgh'
b = bytearray(s)
b[3] = 'r'
s2 = str(b)
>>> mystring = "Th1s 1s my str1ng"
>>> mystring.replace("1", "i")
'This is my string'
If you want to store this new string you'll have to mystring = mystring.replace("1", "i"). This is because in Python strings are immutable.
If I ever need to do something like that I just convert it to a mutable list
For example... (though it would be easier to use sort (see second example) )
>>> s = "abcdfe"
>>> s = list(s)
>>> s[4] = "e"
>>> s[5] = "f"
>>> s = ''.join(s)
>>> print s
abcdef
>>>
# second example
>>> s.sort()
>>> s = ''.join(s)
Here is an example using translate to switch "-" with "." and uppercase "a"s
>>> from string import maketrans
>>> trans_table = maketrans(".-a","-.A")
>>> "foo-bar.".translate(trans_table)
'foo.bAr-'
This is much more efficient that flipping to byte array and back if you just need to do single char replacements
def modifyIdx(s, idx, newchar):
return s[:idx] + newchar + s[idx+1:]
You can use StringIO class to receive file-like mutable interface of string.
I did that like this:
import tempfile
import shutil
...
f_old = open(input_file, 'r')
with tempfile.NamedTemporaryFile() as tmp:
for line in f_old:
tmp.write(line.replace(old_string, new_string))
f_old.close()
tmp.flush()
os.fsync(tmp)
shutil.copy2(tmp.name, input_file)
tmp.close()
Here's my pythonic solution for In-place string reversal.
Accounts for white spaces too.
Note: It won't match any special characters if included in input_string except for underscore ( '_' )
i/p - "Hello World" => o/p - "olleH dlroW"
import re
def inplace_reversal(input_string):
list_of_strings = re.findall(r'\s|(\w+)',input_string)
output_string= ''
for string in list_of_strings:
if string == '':
output_string += ' '
else:
output_string += string[::-1]
return output_string
print(inplace_reversal('__Hello__ __World__ __Hello__ __World__ '))
>>> __olleH__ __dlroW__ __olleH__ __dlroW__