python - Replace several different characters by only one [duplicate] - python

This question already has answers here:
how to replace multiple characters in a string?
(3 answers)
Closed 5 years ago.
I'm looking for a way to replace some characters by another one.
For example we have :
chars_to_be_replaced = "ihgr"
and we want them to be replaced by
new_char = "b"
So that the new string
s = "im hungry"
becomes
s' = "bm bunbby".
I'm well aware you can do this one char at a time with .replace or with regular expressions, but I'm looking for a way to go only once through the string.
Does the re.sub goes only once through the string ? Are there other ways to do this ? Thanks
Thanks

You can use string.translate()
from string import maketrans
chars_to_be_replaced = "ihgr"
new_char = "b"
s = "im hungry"
trantab = maketrans(chars_to_be_replaced, new_char * len(chars_to_be_replaced))
print s.translate(trantab)
# bm bunbby

How about this:
chars_to_be_replaced = "ihgr"
new_char = "b"
my_dict = {k: new_char for k in chars_to_be_replaced}
s = "im hungry"
new_s = ''.join(my_dict.get(x, x) for x in s)
print(new_s) # bm bunbby
''.join(my_dict.get(x, x) for x in s): for each letter in your original string it tries to get it's dictionary value instead unless it does not exist in which case the original is returned.
NOTE: You can speed it up (a bit) by passing a list to join instead of a generator:
new_s = ''.join([my_dict.get(x, x) for x in s])

Related

How to remove duplicated characters from a string? [duplicate]

This question already has answers here:
Removing duplicate characters from a string
(15 answers)
Closed 3 years ago.
How can I remove all repeated characters from a string?
e.g:
Input: string = 'Hello'
Output: 'Heo'
different question from Removing duplicate characters from a string as i don't want to print out the duplicates but i want to delete them.
You can use a generator expression and join like,
>>> x = 'Hello'
>>> ''.join(c for c in x if x.count(c) == 1)
'Heo'
You could construct a Counter from the string, and retrieve elements from it looking up in the counter which appear only once:
from collections import Counter
c = Counter(string)
''.join([i for i in string if c[i]==1])
# 'Heo'
a = 'Hello'
list_a = list(a)
output = []
for i in list_a:
if list_a.count(i) == 1:
output.append(i)
''.join(output)
In addition to the other answers, a filter is also possible:
s = 'Hello'
result = ''.join(filter(lambda c: s.count(c) == 1, s))
# result - Heo
If you limit your question to cases with only repeated consecutive letters (as your example suggests), you could employ regular expressions:
import re
print(re.sub(r"(.)\1+", "", "hello")) # result = heo
print(re.sub(r"(.)\1+", "", "helloo")) # result = he
print(re.sub(r"(.)\1+", "", "hellooo")) # result = he
print(re.sub(r"(.)\1+", "", "sports")) # result = sports
If you need to re-apply the regular expression many times, its worth to compile it beforehand:
prog = re.compile(r"(.)\1+")
print(prog.sub("", "hello"))
To restrict the search for duplicated letters on some subset of characters, you can adjust the regular expression accordingly.
print(re.sub(r"(\S)\1+", "", "hello")) # Search duplicated non-whitespace chars
print(re.sub(r"([a-z])\1+", "", "hello")) # Search for duplicated lowercase letters
Alternatively, an approach using list comprehension could look as follows:
from itertools import groupby
dedup = lambda s: "".join([i for i, g in groupby(s) if len(list(g))==1])
print(dedup("hello")) # result = heo
print(dedup("helloo")) # result = he
print(dedup("hellooo")) # result = he
print(dedup("sports")) # result = sports
Note that the first method using regular expressions was on my machine about 8-10 times faster than the second one. (System: python 3.6.7, MacBook Pro (Mid 2015))

How to capitalize every other character in a string [duplicate]

This question already has answers here:
Capitalise every other letter in a string in Python? [closed]
(5 answers)
Closed 3 years ago.
I want the program to return ' mahir ' as 'MaHiR', I have got MHR but how do I get 'a' and 'h' at their usual place ?
I have already tried slicing but that does not work
s = 'mahir'
a = list (s)
c = a[0:5:2]
for i in range (len(c)):
print (c[i].capitalize(),end = " ")
Python's strings are immutable, calling c[i].capitalize() will not change c[i], and therefore will not change s, to modify a string you must create a new one out of it, you can use str.join with a generator expression instead:
s = 'mahir'
s = ''.join(c.upper() if i % 2 == 0 else c for i, c in enumerate(s))
print(s)
Output:
MaHiR
If you want to do it using slicing, you could convert your string to a list since lists are mutable (but the string approach above is better):
s = 'mahir'
l = list(s)
l[::2] = map(str.upper, l[::2])
s = ''.join(l)
print(s)
Output:
MaHiR

How can we remove word with repeated single character?

I am trying to remove word with single repeated characters using regex in python, for example :
good => good
gggggggg => g
What I have tried so far is following
re.sub(r'([a-z])\1+', r'\1', 'ffffffbbbbbbbqqq')
Problem with above solution is that it changes good to god and I just want to remove words with single repeated characters.
A better approach here is to use a set
def modify(s):
#Create a set from the string
c = set(s)
#If you have only one character in the set, convert set to string
if len(c) == 1:
return ''.join(c)
#Else return original string
else:
return s
print(modify('good'))
print(modify('gggggggg'))
If you want to use regex, mark the start and end of the string in our regex by ^ and $ (inspired from #bobblebubble comment)
import re
def modify(s):
#Create the sub string with a regex which only matches if a single character is repeated
#Marking the start and end of string as well
out = re.sub(r'^([a-z])\1+$', r'\1', s)
return out
print(modify('good'))
print(modify('gggggggg'))
The output will be
good
g
If you do not want to use a set in your method, this should do the trick:
def simplify(s):
l = len(s)
if l>1 and s.count(s[0]) == l:
return s[0]
return s
print(simplify('good'))
print(simplify('abba'))
print(simplify('ggggg'))
print(simplify('g'))
print(simplify(''))
output:
good
abba
g
g
Explanations:
You compute the length of the string
you count the number of characters that are equal to the first one and you compare the count with the initial string length
depending on the result you return the first character or the whole string
You can use trim command:
take a look at this examples:
"ggggggg".Trim('g');
Update:
and for characters which are in the middle of the string use this function, thanks to this answer
in java:
public static string RemoveDuplicates(string input)
{
return new string(input.ToCharArray().Distinct().ToArray());
}
in python:
used = set()
unique = [x for x in mylist if x not in used and (used.add(x) or True)]
but I think all of these answers does not match situation like aaaaabbbbbcda, this string has an a at the end of string which does not appear in the result (abcd). for this kind of situation use this functions which I wrote:
In:
def unique(s):
used = set()
ret = list()
s = list(s)
for x in s:
if x not in used:
ret.append(x)
used = set()
used.add(x)
return ret
print(unique('aaaaabbbbbcda'))
out:
['a', 'b', 'c', 'd', 'a']

Python best way to remove char from string by index [duplicate]

This question already has answers here:
Remove char at specific index - python
(8 answers)
Closed 2 months ago.
I'm removing an char from string like this:
S = "abcd"
Index=1 #index of string to remove
ListS = list(S)
ListS.pop(Index)
S = "".join(ListS)
print S
#"acd"
I'm sure that this is not the best way to do it.
EDIT
I didn't mentioned that I need to manipulate a string size with length ~ 10^7.
So it's important to care about efficiency.
Can someone help me. Which pythonic way to do it?
You can bypass all the list operations with slicing:
S = S[:1] + S[2:]
or more generally
S = S[:Index] + S[Index + 1:]
Many answers to your question (including ones like this) can be found here: How to delete a character from a string using python?. However, that question is nominally about deleting by value, not by index.
Slicing is the best and easiest approach I can think of, here are some other alternatives:
>>> s = 'abcd'
>>> def remove(s, indx):
return ''.join(x for x in s if s.index(x) != indx)
>>> remove(s, 1)
'acd'
>>>
>>>
>>> def remove(s, indx):
return ''.join(filter(lambda x: s.index(x) != 1, s))
>>> remove(s, 1)
'acd'
Remember that indexing is zero-based.
You can replace the Index character with "".
str = "ab1cd1ef"
Index = 3
print(str.replace(str[Index],"",1))
def missing_char(str, n):
n = abs(n)
front = str[:n] # up to but not including n
back = str[n+1:] # n+1 through end of string
return front + back
S = "abcd"
Index=1 #index of string to remove
S = S.replace(S[Index], "")
print(S)
I hope it helps!

python printing common items in a string without duplicating [duplicate]

This question already has answers here:
How can I find all common letters in a set of strings?
(2 answers)
Closed 8 years ago.
I need to make a function that takes two string arguments and returns a string with only the characters that are in both of the argument strings. There should be no duplicate characters in the return value.
this is what I have but I need to make it print things only once if there is more then one
def letter(x,z):
for i in x:
for f in z:
if i == f:
s = str(i)
print(s)
If the order is not important, you can take the intersection & of the set of characters in each word, then join that set into a single string and return it.
def makeString(a, b):
return ''.join(set(a) & set(b))
>>> makeString('sentence', 'santa')
'nts'
Try this
s = set()
def letter(x,z):
for i in x:
for f in z:
if i == f:
s.add(i)
letter("hello","world")
print("".join(s))
It will print 'ol'
If sets aren't your bag for some reason (perhaps you want to maintain the order in one or other of the strings, try:
def common_letters(s1, s2):
unique_letters = []
for letter in s1:
if letter in s2 and letter not in unique_letters:
unique_letters.append(letter)
return ''.join(unique_letters)
print(common_letters('spam', 'arthuprs'))
(Assuming Python 3 for the print()).

Categories

Resources