Python, can't replace generator object - python

I need to change replace a string's punctuation marks with space.
The problem is that I need to do it in one line.
for example: there's a string: 'H,+-=/e^##%ll-!!..o'
the result should be : 'H-----e----ll-----o'
where '-' symbolizes ' ' (space)
when I do
replace((c for c in string.punctuation),' ')
I get the error:
TypeError: Can't convert 'generator' object to str implicitly
I tried to put it in a list, in a set even in a dict.
but this error keeps on coming back.
how can I surpass this?

str.replace() doesn't take a list or generator, it'd only take a string, and even then won't do what you want. The method replaces one whole sequence of characters with another, so even x.replace(string.puntuation, '-') would only replace whole occurrences of the string.punctuation string in x with one dash.
Use string.maketrans() and str.translate() instead:
import string
translationmap = string.maketrans(string.punctuation, '-' * len(string.punctuation))
x = x.translate(translationmap)
Demo:
>>> import string
>>> x = 'H,+-=/e^##%ll-!!..o'
>>> import string
>>> translationmap = string.maketrans(string.punctuation, '-' * len(string.punctuation))
>>> x.translate(translationmap)
'H-----e----ll-----o'
str.translate() is hands-down the fastest method to map characters to other characters, or delete characters from a string.
On Python 3, str.translate() (or in Python 2, unicode.translate()) takes a mapping instead:
translationmap = {ord(c): '-' for c in string.punctuation}
x.translate(translationmap)

Try following
import string
''.join(map(lambda x : '-' if x in string.punctuation else x,
'H,+-=/e^##%ll-!!..o'))

You could also use re.sub for this:
>>> from re import sub
>>> sub("\W", "-", "H,+-=/e^##%ll-!!..o")
'H-----e----ll-----o'
>>>
\W captures all non-word characters.
Note that the above code will keep underscores. If you don't want them, replace \W with [\W_].

Related

Replacing unknown characters in a string Python 2.7

How can I define characters(in a LIST or a STRING), and have any other characters replaced with.. lets say a '?'
Example:
strinput = "abcdefg#~"
legal = '.,/?~abcdefg' #legal characters
while i not in legal:
#Turn i into '?'
print output
Put the legal characters in a set then use in to test each character of the string. Construct the new string using the str.join() method and a conditional expression.
>>> s = "test.,/?~abcdefgh"
>>> legal = set('.,/?~abcdefg')
>>> s = ''.join(char if char in legal else '?' for char in s)
>>> s
'?e??.,/?~abcdefg?'
>>>
If this is a large file, read in chunks, and apply re.sub(..) as below. ^ within a class (square brackets) stands for negation (similar to saying "anything other than")
>>> import re
>>> char = '.,/?~abcdefg'
>>> re.sub(r'[^' + char +']', '?', "test.,/?~abcdefgh")
'?e??.,/?~abcdefg?'

Getting rid of certain characters in a string in python

I have characters in the middle of a string that I want to get rid of. These characters are =, p,, and H. Since they are not the leftmost and the rightmost characters in the string, I cannot use strip(). Is there a function that gets rid of a certain character in any location in a string?
The usual tool for this job is str.translate
https://docs.python.org/2/library/stdtypes.html#str.translate
>>> 'hello=potato'.translate(None, '=p')
'hellootato'
Check the .replace() function:
> 'aaba'.replace('a','').replace('b','')
< ''
My usual tool for this is the regular expression.
>>> import re
>>> invalidCharacters = r'[=p H]'
>>> mystring = re.sub(invalidCharacters, '', ' poH==hHoPPp p')
'ohoPP'
If you need to constrain the number (i.e., the count) of characters you remove, see the count argument.

How to replace characters in string by the next one?

I would like to replace every character of a string with the next one and the last should become first. Here is an example:
abcdefghijklmnopqrstuvwxyz
should become:
bcdefghijklmnopqrstuvwxyza
Is it possible to do it without using the replace function 26 times?
You can use the str.translate() method to have Python replace characters by other characters in one step.
Use the string.maketrans() function to map ASCII characters to their targets; using string.ascii_lowercase can help here as it saves you typing all the letters yourself:
from string import ascii_lowercase
try:
# Python 2
from string import maketrans
except ImportError:
# Python 3 made maketrans a static method
maketrans = str.maketrans
cipher_map = maketrans(ascii_lowercase, ascii_lowercase[1:] + ascii_lowercase[:1])
encrypted = text.translate(cipher_map)
Demo:
>>> from string import maketrans
>>> from string import ascii_lowercase
>>> cipher_map = maketrans(ascii_lowercase, ascii_lowercase[1:] + ascii_lowercase[:1])
>>> text = 'the quick brown fox jumped over the lazy dog'
>>> text.translate(cipher_map)
'uif rvjdl cspxo gpy kvnqfe pwfs uif mbaz eph'
Sure, just use string slicing:
>>> s = "abcdefghijklmnopqrstuvwxyz"
>>> s[1:] + s[:1]
'bcdefghijklmnopqrstuvwxyza'
Basically, the operation you want to do is analogous to rotating the position of the characters by one place to the left. So, we can simply take the part of string after the first character, and add the first character to it.
EDIT: I assumed that OP is asking to rotate a string (which is plausible from his given input, the input string has 26 characters, and he might have been doing a manual replace for each character), in case the post is about creating a cipher, please check #Martjin's answer above.
Because string in Python is immutable you need convert string to list, replace, and then convert back to string. Here I use modulo.
def convert(text):
lst = list(text)
new_list = [text[i % len(text) +1] for i in lst]
return "".join(new_list)
Don't use slicing because this is not efficient. Python will create new full copy string for every single changed char, because string is immutable.

Understanding string method strip

After initializing a variable x with the content shown in below, I applied strip with a parameter. The result of strip is unexpected. As I'm trying to strip "ios_static_analyzer/", "rity/ios_static_analyzer/" is getting striped.
Kindly help me know why is it so.
>>> print x
/Users/msecurity/Desktop/testspace/Hy5_Workspace/security/ios_static_analyzer/
>>> print x.strip()
/Users/msecurity/Desktop/testspace/Hy5_Workspace/security/ios_static_analyzer/
>>> print x.strip('/')
Users/msecurity/Desktop/testspace/Hy5_Workspace/security/ios_static_analyzer
>>> print x.strip('ios_static_analyzer/')
Users/msecurity/Desktop/testspace/Hy5_Workspace/secu
>>> print x.strip('analyzer/')
Users/msecurity/Desktop/testspace/Hy5_Workspace/security/ios_static_
>>> print x.strip('_analyzer/')
Users/msecurity/Desktop/testspace/Hy5_Workspace/security/ios_static
>>> print x.strip('static_analyzer/')
Users/msecurity/Desktop/testspace/Hy5_Workspace/security/io
>>> print x.strip('_static_analyzer/')
Users/msecurity/Desktop/testspace/Hy5_Workspace/security/io
>>> print x.strip('s_static_analyzer/')
Users/msecurity/Desktop/testspace/Hy5_Workspace/security/io
>>> print x.strip('os_static_analyzer/')
Users/msecurity/Desktop/testspace/Hy5_Workspace/secu
Quoting from str.strip docs
Return a copy of the string with the leading and trailing characters
removed. The chars argument is a string specifying the set of
characters to be removed. If omitted or None, the chars argument
defaults to removing whitespace. The chars argument is not a prefix or
suffix; rather, all combinations of its values are stripped:
So, it removes all the characters in the parameter, from both the sides of the string.
For example,
my_str = "abcd"
print my_str.strip("da") # bc
Note: You can think of it like this, it stops removing the characters from the string when it finds a character which is not found in the input parameter string.
To actually, remove the particular string, you should use str.replace
x = "/Users/Desktop/testspace/Hy5_Workspace/security/ios_static_analyzer/"
print x.replace('analyzer/', '')
# /Users/msecurity/Desktop/testspace/Hy5_Workspace/security/ios_static_
But replace will remove the matches everywhere,
x = "abcd1abcd2abcd"
print x.replace('abcd', '') # 12
But if you want to remove words only at the beginning and ending of the string, you can use RegEx, like this
import re
pattern = re.compile("^{0}|{0}$".format("abcd"))
x = "abcd1abcd2abcd"
print pattern.sub("", x) # 1abcd2
What you need, I think, is replace:
>>> x.replace('ios_static_analyzer/','')
'/Users/msecurity/Desktop/testspace/Hy5_Workspace/security/'
string.replace(s, old, new[, maxreplace])
Return a copy of string s with all occurrences of substring old replaced by new.
So you can replace your string with nothing and get the desired output.
Python x.strip(s) remove from the begginning or the end of the string x any character appearing in s ! So s is just a set of characters, not a string being matched for substring.
string.strip removes a set of characters given as an argument. The chars argument is not a prefix or suffix; rather, all combinations of its values are stripped.
strip does not remove the string given as argument from the object; it removes the characters in the argument.
In this case, strip sees the string s_static_analyzer/ as an iterable of characters that needs to be stripped.

Using parentheses as delimiter in re or str.split() python

I am trying to split a string such as: add(ten)sub(one) into add(ten) sub(one).
I can't figure out how to match the close parentheses. I have used re.sub(r'\\)', '\\) ') and every variation of escaping the parentheses,I can think of. It is hard to tell in this font but I am trying to add a space between these commands so I can split it into a list later.
There's no need to escape ) in the replacement string, ) has a special a special meaning only in the regex pattern so it needs to be escaped there in order to match it in the string, but in normal string it can be used as is.
>>> strs = "add(ten)sub(one)"
>>> re.sub(r'\)(?=\S)',r') ', strs)
'add(ten) sub(one)'
As #StevenRumbalski pointed out in comments the above operation can be simply done using str.replace and str.rstrip:
>>> strs.replace(')',') ').strip()
'add(ten) sub(one)'
d = ')'
my_str = 'add(ten)sub(one)'
result = [t+d for t in my_str.split(d) if len(t) > 0]
result = ['add(ten)','sub(one)']
Create a list of all substrings
import re
a = 'add(ten)sub(one)'
print [ b for b in re.findall('(.+?\(.+?\))', a) ]
Output:
['add(ten)', 'sub(one)']

Categories

Resources