Understanding strip() - python

I am trying the strip() method on this string but it doesn't give the desired output.
s = 'www.yahoo.com'
s = s.rstrip('.com')
print s # The desired output is 'www.yahoo' but this is showing 'www.yah'
Along with the solution please provide the reason for current output.

str.strip('.com') removes specified characters ., c, o, m, not .com at the beginning and at the end of the string.
To remove .com, use str.replace.
>>> s = 'www.yahoo.com'
>>> s.replace('.com', '') # Replace `.com` with empty string.
'www.yahoo'
UPDATE
As Marcin Fabrykowski, David Zwicker pointed, above solution will turn www.company.com into wwwpany.
To address that, you can use Marcin Fabrykowski's solution. Or using regular expression:
>>> import re
>>> re.sub(r'\.com$', '', 'www.company.com')
'www.company'
>>> re.sub(r'\.com$', '', 'www.company.com.com')
'www.company.com'
>>> re.sub(r'(\.com)+$', '', 'www.company.com.com') # To remove multiple trailings.
'www.company'
\.com$ matches .com at the end of the string ($). . is escaped becasue . has a special meaning in the regular expression (match any character).
NOTE I used r'raw string literal'; r'\.com' == '\\.com'

you can try:
if x.endswith('.com'): print(x[:-4])
becouse:
x = "www.computers.com"
print(x.replace('.com',''))
wwwputers

Related

Regex : replace url inside string

i have
string = 'Server:xxx-zzzzzzzzz.eeeeeeeeeee.frPIPELININGSIZE'
i need a python regex expression to identify xxx-zzzzzzzzz.eeeeeeeeeee.fr to do a sub-string function to it
Expected output :
string : 'Server:PIPELININGSIZE'
the URL is inside a string, i tried a lot of regex expressions
Not sure if this helps, because your question was quite vaguely formulated. :)
import re
string = 'Server:xxx-zzzzzzzzz.eeeeeeeeeee.frPIPELININGSIZE'
string_1 = re.search('[a-z.-]+([A-Z]+)', string).group(1)
print(f'string: Server:{string_1}')
Output:
string: Server:PIPELININGSIZE
No regex. single line use just to split on your target word.
string = 'Server:xxx-zzzzzzzzz.eeeeeeeeeee.frPIPELININGSIZE'
last = string.split("fr",1)[1]
first =string[:string.index(":")]
print(f'{first} : {last}')
Gives #
Server:PIPELININGSIZE
The wording of the question suggests that you wish to find the hostname in the string, but the expected output suggests that you want to remove it. The following regular expression will create a tuple and allow you to do either.
import re
str = "Server:xxx-zzzzzzzzz.eeeeeeeeeee.frPIPELININGSIZE"
p = re.compile('^([A-Za-z]+[:])(.*?)([A-Z]+)$')
m = re.search(p, str)
result = m.groups()
# ('Server:', 'xxx-zzzzzzzzz.eeeeeeeeeee.fr', 'PIPELININGSIZE')
Remove the hostname:
print(f'{result[0]} {result[2]}')
# Output: 'Server: PIPELININGSIZE'
Extract the hostname:
print(result[1])
# Output: 'xxx-zzzzzzzzz.eeeeeeeeeee.fr'

Remove brackets and number inside from string Python

I've seen a lot of examples on how to remove brackets from a string in Python, but I've not seen any that allow me to remove the brackets and a number inside of the brackets from that string.
For example, suppose I've got a string such as "abc[1]". How can I remove the "[1]" from the string to return just "abc"?
I've tried the following:
stringTest = "abc[1]"
stringTestWithoutBrackets = str(stringTest).strip('[]')
but this only outputs the string without the final bracket
abc[1
I've also tried with a wildcard option:
stringTest = "abc[1]"
stringTestWithoutBrackets = str(stringTest).strip('[\w+\]')
but this also outputs the string without the final bracket
abc[1
You could use regular expressions for that, but I think the easiest way would be to use split:
>>> stringTest = "abc[1][2][3]"
>>> stringTest.split('[', maxsplit=1)[0]
'abc'
You can use regex but you need to use it with the re module:
re.sub(r'\[\d+\]', '', stringTest)
If the [<number>] part is always at the end of the string you can also strip via:
stringTest.rstrip('[0123456789]')
Though the latter version might strip beyond the [ if the previous character is in the strip list too. For example in "abc1[5]" the "1" would be stripped as well.
Assuming your string has the format "text[number]" and you only want to keep the "text", then you could do:
stringTest = "abc[1]"
bracketBegin = stringTest.find('[')
stringTestWithoutBrackets = stringTest[:bracketBegin]

Strip removing more characters than expected

Can anyone explain what's going on here:
s = 'REFPROP-MIX:METHANOL&WATER'
s.lstrip('REFPROP-MIX') # this returns ':METHANOL&WATER' as expected
s.lstrip('REFPROP-MIX:') # returns 'THANOL&WATER'
What happened to that 'ME'? Is a colon a special character for lstrip? This is particularly confusing because this works as expected:
s = 'abc-def:ghi'
s.lstrip('abc-def') # returns ':ghi'
s.lstrip('abd-def:') # returns 'ghi'
str.lstrip removes all the characters in its argument from the string, starting at the left. Since all the characters in the left prefix "REFPROP-MIX:ME" are in the argument "REFPROP-MIX:", all those characters are removed. Likewise:
>>> s = 'abcadef'
>>> s.lstrip('abc')
'def'
>>> s.lstrip('cba')
'def'
>>> s.lstrip('bacabacabacabaca')
'def'
str.lstrip does not remove whole strings (of length greater than 1) from the left. If you want to do that, use a regular expression with an anchor ^ at the beginning:
>>> import re
>>> s = 'REFPROP-MIX:METHANOL&WATER'
>>> re.sub(r'^REFPROP-MIX:', '', s)
'METHANOL&WATER'
The method mentioned by #PadraicCunningham is a good workaround for the particular problem as stated.
Just split by the separating character and select the last value:
s = 'REFPROP-MIX:METHANOL&WATER'
res = s.split(':', 1)[-1] # 'METHANOL&WATER'

Getting rid of certain characters in a string in python

I have characters in the middle of a string that I want to get rid of. These characters are =, p,, and H. Since they are not the leftmost and the rightmost characters in the string, I cannot use strip(). Is there a function that gets rid of a certain character in any location in a string?
The usual tool for this job is str.translate
https://docs.python.org/2/library/stdtypes.html#str.translate
>>> 'hello=potato'.translate(None, '=p')
'hellootato'
Check the .replace() function:
> 'aaba'.replace('a','').replace('b','')
< ''
My usual tool for this is the regular expression.
>>> import re
>>> invalidCharacters = r'[=p H]'
>>> mystring = re.sub(invalidCharacters, '', ' poH==hHoPPp p')
'ohoPP'
If you need to constrain the number (i.e., the count) of characters you remove, see the count argument.

Using parentheses as delimiter in re or str.split() python

I am trying to split a string such as: add(ten)sub(one) into add(ten) sub(one).
I can't figure out how to match the close parentheses. I have used re.sub(r'\\)', '\\) ') and every variation of escaping the parentheses,I can think of. It is hard to tell in this font but I am trying to add a space between these commands so I can split it into a list later.
There's no need to escape ) in the replacement string, ) has a special a special meaning only in the regex pattern so it needs to be escaped there in order to match it in the string, but in normal string it can be used as is.
>>> strs = "add(ten)sub(one)"
>>> re.sub(r'\)(?=\S)',r') ', strs)
'add(ten) sub(one)'
As #StevenRumbalski pointed out in comments the above operation can be simply done using str.replace and str.rstrip:
>>> strs.replace(')',') ').strip()
'add(ten) sub(one)'
d = ')'
my_str = 'add(ten)sub(one)'
result = [t+d for t in my_str.split(d) if len(t) > 0]
result = ['add(ten)','sub(one)']
Create a list of all substrings
import re
a = 'add(ten)sub(one)'
print [ b for b in re.findall('(.+?\(.+?\))', a) ]
Output:
['add(ten)', 'sub(one)']

Categories

Resources