str.strip() strange behavior [duplicate] - python

This question already has answers here:
How do the .strip/.rstrip/.lstrip string methods work in Python?
(4 answers)
Closed 28 days ago.
>>> t1 = "abcd.org.gz"
>>> t1
'abcd.org.gz'
>>> t1.strip("g")
'abcd.org.gz'
>>> t1.strip("gz")
'abcd.org.'
>>> t1.strip(".gz")
'abcd.or'
Why is the 'g' of '.org' gone?

strip(".gz") removes any of the characters ., g and z from the beginning and end of the string.

x.strip(y) will remove all characters that appear in y from the beginning and end of x.
That means
'foo42'.strip('1234567890') == 'foo'
becuase '4' and '2' both appear in '1234567890'.
Use os.path.splitext if you want to remove the file extension.
>>> import os.path
>>> t1 = "abcd.org.gz"
>>> os.path.splitext(t1)
('abcd.org', '.gz')

In Python 3.9, there are two new string methods .removeprefix() and .removesuffix() to remove the beginning or end of a string, respectively. Thankfully this time, the method names make it aptly clear what these methods are supposed to perform.
>>> print (sys.version)
3.9.0
>>> t1 = "abcd.org.gz"
>>> t1.removesuffix('gz')
'abcd.org.'
>>> t1
'abcd.org.gz'
>>> t1.removesuffix('gz').removesuffix('.gz')
'abcd.org.' # No unexpected effect from last removesuffix call

The argument given to strip is a set of characters to be removed, not a substring. From the docs:
The chars argument is a string specifying the set of characters to be removed.

as far as I know strip removes from the beginning or end of a string only. If you want to remove from the whole string use replace.

Related

Is there any bug in Python strip() function? [duplicate]

This question already has answers here:
How to use text strip() function?
(2 answers)
Closed 5 years ago.
Creating two strings:
s1 = "sha1:abcd"
s2 = "sha1:wxyz"
Applying .strip() function on both strings:
s1.strip("sha1:")
>>> 'bcd'
s2.strip("sha1:")
>>> 'wxyz'
I expected the following output:
s1.strip("sha1:")
>>> 'abcd'
s2.strip("sha1:")
>>> 'wxyz'
I am aware that strip() function is deprecated. I am just curious to know the issue. I went through official docs, but found no special mentions about ":a" or anything like that.
And also I am aware of other alternatives, we can use split("sha1:") or strip("sha1") followed by strip(":"), gives the desired output.
there
strip(...)
S.strip([chars]) -> str
Return a copy of the string S with leading and trailing
whitespace removed.
If chars is given and not None, remove characters in chars instead.
note characters in chars
Explained in detail in the documentation.
Here is a counter example showing the actual intention of strip:
s1 = "sha1:abcds"
s2 = "sha1:wxyzs"
print(s1.strip("sha1:"))
print(s2.strip("sha1:"))
Output:
bcd
wxyz
strip() removed chars supplied in its parameter, whether they are found at the start or end of the target.
It will strip all characters i.e. s, h, a , 1 and : at the beginning and ending of the string.

Strip removing more characters than expected

Can anyone explain what's going on here:
s = 'REFPROP-MIX:METHANOL&WATER'
s.lstrip('REFPROP-MIX') # this returns ':METHANOL&WATER' as expected
s.lstrip('REFPROP-MIX:') # returns 'THANOL&WATER'
What happened to that 'ME'? Is a colon a special character for lstrip? This is particularly confusing because this works as expected:
s = 'abc-def:ghi'
s.lstrip('abc-def') # returns ':ghi'
s.lstrip('abd-def:') # returns 'ghi'
str.lstrip removes all the characters in its argument from the string, starting at the left. Since all the characters in the left prefix "REFPROP-MIX:ME" are in the argument "REFPROP-MIX:", all those characters are removed. Likewise:
>>> s = 'abcadef'
>>> s.lstrip('abc')
'def'
>>> s.lstrip('cba')
'def'
>>> s.lstrip('bacabacabacabaca')
'def'
str.lstrip does not remove whole strings (of length greater than 1) from the left. If you want to do that, use a regular expression with an anchor ^ at the beginning:
>>> import re
>>> s = 'REFPROP-MIX:METHANOL&WATER'
>>> re.sub(r'^REFPROP-MIX:', '', s)
'METHANOL&WATER'
The method mentioned by #PadraicCunningham is a good workaround for the particular problem as stated.
Just split by the separating character and select the last value:
s = 'REFPROP-MIX:METHANOL&WATER'
res = s.split(':', 1)[-1] # 'METHANOL&WATER'

Select last chars of string until whitespace in Python [duplicate]

This question already has answers here:
Python: Cut off the last word of a sentence?
(10 answers)
Closed 8 years ago.
Is there any efficient way to select the last characters of a string until there's a whitespace in Python?
For example I have the following string:
str = 'Hello my name is John'
I want to return 'John'. But if the str was:
str = 'Hello my name is Sally'
I want to retrun 'Sally'
Just split the string on whitespace, and get the last element of the array. Or use rsplit() to start splitting from end:
>>> st = 'Hello my name is John'
>>> st.rsplit(' ', 1)
['Hello my name is', 'John']
>>>
>>> st.rsplit(' ', 1)[1]
'John'
The 2nd argument specifies the number of split to do. Since you just want last element, we just need to split once.
As specified in comments, you can just pass None as 1st argument, in which case the default delimiter which is whitespace will be used:
>>> st.rsplit(None, 1)[-1]
'John'
Using -1 as index is safe, in case there is no whitespace in your string.
It really depends what you mean by efficient, but the simplest (efficient use of programmer time) way I can think of is:
str.split()[-1]
This fails for empty strings, so you'll want to check that.
I think this is what you want:
str[str.rfind(' ')+1:]
this creates a substring from str starting at the character after the right-most-found-space, and up until the last character.
This works for all strings - empty or otherwise (unless it's not a string object, e.g. a None object would throw an error)

Removing a prefix from a string [duplicate]

This question already has answers here:
Remove a prefix from a string [duplicate]
(6 answers)
Closed 6 months ago.
Trying to strip the "0b1" from the left end of a binary number.
The following code results in stripping all of binary object. (not good)
>>> bbn = '0b1000101110100010111010001' #converted bin(2**24+**2^24/11)
>>> aan=bbn.lstrip("0b1") #Try stripping all left-end junk at once.
>>> print aan #oops all gone.
''
So I did the .lstrip() in two steps:
>>> bbn = '0b1000101110100010111010001' # Same fraction expqansion
>>> aan=bbn.lstrip("0b")# Had done this before.
>>> print aan #Extra "1" still there.
'1000101110100010111010001'
>>> aan=aan.lstrip("1")# If at first you don't succeed...
>>> print aan #YES!
'000101110100010111010001'
What's the deal?
Thanks again for solving this in one simple step. (see my previous question)
The strip family treat the arg as a set of characters to be removed. The default set is "all whitespace characters".
You want:
if strg.startswith("0b1"):
strg = strg[3:]
No. Stripping removes all characters in the sequence passed, not just the literal sequence. Slice the string if you want to remove a fixed length.
In Python 3.9 you can use bbn.removeprefix('0b1').
(Actually this question has been mentioned as part of the rationale in PEP 616.)
This is the way lstrip works. It removes any of the characters in the parameter, not necessarily the string as a whole. In the first example, since the input consisted of only those characters, nothing was left.
Lstrip is removing any of the characters in the string. So, as well as the initial 0b1, it is removing all zeros and all ones. Hence it is all gone!
#Harryooo: lstrip only takes the characters off the left hand end. So, because there's only one 1 before the first 0, it removes that. If the number started 0b11100101..., calling a.strip('0b').strip('1') would remove the first three ones, so you'd be left with 00101.
>>> i = 0b1000101110100010111010001
>>> print(bin(i))
'0b1000101110100010111010001'
>>> print(format(i, '#b'))
'0b1000101110100010111010001'
>>> print(format(i, 'b'))
'1000101110100010111010001'
See Example in python tutor:
From the standard doucmentation (See standard documentation for function bin()):
bin(x)
Convert an integer number to a binary string prefixed with “0b”. The result is a valid Python expression. If x is not a Python int object, it has to define an index() method that returns an integer. Some examples:
>>> bin(3)
'0b11'
>>> bin(-10)
'-0b1010'
If prefix “0b” is desired or not, you can use either of the following ways.
>>> format(14, '#b'), format(14, 'b')
('0b1110', '1110')
>>> f'{14:#b}', f'{14:b}'
('0b1110', '1110')
See also format() for more information.

Why does str.lstrip strip an extra character? [duplicate]

This question already has answers here:
How do I remove a substring from the end of a string?
(24 answers)
Closed 4 years ago.
>>> path = "/Volumes/Users"
>>> path.lstrip('/Volume')
's/Users'
>>> path.lstrip('/Volumes')
'Users'
>>>
I expected the output of path.lstrip('/Volumes') to be '/Users'
lstrip is character-based, it removes all characters from the left end that are in that string.
To verify this, try this:
"/Volumes/Users".lstrip("semuloV/") # also returns "Users"
Since / is part of the string, it is removed.
You need to use slicing instead:
if s.startswith("/Volumes"):
s = s[8:]
Or, on Python 3.9+ you can use removeprefix:
s = s.removeprefix("/Volumes")
Strip is character-based. If you are trying to do path manipulation you should have a look at os.path
>>> os.path.split("/Volumes/Users")
('/Volumes', 'Users')
The argument passed to lstrip is taken as a set of characters!
>>> ' spacious '.lstrip()
'spacious '
>>> 'www.example.com'.lstrip('cmowz.')
'example.com'
See also the documentation
You might want to use str.replace()
str.replace(old, new[, count])
# e.g.
'/Volumes/Home'.replace('/Volumes', '' ,1)
Return a copy of the string with all occurrences of substring old replaced by new. If the optional argument count is given, only the first count occurrences are replaced.
For paths, you may want to use os.path.split(). It returns a list of the paths elements.
>>> os.path.split('/home/user')
('/home', '/user')
To your problem:
>>> path = "/vol/volume"
>>> path.lstrip('/vol')
'ume'
The example above shows, how lstrip() works. It removes '/vol' starting form left. Then, is starts again...
So, in your example, it fully removed '/Volumes' and started removing '/'. It only removed the '/' as there was no 'V' following this slash.
HTH
lstrip doc says:
Return a copy of the string S with leading whitespace removed.
If chars is given and not None, remove characters in chars instead.
If chars is unicode, S will be converted to unicode before stripping
So you are removing every character that is contained in the given string, including both 's' and '/' characters.
Here is a primitive version of lstrip (that I wrote) that might help clear things up for you:
def lstrip(s, chars):
for i in range len(s):
char = s[i]
if not char in chars:
return s[i:]
else:
return lstrip(s[i:], chars)
Thus, you can see that every occurrence of a character in chars is is removed until a character that is not in chars is encountered. Once that happens, the deletion stops and the rest of the string is simply returned

Categories

Resources