Is there any bug in Python strip() function? [duplicate] - python

This question already has answers here:
How to use text strip() function?
(2 answers)
Closed 5 years ago.
Creating two strings:
s1 = "sha1:abcd"
s2 = "sha1:wxyz"
Applying .strip() function on both strings:
s1.strip("sha1:")
>>> 'bcd'
s2.strip("sha1:")
>>> 'wxyz'
I expected the following output:
s1.strip("sha1:")
>>> 'abcd'
s2.strip("sha1:")
>>> 'wxyz'
I am aware that strip() function is deprecated. I am just curious to know the issue. I went through official docs, but found no special mentions about ":a" or anything like that.
And also I am aware of other alternatives, we can use split("sha1:") or strip("sha1") followed by strip(":"), gives the desired output.

there
strip(...)
S.strip([chars]) -> str
Return a copy of the string S with leading and trailing
whitespace removed.
If chars is given and not None, remove characters in chars instead.
note characters in chars
Explained in detail in the documentation.

Here is a counter example showing the actual intention of strip:
s1 = "sha1:abcds"
s2 = "sha1:wxyzs"
print(s1.strip("sha1:"))
print(s2.strip("sha1:"))
Output:
bcd
wxyz
strip() removed chars supplied in its parameter, whether they are found at the start or end of the target.

It will strip all characters i.e. s, h, a , 1 and : at the beginning and ending of the string.

Related

Python: .strip() method is not working as expected [duplicate]

This question already has answers here:
How do the .strip/.rstrip/.lstrip string methods work in Python?
(4 answers)
Closed 2 years ago.
I have two strings:
my_str_1 = '200327_elb_72_ch_1429.csv'
my_str_2 = '200327_elb_10_ch_1429.csv'
When I call .strip() method on both of them I get results like this:
>>> print(my_str_1.strip('200327_elb_'))
'ch_1429.csv'
>>> print(my_str_2.strip('200327_elb_'))
'10_ch_1429.csv'
I expected result of print(my_str_1.strip('200327_elb_')) to be '72_ch_1429.csv'. Why isn't it that case? Why these two result aren't consistent? What am I missing?
From the docs:
[...] The chars argument is a string specifying the set of characters to be removed. [...] The chars argument is not a prefix or suffix; rather, all combinations of its values are stripped [...]
This method removes all specified characters that appear at the left or right end of the original string, till on character is reached that is not specified; it does not just remove leading/trailing substrings, it takes each character individually.
Clarifying example (from Jon Clements comment); note that the characters a from the middle are NOT removed:
>>> 'aa3aa3aa'.strip('a')
'3aa3'
>>> 'a4a3aa3a5a'.strip('a54')
'3aa3'

Python - First and last character in string must be alpha numeric, else delete [duplicate]

This question already has answers here:
How to remove non-alphanumeric characters at the beginning or end of a string
(5 answers)
Closed 6 years ago.
I am wondering how I can implement a string check, where I want to make sure that the first (&last) character of the string is alphanumeric. I am aware of the isalnum, but how do I use this to implement this check/substitution?
So, I have a string like so:
st="-jkkujkl-ghjkjhkj*"
and I would want back:
st="jkkujkl-ghjkjhkj"
Thanks..
Though not exactly what you want, but using str.strip should serve your purpose
import string
st.strip(string.punctuation)
Out[174]: 'jkkujkl-ghjkjhkj'
You could use regex like shown below:
import re
# \W is a set of all special chars, and also include '_'
# If you have elements in the set [\W_] at start and end, replace with ''
p = re.compile(r'^[\W_]+|[\W_]+$')
st="-jkkujkl-ghjkjhkj*"
print p.subn('', st)[0]
Output:
jkkujkl-ghjkjhkj
Edit:
If your special chars are in the set: !"#$%&\'()*+,-./:;<=>?#[\]^_`{|}~
#Abhijit's answer is much simpler and cleaner.
If you are not sure then this regex version is better.
You can use following two expressions:
st = re.sub('^\W*', '', st)
st = re.sub('\W*$', '', st)
This will strip all non alpha chars of the beginning and the end of the string, not just the first ones.
You could use a regular expression.
Something like this could work;
\w.+?\w
However I'm don't know how to do a regexp match in python..
hint 1: ord() can covert a letter to a character number
hint 2: alpha charterers are between 97 and 122 in ord()
hint 3: st[0] will return the first letter in string st[-1] will return the last
An exact answer to your question may be the following:
def stringCheck(astring):
firstChar = astring[0] if astring[0].isalnum() else ''
lastChar = astring[-1] if astring[-1].isalnum() else ''
return firstChar + astring[1:-1] + lastChar

Removing a prefix from a string [duplicate]

This question already has answers here:
Remove a prefix from a string [duplicate]
(6 answers)
Closed 6 months ago.
Trying to strip the "0b1" from the left end of a binary number.
The following code results in stripping all of binary object. (not good)
>>> bbn = '0b1000101110100010111010001' #converted bin(2**24+**2^24/11)
>>> aan=bbn.lstrip("0b1") #Try stripping all left-end junk at once.
>>> print aan #oops all gone.
''
So I did the .lstrip() in two steps:
>>> bbn = '0b1000101110100010111010001' # Same fraction expqansion
>>> aan=bbn.lstrip("0b")# Had done this before.
>>> print aan #Extra "1" still there.
'1000101110100010111010001'
>>> aan=aan.lstrip("1")# If at first you don't succeed...
>>> print aan #YES!
'000101110100010111010001'
What's the deal?
Thanks again for solving this in one simple step. (see my previous question)
The strip family treat the arg as a set of characters to be removed. The default set is "all whitespace characters".
You want:
if strg.startswith("0b1"):
strg = strg[3:]
No. Stripping removes all characters in the sequence passed, not just the literal sequence. Slice the string if you want to remove a fixed length.
In Python 3.9 you can use bbn.removeprefix('0b1').
(Actually this question has been mentioned as part of the rationale in PEP 616.)
This is the way lstrip works. It removes any of the characters in the parameter, not necessarily the string as a whole. In the first example, since the input consisted of only those characters, nothing was left.
Lstrip is removing any of the characters in the string. So, as well as the initial 0b1, it is removing all zeros and all ones. Hence it is all gone!
#Harryooo: lstrip only takes the characters off the left hand end. So, because there's only one 1 before the first 0, it removes that. If the number started 0b11100101..., calling a.strip('0b').strip('1') would remove the first three ones, so you'd be left with 00101.
>>> i = 0b1000101110100010111010001
>>> print(bin(i))
'0b1000101110100010111010001'
>>> print(format(i, '#b'))
'0b1000101110100010111010001'
>>> print(format(i, 'b'))
'1000101110100010111010001'
See Example in python tutor:
From the standard doucmentation (See standard documentation for function bin()):
bin(x)
Convert an integer number to a binary string prefixed with “0b”. The result is a valid Python expression. If x is not a Python int object, it has to define an index() method that returns an integer. Some examples:
>>> bin(3)
'0b11'
>>> bin(-10)
'-0b1010'
If prefix “0b” is desired or not, you can use either of the following ways.
>>> format(14, '#b'), format(14, 'b')
('0b1110', '1110')
>>> f'{14:#b}', f'{14:b}'
('0b1110', '1110')
See also format() for more information.

str.strip() strange behavior [duplicate]

This question already has answers here:
How do the .strip/.rstrip/.lstrip string methods work in Python?
(4 answers)
Closed 28 days ago.
>>> t1 = "abcd.org.gz"
>>> t1
'abcd.org.gz'
>>> t1.strip("g")
'abcd.org.gz'
>>> t1.strip("gz")
'abcd.org.'
>>> t1.strip(".gz")
'abcd.or'
Why is the 'g' of '.org' gone?
strip(".gz") removes any of the characters ., g and z from the beginning and end of the string.
x.strip(y) will remove all characters that appear in y from the beginning and end of x.
That means
'foo42'.strip('1234567890') == 'foo'
becuase '4' and '2' both appear in '1234567890'.
Use os.path.splitext if you want to remove the file extension.
>>> import os.path
>>> t1 = "abcd.org.gz"
>>> os.path.splitext(t1)
('abcd.org', '.gz')
In Python 3.9, there are two new string methods .removeprefix() and .removesuffix() to remove the beginning or end of a string, respectively. Thankfully this time, the method names make it aptly clear what these methods are supposed to perform.
>>> print (sys.version)
3.9.0
>>> t1 = "abcd.org.gz"
>>> t1.removesuffix('gz')
'abcd.org.'
>>> t1
'abcd.org.gz'
>>> t1.removesuffix('gz').removesuffix('.gz')
'abcd.org.' # No unexpected effect from last removesuffix call
The argument given to strip is a set of characters to be removed, not a substring. From the docs:
The chars argument is a string specifying the set of characters to be removed.
as far as I know strip removes from the beginning or end of a string only. If you want to remove from the whole string use replace.

Why does str.lstrip strip an extra character? [duplicate]

This question already has answers here:
How do I remove a substring from the end of a string?
(24 answers)
Closed 4 years ago.
>>> path = "/Volumes/Users"
>>> path.lstrip('/Volume')
's/Users'
>>> path.lstrip('/Volumes')
'Users'
>>>
I expected the output of path.lstrip('/Volumes') to be '/Users'
lstrip is character-based, it removes all characters from the left end that are in that string.
To verify this, try this:
"/Volumes/Users".lstrip("semuloV/") # also returns "Users"
Since / is part of the string, it is removed.
You need to use slicing instead:
if s.startswith("/Volumes"):
s = s[8:]
Or, on Python 3.9+ you can use removeprefix:
s = s.removeprefix("/Volumes")
Strip is character-based. If you are trying to do path manipulation you should have a look at os.path
>>> os.path.split("/Volumes/Users")
('/Volumes', 'Users')
The argument passed to lstrip is taken as a set of characters!
>>> ' spacious '.lstrip()
'spacious '
>>> 'www.example.com'.lstrip('cmowz.')
'example.com'
See also the documentation
You might want to use str.replace()
str.replace(old, new[, count])
# e.g.
'/Volumes/Home'.replace('/Volumes', '' ,1)
Return a copy of the string with all occurrences of substring old replaced by new. If the optional argument count is given, only the first count occurrences are replaced.
For paths, you may want to use os.path.split(). It returns a list of the paths elements.
>>> os.path.split('/home/user')
('/home', '/user')
To your problem:
>>> path = "/vol/volume"
>>> path.lstrip('/vol')
'ume'
The example above shows, how lstrip() works. It removes '/vol' starting form left. Then, is starts again...
So, in your example, it fully removed '/Volumes' and started removing '/'. It only removed the '/' as there was no 'V' following this slash.
HTH
lstrip doc says:
Return a copy of the string S with leading whitespace removed.
If chars is given and not None, remove characters in chars instead.
If chars is unicode, S will be converted to unicode before stripping
So you are removing every character that is contained in the given string, including both 's' and '/' characters.
Here is a primitive version of lstrip (that I wrote) that might help clear things up for you:
def lstrip(s, chars):
for i in range len(s):
char = s[i]
if not char in chars:
return s[i:]
else:
return lstrip(s[i:], chars)
Thus, you can see that every occurrence of a character in chars is is removed until a character that is not in chars is encountered. Once that happens, the deletion stops and the rest of the string is simply returned

Categories

Resources