`strip` Python method non-obvious behavior [duplicate] - python

This question already has answers here:
How do I remove a substring from the end of a string?
(23 answers)
Closed last year.
I was surprised about strip Python method behavior:
>>> 'https://texample.com'.strip('https://')
'example.com'
It was not obvious, because usually I use strip with an one-char argument.
This is because of
The chars argument is a string specifying the set of characters to be removed
(https://docs.python.org/3/library/stdtypes.html#str.strip).
What is the best way to delete a "head" of a string?

you have 3 options:
use string.replace instead of string.strip
startswith method:
if line.startswith("'https://"):
return line[8:]
split:
if "/" in line:
param, value = line.split("/",1)

Related

Issue stripping the values of a list in Python [duplicate]

This question already has answers here:
Python string.strip stripping too many characters [duplicate]
(3 answers)
Strip removing more characters than expected
(2 answers)
How to remove the left part of a string?
(21 answers)
Closed 13 days ago.
I have the following list of elements named 'files_temp':
['CDS_SPREAD_AA1EUNBCBM', 'CDS_SPREAD_AA1EUNCCBM', 'CDS_SPREAD_AA1USNBCBM', 'CDS_SPREAD_AA1USNCCBM', 'CDS_SPREAD_AALLN1EUNECBM', 'CDS_SPREAD_AALLN1USNECBM', 'CDS_SPREAD_ABB3EUNECBM', 'CDS_SPREAD_ABB3USNECBM', 'CDS_SPREAD_ABX1EUNCCBM', 'CDS_SPREAD_ABX1USNCCBM', 'CDS_SPREAD_ACAFP1EUBECBM', 'CDS_SPREAD_ACAFP1EUNECBM', 'CDS_SPREAD_ACOM1JPNACBM', 'CDS_SPREAD_ACOM1USNACBM', 'CDS_SPREAD_AEGON1EUBACBM', 'CDS_SPREAD_AEGON1EUNECBM', 'CDS_SPREAD_AEGON1JPBACBM', 'CDS_SPREAD_AEGON1USBACBM', 'CDS_SPREAD_AEGON1USNECBM', 'CDS_SPREAD_AEP1USNBCBM' ...]
I would like to keep only the alphanumeric codes, removing the CDS_SPREAD_ part and tried the following code:
files_temp=[elem.strip('CDS_SPREAD_') for elem in files_temp]
However, besides the CDS_SPREAD_ part it is also removing a part of the alphanumeric code:
['1EUNBCBM', '1EUNCCBM', '1USNBCBM', '1USNCCBM', 'LLN1EUNECBM', 'LLN1USNECBM', 'BB3EUNECBM', 'BB3USNECBM', 'BX1EUNCCBM', 'BX1USNCCBM', 'FP1EUBECBM', 'FP1EUNECBM', 'OM1JPNACBM', 'OM1USNACBM', 'GON1EUBACBM', 'GON1EUNECBM', 'GON1JPBACBM', 'GON1USBACBM', 'GON1USNECBM', '1USNBCBM', '1USNCCBM', 'T1EUNCCBM', 'T1USNBCBM' ...]
For instance, for the first element, in theory I should get AA1EUNBCBM instead of 1EUNBCBM. Would you know why this is happening? I would highly appreciate an alternative to solve the issue as well.
The strip() function removes all the characters you are providing as the parameters. For your case, you should use replace() function.
files_temp=[elem.replace('CDS_SPREAD_', '') for elem in files_temp]

Omitting metacharacters in python [duplicate]

This question already has answers here:
How to write string literals in Python without having to escape them?
(6 answers)
Closed last year.
I want to assing a path to a variable a:
a = "D:\misc\testsets\Real"
How can i omit the \t metacharacter without changing the folder name?
Use raw strings:
a = r"D:\misc\testsets\Real"
Try this:
r denotes raw string.
a = r"D:\misc\testsets\Real"

Re.sub in python (remove last _) [duplicate]

This question already has answers here:
Remove Last instance of a character and rest of a string
(5 answers)
Closed 3 years ago.
I have a string such as:
string="lcl|NC_011588.1_cds_YP_002321424.1_1"
and I would like to keep only: "YP_002321424.1"
So I tried :
string=re.sub(".*_cds_","",string)
string=re.sub("_\d","",string)
Does someone have an idea?
But the first _ is removed to
Note: The number can change (they are not fixed).
"Ordinary" split, as proposed in the other answer, is not enough,
because you also want to strip the trailing _1, so the part to capture
should end after a dot and digit.
Try the following pattern:
(?<=_cds_)\w+\.\d
For a working example see https://regex101.com/r/U2QsFH/1
Don't bother with regexes, a simple
string.split('_cds_')[1]
will be enough

Retrieve part of string after underscore [duplicate]

This question already has answers here:
How can I split and parse a string in Python? [duplicate]
(3 answers)
Closed 3 years ago.
I know that in Python you can use the array selector to retrieve a certain part of a string, ie me.name[10:] to get just the last 10 characters.
but how would you retrieve just the part of a string after an underscore ie _ using a single expression?
For example if my string is "stringcharThatChange_myname"
How would I extract just 'myname' ? I'm confined to using Python 3.5.1
You could use split.
test_string = "stringcharThatChange_myname"
print(test_string.split('_')[1]) # myname
Using split method.
"this will be excluded_this is kept".split('_')[1]

python how to remove http from a string using replace [duplicate]

This question already has answers here:
Why doesn't calling a string method (such as .replace or .strip) modify (mutate) the string?
(3 answers)
Closed 7 years ago.
I would like to remove "http://" in a string that contains a web URL i.e: "http://www.google.com". My code is:
import os
s = 'http://www.google.com'
s.replace("http://","")
print s
I try to replace http:// with a space but somehow it still prints out http://www.google.com
Am i using replace incorrectly here? Thanks for your answer.
Strings are immutable. That means none of their methods change the existing string - rather, they give you back a new one. So, you need to assign the result back to a variable (the same, or a different, one):
s = 'http://www.google.com'
s = s.replace("http://","")
print s

Categories

Resources