This question already has answers here:
Python string.strip stripping too many characters [duplicate]
(3 answers)
Strip removing more characters than expected
(2 answers)
How to remove the left part of a string?
(21 answers)
Closed 13 days ago.
I have the following list of elements named 'files_temp':
['CDS_SPREAD_AA1EUNBCBM', 'CDS_SPREAD_AA1EUNCCBM', 'CDS_SPREAD_AA1USNBCBM', 'CDS_SPREAD_AA1USNCCBM', 'CDS_SPREAD_AALLN1EUNECBM', 'CDS_SPREAD_AALLN1USNECBM', 'CDS_SPREAD_ABB3EUNECBM', 'CDS_SPREAD_ABB3USNECBM', 'CDS_SPREAD_ABX1EUNCCBM', 'CDS_SPREAD_ABX1USNCCBM', 'CDS_SPREAD_ACAFP1EUBECBM', 'CDS_SPREAD_ACAFP1EUNECBM', 'CDS_SPREAD_ACOM1JPNACBM', 'CDS_SPREAD_ACOM1USNACBM', 'CDS_SPREAD_AEGON1EUBACBM', 'CDS_SPREAD_AEGON1EUNECBM', 'CDS_SPREAD_AEGON1JPBACBM', 'CDS_SPREAD_AEGON1USBACBM', 'CDS_SPREAD_AEGON1USNECBM', 'CDS_SPREAD_AEP1USNBCBM' ...]
I would like to keep only the alphanumeric codes, removing the CDS_SPREAD_ part and tried the following code:
files_temp=[elem.strip('CDS_SPREAD_') for elem in files_temp]
However, besides the CDS_SPREAD_ part it is also removing a part of the alphanumeric code:
['1EUNBCBM', '1EUNCCBM', '1USNBCBM', '1USNCCBM', 'LLN1EUNECBM', 'LLN1USNECBM', 'BB3EUNECBM', 'BB3USNECBM', 'BX1EUNCCBM', 'BX1USNCCBM', 'FP1EUBECBM', 'FP1EUNECBM', 'OM1JPNACBM', 'OM1USNACBM', 'GON1EUBACBM', 'GON1EUNECBM', 'GON1JPBACBM', 'GON1USBACBM', 'GON1USNECBM', '1USNBCBM', '1USNCCBM', 'T1EUNCCBM', 'T1USNBCBM' ...]
For instance, for the first element, in theory I should get AA1EUNBCBM instead of 1EUNBCBM. Would you know why this is happening? I would highly appreciate an alternative to solve the issue as well.
The strip() function removes all the characters you are providing as the parameters. For your case, you should use replace() function.
files_temp=[elem.replace('CDS_SPREAD_', '') for elem in files_temp]
Related
This question already has answers here:
Remove Last instance of a character and rest of a string
(5 answers)
Closed 3 years ago.
I have a string such as:
string="lcl|NC_011588.1_cds_YP_002321424.1_1"
and I would like to keep only: "YP_002321424.1"
So I tried :
string=re.sub(".*_cds_","",string)
string=re.sub("_\d","",string)
Does someone have an idea?
But the first _ is removed to
Note: The number can change (they are not fixed).
"Ordinary" split, as proposed in the other answer, is not enough,
because you also want to strip the trailing _1, so the part to capture
should end after a dot and digit.
Try the following pattern:
(?<=_cds_)\w+\.\d
For a working example see https://regex101.com/r/U2QsFH/1
Don't bother with regexes, a simple
string.split('_cds_')[1]
will be enough
This question already has answers here:
Python csv string to array
(10 answers)
In regex, match either the end of the string or a specific character
(2 answers)
Closed 3 years ago.
I need to capture words separated by tabs as illustrated in the image below.
The expression (.*?)[\t|\n] works well, except for the last line where a line feed is missing. Can anyone suggest a modification of the regular expression to also match the last word, i.e. Cheyenne? Link to code example
Replace [\t|\n] with (\t|$).
BTW, [\t|\n] is a character class, so the pipe | is literal here. You probably meant [\t\n].
This question already has answers here:
why is python string split() not splitting
(3 answers)
Closed 6 years ago.
I'm trying to split
<team>
into just team, here is the code I'm using:
s = "<team>"
s.split(">")[1]
s
'<team>'
s.split(">")[1].split("<")[0]
s
'<team>
As you can see, it's still leaving me with
<team>
anyone know why>
str.split() function returns a list, it does not split the string in place.
You'll need to make a new variable:
s = "<team>"
t = s.split(">")[1]
t
This question already has answers here:
Are there limits to using string.lstrip() in python? [duplicate]
(3 answers)
Closed 8 years ago.
So I have a super long string composed of integers and I am trying to extract and remove the first three numbers in the string, and I have been using the lstrip method (the idea is kinda like pop) but sometimes it would remove more than three.
x="49008410..."
x.lstrip(x[0:3])
"8410..."
I was hoping it would just remove 490 and return 08410 but it's being stubborn -_- .
Also I am running Python 2.7 on Windows... And don't ask why the integers are strings. If that bothers you, just replace them with letters. Same thing! LOL
Instead of remove the first 3 numbers, get all numbers behind the third position. You can do it using : operator.
x="49008410..."
x[3:]
>> "8410..."
This question already has answers here:
How do I get a substring of a string in Python? [duplicate]
(16 answers)
Understanding slicing
(38 answers)
Closed 8 years ago.
In Python: How do I write a function that would remove "x" number of characters from the beginning of a string?
For instance if my string was "gorilla" and I want to be able remove two letters it would then return "rilla".
OR if my string was "table" and I wanted to remove the first three letters it would return "le".
Please help and thank you everyone!
You can use this syntax called slices
s = 'gorilla'
s[2:]
will return
'rilla'
see also Explain Python's slice notation