This question already has answers here:
Best way to strip punctuation from a string
(32 answers)
Closed 1 year ago.
How would i eventually remove punctuations in this function? would "(str.maketrans('', '', string.punctuation)" work? And where in the function should I write it?
def convert(lst):
return " " .join(lst).split()
lst = ["Good for the price, but poor Bluetooth connections."]
print(convert(lst))
In general on Stack Overflow, you would have been better editing your original question with an update on your progress.
string.punctuation is definitely a step in the right direction. You've got a few options including:
str.strip()
str.translate() (with or without maketrans())
As to where? If you do it after you create the list, you'll need to apply it to each element individually. But if you split up the your string transformation function calls (split, join) on the return line into their own lines, you could do it earlier. Try performing one action at a time, then printing the result to see if you can spot where else you can remove punctuation without having to iterate.
Related
This question already has an answer here:
Why does printing a tuple (list, dict, etc.) in Python double the backslashes?
(1 answer)
Closed 1 year ago.
enter image description here
is there a way to print single backslash within list?
Regarding the first version of your question, I wrote this:
First, this expression x='\' isn't right in Python in python. you should rather puth it this way: x='\\', since back slash is a special character in python.
Second, try this:
l=['\\'] print(l)
This will print: ['\\']
But when you execute this: print(l[0]), it renders this '\'. So basically, this ['\\'] is the way to print a backslash within a list.
This question already has answers here:
Remove Last instance of a character and rest of a string
(5 answers)
Closed 3 years ago.
I have a string such as:
string="lcl|NC_011588.1_cds_YP_002321424.1_1"
and I would like to keep only: "YP_002321424.1"
So I tried :
string=re.sub(".*_cds_","",string)
string=re.sub("_\d","",string)
Does someone have an idea?
But the first _ is removed to
Note: The number can change (they are not fixed).
"Ordinary" split, as proposed in the other answer, is not enough,
because you also want to strip the trailing _1, so the part to capture
should end after a dot and digit.
Try the following pattern:
(?<=_cds_)\w+\.\d
For a working example see https://regex101.com/r/U2QsFH/1
Don't bother with regexes, a simple
string.split('_cds_')[1]
will be enough
This question already has answers here:
Python for-in loop preceded by a variable [duplicate]
(5 answers)
Closed 4 years ago.
I'm new to Python, so I was hoping somebody could break down the following statement and explain the purpose of each part.
[digit for digit in string.split() if digit.isdigit()][0]
Obviously for digit in string.split() creates a list of substrings by separating the string into elements at each space.
What confuses me is the digit at the very beginning and the if statement at the very end.
Is the very first digit what will be returned if digit.isdigit()?
Why must this statement be wrapped in a list?
I've never seen a for loop and an if statement combined into one statement like this before, but it reminds me of a particular JS syntax: for (condition) // whatever or if (condition) // whatever. However, in JS you can't combine them into a single statement (i.e. for (condition) if (condition) // whatever).
This is called a list comprehension. You will find plenty of pages explaining how it works. Just ask you favorite search engine.
This question already has answers here:
Carets in Regular Expressions
(2 answers)
Difference between * and + regex
(7 answers)
Closed 5 years ago.
I apologize for the poorly worded question.
I have a large number of strings like:
"ODLS_ND33283633__PS1185"
Which the first letters up to the first "_" are a header and the remainder (ND33283633__PS1185) is a unique ID.
I wrote a regex in python trying to remove everything up to the first "_" desiring
"ND33283633__PS1185"
as the end result.
I figured something like:
.*_? or .+?_
Would do the trick, but that was not the case...
I kept trying to write various regex unsuccessfully to accomplish this and finally went online and found another person's answer I was able to use as an example to rewrite as:
^[^_]+_
Which gave me my desired result, but now I have questions which I can't figure out the answer for:
I found that removing the "^" at the front and writing it as:
[^_]+_
caused the regex to remove everything up to the second "_" so the resulting string was:
"_PS1185"
I understand that "^" identifies as the beginning of the line, but I would like to know why not including it removes up to the second without the "^" at the front?
My understanding is that [^_]+ matches characters NOT equal to "_" 1 or more number of times, so why would including the "^" at the beginning cause it to stop at the first, while excluding it causes it to stop at the second?
Another thing, when I replaced the "+" symbol with a "*":
[^_]*_
I expected the same result but instead got:
PS1185
I thought that * matches 0 or more, while + matches 1 or more, so they're effectively the same except + is supposed to be more 'strict'. However, seeing these results makes me feel like I don't fully understand how regex is behaving. Is there anyone here that can please explain what is actually going on?
This question already has answers here:
why is python string split() not splitting
(3 answers)
Closed 6 years ago.
I've got a simple block of code which is meant to iterate over a list of strings and split each item in the list into a new list, then call another function on each item:
list_input = take_input()
for item in list_input:
item.split()
system_output(item)
The problem is that 'item.split()' doesn't seem to be doing anything. With a print(item) statement in the penultimate line, all that is printed to the console is the contents of item, not the contents split into a new list. I feel like I'm missing something obvious, can anyone help? Thanks!
EDIT: So I've been informed that strings are immutable in Python, and in light of this replaced the 'item.split()' line with 'item = item.split()'. However, I am still running into the same error, even with item redefined as a new variable.
split() does not split the string inplace, it only returns a splitted string that you have to put in an other variable.