Splitting quotes [duplicate] - python

This question already has answers here:
RegEx: Grabbing values between quotation marks
(20 answers)
Closed 6 years ago.
Does anyone have any advice for removing separators of split quotes in a piece of text? I am using Python, and am still a beginner.
For example, "Well," he said, "I suppose I could take a break." In this example, the italicized "he said," is the separator, and needs to be removed. Then, the quote needs to be seen as one string within quotations such as, "Well, I suppose I could take a break." I haven't been able to find code similar to this yet, and was hoping someone may be able to point me in the right direction.
Thanks!

In order to get the content only within " in your given string, you may use re library as:
import re
my_string = '"Well," he said, "I suppose I could take a break."'
quoted_string = re.findall(r'\".*?\"', my_string)
# 'quoted_string' is -> ['"Well,"', '"I suppose I could take a break."']
new_string = ''.join(quoted_string).replace('"', '')
# 'new_string' is -> 'Well, I suppose I could take a break.'
You may write the same as one-liner as:
''.join(re.findall(r'\".*?\"', my_string)).replace('"', '')

Related

Looking for a way to correctly strip a string [duplicate]

This question already has answers here:
python split() vs rsplit() performance?
(5 answers)
Closed 2 years ago.
I'm using the Spotify API to get song data from a lot of songs. To this end, I need to input the song URI intro an API call. To obtain the song URI's, I'm using another API endpoint. It returns the URI in this form: 'spotify:track:5CQ30WqJwcep0pYcV4AMNc' I only need the URI part,
So I used 'spotify:track:5CQ30WqJwcep0pYcV4AMNc'.strip("spotify:track) to strip away the first part. Only this did not work as expected, as this call also removes the trailing "c".
I tried to built a regex to strip away the first part, but instructions were too complicated and D**K is now stuck in ceiling fan :'(. Any help would be greatly appreciated.
strip() removes all the leading and trailing characters that are in the in the argument string, it doesn't match the string exactly.
You can use replace() to remove an exact string:
'spotify:track:5CQ30WqJwcep0pYcV4AMNc'.replace("spotify:track:", "")
or split it at : characters:
'spotify:track:5CQ30WqJwcep0pYcV4AMNc'.split(":")[-1]
Use simple regex replace:
import re
txt = 'spotify:track:5CQ30WqJwcep0pYcV4AMNc'
pat_to_strip = ['^spotify\:track', 'MNc$']
pat = f'({")|(".join(pat_to_strip)})'
txt = re.sub(pat, '', txt)
# outputs:
>>> txt
:5CQ30WqJwcep0pYcV4A
Essentially the patterns starting with ^ will be stripped from the beginning, and the ones ending with $ will be stripped from the end.
I stripped last 3 letters just as an example.

Split string on one use of a word and not the other [duplicate]

This question already has answers here:
Python split string based on conditional
(2 answers)
Closed 4 years ago.
fairly novice here, I'm looking for an effective way to use split() to split a string after a certain word.
I'm working on a voice controlled filter using the Csound API in python, and say my input command is "Set cutoff to 440", I'd like to split the string after the word "to", basically meaning that I can say the command however I like and it'll still find my frequency I'm looking for, I hope that makes sense.
So at the moment, my code for this is
string = "set cutoff to 440"
split = string.split("to")
print(split)
and my output is
['set', 'cu', 'ff', '440']
The problem is the 'to' in 'cutoff', I know I could fix this by just changing cutoff to frequency but it seems like giving in too easily. My suspicion is there's a way to do this with regular expressions, but I could easily be wrong, any advice would be really helpful, and I hope my post adhered to all the guidelines and stuff, am pretty new to Stack Overflow.
Easy way to do so is to split with spaces around the word to
string = "set cutoff to 440"
split = string.split(" to ")
print(split)
returns
['set cutoff', '440']
using regex to do so is much less efficient than simple split the word surrounded by spaces
If you want to use regex for other reasons, here is how to do it: you can find all non-whitespace characters:
import re
string = "set cutoff to 440"
split = re.findall(r'\S+',string)
print(split)
returns
['set', 'cutoff', 'to', '440']
from jamylak on this post: Split string based on a regular expression

Negating ")" inside a regex search pattern [duplicate]

This question already has answers here:
Escaping regex string
(4 answers)
Closed 5 years ago.
I'm trying to use regex in python to replace strings. I'd like to replace "SERVERS)" with "SERV" in a string.
example_String = "This is a great SERVERS)"
re.sub("SERVERS)","SERV", example_String)
I expected it to be a straight forward swap, but as I read more into the error, it looks like I need to set the regex pattern to read the ")" as a regular character and not a special regex character.
I'm not very familiar with regex , and would appreciate the help!
Edit : I'm importing data from a database (which is user input), there's quite a few similar issues as mentioned in the question.
re.escape() fits the bill perfectly , thanks!
You could go for
import re
example_String = "This is a great SERVERS)"
new_string = re.sub(r"SERVERS\)","SERV", example_String)
print(new_string)
Which yields
This is a great SERV
To be honest, no regular expression is needed, really:
example_String = "This is a great SERVERS)"
new_string = example_String.replace('SERVERS)', 'SERV')
print(new_string)
In the latter, you don't even need to escape anything and it will be faster.

Python: how to emphasize a specific sequence of characters within a string when it is printed? [duplicate]

This question already has answers here:
How can I print bold text in Python?
(17 answers)
Closed 7 years ago.
I have a program that prints strings consisting of a random sequence of lowercase letters and spaces.
Is there a way emphasize a certain target word within that string when I print the string to make the target easier to spot in the output?
For example:
Current code:
>>> print(mystring)
mxxzjvgjaspammttunthcrurny dvszqwkurxcxyfepftwyrxqh
Desired behaviour:
>>> some_function_that_prints_with_emphasis(mystring,'spam')
mxxzjvgjaspammttunthcrurny dvszqwkurxcxyfepftwyrxqh
Other acceptable forms of emphasis would be:
boldening the characters
changing colour
capitalizing
separating with extra characters
any other idea that is easily implemented in Python 3
I'm leaving the requirements deliberately vague because as a beginner I'm not aware of all that Python can do so there might be a simpler way to do this that I've overlooked.
Basically everything you're looking for would be done with the curses module; it's designed to perform advanced control over the terminal to do stuff like change colors, bold text, etc. You need to use the various has_* commands to determine terminal capabilities and choose your preferred emphasis style, but after that the docs page and the linked tutorial should give you all the info you need.
For simpler usage, you can just print out the raw terminal escape codes to add and remove color (you just have to split the line up yourself or use re to perform replacements to add the codes). For example, to highlight 'spam' in a line as blue:
myline = "abc123spamscsdfwerf"
print(myline.replace('spam', '\033[94mspam\033[0m'))
For ease of use, you can use ansicolors to avoid having to manually deal with color escapes and the like.
You can replace the target with its version capitalized:
def emphasis(string, target):
return string.replace(target, target.upper())
you could just find the substring you're looking for and put a marker under them, like:
needle = "spam"
haystack = "mxxzjvgjaspammttunthcrurny dvszqwkurxcxyfepftwyrxqh"
pos = haystack.find(needle)
print(haystack)
print(" "*(pos-1) + "^" * len(needle))

How to check for three or more whitespaces in string - Python [duplicate]

This question already has answers here:
Count the number of occurrences of a character in a string
(26 answers)
Closed 7 years ago.
I need a method to check if there are three o whitespaces a string. Currently I only know how to check if there is a whitespace, with this: if " " in someString:. The reason why I want to find how many whitespaces there are, is because I need to narrow the search down of decrypted messages.
You see, the encrypted message is a long string of random letters and numbers. I use itertools.product("abcdefghijklmnopqrstuvwxyzæøå", repeat=6) to generate a set of keys, assuming the length of the key is six. In order to find the correct original message, which is a setence in english I need to narrow down the search. I figured out that checking for three or more whitespaces in a string is a great way to do so, since a sentence usually consist of multiple whitespaces.
I really want some tips, rather than the solution, since this is a task I want to figure out by myself! :D
If you are actually interested in the exact count of any character that is considered white space, you could use something like this:
s = " nar\trow down the se\narch."
num_whitespace = len([c for c in s if c in [' ', '\n', '\t']])
print(num_whitespace) # 6
Of course, you can add some whitespace characters as needed.

Categories

Resources