Regex Matching - Python [closed]

Regex Matching - Python [closed] - python

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I need to write a regex matching pattern code to either return true if there is one '+' between two words and nothing else. I have written the code to check if there is only one '+' in the string but how will I check it is between two words?
The code is below:
import re
inputStr= "ali+ahmedafaw+"
inputStr2= "hello+world+again"
plus=re.findall(r'[+]', inputStr)
print (plus)
l_plus=len(plus)
print "The length is ",l_plus
if l_plus<=1:
print "True"
else:
print "False"

Actually it depends on what you mean by word. If you mean a word with more than one character, you can simply use [a-zA-Z]+ around the + character. Or other patterns which will match different characters like \w to match word characters.
re.search(r'[a-zA-Z]+\+[a-zA-Z]+', input_str)
But if you just want it doesn't appears at the leading and trailing of your text you can use negative look-around:
re.search(r'(?<!^)\+(?!$)', input_str)

Related

Replace a string in a URL with python [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 months ago.
Improve this question
I have a text containing a URL that needs to be reworked.
text='dfs:/?url=https://myserver/c12&ofg={"tes":{"id":1812}}'
I need to replace programmatically the id value (in this example 1812, which is unknown before the execution) with a fixed substring (e.g. 189). So the end result must be
'dfs:/?url=https://myserver/c12&ofg={"tes":{"id":189}}'
As I'm programming in Python, I guess that I should use the regular expression (module re) to automatically replace that value between "id": and }} but I couldn't find one that works for this use case.

I assume you are always generating the same URL with that pattern, and the value to 'change' is always in {"id":X}. One way to solve this particular problem is with a positive lookbehind + re.sub replacement.
import re
pattern = re.compile(r"(?<=\"id\":)\d+")
string = "dfs:/?url=https://myserver/c12&ofg={\"tes\":{\"id\":1812}}"
print(pattern.sub("desired_value", string))
Generated output will contain desired_value in place of the 1812. A good explanation of what is happening is done in regex101 but a quick rep of what is happening in the pattern:
Matches any digit one or more times ONLY if behind has "id":, without consuming characters

what about simply splitting the string twice? eg.
my_string = 'dfs:/?url=https://myserver/c12&ofg={"tes":{"id":1812}}'
substring = my_string.split('"id":',1)[1]
substring = substring.split('}}')[0]
print(my_string.replace(substring, "189"))

Removing all occurrences of any characters in the word 'dust' in the string [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 11 months ago.
Improve this question
Example:
Input: Output:
dustbin bin
if 'dust' in string:
new = string.split('dust')
listToStr = ''.join(map(str, new))
print(listToStr)
The above code works fine.
But if the input changes like this.
Input: Sample Output:
dustduuuustdustbin bin
The above code doesn't work. Is there a solution to this?

Use a regular expression.
import re
result = re.sub(r'[dust]', '', string)
The regexp [dust] matches any of those characters, and all the matches are replaced with an empty string.
If you want to remove only the whole word dust, with possible repetitions of the letters, the regexp would be r'd+u+s+t+'.
If only u can be repeated, use r'du+st'.

Use the below function definition, to remove any pattern from any string.
re allows to remove the match with the pattern allowing to print only the non-matched string.
import re
def remover(input, pattern):
temp='['
temp+=pattern
temp+=']'
return re.sub(r''+temp,'',input)
remover("dustttttttttbinggg",'dust')

Regex - Find 2 words or more using regex [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I have a simple regex to grab a string, It works fine with words like HelloWorld without space and how can i grab a words with space or more than 1 word like Hello World
Text File
FAN PS-2 is NOT PRESENT #like this 'NOT PRESENT'
Regex
Value FAN_PS (\S*) # regex
Start
^FAN PS is ${FAN_PS_1}
What should I change in my regex so I can grab more than 1 word?
Thank you.

You have to change your regular expressions. The current expressions \S* match a string of non-white-space characters, so it is normal that you only get the first word.
If you change \S* to [\S ]*, you will get multiple words. You can even change it to the simpler .* if you do not care about certain characters.
Read the python regex reference for information on different character classes.

Python regex to remove specific pattern from a list of strings [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I have a list of strings with filenames. The filenames follow a specific naming format:
string1_YYYYMMDD_HHMMSS_string2
Here YYYYMMDD and HHMMSS are actual date and time values.
I want to delete all characters that appear after 'string1' for each of the entries. I've been trying this with regex but to no vain. Could anyone help me with this?

You don't need a regex, just split on the first underscore:
s = 'string1_YYYYMMDD_HHMMSS_string2'
return s.split('_')[0]
[edit]:
If you can only rely on the last parts ('_YYYYMMDD_HHMMSS_string2') then try indexing like this:
s = 's_t_r_i_n_g_1_YYYYMMDD_HHMMSS_string2'
return '_'.join(s.split('_')[:-3])

Using regex:
import re
s = 'string1_YYYYMMDD_HHMMSS_string2'
newstr = re.sub('_.*', '', s)
print(newstr)
Notes:
_.* matches with a _ and all of its following characters.
re.sub(p, r, s) searches s for p and replaces all matches with r.
Update #1
string1 may contain additional underscores. I'd like to retain all of string1 and only get rid of the trailing pattern.
In this case you can use the following regex:
_\d{8}_\d{6}_.*
Demo: https://regex101.com/r/jS2gL5/1

How to remove periods from the middle of sentences [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
Is it possible to remove periods from the middle of a string (sentence), leaving the ending period?
The answers that I have seen, basically strip all of the periods.
Remove periods at the end of sentences in python

If I understand correctly, this should do what you want:
import re
string = 'You can. use this to .remove .extra dots.'
string = re.sub('\.(?!$)', '', string)
It uses regex to replace all dots, except if the dot is at the end of the string. (?!$) is a negative lookahead, so the regex looks for any dot not directly followed by $ (end of line).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Regex Matching - Python [closed] - python

Related

Replace a string in a URL with python [closed]

Removing all occurrences of any characters in the word 'dust' in the string [closed]

Regex - Find 2 words or more using regex [closed]

Python regex to remove specific pattern from a list of strings [closed]

How to remove periods from the middle of sentences [closed]

Categories

Resources