How to use "." as a wildcard inside "string" instead of pattern? - python

I have this:
incompleted_string1 = "Thom"
incompleted_string2 = "s Mueller naive"
entire_string = 'Thom.s Mueller naive' # <= dot means any char!!! I dont know which char is it
pattern = "mas M"
I would like to know if "mas M" if present inside entire_string. I do not care if "." is equal to "a" or something else. I cannot change the pattern string!
re.findall("mas M", entire_string)
This returns [] I'd like to have "mas M" but True will be enough
Thank you for your help

You can replace each char in the pattern with [ + this char + . + ]:
bool(re.search("".join([f"[{x}.]" for x in pattern]), entire_string))
The pattern will look like [m.][a.][s.][ .][M.] here, and each can match either the corresponding letter or a dot. See the regex demo.
See the Python demo:
import re
incompleted_string1 = "Thom"
incompleted_string2 = "s Mueller naive"
entire_string = 'Thom.s Mueller naive' # <= dot means any char!!! I dont know which char is it
pattern = "mas M"
print (bool(re.search("".join([f"[{x}.]" for x in pattern]), entire_string)) )
# => True

The another approach could be to have all the possible combinations of pattern
bool(re.search(pattern + "|"+ "|".join([pattern[0:i] + '.' + pattern[i+1:] for i in range(len(pattern))]), entire_string))

Related

Python conditional replace

I need conditional replace in string.
input_str = "a111a11b111b22"
condition : ("b" + any number + "b") to ("Z" + any number)
output_str = "a111a11Z11122"
maybe I need to use [0] and [-1] for remove "b"s and "Z"+any number
but I can't find conditional replace for it.
You should use regular expressions. They are really useful:
import re
input_str = "a111a11b111b22"
output_str = re.sub(r'b(\d+)b', r'Z\1', input_str)
# output_str is "a111a11Z11122"
The r'b(\d+)b' regexpr matches letter b, followed by 1 or more digits and other letter b. The parenthesis memorizes the digits for further use (with \1) in the replacement part of the sentence (letter Z and \1).
Try with regex:
import re
input_str = "a111a11b111b22"
output_str = re.sub(r'[b](\d)',r'Z\1',input_str)
print(output_str)

Replace subtext of a word

I want to replace this string
ramesh#gmail.com
to
rxxxxh#gxxxl.com
this is what I have done so far
print( re.sub(r'([A-Za-z](.*)[A-Za-z]#)','x', i))
One way to go is to use capturing groups and in the replacement for the parts that should be replaced with x return a repetition for number of characters in the matched group.
For the second and the fourth group use a negated character class [^ matching any char except the listed.
\b([A-Za-z])([^#\s]*)([A-Za-z]#[A-Za-z])([^#\s.]*)([A-Za-z])\b
Regex demo | Python demo
For example
import re
i = "ramesh#gmail.com"
res = re.sub(
r'\b([A-Za-z])([^#\s]*)([A-Za-z]#[A-Za-z])([^#\s.]*)([A-Za-z])\b',
lambda x: x.group(1) + "x" * len(x.group(2)) + x.group(3) + "x" * len(x.group(4)) + x.group(5),
i)
print(res)
Output
rxxxxh#gxxxl.com

Insert string before first occurence of character

So basically I have this string __int64 __fastcall(IOService *__hidden this);, and I need to insert a word in between __fastcall (this could be anything) and (IOService... such as __int64 __fastcall LmaoThisWorks(IOService *__hidden this);.
I've thought about splitting the string but this seems a bit overkill. I'm hoping there's a simpler and shorter way of doing this:
type_declaration_fun = GetType(fun_addr) # Sample: '__int64 __fastcall(IOService *__hidden this)'
if type_declaration_fun:
print(type_declaration_fun)
type_declaration_fun = type_declaration_fun.split(' ')
first_bit = ''
others = ''
funky_list = type_declaration_fun[1].split('(')
for x in range(0, (len(funky_list))):
if x == 0:
first_bit = funky_list[0]
else:
others = others + funky_list[x]
type_declaration_fun = type_declaration_fun[0] + ' ' + funky_list[0] + ' ' + final_addr_name + others
type_declaration_fun = type_declaration_fun + ";"
print(type_declaration_fun)
The code is not only crap, but it doesn't quite work. Here's a sample output:
void *__fastcall(void *objToFree)
void *__fastcall IOFree_stub_IONetworkingFamilyvoid;
How could I make this work and cleaner?
Notice that there could be nested parentheses and other weird stuff, so you need to make sure that the name is added just before the first parenthesis.
You can use the method replace():
s = 'ABCDEF'
ins = '$'
before = 'DE'
new_s = s.replace(before, ins + before, 1)
print(new_s)
# ABC$DEF
Once you find the index of the character you need to insert before, you can use splicing to create your new string.
string = 'abcdefg'
string_to_insert = '123'
insert_before_char = 'c'
for i in range(len(string)):
if string[i] == insert_before_char:
string = string[:i] + string_to_insert + string[i:]
break
What about this:
s = "__int64__fastcall(IOService *__hidden this);"
t = s.split("__fastcall",1)[0]+"anystring"+s.split("__fastcall",1)[1]
I get:
__int64__fastcallanystring(IOService *__hidden this);
I hope this is what you want. If not, please comment.
Use regex.
In [1]: import re
pattern = r'(?=\()'
string = '__int64 __fastcall(IOService *__hidden this);'
re.sub(pattern, 'pizza', string)
Out[1]: '__int64 __fastcallpizza(IOService *__hidden this);'
The pattern is a positive lookahead to match the first occurrence of (.
x='high speed'
z='new text'
y = x.index('speed')
x =x[:y] + z +x[y:]
print(x)
>>> high new textspeed
this a quick example, please be aware that y inclusuve after the new string.
be Aware that you are changing the original string, or instead just declare a new string.

Replace character only when character not in parentheses

I have a string like the following:
test_string = "test:(apple:orange,(orange:apple)):test2"
I want to replace ":" with "/" only if it is not contained within any set of parentheses.
The desired output is "test/(apple:orange,(orange:apple))/test2"
How can this be done in Python?
You can use below code to achive expected ouput
def solve(args):
ans=''
seen = 0
for i in args:
if i == '(':
seen += 1
elif i== ')':
seen -= 1
if i == ':' and seen <= 0:
ans += '/'
else:
ans += i
return ans
test_string = "test:(apple:orange,(orange:apple)):test2"
print(solve(test_string))
With regex module:
>>> import regex
>>> test_string = "test:(apple:orange,(orange:apple)):test2"
>>> regex.sub(r'\((?:[^()]++|(?0))++\)(*SKIP)(*F)|:', '/', test_string)
'test/(apple:orange,(orange:apple))/test2'
\((?:[^()]++|(?0))++\) match pair of parantheses recursively
See Recursive Regular Expressions for explanations
(*SKIP)(*F) to avoid replacing the preceding pattern
See Backtracking Control Verbs for explanations
|: to specify : as alternate match
Find the first opening parentheses
Find the last closing parentheses
Replace every ":" with "/" before the first opening parentheses
Don't do anything to the middle part
Replace every ":" with "/" after the last closing parentheses
Put these 3 substrings together
Code:
test_string = "test:(apple:orange,(orange:apple)):test2"
first_opening = test_string.find('(')
last_closing = test_string.rfind(')')
result_string = test_string[:first_opening].replace(':', '/') + test_string[first_opening : last_closing] + test_string[last_closing:].replace(':', '/')
print(result_string)
Output:
test/(apple:orange,(orange:apple))/test2
Warning: as the comments pointed it out this won't work if there are multiple distinct parentheses :(

Find substring in string but only if whole words?

What is an elegant way to look for a string within another string in Python, but only if the substring is within whole words, not part of a word?
Perhaps an example will demonstrate what I mean:
string1 = "ADDLESHAW GODDARD"
string2 = "ADDLESHAW GODDARD LLP"
assert string_found(string1, string2) # this is True
string1 = "ADVANCE"
string2 = "ADVANCED BUSINESS EQUIPMENT LTD"
assert not string_found(string1, string2) # this should be False
How can I best write a function called string_found that will do what I need? I thought perhaps I could fudge it with something like this:
def string_found(string1, string2):
if string2.find(string1 + " "):
return True
return False
But that doesn't feel very elegant, and also wouldn't match string1 if it was at the end of string2. Maybe I need a regex? (argh regex fear)
You can use regular expressions and the word boundary special character \b (highlight by me):
Matches the empty string, but only at the beginning or end of a word. A word is defined as a sequence of alphanumeric or underscore characters, so the end of a word is indicated by whitespace or a non-alphanumeric, non-underscore character. Note that \b is defined as the boundary between \w and \W, so the precise set of characters deemed to be alphanumeric depends on the values of the UNICODE and LOCALE flags. Inside a character range, \b represents the backspace character, for compatibility with Python’s string literals.
def string_found(string1, string2):
if re.search(r"\b" + re.escape(string1) + r"\b", string2):
return True
return False
Demo
If word boundaries are only whitespaces for you, you could also get away with pre- and appending whitespaces to your strings:
def string_found(string1, string2):
string1 = " " + string1.strip() + " "
string2 = " " + string2.strip() + " "
return string2.find(string1)
The simplest and most pythonic way, I believe, is to break the strings down into individual words and scan for a match:
string = "My Name Is Josh"
substring = "Name"
for word in string.split():
if substring == word:
print("Match Found")
For a bonus, here's a oneliner:
any(substring == word for word in string.split())
Here's a way to do it without a regex (as requested) assuming that you want any whitespace to serve as a word separator.
import string
def find_substring(needle, haystack):
index = haystack.find(needle)
if index == -1:
return False
if index != 0 and haystack[index-1] not in string.whitespace:
return False
L = index + len(needle)
if L < len(haystack) and haystack[L] not in string.whitespace:
return False
return True
And here's some demo code (codepad is a great idea: Thanks to Felix Kling for reminding me)
I'm building off aaronasterling's answer.
The problem with the above code is that it will return false when there are multiple occurrences of needle in haystack, with the second occurrence satisfying the search criteria but not the first.
Here's my version:
def find_substring(needle, haystack):
search_start = 0
while (search_start < len(haystack)):
index = haystack.find(needle, search_start)
if index == -1:
return False
is_prefix_whitespace = (index == 0 or haystack[index-1] in string.whitespace)
search_start = index + len(needle)
is_suffix_whitespace = (search_start == len(haystack) or haystack[search_start] in string.whitespace)
if (is_prefix_whitespace and is_suffix_whitespace):
return True
return False
One approach using the re, or regex, module that should accomplish this task is:
import re
string1 = "pizza pony"
string2 = "who knows what a pizza pony is?"
search_result = re.search(r'\b' + string1 + '\W', string2)
print(search_result.group())
Excuse me REGEX fellows, but the simpler answer is:
text = "this is the esquisidiest piece never ever writen"
word = "is"
" {0} ".format(text).lower().count(" {0} ".format(word).lower())
The trick here is to add 2 spaces surrounding the 'text' and the 'word' to be searched, so you guarantee there will be returning only counts for the whole word and you don't get troubles with endings and beginnings of the 'text' searched.
Thanks for #Chris Larson's comment, I test it and updated like below:
import re
string1 = "massage"
string2 = "muscle massage gun"
try:
re.search(r'\b' + string1 + r'\W', string2).group()
print("Found word")
except AttributeError as ae:
print("Not found")
def string_found(string1,string2):
if string2 in string1 and string2[string2.index(string1)-1]=="
" and string2[string2.index(string1)+len(string1)]==" ":return True
elif string2.index(string1)+len(string1)==len(string2) and
string2[string2.index(string1)-1]==" ":return True
else:return False

Categories

Resources