Replace character only when character not in parentheses - python

I have a string like the following:
test_string = "test:(apple:orange,(orange:apple)):test2"
I want to replace ":" with "/" only if it is not contained within any set of parentheses.
The desired output is "test/(apple:orange,(orange:apple))/test2"
How can this be done in Python?

You can use below code to achive expected ouput
def solve(args):
ans=''
seen = 0
for i in args:
if i == '(':
seen += 1
elif i== ')':
seen -= 1
if i == ':' and seen <= 0:
ans += '/'
else:
ans += i
return ans
test_string = "test:(apple:orange,(orange:apple)):test2"
print(solve(test_string))

With regex module:
>>> import regex
>>> test_string = "test:(apple:orange,(orange:apple)):test2"
>>> regex.sub(r'\((?:[^()]++|(?0))++\)(*SKIP)(*F)|:', '/', test_string)
'test/(apple:orange,(orange:apple))/test2'
\((?:[^()]++|(?0))++\) match pair of parantheses recursively
See Recursive Regular Expressions for explanations
(*SKIP)(*F) to avoid replacing the preceding pattern
See Backtracking Control Verbs for explanations
|: to specify : as alternate match

Find the first opening parentheses
Find the last closing parentheses
Replace every ":" with "/" before the first opening parentheses
Don't do anything to the middle part
Replace every ":" with "/" after the last closing parentheses
Put these 3 substrings together
Code:
test_string = "test:(apple:orange,(orange:apple)):test2"
first_opening = test_string.find('(')
last_closing = test_string.rfind(')')
result_string = test_string[:first_opening].replace(':', '/') + test_string[first_opening : last_closing] + test_string[last_closing:].replace(':', '/')
print(result_string)
Output:
test/(apple:orange,(orange:apple))/test2
Warning: as the comments pointed it out this won't work if there are multiple distinct parentheses :(

Related

How can I remove curly brackets as well as the text inside of it?

hello_world{552}.txt
{hix89}abcdefg{47181x00}.exe
How can I output these strings to look like "hello_world.txt,abcdefg.exe"
with double quotation and comma without spacing?
You can use regex and find solution:
output=[]
test=['hello_world{552}.txt','ggfj{hix89}abcdefg{47181x00}.exe']
for x in test:
out = re.sub('{\w*}','',x)
output.append(out)
print(output)
final_o = ",".join(output)
print(str(final_o))
Output:
hello_world.txt,ggfjabcdefg.exe
if I understood your problem correctly, you need to get rid of all brackets and connect two strings together, plus surround with double quotes.
ls = ['hello_world{552}.txt', '{hix89}abcdefg{47181x00}.exe']
def delete_content_of_brackets(ls):
result = ""
bracket_is_open = False
for element in ls:
for character in element:
if character == "}":
bracket_is_open = False
elif character == "{" or bracket_is_open:
bracket_is_open = True
else:
result += character
result += ","
return "\"" + result[:-1]+"\""

How can i remove only the last bracket from a string in python?

How can i remove only the last bracket from a string ?
For example,
INPUT 1:
"hell(h)o(world)"
i want this result:
"hell(h)o"
Input 2 :-
hel(lo(wor)ld)
i want :-
hel
as you can see the middle brackets remain intact only the last bracket got removed.
I tried :-
import re
string = 'hell(h)o(world)'
print(re.sub('[()]', '', string))
output :-
hellhoworld
i figured out a solution :-
i did it like this
string = 'hell(h)o(world)'
if (string[-1] == ")"):
add=int(string.rfind('(', 0))
print(string[:add])
output :-
hell(h)o
looking for other optimised solutions/suggestions..
Please see the below if this is useful, Let me know I will optimize further.
string = 'hell(h)o(world)'
count=0
r=''
for i in reversed(string):
if count <2 and (i == ')' or i=='('):
count+=1
pass
else:
r+=i
for i in reversed(r):
print(i, end='')
If you want to remove the last bracket from the string even if it's not at the end of the string, you can try something like this. This will only work if you know you have a substring beginning and ending with parentheses somewhere in the string, so you may want to implement some sort of check for that. You will also need to modify if you are dealing with nested parenthesis.
str = "hell(h)o(world)"
r_str = str[::-1] # creates reverse copy of string
for i in range(len(str)):
if r_str[i] == ")":
start = i
elif r_str[i] == "(":
end = i+1
break
x = r_str[start:end][::-1] # substring that we want to remove
str = str.replace(x,'')
print(str)
output:
hell(h)o
If the string is not at the end:
str = "hell(h)o(world)blahblahblah"
output:
hell(h)oblahblahblah
Edit: Here is a modified version to detect nested parenthesis. However, please keep in mind that this will not work if there are unbalanced parenthesis in the string.
str = "hell(h)o(w(orld))"
r_str = str[::-1]
p_count = 0
for i in range(len(str)):
if r_str[i] == ")":
if p_count == 0:
start = i
p_count = p_count+1
elif r_str[i] == "(":
if p_count == 1:
end = i+1
break
else:
p_count = p_count - 1
x = r_str[start:end][::-1]
print("x:", x)
str = str.replace(x,'')
print(str)
output:
hell(h)o
Something like this?
string = 'hell(h)o(w(orl)d)23'
new_str = ''
escaped = 0
for char in reversed(string):
if escaped is not None and char == ')':
escaped += 1
if not escaped:
new_str = char + new_str
if escaped is not None and char == '(':
escaped -= 1
if escaped == 0:
escaped = None
print(new_str)
This starts escaping when a ) and stops when it's current level is closed with (.
So a nested () would not effect it.
Using re.sub('[()]', '', string) will replace any parenthesis in the string with an empty string.
To match the last set of balanced parenthesis, and if you can make use of the regex PyPi module, you can use a recursive pattern repeating the first sub group, and assert that to the right there are no more occurrences of either ( or )
(\((?:[^()\n]++|(?1))*\))(?=[^()\n]*$)
The pattern matches:
( Capture group 1
\( Match (
(?:[^()\n]++|(?1))* Repeat 0+ times matching either any char except ( ) or a newline. If you do, recurse group 1 using (?1)
\) Match )
) Close group 1
(?=[^()\n]*$) Positive lookahead, assert till the end of the string no ( or ) or newline
See a regex demo and a Python demo.
For example
import regex
strings = [
"hell(h)o(world)",
"hel(lo(wor)ld)",
"hell(h)o(world)blahblahblah"
]
pattern = r"(\((?:[^()]++|(?1))*\))(?=[^()]*$)"
for s in strings:
print(regex.sub(pattern, "", s))
Output
hell(h)o
hel
hell(h)oblahblahblah

Insert string before first occurence of character

So basically I have this string __int64 __fastcall(IOService *__hidden this);, and I need to insert a word in between __fastcall (this could be anything) and (IOService... such as __int64 __fastcall LmaoThisWorks(IOService *__hidden this);.
I've thought about splitting the string but this seems a bit overkill. I'm hoping there's a simpler and shorter way of doing this:
type_declaration_fun = GetType(fun_addr) # Sample: '__int64 __fastcall(IOService *__hidden this)'
if type_declaration_fun:
print(type_declaration_fun)
type_declaration_fun = type_declaration_fun.split(' ')
first_bit = ''
others = ''
funky_list = type_declaration_fun[1].split('(')
for x in range(0, (len(funky_list))):
if x == 0:
first_bit = funky_list[0]
else:
others = others + funky_list[x]
type_declaration_fun = type_declaration_fun[0] + ' ' + funky_list[0] + ' ' + final_addr_name + others
type_declaration_fun = type_declaration_fun + ";"
print(type_declaration_fun)
The code is not only crap, but it doesn't quite work. Here's a sample output:
void *__fastcall(void *objToFree)
void *__fastcall IOFree_stub_IONetworkingFamilyvoid;
How could I make this work and cleaner?
Notice that there could be nested parentheses and other weird stuff, so you need to make sure that the name is added just before the first parenthesis.
You can use the method replace():
s = 'ABCDEF'
ins = '$'
before = 'DE'
new_s = s.replace(before, ins + before, 1)
print(new_s)
# ABC$DEF
Once you find the index of the character you need to insert before, you can use splicing to create your new string.
string = 'abcdefg'
string_to_insert = '123'
insert_before_char = 'c'
for i in range(len(string)):
if string[i] == insert_before_char:
string = string[:i] + string_to_insert + string[i:]
break
What about this:
s = "__int64__fastcall(IOService *__hidden this);"
t = s.split("__fastcall",1)[0]+"anystring"+s.split("__fastcall",1)[1]
I get:
__int64__fastcallanystring(IOService *__hidden this);
I hope this is what you want. If not, please comment.
Use regex.
In [1]: import re
pattern = r'(?=\()'
string = '__int64 __fastcall(IOService *__hidden this);'
re.sub(pattern, 'pizza', string)
Out[1]: '__int64 __fastcallpizza(IOService *__hidden this);'
The pattern is a positive lookahead to match the first occurrence of (.
x='high speed'
z='new text'
y = x.index('speed')
x =x[:y] + z +x[y:]
print(x)
>>> high new textspeed
this a quick example, please be aware that y inclusuve after the new string.
be Aware that you are changing the original string, or instead just declare a new string.

How to find the multiple instances of a data between two special characters in python

I am a beginner in Python so please excuse me if my question is two simple. I want to find the multiple instances of data between two special characters in a string and also count the number of instances. Until now I have the following code.
import re
count=0
myString="abcde(fghi)defggdfsidf(ijkl)gfders(gkjh)hgstfvd"
startString = '('
endString = ')'
for item in myString:
portString=myString[myString.find(startString)+len(startString):myString.find(endString)]
print(portString)
count=count+1
My desired output is
fghi
ijkl
gkjh
But my code always start the loop from the start and produces fghi. Can any one tell me what is the problem?
You can use non greedy regexes:
count=0
myString="abcde(fghi)defggdfsidf(ijkl)gfders(gkjh)hgstfvd"
rx = re.compile(r'\((.*?)\)') # non greedy version inside parens
pos = 0
while True:
m = rx.search(myString[pos:]) # search starting at pos (initially 0)
if m is None: break
count += 1
print(m.group(1))
pos += m.end() # next search will start past last ')'
Above solution only makes sense if parentheses are correctly balanced or if you want to start on first opening one and end of first closing next.
If you want to select text parenthesed text containing no opening or closing parentheses, you have to specify it in the regex:
myString="abcde(fghi)defg(gdfsidf(ijkl)g(fders(gkjh)hgstfvd"
rx = re.compile(r'\(([^()]*)\)')
pos = 0
while True:
m = rx.search(myString[pos:]) # search starting at pos (initially 0)
if m is None: break
count += 1
print(m.group(1))
pos += m.end() # next search will start past last ')'
As an alternative to regex if you'd prefer to keep the loop, note that String.find() can take an optional parameter to tell it where to start looking. Just keep track of the where the closing parenthesis is and start again from just after that.
Unfortunately it's not quite so simple as the loop condition will have to change too, so that it stops after hitting the last set of parentheses.
Something like this should do the trick:
count=0
myString="abcde(fghi)defggdfsidf(ijkl)gfders(gkjh)hgstfvd"
startString = '('
endString = ')'
endStringIndex = 0
while True:
startStringIndex = myString.find(startString, endStringIndex+1)
endStringIndex = myString.find(endString, endStringIndex+1)
if (startStringIndex == -1):
break
portString=myString[startStringIndex+len(startString):endStringIndex]
print(portString)
count+=1
Output:
fghi
ijkl
gkjh
You can use re.findall:
>>> myString = "abcde(fghi)defggdfsidf(ijkl)gfders(gkjh)hgstfvd"
>>> matches = re.findall(r'\((\w+)\)', myString)
>>> count = len(matches)
>>> print('\n'.join(matches))
fghi
ijkl
gkjh
>>> print(count)
3

How do I remove spaces in a string using a loop in python?

def onlyLetters(s):
for i in range(len(s)):
if s[i] == " ":
s = s[:i] + s[i+1:]
return s
return s
Why is my above loop not working? It seems like it's only doing it once.
For example, if i have the string "Hello how are you", it's returning "Hellohow are you". I want it to check the string again and remove another space, and keep doing it until there are no spaces left. How do I fix this code?
Your code is stopping after the first space is replaced because you've told it to. You have return s inside the loop, and when that is reached, the rest of the loop is abandoned since the function exits. You should remove that line entirely.
There's another issue though, related to how you're indexing. When you iterate on range(len(s)) for your indexes, you're going to go to the length of the original string. If you've removed some spaces, however, those last few indexes will no longer be valid (since the modified string is shorter). Another similar issue will come up if there are two spaces in a row (as in "foo bar"). Your code will only be able to replace the first one. After the first space is removed, the second spaces will move up and be at the same index, but the loop will move on to the next index without seeing it.
You can fix this in two different ways. The easiest fix is to loop over the indexes in reverse order. Removing a space towards the end won't change the indexes of the earlier spaces, and the numerically smallest indexes will always be valid even as the string shrinks.
def onlyLetters(s):
for i in range(len(s)-1, -1, -1): # loop in reverse order
if s[i] == " ":
s = s[:i] + s[i+1:]
return s
The other approach is to abandon the for loop for the indexes and use a while loop while manually updating the index variable:
def onlyLetters(s):
i = 0
while i < len(s):
if s[i] == " ":
s = s[:i] + s[i+1:]
else:
i += 1
return s
If you want to remove all spaces, use str.replace():
sentence = ' Freeman was here'
sentence.replace(" ", "")
>>> 'Freemanwashere'
If you want to remove leading and ending spaces, use str.strip():
sentence = ' Freeman was here'
sentence.strip()
>>> 'Freeman was here'
If you want to remove duplicated spaces, use str.split():
sentence = ' Freeman was here'
" ".join(sentence.split())
>>> 'Freemanwashere'
If you also want to remove all the other strange whitespace characters that exist in unicode you can use re.sub with the re.UNICODE arguement:
text = re.sub(r"\s+", "", text, flags=re.UNICODE)
or something like this,it's handles any whitespace characters that you're not thinking of 😉 :
>>> import re
>>> re.sub(r'\s+', '', 'Freeman was here')
'Freemanwashere'
if you don’t want to use anything like replace() or join() etc. you can do this :
def filter(input):
for i in input:
yield " " if i in " ,.?!;:" else i
def expand(input):
for i in input:
yield None if i == " " else object(), i
def uniq(input):
last = object()
for key, i in input:
if key == last:
continue
yield key, i
def compact(input):
for key, i in input:
yield i
yourText = compact(uniq(expand(filter(input()))))
just use python replace() method for Strings.
s.replace(" ","")
It's a little hard to tell, because the indentation in your example is hard to understand. But it looks to me like after you encounter and remove the very first space, you are returning s. So you only get a chance to remove one space before returning.
Try only returning s once you are out of the for loop.
You can do s.replace(" ", ""). This will take all your spaces (" ") and replace it by nothing (""), effectively removing the spaces.
You're returning the function after the first occurrence of a ",", which means it exits the loop and the function altogether. That's why it's only working for the first comma.
Try this,
def onlyLetters(s):
length=len(s)
i=0
while(i<length):
if s[i] == " ":
s = s[:i] + s[i+1:]
length=len(s)
else
i+=1
return s
If you want to use your own code only then try running this:
def onlyLetters(s):
for i in range(len(s)):
if s[i] == " ":
s = s[:i] + s[i+1:]
return s
or better you may use:
def onlyLetters(s):
return s.replace(" ","")
def remove_spaces(text):
text_with_out_spaces='';
while text!='':
next_character=text[0];
if next_character!=' ':
text_with_out_spaces=text_with_out_spaces +next_character;
text=text[1:];
return text_with_out_spaces
print remove_spaces("Hi Azzam !! ?");

Categories

Resources