How to add space before a particular string in python - python

I have the following output after removing all spaces from a string
テレビを付けて
テレビつけて
つけて
テレビをオンにして
However, I'm trying to add a space either after the を character, or before つけて after テレビ
The desired output is as follows
テレビを 付けて
テレビ つけて
つけて
テレビを オンにして
I've tried to use some def function but not sure how to finish it off, or even if it'll work.
def teform(txt):
if x = "オンにして":
return " して"
elif y = "つけて":
return " つけて"
elif z = "付けて":
return " 付けて"
else:
return # ...(not sure what goes here)

You cannot use = to compare items, you must use ==.
Apart from the wrong comparing syntax, it seems you don't really know how to approach this – you don't need a loop of any kind. Just use txt.replace to target and change those specific substrings:
def teform(txt):
# add a space after を
txt = txt.replace ('を','を ')
# add a space between テレビ and つけて
txt = txt.replace ('テレビつけて', 'テレビ つけて')
return txt

Related

Python - Possibly Regex - How to replace part of a filepath with another filepath based on a match?

I'm new to Python and relatively new to programming. I'm trying to replace part of a file path with a different file path. If possible, I'd like to avoid regex as I don't know it. If not, I understand.
I want an item in the Python list [] before the word PROGRAM to be replaced with the 'replaceWith' variable.
How would you go about doing this?
Current Python List []
item1ToReplace1 = \\server\drive\BusinessFolder\PROGRAM\New\new.vb
item1ToReplace2 = \\server\drive\BusinessFolder\PROGRAM\old\old.vb
Variable to replace part of the Python list path
replaceWith = 'C:\ProgramFiles\Microsoft\PROGRAM'
Desired results for Python List []:
item1ToReplace1 = C:\ProgramFiles\Micosoft\PROGRAM\New\new.vb
item1ToReplace2 = C:\ProgramFiles\Micosoft\PROGRAM\old\old.vb
Thank you for your help.
The following code does what you ask, note I updated your '' to '\', you probably need to account for the backslash in your code since it is used as an escape character in python.
import os
item1ToReplace1 = '\\server\\drive\\BusinessFolder\\PROGRAM\\New\\new.vb'
item1ToReplace2 = '\\server\\drive\\BusinessFolder\\PROGRAM\\old\\old.vb'
replaceWith = 'C:\ProgramFiles\Microsoft\PROGRAM'
keyword = "PROGRAM\\"
def replacer(rp, s, kw):
ss = s.split(kw,1)
if (len(ss) > 1):
tail = ss[1]
return os.path.join(rp, tail)
else:
return ""
print(replacer(replaceWith, item1ToReplace1, keyword))
print(replacer(replaceWith, item1ToReplace2, keyword))
The code splits on your keyword and puts that on the back of the string you want.
If your keyword is not in the string, your result will be an empty string.
Result:
C:\ProgramFiles\Microsoft\PROGRAM\New\new.vb
C:\ProgramFiles\Microsoft\PROGRAM\old\old.vb
One way would be:
item_ls = item1ToReplace1.split("\\")
idx = item_ls.index("PROGRAM")
result = ["C:", "ProgramFiles", "Micosoft"] + item_ls[idx:]
result = "\\".join(result)
Resulting in:
>>> item1ToReplace1 = r"\\server\drive\BusinessFolder\PROGRAM\New\new.vb"
... # the above
>>> result
'C:\ProgramFiles\Micosoft\PROGRAM\New\new.vb'
Note the use of r"..." in order to avoid needing to have to 'escape the escape characters' of your input (i.e. the \). Also that the join/split requires you to escape these characters with a double backslash.

Generating multiple strings by replacing wildcards

So i have the following strings:
"xxxxxxx#FUS#xxxxxxxx#ACS#xxxxx"
"xxxxx#3#xxxxxx#FUS#xxxxx"
And i want to generate the following strings from this pattern (i'll use the second example):
Considering #FUS# will represent 2.
"xxxxx0xxxxxx0xxxxx"
"xxxxx0xxxxxx1xxxxx"
"xxxxx0xxxxxx2xxxxx"
"xxxxx1xxxxxx0xxxxx"
"xxxxx1xxxxxx1xxxxx"
"xxxxx1xxxxxx2xxxxx"
"xxxxx2xxxxxx0xxxxx"
"xxxxx2xxxxxx1xxxxx"
"xxxxx2xxxxxx2xxxxx"
"xxxxx3xxxxxx0xxxxx"
"xxxxx3xxxxxx1xxxxx"
"xxxxx3xxxxxx2xxxxx"
Basically if i'm given a string as above, i want to generate multiple strings by replacing the wildcards that can be #FUS#, #WHATEVER# or with a number #20# and generating multiple strings with the ranges that those wildcards represent.
I've managed to get a regex to find the wildcards.
wildcardRegex = f"(#FUS#|#WHATEVER#|#([0-9]|[1-9][0-9]|[1-9][0-9][0-9])#)"
Which finds correctly the target wildcards.
For 1 wildcard present, it's easy.
re.sub()
For more it gets complicated. Or maybe it was a long day...
But i think my algorithm logic is failing hard because i'm failing to write some code that will basically generate the signals. I think i need some kind of recursive function that will be called for each number of wildcards present (up to maybe 4 can be present (xxxxx#2#xxx#2#xx#FUS#xx#2#x)).
I need a list of resulting signals.
Is there any easy way to do this that I'm completely missing?
Thanks.
import re
stringV1 = "xxx#FUS#xxxxi#3#xxx#5#xx"
stringV2 = "XXXXXXXXXX#FUS#XXXXXXXXXX#3#xxxxxx#5#xxxx"
regex = "(#FUS#|#DSP#|#([0-9]|[1-9][0-9]|[1-9][0-9][0-9])#)"
WILDCARD_FUS = "#FUS#"
RANGE_FUS = 3
def getSignalsFromWildcards(app, can):
sigList = list()
if WILDCARD_FUS in app:
for i in range(RANGE_FUS):
outAppSig = app.replace(WILDCARD_FUS, str(i), 1)
outCanSig = can.replace(WILDCARD_FUS, str(i), 1)
if "#" in outAppSig:
newSigList = getSignalsFromWildcards(outAppSig, outCanSig)
sigList += newSigList
else:
sigList.append((outAppSig, outCanSig))
elif len(re.findall("(#([0-9]|[1-9][0-9]|[1-9][0-9][0-9])#)", stringV1)) > 0:
wildcard = re.search("(#([0-9]|[1-9][0-9]|[1-9][0-9][0-9])#)", app).group()
tarRange = int(wildcard.strip("#"))
for i in range(tarRange):
outAppSig = app.replace(wildcard, str(i), 1)
outCanSig = can.replace(wildcard, str(i), 1)
if "#" in outAppSig:
newSigList = getSignalsFromWildcards(outAppSig, outCanSig)
sigList += newSigList
else:
sigList.append((outAppSig, outCanSig))
return sigList
if "#" in stringV1:
resultList = getSignalsFromWildcards(stringV1, stringV2)
for item in resultList:
print(item)
results in
('xxx0xxxxi0xxxxx', 'XXXXXXXXXX0XXXXXXXXXX0xxxxxxxxxx')
('xxx0xxxxi1xxxxx', 'XXXXXXXXXX0XXXXXXXXXX1xxxxxxxxxx')
('xxx0xxxxi2xxxxx', 'XXXXXXXXXX0XXXXXXXXXX2xxxxxxxxxx')
('xxx1xxxxi0xxxxx', 'XXXXXXXXXX1XXXXXXXXXX0xxxxxxxxxx')
('xxx1xxxxi1xxxxx', 'XXXXXXXXXX1XXXXXXXXXX1xxxxxxxxxx')
('xxx1xxxxi2xxxxx', 'XXXXXXXXXX1XXXXXXXXXX2xxxxxxxxxx')
('xxx2xxxxi0xxxxx', 'XXXXXXXXXX2XXXXXXXXXX0xxxxxxxxxx')
('xxx2xxxxi1xxxxx', 'XXXXXXXXXX2XXXXXXXXXX1xxxxxxxxxx')
('xxx2xxxxi2xxxxx', 'XXXXXXXXXX2XXXXXXXXXX2xxxxxxxxxx')
long day after-all...

Python writing in two lines instead of one

def end_aluguer(self, nickname):
a = self.obj_aluguer(nickname)
x = "{},{},{},{},{},{}\r".format(int(a.time), a.nickname, a.viatura, a.preco, a.decorrido(), a.update())
y = open("historico.txt", 'a')
y.write(x)
y.close()
Hello, so i have this function and everything is working fine except when it writes in historico.txt it splits into two lines:
1607741371,tiagovski,tartaruga,0.80
,21,0.0
How can i make it to write in only one line?
Thanks in advance for your time
It appears your data hasn't been sanitized fully yet - it hasn't been cleaned.
One of your fields, in this case, a.preco has newline(s) already in it -- hence, it is being written out to the file, resulting in a broken output.
Generally this:
a.preco.strip()
will suffice. But do note that this will also remove any leading and trailing whitespace (which you often want to do anyway).
Otherwise, you can specifically remove just the newline character(s) from the end using rstrip, like this:
a.preco.rstrip('\n')
and this will retain the leading and trailing whitespace.
Also, you may want to do the same with your other fields, depending on where your data is coming from and whether you can rely on it being in a suitable format.
Try this code
def end_aluguer(self, nickname):
a = self.obj_aluguer(nickname)
x = "{},{},{},{},{},{}\r".format(int(a.time), a.nickname, a.viatura, a.preco, a.decorrido(), a.update())
x = ' '.join(x.split('\n')) # remove newline and join into one string
y = open("historico.txt", 'a')
y.write(x)
y.close()

Remove everything but #number in brackets

I have a file where the lines have the form #nr = name(#nr, (#nr), different vars, and names).
I would like to only have the #nr in the brackets to get the form #nr = name(#nr, #nr)
I have tried to solve this in different ways like using regex, startswith() and lists but nothing has worked so far.
Any help is much appreciated.
Edit: Code
for line in f.split():
start = line.find( '(' )
end = line.find( ')' )
if start != -1 and end != -1:
line = ''.join(i for i in x if not i.startswith('#'))
print(line)
Edit 2:
As example I have:
#304= IFCRELDEFINESBYPROPERTIES('0FZ0hKNanFNAQpJ_Iqh4zM',#42,$,$,(#142),#301);
Afterwards I want to have:
#304= IFCRELDEFINESBYPROPERTIES(#42,#142,#301);
This can be solved using regex, though trying to do it with a single find/replace would be more complicated. Instead, you can do it in two steps:
import re
def sub_func(match):
nums = re.findall(r'#\d+', match.group(2))
return match.group(1) + '(' + ','.join(nums) + ');'
text = "#304= IFCRELDEFINESBYPROPERTIES('0FZ0hKNanFNAQpJ_Iqh4zM',#42,$,$,(#142),#301);"
result = re.sub(r'(^[^(]+)\((.*)\);', sub_func, text)
print(result)
# '#304= IFCRELDEFINESBYPROPERTIES(#42,#142,#301);'
So instead of passing a string as the second argument for re.sub, we pass a function instead, where we can process the results of the match with some more regex and reformatting the results before passing it back.

Python: Split between two characters

Let's say I have a ton of HTML with no newlines. I want to get each element into a list.
input = "<head><title>Example Title</title></head>"
a_list = ["<head>", "<title>Example Title</title>", "</head>"]
Something like such. Splitting between each ><.
But in Python, I don't know of a way to do that. I can only split at that string, which removes it from the output. I want to keep it, and split between the two equality operators.
How can this be done?
Edit: Preferably, this would be done without adding the characters back in to the ends of each list item.
# initial input
a = "<head><title>Example Title</title></head>"
# split list
b = a.split('><')
# remove extra character from first and last elements
# because the split only removes >< pairs.
b[0] = b[0][1:]
b[-1] = b[-1][:-1]
# initialize new list
a_list = []
# fill new list with formatted elements
for i in range(len(b)):
a_list.append('<{}>'.format(b[i]))
This will output the given list in python 2.7.2, but it should work in python 3 as well.
You can try this:
import re
a = "<head><title>Example Title</title></head>"
data = re.split("><", a)
new_data = [data[0]+">"]+["<" + i+">" for i in data[1:-1]] + ["<"+data[-1]]
Output:
['<head>', '<title>Example Title</title>', '</head>']
The shortest approach using re.findall() function on extended example:
# extended html string
s = "<head><title>Example Title</title></head><body>hello, <b>Python</b></body>"
result = re.findall(r'(<[^>]+>[^<>]+</[^>]+>|<[^>]+>)', s)
print(result)
The output:
['<head>', '<title>Example Title</title>', '</head>', '<body>', '<b>Python</b>', '</body>']
Based on the answers by other people, I made this.
It isn't as clean as I had wanted, but it seems to work. I had originally wanted to not re-add the characters after split.
Here, I got rid of one extra argument by combining the two characters into a string. Anyways,
def split_between(string, chars):
if len(chars) is not 2: raise IndexError("Argument chars must contain two characters.")
result_list = [chars[1] + line + chars[0] for line in string.split(chars)]
result_list[0] = result_list[0][1:]
result_list[-1] = result_list[-1][:-1]
return result_list
Credit goes to #cforemanand #Ajax1234.
Or even simpler, this:
input = "<head><title>Example Title</title></head>"
print(['<'+elem if elem[0]!='<' else elem for elem in [elem+'>' if elem[-1]!='>' else elem for elem in input.split('><') ]])

Categories

Resources