Remove until specific char in string (Python) - python

I want to remove something from the start and end of a string before writing to the .txt
I'm reading an incoming string from a serial port. I want to write the string to a .txt file, which I can do. I've tried using the rstrip() (also tried strip()) function to remove the 'OK' in the end, but with no luck.
Ideally, I want the program to be dynamic so I can use it for other files. This gives a problem, because the unwanted text in the start and end of the string might vary, so I can't look for specific chars/words to remove.
While this is said, all unwanted text in the start of the string will start with a '+' like in the example below (It might be possible to check if the first line starts with a '+' and remove it if it does. This would be ideal).
def write2file():
print "Listing local files ready for copying:"
listFiles()
print 'Enter name of file to copy:'
name = raw_input()
pastedFile = []
tempStr = readAll('AT+URDFILE="' + name + '"') #Calls another function who reads the file locally and return it in a string
tempStr = tempStr.rstrip('OK') #This is where I try to remove the 'OK' I know is going to be in the end of the string
pastedFile[:0] = tempStr #Putting the string into a list. Workaround for writing 128 bytes at a time. I know it's not pretty, but it works :-)
print 'Enter path to file directory'
path = raw_input()
myFile = open(join(path, name),"w")
while len(pastedFile):
tempcheck = pastedFile[0:128]
for val in tempcheck:
myFile.write(val)
del pastedFile[0:128]
myFile.close()
I expect the .txt to include all the text from the local file, but remove the OK in the end. When program is run it returns:
+URDFILE: "develop111.txt",606,"** ((content of local file)) OK
The 'OK' I wanted to be removed is still in there.
The text "+URDFILE: "develop111.txt",606," is also unwanted in the final .txt file.
So summarizes the problem:
How can I remove the unwanted text in the start and end of a string, before writing it to a .txt file

I assume that your URDFILE is always has the same return pattern +URDFILE: "filname",filesize,"filedata"\nOK as it is AT command. So, it should be enough to ''.join(tempStr.split(',')[3:])[:-3]
Working example:
>>> s = '+URDFILE: "filname",filesize,"filedata, more data"\nOK'
>>> ','.join(s.split(',')[2:])[:-3]
'"filedata, more data"'
or to remove with quotes:
>>>','.join(s.split(',')[2:])[1:-4]
'filedata, more data'

Can you try the following:
tempStr = '+URDFILE: "develop111.txt",606,"** ((content of local file)) OK'
tempStr = tempStr.strip()
if tempStr.startswith('+'):
tempStr = tempStr[1:]
if tempStr.endswith('OK'):
tempStr = tempStr[:-2]
print(tempStr)
Output:
URDFILE: "develop111.txt",606,"** ((content of local file))
If you want to select the required text then you can use regex for that. Can you try the following:
import re
tempStr = 'URDFILE: "develop111.txt",606,"** 01((content of local file)) OK'
tempStr = tempStr.strip()
if tempStr.startswith('+'):
tempStr = tempStr[1:]
if tempStr.endswith('OK'):
tempStr = tempStr[:-2]
# print(tempStr)
new_str = ''.join(re.findall(r'01(.+)', tempStr))
new_str = new_str.strip()
print(new_str)
Output:
((content of local file))

Related

Python - Possibly Regex - How to replace part of a filepath with another filepath based on a match?

I'm new to Python and relatively new to programming. I'm trying to replace part of a file path with a different file path. If possible, I'd like to avoid regex as I don't know it. If not, I understand.
I want an item in the Python list [] before the word PROGRAM to be replaced with the 'replaceWith' variable.
How would you go about doing this?
Current Python List []
item1ToReplace1 = \\server\drive\BusinessFolder\PROGRAM\New\new.vb
item1ToReplace2 = \\server\drive\BusinessFolder\PROGRAM\old\old.vb
Variable to replace part of the Python list path
replaceWith = 'C:\ProgramFiles\Microsoft\PROGRAM'
Desired results for Python List []:
item1ToReplace1 = C:\ProgramFiles\Micosoft\PROGRAM\New\new.vb
item1ToReplace2 = C:\ProgramFiles\Micosoft\PROGRAM\old\old.vb
Thank you for your help.
The following code does what you ask, note I updated your '' to '\', you probably need to account for the backslash in your code since it is used as an escape character in python.
import os
item1ToReplace1 = '\\server\\drive\\BusinessFolder\\PROGRAM\\New\\new.vb'
item1ToReplace2 = '\\server\\drive\\BusinessFolder\\PROGRAM\\old\\old.vb'
replaceWith = 'C:\ProgramFiles\Microsoft\PROGRAM'
keyword = "PROGRAM\\"
def replacer(rp, s, kw):
ss = s.split(kw,1)
if (len(ss) > 1):
tail = ss[1]
return os.path.join(rp, tail)
else:
return ""
print(replacer(replaceWith, item1ToReplace1, keyword))
print(replacer(replaceWith, item1ToReplace2, keyword))
The code splits on your keyword and puts that on the back of the string you want.
If your keyword is not in the string, your result will be an empty string.
Result:
C:\ProgramFiles\Microsoft\PROGRAM\New\new.vb
C:\ProgramFiles\Microsoft\PROGRAM\old\old.vb
One way would be:
item_ls = item1ToReplace1.split("\\")
idx = item_ls.index("PROGRAM")
result = ["C:", "ProgramFiles", "Micosoft"] + item_ls[idx:]
result = "\\".join(result)
Resulting in:
>>> item1ToReplace1 = r"\\server\drive\BusinessFolder\PROGRAM\New\new.vb"
... # the above
>>> result
'C:\ProgramFiles\Micosoft\PROGRAM\New\new.vb'
Note the use of r"..." in order to avoid needing to have to 'escape the escape characters' of your input (i.e. the \). Also that the join/split requires you to escape these characters with a double backslash.

modified textfile python script

I am totally new in python world. Here I am looking for some suggestion about my problem. I have three text file one is original text file, one is text file for updating original text file and write in a new text file without modifying the original text file. So file1.txt looks like
$ego_vel=x
$ped_vel=2
$mu=3
$ego_start_s=4
$ped_start_x=5
file2.txt like
$ego_vel=5
$ped_vel=5
$mu=6
$to_decel=5
outputfile.txt should be like
$ego_vel=5
$ped_vel=5
$mu=6
$ego_start_s=4
$ped_start_x=5
$to_decel=5
the code I tried till now is given below:
import sys
import os
def update_testrun(filename1: str, filename2: str, filename3: str):
testrun_path = os.path.join(sys.argv[1] + "\\" + filename1)
list_of_testrun = []
with open(testrun_path, "r") as reader1:
for line in reader1.readlines():
list_of_testrun.append(line)
# print(list_of_testrun)
design_path = os.path.join(sys.argv[3] + "\\" + filename2)
list_of_design = []
with open(design_path, "r") as reader2:
for line in reader1.readlines():
list_of_design .append(line)
print(list_of_design)
for i, x in enumerate(list_of_testrun):
for test in list_of_design:
if x[:9] == test[:9]:
list_of_testrun[i] = test
# list_of_updated_testrun=list_of_testrun
break
updated_testrun_path = os.path.join(sys.argv[5] + "\\" + filename3)
def main():
update_testrun(sys.argv[2], sys.argv[4], sys.argv[6])
if __name__ == "__main__":
main()
with this code I am able to get output like this
$ego_vel=5
$ped_vel=5
$mu=3
$ego_start_s=4
$ped_start_x=5
$to_decel=5
all the value I get correctly except $mu value.
Will any one provide me where I am getting wrong and is it possible to share a python script for my task?
Looks like your problem comes from the if statement:
if x[:9] == test[:9]:
Here you're comparing the first 8 characters of each string. For all other cases this is fine as you're not comparing past the '=' character, but for $mu this means you're evaluating:
if '$mu=3' == '$mu=6'
This obviously evaluates to false so the mu value is not updated.
You could shorten to if x[:4] == test[:4]: for a quick fix but maybe you would consider another method, such as using the .split() string function. This lets you split a string around a specific character which in your case could be '='. For example:
if x.split('=')[0] == test.split('=')[0]:
Would evaluate as:
if '$mu' == '$mu':
Which is True, and would work for the other statements too. Regardless of string length before the '=' sign.

Using Python to remove the date from a series of filenames, when every filename doesn't contain a date?

I'm parsing a directory of files for databasing purposes. We have files with an almost standardized naming convention. Almost all are name_of_file_yyyymmdd.py or this_name_is_longer_yyyymmdd.py I want a return of [name_of_file], [name_of_file.py] is ok as well. I attempted doing a loop (this part is functioning) and in my loop I have
module_name = file[:-12]
This would work on the files that match the above style, but there are a few files in the directory which I can't rename with names like noformat.py or who_needs_a_date_suffix.py and as my loop progresses these names get blown up and return things like "" and who_needs_a_da respectively.
All my searches have only returned info assuming all filenames are of a like type.
Since it's been requested here's the loop, that's leading to
module_name = file[:-12]
for k in range(0, len(df_temp.index)):
file = df_temp.at[k, 'Python_Module_Name']
print 'Now processing filename ', str((k+1)), ' of ', str(total), ', ', file
headers = []
with open(join(location_dir_input, file), 'r') as module
for line in module:
headers.append(line.rstrip('\n'))
module_name = file[:-12]
df_temp.at[k, 'Python_Module_Name'] = module_name
return df_temp
You can also do
module_name = re.sub('_\d{8}\.py$', '', file)
if you don't want to check the format of the dates
update
making the date optional
module_name = re.sub('(_\d{8})*\.py$', '', file)
You can remove the ".py" out of your result and the use the following code in order to check that the string you poses has the proper date format
import datetime
date_string_bad_format = '12252018'
date_string_good_format = '20181225'
format = "%Y%m%d"
try:
datetime.datetime.strptime(date_string, format)
print("This is the correct date string format.")
except ValueError:
print("This is the incorrect date string format. It should be YYYYMMDD")
That way you can return True when it is a date file and False when it is not and decide what you want to do with the file
I think you could solve your problem by using regular expressions. First you would check if the given file name matches a certain regular expression, and then process the name accordingly based on whether or not it does. A function to check if the filename matches a pattern would look somewhat like this:
import re
def ends_with_date(filename: str) -> bool:
return re.match('.*_\d{8}\.py$', filename) is not None
If the result is True, then you could process the filename using the method you have already described. If not, then you could do something else.

How to extract a string from another string wihout changing a characters case

There are two variables.
A variable drive was assigned a drive path (string).
A variable filepath was assigned a complete path to a file (string).
drive = '/VOLUMES/TranSFER'
filepath = '/Volumes/transfer/Some Documents/The Doc.txt'
First I need to find if a string stored in a drive variable is in a string stored in filepath variable.
If it is then I need to extract a string stored in drive variable from a string stored in a filepath variable without changing both variables characters case (no changing to lower or uppercase. The character case should stay the same).
So a final result should be:
result = '/Some Documents/The Doc.txt'
I could get it done with:
if drive.lower() in filepath.lower(): result = filepath.lower().split( drive.lower() )
But approach like this messes up the letter case (everything is now lowercase)
Please advise, thanks in advance!
EDITED LATER:
I coould be using my own approach. It appear IF portion of the statement
if drive.lower() in filepath.lower():
is case-sensitive. And drive in filepath will return False if case doesn't match.
So it would make sense to lower()-case everything while comparing. But a .split() method splits beautiful regardless of the letter-cases:
if drive.lower() in filepath.lower(): result = filepath.split( drive )
if filepath.lower().startswith(drive.lower() + '/'):
result = filepath[len(drive)+1:]
Using str.find:
>>> drive = '/VOLUMES/TranSFER'
>>> filepath = '/Volumes/transfer/Some Documents/The Doc.txt'
>>> i = filepath.lower().find(drive.lower())
>>> if i >= 0:
... result = filepath[:i] + filepath[i+len(drive):]
...
>>> result
'/Some Documents/The Doc.txt'

python, string.replace() and \n

(Edit: the script seems to work for others here trying to help. Is it because I'm running python 2.7? I'm really at a loss...)
I have a raw text file of a book I am trying to tag with pages.
Say the text file is:
some words on this line,
1
DOCUMENT TITLE some more words here too.
2
DOCUMENT TITLE and finally still more words.
I am trying to use python to modify the example text to read:
some words on this line,
</pg>
<pg n=2>some more words here too,
</pg>
<pg n=3>and finally still more words.
My strategy is to load the text file as a string. Build search-for and a replace-with strings corresponding to a list of numbers. Replace all instances in string, and write to a new file.
Here is the code I've written:
from sys import argv
script, input, output = argv
textin = open(input,'r')
bookstring = textin.read()
textin.close()
pages = []
x = 1
while x<400:
pages.append(x)
x = x + 1
pagedel = "DOCUMENT TITLE"
for i in pages:
pgdel = "%d\n%s" % (i, pagedel)
nplus = i + 1
htmlpg = "</p>\n<p n=%d>" % nplus
bookstring = bookstring.replace(pgdel, htmlpg)
textout = open(output, 'w')
textout.write(bookstring)
textout.close()
print "Updates to %s printed to %s" % (input, output)
The script runs without error, but it also makes no changes whatsoever to the input text. It simply reprints it character for character.
Does my mistake have to do with the hard return? \n? Any help greatly appreciated.
In python, strings are immutable, and thus replace returns the replaced output instead of replacing the string in place.
You must do:
bookstring = bookstring.replace(pgdel, htmlpg)
You've also forgot to call the function close(). See how you have textin.close? You have to call it with parentheses, like open:
textin.close()
Your code works for me, but I might just add some more tips:
Input is a built-in function, so perhaps try renaming that. Although it works normally, it might not for you.
When running the script, don't forget to put the .txt ending:
$ python myscript.py file1.txt file2.txt
Make sure when testing your script to clear the contents of file2.
I hope these help!
Here's an entirely different approach that uses re(import the re module for this to work):
doctitle = False
newstr = ''
page = 1
for line in bookstring.splitlines():
res = re.match('^\\d+', line)
if doctitle:
newstr += '<pg n=' + str(page) + '>' + re.sub('^DOCUMENT TITLE ', '', line)
doctitle = False
elif res:
doctitle = True
page += 1
newstr += '\n</pg>\n'
else:
newstr += line
print newstr
Since no one knows what's going on, it's worth a try.

Categories

Resources