modified textfile python script - python

I am totally new in python world. Here I am looking for some suggestion about my problem. I have three text file one is original text file, one is text file for updating original text file and write in a new text file without modifying the original text file. So file1.txt looks like
$ego_vel=x
$ped_vel=2
$mu=3
$ego_start_s=4
$ped_start_x=5
file2.txt like
$ego_vel=5
$ped_vel=5
$mu=6
$to_decel=5
outputfile.txt should be like
$ego_vel=5
$ped_vel=5
$mu=6
$ego_start_s=4
$ped_start_x=5
$to_decel=5
the code I tried till now is given below:
import sys
import os
def update_testrun(filename1: str, filename2: str, filename3: str):
testrun_path = os.path.join(sys.argv[1] + "\\" + filename1)
list_of_testrun = []
with open(testrun_path, "r") as reader1:
for line in reader1.readlines():
list_of_testrun.append(line)
# print(list_of_testrun)
design_path = os.path.join(sys.argv[3] + "\\" + filename2)
list_of_design = []
with open(design_path, "r") as reader2:
for line in reader1.readlines():
list_of_design .append(line)
print(list_of_design)
for i, x in enumerate(list_of_testrun):
for test in list_of_design:
if x[:9] == test[:9]:
list_of_testrun[i] = test
# list_of_updated_testrun=list_of_testrun
break
updated_testrun_path = os.path.join(sys.argv[5] + "\\" + filename3)
def main():
update_testrun(sys.argv[2], sys.argv[4], sys.argv[6])
if __name__ == "__main__":
main()
with this code I am able to get output like this
$ego_vel=5
$ped_vel=5
$mu=3
$ego_start_s=4
$ped_start_x=5
$to_decel=5
all the value I get correctly except $mu value.
Will any one provide me where I am getting wrong and is it possible to share a python script for my task?

Looks like your problem comes from the if statement:
if x[:9] == test[:9]:
Here you're comparing the first 8 characters of each string. For all other cases this is fine as you're not comparing past the '=' character, but for $mu this means you're evaluating:
if '$mu=3' == '$mu=6'
This obviously evaluates to false so the mu value is not updated.
You could shorten to if x[:4] == test[:4]: for a quick fix but maybe you would consider another method, such as using the .split() string function. This lets you split a string around a specific character which in your case could be '='. For example:
if x.split('=')[0] == test.split('=')[0]:
Would evaluate as:
if '$mu' == '$mu':
Which is True, and would work for the other statements too. Regardless of string length before the '=' sign.

Related

extracting the data of a specified function in a python file | adding comments to highlight what is removed

I want to extract the code written under a specified function. I am trying to do it like this:
With an example file TestFile.py containing the following function sub():
def sub(self,num1,num2):
# Subtract two numbers
answer = num1 - num2
# Print the answer
print('Difference = ',answer)
If I run get_func_data.py:
def giveFunctionData(data, function):
dataRequired = []
for i in range(0, len(data)):
if data[i].__contains__(str(function)):
startIndex = i
for p in range(startIndex + 1, len(data)):
dataRequired.append(data[p])
if data[p].startswith('\n' + 'def'):
dataRequired.remove(dataRequired[len(dataRequired) - 1])
break
print(dataRequired)
return dataRequired
data = []
f = open("TestFile.py", "r")
for everyLine in f:
if not(everyLine.startswith('#') or everyLine.startswith('\n' + '#')):
data.append(everyLine)
giveFunctionData(data,'sub') # Extract content in sub() function
I expect to obtain the following result:
answer = num1 - num2
print('Difference = ',answer)
But here I get the comments written inside the function as well. Instead of the list, Is there a way to get it as it is written in the file?
Returning a string from your function giveFunctionData()
In your function giveFunctionData you're instantiating the variable dataRequired as a list and returning it after assigning it a value so of course you're getting a list back.
You'd have to unpack the list back into a string. One way could be this:
# Unpack the list into a string
function_content = ''
for line in dataRequired:
function_content += line + '\n'
# function_content now contains your desired string
The reason you're still getting comment lines
Iterating from a file object instantiated via open() will give you a list of lines from a file with \n already used as a delimiter for lines. As a result, there aren't any \n# for .startswith('\n' + '#')) to find.
General comments
There is no need to specify the newline and # character separately like you did in .startswith('\n' + '#')). '\n#' would have been fine
If you intend for the file to be run as a script, you really should put your code to be run in a if __name__ == "__main__": conditional. See What does if name == “main”: do?
It might be cleaner to move the reading of the file object into your giveFunctionData() function. It also eliminates having to to iterate over it multiple times.
Putting it all together
Note that this script isn't able to ignore comments placed in the same line as code, (eg. some = statement # With comments won't be comment-stripped)
def giveFunctionData(data, function):
function_content = ''
# Tells us whether to append lines to the `function_content` string
record_content = False
for line in data:
if not record_content:
# Once we find a match, we start recording lines
if function in line:
record_content = True
else:
# We keep recording until we encounter another function
if line.startswith('def'):
break
elif line.isspace():
continue
elif '#' not in line:
# Add line to `function_content` string
function_content += line
return function_content
if __name__ == "__main__":
data = []
script = open("TestFile.py")
output = giveFunctionData(script, 'sub')
print(output)
I have generated code which an do your task. I don't think you require 2 different processing part like function and code to fetch data.
You can do one thing, create a function which accept 2 arguments i.e. File Name and Function Name. Function should return the code you want.
I have created function getFunctionCode(filename,funcname). Code is working well.
def getFunctionCode(filename, funcname):
data = []
with open(filename) as fp:
line = fp.readlines()
startIndex = 0 #From where to start reading body part
endIndex = 0 #till what line because file may have mult
for i in range(len(line)): #Finding Starting index
if(line[i].__contains__(funcname)):
startIndex = i+1
break
for i in range(startIndex,len(line)):
if(line[i].__contains__('def')): #Find end in case - multiple function
endIndex = i-1
break
else:
endIndex = len(line)
for i in range(startIndex,endIndex):
if(line[i] != None):
temp = "{}".format(line[i].strip())[0]
if(temp != '\n' and temp != '#'):
data.append(line[i][:-1])
return(data)
I have read the file provided in first argument.
Then Found out the index where function is location. Function is provided in second arguement.
Starting from the index, I cleared string and checked first character to know about comment (#) and new line (\n).
Finally, the lines without these are appended.
Here, you can find file TestFile.py :
def sub(self,num1,num2):
# Subtract two numbers
answer = num1 - num2
# Print the answer
print('Difference = ',answer)
def add(self,num1,num2):
# addition of two numbers
answer = num1 + num2
# Print the answer
print('Summation = ',answer)
def mul(self,num1,num2):
# Product of two numbers
answer = num1 * num2
# Print the answer
print('Product = ',answer)
Execution of function :
getFunctionCode('TestFile.py','sub')
[' answer = num1 - num2', " print('Difference = ',answer)"]
getFunctionCode('TestFile.py','add')
[' answer = num1 + num2', " print('Summation = ',answer)"]
getFunctionCode('TestFile.py','mul')
[' answer = num1 * num2', " print('Product = ',answer)"]
Solution by MoltenMuffins is easier as well.
Your implementation of this function would fail badly if you have multiple functions inside your TestFile.py file and if you are intending to retrieve the source code of only specific functions from TestFile.py. It would also fail if you have some variables defined between two function definitions in TestFile.py
A more idealistic and simplistic way to extract the source code of a function from TestFile.py would be to use the inspect.getsource() method as follows:
#Import necessary packages
import os
import sys
import inspect
#This function takes as input your pyton .py file and the function name in the .py file for which code is needed
def giveFunctionData(file_path,function_name):
folder_path = os.path.dirname(os.path.abspath(file_path))
#Change directory to the folder containing the .py file
os.chdir(folder_path)
head, tail = os.path.split(file_path)
tail = tail.split('.')[0]
#Contruct import statement for the function that needs to be imported
import_statement = "from " + tail + " import " + function_name
#Execute the import statement
exec(import_statement)
#Extract the function code with comments
function_code_with_comments = eval("inspect.getsource("+function_name+")")
#Now, filter out the comments from the function code
function_code_without_comments = ''
for line in function_code_with_comments.splitlines():
currentstr = line.lstrip()
if not currentstr.startswith("#"):
function_code_without_comments = function_code_without_comments + line + '\n'
return function_code_without_comments
#Specify absolute path of your python file from which function code needs to be extracted
file_path = "Path_To_Testfile.py"
#Specify the name of the function for which code is needed
function_name = "sub"
#Print the output function code without comments by calling the function "giveFunctionData(file_path,function_name)"
print(giveFunctionData(file_path,function_name))
This method will work for any kind of function code that you need to extract irrespective of the formatting of the .py file where the function is present instead of parsing the .py file as a string variable.
Cheers!

Remove until specific char in string (Python)

I want to remove something from the start and end of a string before writing to the .txt
I'm reading an incoming string from a serial port. I want to write the string to a .txt file, which I can do. I've tried using the rstrip() (also tried strip()) function to remove the 'OK' in the end, but with no luck.
Ideally, I want the program to be dynamic so I can use it for other files. This gives a problem, because the unwanted text in the start and end of the string might vary, so I can't look for specific chars/words to remove.
While this is said, all unwanted text in the start of the string will start with a '+' like in the example below (It might be possible to check if the first line starts with a '+' and remove it if it does. This would be ideal).
def write2file():
print "Listing local files ready for copying:"
listFiles()
print 'Enter name of file to copy:'
name = raw_input()
pastedFile = []
tempStr = readAll('AT+URDFILE="' + name + '"') #Calls another function who reads the file locally and return it in a string
tempStr = tempStr.rstrip('OK') #This is where I try to remove the 'OK' I know is going to be in the end of the string
pastedFile[:0] = tempStr #Putting the string into a list. Workaround for writing 128 bytes at a time. I know it's not pretty, but it works :-)
print 'Enter path to file directory'
path = raw_input()
myFile = open(join(path, name),"w")
while len(pastedFile):
tempcheck = pastedFile[0:128]
for val in tempcheck:
myFile.write(val)
del pastedFile[0:128]
myFile.close()
I expect the .txt to include all the text from the local file, but remove the OK in the end. When program is run it returns:
+URDFILE: "develop111.txt",606,"** ((content of local file)) OK
The 'OK' I wanted to be removed is still in there.
The text "+URDFILE: "develop111.txt",606," is also unwanted in the final .txt file.
So summarizes the problem:
How can I remove the unwanted text in the start and end of a string, before writing it to a .txt file
I assume that your URDFILE is always has the same return pattern +URDFILE: "filname",filesize,"filedata"\nOK as it is AT command. So, it should be enough to ''.join(tempStr.split(',')[3:])[:-3]
Working example:
>>> s = '+URDFILE: "filname",filesize,"filedata, more data"\nOK'
>>> ','.join(s.split(',')[2:])[:-3]
'"filedata, more data"'
or to remove with quotes:
>>>','.join(s.split(',')[2:])[1:-4]
'filedata, more data'
Can you try the following:
tempStr = '+URDFILE: "develop111.txt",606,"** ((content of local file)) OK'
tempStr = tempStr.strip()
if tempStr.startswith('+'):
tempStr = tempStr[1:]
if tempStr.endswith('OK'):
tempStr = tempStr[:-2]
print(tempStr)
Output:
URDFILE: "develop111.txt",606,"** ((content of local file))
If you want to select the required text then you can use regex for that. Can you try the following:
import re
tempStr = 'URDFILE: "develop111.txt",606,"** 01((content of local file)) OK'
tempStr = tempStr.strip()
if tempStr.startswith('+'):
tempStr = tempStr[1:]
if tempStr.endswith('OK'):
tempStr = tempStr[:-2]
# print(tempStr)
new_str = ''.join(re.findall(r'01(.+)', tempStr))
new_str = new_str.strip()
print(new_str)
Output:
((content of local file))

About dict.fromkeys, key from filename, values inside file, using Regex

Well, I'm learning Python, so I'm working on a project that consists in passing some numbers of PDF files to xlsx and placing them in their corresponding columns, rows determined according to row heading.
The idea that came to me to carry it out is to convert the PDF files to txt and make a dictionary with the txt files, whose key is a part of the file name (because it contains a part of the row header) and the values be the numbers I need.
I have already managed to convert the txt files, now i'm dealing with the script to carry the dictionary. at the moment look like this:
import os
import re
p = re.compile(r'\w+\f+')
'''
I'm not entirely sure at the moment how the .compile of regular expressions works, but I know I'm missing something to indicate that what I want is immediately to the right, I'm also not sure if the keywords will be ignored, I just want take out the numbers
'''
m = p.match('Theese are the keywords' or 'That are immediately to the left' or 'The numbers I want')
def IsinDict(txtDir):
ToData = ()
if txtDir == "": txtDir = os.getcwd() + "\\"
for txt in os.listdir(txtDir):
ToKey = txt[9:21]
if ToKey == (r"\w+"):
Data = open(txt, "r")
for string in Data:
ToData += m.group()
Diccionary = dict.fromkeys(ToKey, ToData)
return Diccionary
txtDir = "Absolute/Path/OfTheText/Files"
IsinDict(txtDir)
Any contribution is welcome, thanks for your attention.

UDF (User Defined Function) python gives different answer in pig

I want to write a UDF python for pig, to read lines from the file called like
#'prefix.csv'
spol.
LLC
Oy
OOD
and match the names and if finds any matches, then replaces it with white space. here is my python code
def list_files2(name, f):
fin = open(f, 'r')
for line in fin:
final = name
extra = 'nothing'
if (name != name.replace(line.strip(), ' ')):
extra = line.strip()
final = name.replace(line.strip(), ' ').strip()
return final, extra,'insdie if'
return final, extra, 'inside for'
Running this code in python,
>print list_files2('LLC nakisa', 'prefix.csv' )
>print list_files2('AG company', 'prefix.csv' )
returns
('nakisa', 'LLC', 'insdie if')
('AG company', 'nothing', 'inside for')
which is exactly what I need. But when I register this code as a UDF in apache pig for this sample list:
nakisa company LLC
three Oy
AG Lans
Test OOD
pig returns wrong answer on the third line:
((nakisa company,LLC,insdie if))
((three,Oy,insdie if))
((A G L a n s,,insdie if))
((Test,OOD,insdie if))
The question is why UDF enters the if loop for the third entry which does not have any match in the prefix.csv file.
I don't know pig but the way you are checking for a match is strange and might be the cause of your problem.
If you want to check whether a string is a substring of another, python provides
the find method on strings:
if name.find(line.strip()) != -1:
# find will return the first index of the substring or -1 if it was not found
# ... do some stuff
additionally, your code might leave the file handle open. A way better approach to handle file operations is by using the with statement. This assures that in any case (except of interpreter crashes) the file handle will get closed.
with open(filename, "r") as file_:
# Everything within this block can use the opened file.
Last but not least, python provides a module called csv with a reader and a writer, that handle the parsing of the csv file format.
Thus, you could try the following code and check if it returns the correct thing:
import csv
def list_files2(name, filename):
with open(filename, 'rb') as file_:
final = name
extra = "nothing"
for prefix in csv.reader(file_):
if name.find(prefix) != -1:
extra = prefix
final = name.replace(prefix, " ")
return final, extra, "inside if"
return final, extra, "inside for"
Because your file is named prefix.csv I assume you want to do prefix substitution. In this case, you could use startswith instead of find for the check and replace the line final = name.replace(prefix, " ") with final = " " + name[name.find(prefix):]. This assures that only a prefix will be substituted with the space.
I hope, this helps

python, string.replace() and \n

(Edit: the script seems to work for others here trying to help. Is it because I'm running python 2.7? I'm really at a loss...)
I have a raw text file of a book I am trying to tag with pages.
Say the text file is:
some words on this line,
1
DOCUMENT TITLE some more words here too.
2
DOCUMENT TITLE and finally still more words.
I am trying to use python to modify the example text to read:
some words on this line,
</pg>
<pg n=2>some more words here too,
</pg>
<pg n=3>and finally still more words.
My strategy is to load the text file as a string. Build search-for and a replace-with strings corresponding to a list of numbers. Replace all instances in string, and write to a new file.
Here is the code I've written:
from sys import argv
script, input, output = argv
textin = open(input,'r')
bookstring = textin.read()
textin.close()
pages = []
x = 1
while x<400:
pages.append(x)
x = x + 1
pagedel = "DOCUMENT TITLE"
for i in pages:
pgdel = "%d\n%s" % (i, pagedel)
nplus = i + 1
htmlpg = "</p>\n<p n=%d>" % nplus
bookstring = bookstring.replace(pgdel, htmlpg)
textout = open(output, 'w')
textout.write(bookstring)
textout.close()
print "Updates to %s printed to %s" % (input, output)
The script runs without error, but it also makes no changes whatsoever to the input text. It simply reprints it character for character.
Does my mistake have to do with the hard return? \n? Any help greatly appreciated.
In python, strings are immutable, and thus replace returns the replaced output instead of replacing the string in place.
You must do:
bookstring = bookstring.replace(pgdel, htmlpg)
You've also forgot to call the function close(). See how you have textin.close? You have to call it with parentheses, like open:
textin.close()
Your code works for me, but I might just add some more tips:
Input is a built-in function, so perhaps try renaming that. Although it works normally, it might not for you.
When running the script, don't forget to put the .txt ending:
$ python myscript.py file1.txt file2.txt
Make sure when testing your script to clear the contents of file2.
I hope these help!
Here's an entirely different approach that uses re(import the re module for this to work):
doctitle = False
newstr = ''
page = 1
for line in bookstring.splitlines():
res = re.match('^\\d+', line)
if doctitle:
newstr += '<pg n=' + str(page) + '>' + re.sub('^DOCUMENT TITLE ', '', line)
doctitle = False
elif res:
doctitle = True
page += 1
newstr += '\n</pg>\n'
else:
newstr += line
print newstr
Since no one knows what's going on, it's worth a try.

Categories

Resources