parse statement string for arguments using regex in Python - python

I have user input statements which I would like to parse for arguments. If possible using regex.
I have read much about functools.partial on Stackoverflow where I could not find argument parsing. Also in regex on Stackoverflow I could not find how to check for a match, but exclude the used tokens. The Python tokenizer seems to heavy for my purpose.
import re
def getarguments(statement):
prog = re.compile("([(].*[)])")
result = prog.search(statement)
m = result.group()
# m = '(interval=1, percpu=True)'
# or m = "('/')"
# strip the parentheses, ugly but it works
return statement[result.start()+1:result.end()-1]
stm = 'psutil.cpu_percent(interval=1, percpu=True)'
arg_list = getarguments(stm)
print(arg_list) # returns : interval=1, percpu=True
# But combining single and double quotes like
stm = "psutil.disk_usage('/').percent"
arg_list = getarguments(stm) # in debug value is "'/'"
print(arg_list) # when printed value is : '/'
callfunction = psutil.disk_usage
args = []
args.append(arg_list)
# args.append('/')
funct1 = functools.partial(callfunction, *args)
perc = funct1().percent
print(perc)
This results an error :
builtins.FileNotFoundError: [Errno 2] No such file or directory: "'/'"
But
callfunction = psutil.disk_usage
args = []
#args.append(arg_list)
args.append('/')
funct1 = functools.partial(callfunction, *args)
perc = funct1().percent
print(perc)
Does return (for me) 20.3 This is correct.
So there is somewhere a difference.
The weird thing is, if I view the content in my IDE (WingIDE) the result is "'/'" and then, if I want to view the details then the result is '/'
I use Python 3.4.0 What is happening here, and how to solve?
Your help is really appreciated.

getarguments("psutil.disk_usage('/').percent") returns '/'. You can check this by printing len(arg_list), for example.
Your IDE adds ", because by default strings are enclosed into single quotes '. Now you have a string which actually contains ', so IDE uses double quotes to enclose the string.
Note, that '/' is not equal to "'/'". The former is a string of 1 character, the latter is a string of 3 characters. So in order to get things right you need to strip quotes (both double and single ones) in getarguments. You can do it with following snippet
if (s.startswith('\'') and s.endswith('\'')) or
(s.startswith('\"') and s.endswith('\"')):
s = s[1:-1]

Related

Python - Possibly Regex - How to replace part of a filepath with another filepath based on a match?

I'm new to Python and relatively new to programming. I'm trying to replace part of a file path with a different file path. If possible, I'd like to avoid regex as I don't know it. If not, I understand.
I want an item in the Python list [] before the word PROGRAM to be replaced with the 'replaceWith' variable.
How would you go about doing this?
Current Python List []
item1ToReplace1 = \\server\drive\BusinessFolder\PROGRAM\New\new.vb
item1ToReplace2 = \\server\drive\BusinessFolder\PROGRAM\old\old.vb
Variable to replace part of the Python list path
replaceWith = 'C:\ProgramFiles\Microsoft\PROGRAM'
Desired results for Python List []:
item1ToReplace1 = C:\ProgramFiles\Micosoft\PROGRAM\New\new.vb
item1ToReplace2 = C:\ProgramFiles\Micosoft\PROGRAM\old\old.vb
Thank you for your help.
The following code does what you ask, note I updated your '' to '\', you probably need to account for the backslash in your code since it is used as an escape character in python.
import os
item1ToReplace1 = '\\server\\drive\\BusinessFolder\\PROGRAM\\New\\new.vb'
item1ToReplace2 = '\\server\\drive\\BusinessFolder\\PROGRAM\\old\\old.vb'
replaceWith = 'C:\ProgramFiles\Microsoft\PROGRAM'
keyword = "PROGRAM\\"
def replacer(rp, s, kw):
ss = s.split(kw,1)
if (len(ss) > 1):
tail = ss[1]
return os.path.join(rp, tail)
else:
return ""
print(replacer(replaceWith, item1ToReplace1, keyword))
print(replacer(replaceWith, item1ToReplace2, keyword))
The code splits on your keyword and puts that on the back of the string you want.
If your keyword is not in the string, your result will be an empty string.
Result:
C:\ProgramFiles\Microsoft\PROGRAM\New\new.vb
C:\ProgramFiles\Microsoft\PROGRAM\old\old.vb
One way would be:
item_ls = item1ToReplace1.split("\\")
idx = item_ls.index("PROGRAM")
result = ["C:", "ProgramFiles", "Micosoft"] + item_ls[idx:]
result = "\\".join(result)
Resulting in:
>>> item1ToReplace1 = r"\\server\drive\BusinessFolder\PROGRAM\New\new.vb"
... # the above
>>> result
'C:\ProgramFiles\Micosoft\PROGRAM\New\new.vb'
Note the use of r"..." in order to avoid needing to have to 'escape the escape characters' of your input (i.e. the \). Also that the join/split requires you to escape these characters with a double backslash.

Extract arguments from string with python function

I'm looking for a way to extract arguments embedded into python function returned to me as strings.
For example:
'create.copy("Node_A", "Information", False)'
# expected return: ["Node_A", "Information", "False"]
'create.new("Node_B")'
# expected return: ["Node_B"]
'delete("Node_C")'
# expected return: ["Node_C"]
My first approach was regular expressions like this:
re.match(r"("(.+?")")
But it returns None all the time.
How can I get list of this arguments?
BTW: I'm forced to use Python 2.7 and only built-in functions :(
You can parse these expressions using the built-in ast module.
import ast
def get_args(expr):
tree = ast.parse(expr)
args = tree.body[0].value.args
return [arg.value for arg in args]
get_args('create.copy("Node_A", "Information", False)') # ['Node_A', 'Information', False]
get_args('create.new("Node_B")') # ['Node_B']
get_args('delete("Node_C")') # ['Node_C']
Here an example without any external modules and totally compatible with python2.7. Slice the string w.r.t. the position of the brackets, clean it from extra white-spaces and split at ,.
f = 'create.copy("Node_A", "Information", False)'
i_open = f.find('(')
i_close = f.find(')')
print(f[i_open+1: i_close].replace(' ', '').split(','))
Output
['"Node_A"', '"Information"', 'False']
Remark:
not for nested functions.
the closing bracket can also be found by reversing the string
i_close = len(f) - f[::-1].find(')') - 1
See below (tested using python 3.6)
def get_args(expr):
args = expr[expr.find('(') + 1:expr.find(')')].split(',')
return [x.replace('"', '') for x in args]
entries = ['create.copy("Node_A", "Information", False)', "create.new(\"Node_B\")"]
for entry in entries:
print(get_args(entry))
output
['Node_A', ' Information', ' False']
['Node_B']

Remove everything but #number in brackets

I have a file where the lines have the form #nr = name(#nr, (#nr), different vars, and names).
I would like to only have the #nr in the brackets to get the form #nr = name(#nr, #nr)
I have tried to solve this in different ways like using regex, startswith() and lists but nothing has worked so far.
Any help is much appreciated.
Edit: Code
for line in f.split():
start = line.find( '(' )
end = line.find( ')' )
if start != -1 and end != -1:
line = ''.join(i for i in x if not i.startswith('#'))
print(line)
Edit 2:
As example I have:
#304= IFCRELDEFINESBYPROPERTIES('0FZ0hKNanFNAQpJ_Iqh4zM',#42,$,$,(#142),#301);
Afterwards I want to have:
#304= IFCRELDEFINESBYPROPERTIES(#42,#142,#301);
This can be solved using regex, though trying to do it with a single find/replace would be more complicated. Instead, you can do it in two steps:
import re
def sub_func(match):
nums = re.findall(r'#\d+', match.group(2))
return match.group(1) + '(' + ','.join(nums) + ');'
text = "#304= IFCRELDEFINESBYPROPERTIES('0FZ0hKNanFNAQpJ_Iqh4zM',#42,$,$,(#142),#301);"
result = re.sub(r'(^[^(]+)\((.*)\);', sub_func, text)
print(result)
# '#304= IFCRELDEFINESBYPROPERTIES(#42,#142,#301);'
So instead of passing a string as the second argument for re.sub, we pass a function instead, where we can process the results of the match with some more regex and reformatting the results before passing it back.

Python modify text file by the name of arguments

I have a text file ("input.param"), which serves as an input file for a package. I need to modify the value of one argument. The lines need to be changed are the following:
param1 0.01
model_name run_param1
I need to search the argument param1 and modify the value of 0.01 for a range of different values, meanwhile the model_name will also be changed accordingly for different value of param1. For example, if the para1 is changed to be 0.03, then the model_name is changed to be 'run_param1_p03'. Below is some of my attempting code:
import numpy as np
import os
param1_range = np.arange(0.01,0.5,0.01)
with open('input.param', 'r') as file :
filedata = file.read()
for p_value in param1_range:
filedata.replace('param1 0.01', 'param1 ' + str(p_value))
filedata.replace('model_name run_param1', 'model_name run_param1' + '_p0' + str(int(round(p_value*100))))
with open('input.param', 'w') as file:
file.write(filedata)
os.system('./bin/run_app param/input.param')
However, this is not working. I guess the main problem is that the replace command can not recognize the space. But I do not know how to search the argument param1 or model_name and change their values.
I'm editing this answer to more accurately answer the original question, which it did not adequately do.
The problem is "The replace command can not recognize the space". In order to do this, the re, or regex module, can be of great help. Your document is composed of an entry and its value, separated by spaces:
param1 0.01
model_name run_param1
In regex, a general capture would look like so:
import re
someline = 'param1 0.01'
pattern = re.match(r'^(\S+)\s+(\S+)$', someline)
pattern.groups()
# ('param1', '0.01')
The regex functions as follows:
^ captures a start-of-line
\S is any non-space char, or, anything not in ('\t', ' ', '\r', '\n')
+ indicates one or more as a greedy search (will go forward until the pattern stops matching)
\s+ is any whitespace char (opposite of \S, note the case here)
() indicate groups, or how you want to group your search
The groups make it fairly easy for you to unpack your arguments into variables if you so choose. To apply this to the code you have already:
import numpy as np
import re
param1_range = np.arange(0.01,0.5,0.01)
filedata = []
with open('input.param', 'r') as file:
# This will put the lines in a list
# so you can use ^ and $ in the regex
for line in file:
filedata.append(line.strip()) # get rid of trailing newlines
# filedata now looks like:
# ['param1 0.01', 'model_name run_param1']
# It might be easier to use a dictionary to keep all of your param vals
# since you aren't changing the names, just the values
groups = [re.match('^(\S+)\s+(\S+)$', x).groups() for x in filedata]
# Now you have a list of tuples which can be fed to dict()
my_params = dict(groups)
# {'param1': '0.01', 'model_name': 'run_param1'}
# Now just use that dict for setting your params
for p_value in param1_range:
my_params['param1'] = str(p_value)
my_params['model_name'] = 'run_param1_p0' + str(int(round(p_value*100)))
# And for the formatting back into the file, you can do some quick padding to get the format you want
with open('somefile.param', 'w') as fh:
content = '\n'.join([k.ljust(20) + v.rjust(20) for k,v in my_params.items()])
fh.write(content)
The padding is done using str.ljust and str.rjust methods so you get a format that looks like so:
for k, v in dict(groups).items():
intstr = k.ljust(20) + v.rjust(20)
print(intstr)
param1 0.01
model_name run_param1
Though you could arguably leave out the rjust if you felt so inclined.

Get a value from a string in python

Program Details:
I am writing a program for python that will need to look through a text file for the line:
Found mode 1 of 12: EV= 1.5185449E+04, f= 19.612545, T= 0.050988.
Problem:
Then after the program has found that line, it will then store the line into an array and get the value 19.612545, from f = 19.612545.
Question:
I so far have been able to store the line into an array after I have found it. However I am having trouble as to what to use after I have stored the string to search through the string, and then extract the information from variable f. Does anyone have any suggestions or tips on how to possibly accomplish this?
Depending upon how you want to go at it, CosmicComputer is right to refer you to Regular Expressions. If your syntax is this simple, you could always do something like:
line = 'Found mode 1 of 12: EV= 1.5185449E+04, f= 19.612545, T= 0.050988.'
splitByComma=line.split(',')
fValue = splitByComma[1].replace('f= ', '').strip()
print(fValue)
Results in 19.612545 being printed (still a string though).
Split your line by commas, grab the 2nd chunk, and break out the f value. Error checking and conversions left up to you!
Using regular expressions here is maddness. Just use string.find as follows: (where string is the name of the variable the holds your string)
index = string.find('f=')
index = index + 2 //skip over = and space
string = string[index:] //cuts things that you don't need
string = string.split(',') //splits the remaining string delimited by comma
your_value = string[0] //extracts the first field
I know its ugly, but its nothing compared with RE.

Categories

Resources