Extract arguments from string with python function

Extract arguments from string with python function - python

I'm looking for a way to extract arguments embedded into python function returned to me as strings.
For example:
'create.copy("Node_A", "Information", False)'
# expected return: ["Node_A", "Information", "False"]
'create.new("Node_B")'
# expected return: ["Node_B"]
'delete("Node_C")'
# expected return: ["Node_C"]
My first approach was regular expressions like this:
re.match(r"("(.+?")")
But it returns None all the time.
How can I get list of this arguments?
BTW: I'm forced to use Python 2.7 and only built-in functions :(

You can parse these expressions using the built-in ast module.
import ast
def get_args(expr):
tree = ast.parse(expr)
args = tree.body[0].value.args
return [arg.value for arg in args]
get_args('create.copy("Node_A", "Information", False)') # ['Node_A', 'Information', False]
get_args('create.new("Node_B")') # ['Node_B']
get_args('delete("Node_C")') # ['Node_C']

Here an example without any external modules and totally compatible with python2.7. Slice the string w.r.t. the position of the brackets, clean it from extra white-spaces and split at ,.
f = 'create.copy("Node_A", "Information", False)'
i_open = f.find('(')
i_close = f.find(')')
print(f[i_open+1: i_close].replace(' ', '').split(','))
Output
['"Node_A"', '"Information"', 'False']
Remark:
not for nested functions.
the closing bracket can also be found by reversing the string
i_close = len(f) - f[::-1].find(')') - 1

See below (tested using python 3.6)
def get_args(expr):
args = expr[expr.find('(') + 1:expr.find(')')].split(',')
return [x.replace('"', '') for x in args]
entries = ['create.copy("Node_A", "Information", False)', "create.new(\"Node_B\")"]
for entry in entries:
print(get_args(entry))
output
['Node_A', ' Information', ' False']
['Node_B']

Related

Python - Possibly Regex - How to replace part of a filepath with another filepath based on a match?

I'm new to Python and relatively new to programming. I'm trying to replace part of a file path with a different file path. If possible, I'd like to avoid regex as I don't know it. If not, I understand.
I want an item in the Python list [] before the word PROGRAM to be replaced with the 'replaceWith' variable.
How would you go about doing this?
Current Python List []
item1ToReplace1 = \\server\drive\BusinessFolder\PROGRAM\New\new.vb
item1ToReplace2 = \\server\drive\BusinessFolder\PROGRAM\old\old.vb
Variable to replace part of the Python list path
replaceWith = 'C:\ProgramFiles\Microsoft\PROGRAM'
Desired results for Python List []:
item1ToReplace1 = C:\ProgramFiles\Micosoft\PROGRAM\New\new.vb
item1ToReplace2 = C:\ProgramFiles\Micosoft\PROGRAM\old\old.vb
Thank you for your help.

The following code does what you ask, note I updated your '' to '\', you probably need to account for the backslash in your code since it is used as an escape character in python.
import os
item1ToReplace1 = '\\server\\drive\\BusinessFolder\\PROGRAM\\New\\new.vb'
item1ToReplace2 = '\\server\\drive\\BusinessFolder\\PROGRAM\\old\\old.vb'
replaceWith = 'C:\ProgramFiles\Microsoft\PROGRAM'
keyword = "PROGRAM\\"
def replacer(rp, s, kw):
ss = s.split(kw,1)
if (len(ss) > 1):
tail = ss[1]
return os.path.join(rp, tail)
else:
return ""
print(replacer(replaceWith, item1ToReplace1, keyword))
print(replacer(replaceWith, item1ToReplace2, keyword))
The code splits on your keyword and puts that on the back of the string you want.
If your keyword is not in the string, your result will be an empty string.
Result:
C:\ProgramFiles\Microsoft\PROGRAM\New\new.vb
C:\ProgramFiles\Microsoft\PROGRAM\old\old.vb

One way would be:
item_ls = item1ToReplace1.split("\\")
idx = item_ls.index("PROGRAM")
result = ["C:", "ProgramFiles", "Micosoft"] + item_ls[idx:]
result = "\\".join(result)
Resulting in:
>>> item1ToReplace1 = r"\\server\drive\BusinessFolder\PROGRAM\New\new.vb"
... # the above
>>> result
'C:\ProgramFiles\Micosoft\PROGRAM\New\new.vb'
Note the use of r"..." in order to avoid needing to have to 'escape the escape characters' of your input (i.e. the \). Also that the join/split requires you to escape these characters with a double backslash.

Convert string of lists into list of lists in Python

I'm trying to convert a list of lists passed as string to nested list in python-3.7.5 and I'm missing something. I tried ast but it seems to be throwing an encoding error.
Example:
#!/usr/bin/env python
import ast
sample="[1abcd245,2bcdasdf,3jakdshfkh234234],[234asdfmnkk234]"
print(ast.literal_eval(sample))
ERROR:
return compile(source, filename, mode, PyCF_ONLY_AST)
File "<unknown>", line 1
[1abcd245,2bcdasdf,3jakdshfkh234234],[234asdfmnkk234]
Required output:
[[1abcd245,2bcdasdf,3jakdshfkh234234],[234asdfmnkk234]]
Any suggestions?

You may use the eval() function here, after making two changes to your starting string:
Wrap each list item in double quotes
Wrap the entire input in [...] to make it a formal 2D list
sample = "[1abcd245,2bcdasdf,3jakdshfkh234234],[234asdfmnkk234]"
sample = '[' + re.sub(r'(\w+)', r'"\1"', sample) + ']'
list = eval(sample)
print(list)
This prints:
[['1abcd245', '2bcdasdf', '3jakdshfkh234234'], ['234asdfmnkk234']]

I think the issue is that literal_eval is unable to parse the strings within the sample you provide. I was able to get the output you wanted by adding a triple quote to surround the sample string, adding quotes to each string within the lists and adding an extra set of brackets:
import ast
sample="""[["1abcd245","2bcdasdf","3jakdshfkh234234"],["234asdfmnkk234"]]"""
print(ast.literal_eval(sample))
In the case you cannot change the input I would recommend using the json library:
import json
json.loads(sample)
Which on my machine gets the desired result!

you can try this:
sample="[1abcd245,2bcdasdf,3jakdshfkh234234],[234asdfmnkk234]"
l1 = []
for item in sample.split(","):
if item.startswith('['):
l1.append([])
l1[-1].append(item[1:])
elif item.endswith(']'):
l1[-1].append(item[:-2])
else:
l1[-1].append(item)
print(l1)

how to add a list, passed from a subprocess to parent process, to an already existing list in python

I am passing a list from a subprocess to the parent process and in the parent process I want to add this to an already existing list. I did this:
subprocess_script.py:
def func():
list = []
list.append('1')
list.append('2')
print'Testing the list passing'
print '>>> list:',list
if __name__ == '__main__':
func()
parent_script.py:
list1 = []
list1.append('a')
list1.append('b')
ret = subprocess.Popen([sys.executable,"/Users/user1/home/subprocess_script.py"],stdout=subprocess.PIPE)
ret.wait()
return_code = ret.returncode
out, err = ret.communicate()
if out is not None:
for line in out.splitlines():
if not line.startswith('>>>'):
continue
value = line.split(':',1)[1].lstrip()
list1.extend(value)
print 'Final List: ',list1
But when I execute this I do not get the desired output. The final list that I want should be : ['a','b','1','2']. But I get ['a', 'b', '[', "'", '1', "'", ',', ' ', "'", '2', "'", ']'] which is wrong. What am I doing wrong here?

The problem is that after your split and lstrip calls, value is still a string, not a list yet. You can stop the script including a pdb.set_trace line and inspect it like this:
if not line.startswith('>>>'):
continue
import pdb; pdb.set_trace()
value = line.split(':', 1)[1].lstrip()
list1.extend(value)
And then run the code:
❯ python main.py
> /private/tmp/example/main.py(19)<module>()
-> value = line.split(':', 1)[1].lstrip()
(Pdb) line
">>> list: ['1', '2']"
(Pdb) line.split(':', 1)[1].lstrip()
"['1', '2']"
You can evaluate that string into a Python list by using the ast.literal_eval function, like this:
(Pdb) import ast
(Pdb) ast.literal_eval(line.split(':', 1)[1].lstrip())
['1', '2']
Now list1 can be extended with this value.
From the Python 2.7 documentation:
ast.literal_eval(node_or_string)
Safely evaluate an expression node or a Unicode or Latin-1 encoded string containing a Python literal or container display. The string or node provided may only consist of the following Python literal structures: strings, numbers, tuples, lists, dicts, booleans, and None.
This can be used for safely evaluating strings containing Python values from untrusted sources without the need to parse the values oneself. It is not capable of evaluating arbitrarily complex expressions, for example involving operators or indexing.

You are doing it wrongly.
When you do - print '>>> list:',list . It would print -
>>> list: [1, 2]
And when you do - value = line.split(':',1)[1].lstrip() , value would become the string -
'[1, 2]'
And when extending list1 with value , each character in value would be added as a new element in list1 (because it would iterate over each element of the string, which is each character and add them to the list1).
When creating the value , you want to remove the first [ as well as the trailed ] and then split them based on , .
Example code -
value = line.split(':',1)[1].lstrip().lstrip('[').rstrip(']').replace("'",'').replace(" ",'').split(',')
The above code is a very big hack , better would be to use ast.literal_eval as #logc mentioned in his answer -
import ast
value = ast.literal_eval(line.split(":",1)[1].lstrip())
But please be vary, that ast.literal_eval would evalutate the expression and return the result, you should use it with care.

Use a standard serialization data format, like JSON:
script.py
import json
def func():
lst = []
lst.append('1')
lst.append('2')
print json.dumps(lst) ## <-- `dumps` dumps to a string
if __name__ == '__main__':
func()
main.py
import sys
import os
import subprocess
import json
list1 = []
list1.append('a')
list1.append('b')
ret = subprocess.Popen([sys.executable, os.path.join(os.getcwd(), "script.py")], stdout=subprocess.PIPE)
ret.wait()
return_code = ret.returncode
out, err = ret.communicate()
line = next(line for line in out.splitlines())
value = json.loads(line) ## <-- `loads` loads from a string
list1.extend(map(str, value))
print 'Final List: ', list1
The map(str, value) is just aesthetic: it is there to have a uniform list, because json.dumps will produce Unicode strings by default, and your previous list1 elements are not Unicode strings.
Also, I removed the whole header-printing and line-skipping parts of the code. You are just doing your life more difficult with them :)

parse statement string for arguments using regex in Python

I have user input statements which I would like to parse for arguments. If possible using regex.
I have read much about functools.partial on Stackoverflow where I could not find argument parsing. Also in regex on Stackoverflow I could not find how to check for a match, but exclude the used tokens. The Python tokenizer seems to heavy for my purpose.
import re
def getarguments(statement):
prog = re.compile("([(].*[)])")
result = prog.search(statement)
m = result.group()
# m = '(interval=1, percpu=True)'
# or m = "('/')"
# strip the parentheses, ugly but it works
return statement[result.start()+1:result.end()-1]
stm = 'psutil.cpu_percent(interval=1, percpu=True)'
arg_list = getarguments(stm)
print(arg_list) # returns : interval=1, percpu=True
# But combining single and double quotes like
stm = "psutil.disk_usage('/').percent"
arg_list = getarguments(stm) # in debug value is "'/'"
print(arg_list) # when printed value is : '/'
callfunction = psutil.disk_usage
args = []
args.append(arg_list)
# args.append('/')
funct1 = functools.partial(callfunction, *args)
perc = funct1().percent
print(perc)
This results an error :
builtins.FileNotFoundError: [Errno 2] No such file or directory: "'/'"
But
callfunction = psutil.disk_usage
args = []
#args.append(arg_list)
args.append('/')
funct1 = functools.partial(callfunction, *args)
perc = funct1().percent
print(perc)
Does return (for me) 20.3 This is correct.
So there is somewhere a difference.
The weird thing is, if I view the content in my IDE (WingIDE) the result is "'/'" and then, if I want to view the details then the result is '/'
I use Python 3.4.0 What is happening here, and how to solve?
Your help is really appreciated.

getarguments("psutil.disk_usage('/').percent") returns '/'. You can check this by printing len(arg_list), for example.
Your IDE adds ", because by default strings are enclosed into single quotes '. Now you have a string which actually contains ', so IDE uses double quotes to enclose the string.
Note, that '/' is not equal to "'/'". The former is a string of 1 character, the latter is a string of 3 characters. So in order to get things right you need to strip quotes (both double and single ones) in getarguments. You can do it with following snippet
if (s.startswith('\'') and s.endswith('\'')) or
(s.startswith('\"') and s.endswith('\"')):
s = s[1:-1]

In pyparsing, how to assign a "no match" key value?

I'd like to make the 'pyparsing' parsing result come out as a dictionary without neeing to post-process. For this, I need to define my own key strings. The following the best I could come up with that produces the desired results.
Line to parse:
%ADD22C,0.35X*%
Code:
import pyparsing as pyp
floatnum = pyp.Regex(r'([\d\.]+)')
comma = pyp.Literal(',').suppress()
cmd_app_def = pyp.Literal('AD').setParseAction(pyp.replaceWith('aperture-definition'))
cmd_app_def_opt_circ = pyp.Group(pyp.Literal('C') +
comma).setParseAction(pyp.replaceWith('circle'))
circular_apperture = pyp.Group(cmd_app_def_opt_circ +
pyp.Group(pyp.Empty().setParseAction(pyp.replaceWith('diameter')) + floatnum) +
pyp.Literal('X').suppress())
<the grammar for the entire line>
The result is:
['aperture-definition', '20', ['circle', ['diameter', '0.35']]]
What I consider a hack here is
pyp.Empty().setParseAction(pyp.replaceWith('diameter'))
which always matches and is empty, but then I assign my desired key name to it.
Is this the best way to do this? Am I abusing pyparsing to do something it's not meant to do?

If you want to name your floatnum as "diameter", you can use named results:
cmd_app_def_opt_circ = pyp.Group(pyp.Literal('C') +
comma)("circle")
circular_apperture = pyp.Group(cmd_app_def_opt_circ +
pyp.Group(floatnum)("diameter") +
pyp.Literal('X').suppress())
In this way, every time the parses encounters floatnum in the circular_appertur context, this result is named diameter. Also, as described above, you can name circle in the same fashion. Does this work for you?

See comments in the posted code.
import pyparsing as pyp
comma = pyp.Literal(',').suppress()
# use parse actions to do type conversion at parse time, so that results fields
# can immediately be used as ints or floats, without additional int() or float()
# calls
floatnum = pyp.Regex(r'([\d\.]+)').setParseAction(lambda t: float(t[0]))
integer = pyp.Word(pyp.nums).setParseAction(lambda t: int(t[0]))
# define the command keyword - I assume there will be other commands too, they
# should follow this general pattern (define the command keyword, then all the
# options, then define the overall command)
aperture_defn_command_keyword = pyp.Literal('AD')
# define a results name for the matched integer - I don't know what this
# option is, wasn't in your original post
d_option = 'D' + integer.setResultsName('D')
# shortcut for defining a results name is to use the expression as a
# callable, and pass the results name as the argument (I find this much
# cleaner and keeps the grammar definition from getting messy with lots
# of calls to setResultsName)
circular_aperture_defn = 'C' + comma + floatnum('diameter') + 'X'
# define the overall command
aperture_defn_command = aperture_defn_command_keyword("command") + d_option + pyp.Optional(circular_aperture_defn)
# use searchString to skip over '%'s and '*'s, gives us a ParseResults object
test = "%ADD22C,0.35X*%"
appData = aperture_defn_command.searchString(test)[0]
# ParseResults can be accessed directly just like a dict
print appData['command']
print appData['D']
print appData['diameter']
# or if you prefer attribute-style access to results names
print appData.command
print appData.D
print appData.diameter
# convert ParseResults to an actual Python dict, removes all unnamed tokens
print appData.asDict()
# dump() prints out the parsed tokens as a list, then all named results
print appData.dump()
Prints:
AD
22
0.35
AD
22
0.35
{'diameter': 0.34999999999999998, 'command': 'AD', 'D': 22}
['AD', 'D', 22, 'C', 0.34999999999999998, 'X']
- D: 22
- command: AD
- diameter: 0.35

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Extract arguments from string with python function - python

Related

Python - Possibly Regex - How to replace part of a filepath with another filepath based on a match?

Convert string of lists into list of lists in Python

how to add a list, passed from a subprocess to parent process, to an already existing list in python

parse statement string for arguments using regex in Python

In pyparsing, how to assign a "no match" key value?

Categories

Resources