I need to find tag=ITEM that match 2 criteria, and then get the parent tag=NODE#name based on this find.
Two issues:
I can't find a way for XPath to do an 'and', for example
item = node.findall('./ITEM[#name="toppas_type" and #value="output file list"]')
Getting the parent NODE info without having to explicitely search and save it in advance of finding the ITEM, for example something like
parent_name = item.parent.attrib['name']
This is the code I have now:
node_names = []
for node in tree.findall('NODE[#name="vertices"]/NODE'):
for item in node.findall('./ITEM[#name="toppas_type"]'):
if item.attrib['name'] == 'toppas_type' and item.attrib['value'] == 'output file list':
node_names.append(node.attrib['name'])
...to parse a file like this (snippet only) ...
<?xml version="1.0" encoding="ISO-8859-1"?>
<PARAMETERS version="1.6.2" xsi:noNamespaceSchemaLocation="http://open-ms.sourceforge.net/schemas/Param_1_6_2.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<NODE name="vertices" description="">
<NODE name="23" description="">
<ITEM name="recycle_output" value="false" type="string" description="" required="false" advanced="false" />
<ITEM name="toppas_type" value="tool" type="string" description="" required="false" advanced="false" />
<ITEM name="tool_name" value="FileConverter" type="string" description="" required="false" advanced="false" />
<ITEM name="tool_type" value="" type="string" description="" required="false" advanced="false" />
<ITEM name="x_pos" value="-620" type="double" description="" required="false" advanced="false" />
<ITEM name="y_pos" value="-1380" type="double" description="" required="false" advanced="false" />
</NODE>
<NODE name="24" description="">
<ITEM name="recycle_output" value="false" type="string" description="" required="false" advanced="false" />
<ITEM name="toppas_type" value="output file list" type="string" description="" required="false" advanced="false" />
<ITEM name="x_pos" value="-440" type="double" description="" required="false" advanced="false" />
<ITEM name="y_pos" value="-1480" type="double" description="" required="false" advanced="false" />
<ITEM name="output_folder_name" value="" type="string" description="" required="false" advanced="false" />
</NODE>
<NODE name="33" description="">
<ITEM name="recycle_output" value="false" type="string" description="" required="false" advanced="false" />
<ITEM name="toppas_type" value="merger" type="string" description="" required="false" advanced="false" />
<ITEM name="x_pos" value="-620" type="double" description="" required="false" advanced="false" />
<ITEM name="y_pos" value="-1540" type="double" description="" required="false" advanced="false" />
<ITEM name="round_based" value="false" type="string" description="" required="false" advanced="false" />
</NODE>
<!--(snip)-->
</NODE>
</PARAMETERS>
UPDATE:
#Mathias Müller
Great suggestion - unfortunately when I try to load the XML file, I get an error. I'm not familiar with lxml...so I'm not sure if I'm using it right.
from lxml import etree
root = etree.DTD("/Users/mikes/Documents/Eclipseworkspace/Bioproximity/Assay-Workflows-Mikes/protein_lfq/protein_lfq-1.1.2.toppas")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "src/lxml/dtd.pxi", line 294, in lxml.etree.DTD.__init__ (src/lxml/lxml.etree.c:187024)
lxml.etree.DTDParseError: Content error in the external subset, line 2, column 1
Unfortunately, ElementTree will not accept that xpath in its tree.find(xpath) or tree.findall(xpath)
Perhaps you do not need nested loops at all, a single XPath expression would suffice. I am not exactly sure what you would like the final result to be, but here is an example with lxml:
>>> import lxml.etree
>>> s = '''<NODE name="vertices" description="">
...
... <NODE name="23" description="">
... <ITEM name="recycle_output" value="false" type="string" description="" required="false" advanced="false" />
... <ITEM name="toppas_type" value="tool" type="string" description="" required="false" advanced="false" />
... <ITEM name="tool_name" value="FileConverter" type="string" description="" required="false" advanced="false" />
... <ITEM name="tool_type" value="" type="string" description="" required="false" advanced="false" />
... <ITEM name="x_pos" value="-620" type="double" description="" required="false" advanced="false" />
... <ITEM name="y_pos" value="-1380" type="double" description="" required="false" advanced="false" />
... </NODE>
...
... <NODE name="24" description="">
... <ITEM name="recycle_output" value="false" type="string" description="" required="false" advanced="false" />
... <ITEM name="toppas_type" value="output file list" type="string" description="" required="false" advanced="false" />
... <ITEM name="x_pos" value="-440" type="double" description="" required="false" advanced="false" />
... <ITEM name="y_pos" value="-1480" type="double" description="" required="false" advanced="false" />
... <ITEM name="output_folder_name" value="" type="string" description="" required="false" advanced="false" />
... </NODE>
...
... <NODE name="33" description="">
... <ITEM name="recycle_output" value="false" type="string" description="" required="false" advanced="false" />
... <ITEM name="toppas_type" value="merger" type="string" description="" required="false" advanced="false" />
... <ITEM name="x_pos" value="-620" type="double" description="" required="false" advanced="false" />
... <ITEM name="y_pos" value="-1540" type="double" description="" required="false" advanced="false" />
... <ITEM name="round_based" value="false" type="string" description="" required="false" advanced="false" />
... </NODE>
... <!--(snip)-->
... </NODE>'''
>>> root = lxml.etree.fromstring(s)
>>> root.xpath('/NODE[#name="vertices"]/NODE/ITEM[#name = "toppas_type" and #value = "output file list"]')
[<Element ITEM at 0x102b5f788>]
And if you actually need the name of the parent element, you can move to the parent node with ..:
>>> root.xpath('/NODE[#name="vertices"]/NODE/ITEM[#name = "toppas_type" and #value = "output file list"]/../#name')
['24']
Parsing an XML document from a file
The function etree.DTD is the wrong choice if you would like to parse an XML document from a file. A DTD is not an XML document. Here is how you can do it with lxml:
>>> import lxml.etree
>>> root = lxml.etree.parse("example.xml")
>>> root
<lxml.etree._ElementTree object at 0x106593b00>
Second Update
If the outermost element is PARAMETERS, you need to search like this:
>>> root.xpath('/PARAMETERS/NODE[#name="vertices"]/NODE/ITEM[#name = "toppas_type" and #value = "output file list"]')
[<Element ITEM at 0x106593e18>]
Related
I want to make a program which look through files, finds every incomplete file (without </module> at the end), then it will print last found abnumber in file and delete everyline (including the last with abnumber) after it.
So my file looks like that:
<Module bs="Mainfile_1">
<object id="1000" name="namex" abnumber="1">
<item name="item0" value="100" />
<item name="item00" value="100" />
</object>
<object id="1001" name="namey" abnumber="2">
<item name="item1" value="100" />
<item name="item00" value="100" />
</object>
<object id="1234" name="name1" abnumber="3">
<item name="item1" value="something11:
something11" />
<item name="item2" value="233" />
<item name="item3" value="233" />
<item name="item4" value="something12:
12something" />
</object>
<object id="1238" name="name2" abnumber="4">
<item name="item8" value="something12:
<item name="item9" value="233" />
and at the end it should looks like:
<Module bs="Mainfile_1">
<object id="1000" name="namex" abnumber="1">
<item name="item0" value="100" />
<item name="item00" value="100" />
</object>
<object id="1001" name="namey" abnumber="2">
<item name="item1" value="100" />
<item name="item00" value="100" />
</object>
<object id="1234" name="name1" abnumber="3">
<item name="item1" value="something11:
something11" />
<item name="item2" value="233" />
<item name="item3" value="233" />
<item name="item4" value="something12:
12something" />
</object>
with printed: 4
I started by doing something like that but I feel like I am doing everything wrong:
import os
Mainfile = 'path'
for filename in os.listdir(Mainfile):
lines = filename.readlines()
if not "</Module>" in lines:
with open(filename, 'r+', encoding="utf-8") as file:
line_list = list(file)
line_list.reverse()
for line in line_list:
if line.find('absno') != -1:
print(line)
You can use re to get your result :
<object([\s\S]*?)<\/object> to get correct <object... </object> tag
abnumber=\"([0-9.]+) to get abnnumber for incorrect tag
<Module.*|<object(?:[\s\S]*?)<\/object> to get correct format of xml data
import re
data = """<Module bs="Mainfile_1">
<object id="1000" name="namex" abnumber="1">
<item name="item0" value="100" />
<item name="item00" value="100" />
</object>
<object id="1001" name="namey" abnumber="2">
<item name="item1" value="100" />
<item name="item00" value="100" />
</object>
<object id="1234" name="name1" abnumber="3">
<item name="item1" value="something11:
something11" />
<item name="item2" value="233" />
<item name="item3" value="233" />
<item name="item4" value="something12:
12something" />
</object>
<object id="1238" name="name2" abnumber="4">
<item name="item8" value="something12:
<item name="item9" value="233" />"""
invalid_XML_Tag = re.sub("<object([\s\S]*?)<\/object>", '', data)
abnnumber_value = re.findall("abnumber=\"([0-9.]+)", invalid_XML_Tag)
print("abnumber of invalid tag => {0}".format(abnnumber_value))
correct_xml_format = re.findall("<Module.*|<object(?:[\s\S]*?)<\/object>",data)
print("".join(correct_xml_format))
Output:
abnumber of invalid tag => ['4']
<Module bs="Mainfile_1"><object id="1000" name="namex" abnumber="1">
<item name="item0" value="100" />
<item name="item00" value="100" />
</object><object id="1001" name="namey" abnumber="2">
<item name="item1" value="100" />
<item name="item00" value="100" />
</object><object id="1234" name="name1" abnumber="3">
<item name="item1" value="something11:
something11" />
<item name="item2" value="233" />
<item name="item3" value="233" />
<item name="item4" value="something12:
12something" />
</object>
I use BS4 to parser .xml,i want to get resattribute number,but get none
how to do it ?
source xml
`<digitizer id="1" integrated="true" csrmusttouch="falsehardprox="true"
physidcsrs="false" pnpid="49154" kind="MULTI_TOUCH" maxcsrs="10">
<monitor left="0" top="0" right="1920" bottom="1080" />`
<properties>
<property name="x" logmin="0" logmax="16383" res="621.7457275" unit="cm" hidusage="0x00010030" guid="{598A6A8F-52C0-4BA0-93AF-AF357411A561}" />
<property name="y" logmin="0" logmax="16383" res="983.9639893" unit="cm" hidusage="0x00010031" guid="{B53F9F75-04E0-4498-A7EE-C30DBB5A9011}" />
<property name="status" logmin="0" logmax="15" res="0" unit="DEFAULT" hidusage="0x000d0042, 0x000d003c, 0x000d0044" guid="{6E0E07BF-AFE7-4CF7-87D1-AF6446208418}" />
<property name="time" logmin="0" logmax="2147483647" res="1" unit="DEFAULT" guid="{436510C5-FED3-45D1-8B76-71D3EA7A829D}" />
<property name="contactid" logmin="0" logmax="31" res="1.861861944" unit="cm" hidusage="0x000d0051" guid="{02585B91-049B-4750-9615-DF8948AB3C9C}" />`
Python Code
a = data_xml.find('digitizer',id="1")
b = a.find('properties')
print(b.get('res'))
Result
None
I have taken your data as html
html="""<digitizer id="1" integrated="true" csrmusttouch="falsehardprox="true"
physidcsrs="false" pnpid="49154" kind="MULTI_TOUCH" maxcsrs="10">
<monitor left="0" top="0" right="1920" bottom="1080" />`
<properties>
<property name="x" logmin="0" logmax="16383" res="621.7457275" unit="cm" hidusage="0x00010030" guid="{598A6A8F-52C0-4BA0-93AF-AF357411A561}" />
<property name="y" logmin="0" logmax="16383" res="983.9639893" unit="cm" hidusage="0x00010031" guid="{B53F9F75-04E0-4498-A7EE-C30DBB5A9011}" />
<property name="status" logmin="0" logmax="15" res="0" unit="DEFAULT" hidusage="0x000d0042, 0x000d003c, 0x000d0044" guid="{6E0E07BF-AFE7-4CF7-87D1-AF6446208418}" />
<property name="time" logmin="0" logmax="2147483647" res="1" unit="DEFAULT" guid="{436510C5-FED3-45D1-8B76-71D3EA7A829D}" />
<property name="contactid" logmin="0" logmax="31" res="1.861861944" unit="cm" hidusage="0x000d0051" guid="{02585B91-049B-4750-9615-DF8948AB3C9C}" />"""
from bs4 import BeautifulSoup
soup=BeautifulSoup(html,"html.parser")
Code::
You can find all property tag and then find res value associate to it!
a = soup.find('digitizer',attrs={"id":"1"})
properties=a.find_all("property")
res_lst=[i['res'] for i in properties]
Output::
['621.7457275', '983.9639893', '0', '1', '1.861861944']
Your xml seems poorly formatted, after reformatting it:
<digitizer id="1" integrated="true" csrmusttouch="" falsehardprox="true" physidcsrs="false" pnpid="49154" kind="MULTI_TOUCH" maxcsrs="10">
<monitor left="0" top="0" right="1920" bottom="1080"/>
<properties>
<property name="x" logmin="0" logmax="16383" res="621.7457275" unit="cm" hidusage="0x00010030" guid="{598A6A8F-52C0-4BA0-93AF-AF357411A561}" />
<property name="y" logmin="0" logmax="16383" res="983.9639893" unit="cm" hidusage="0x00010031" guid="{B53F9F75-04E0-4498-A7EE-C30DBB5A9011}" />
<property name="status" logmin="0" logmax="15" res="0" unit="DEFAULT" hidusage="0x000d0042, 0x000d003c, 0x000d0044" guid="{6E0E07BF-AFE7-4CF7-87D1-AF6446208418}" />
<property name="time" logmin="0" logmax="2147483647" res="1" unit="DEFAULT" guid="{436510C5-FED3-45D1-8B76-71D3EA7A829D}" />
<property name="contactid" logmin="0" logmax="31" res="1.861861944" unit="cm" hidusage="0x000d0051" guid="{02585B91-049B-4750-9615-DF8948AB3C9C}" />
You can easily parse it like this:
from bs4 import BeautifulSoup
with open('data.xml') as raw_resuls:
results = BeautifulSoup(raw_resuls, 'lxml')
for element in results.find_all("properties"):
for property_tag in element.find_all("property"):
print(property_tag['res'])
Output:
621.7457275
983.9639893
0
1
1.861861944
You can find more info about parsing attribute values from xml in the tutorial where the code is from.
Edit: Note that I slightly modified the code to fit your question.
I have a problem with change of the atribute at the xml file.
My tree looks like that
<Objects>
<BigObj Version="2.2" Name="Something">
<ItemList>
<Item Name="s_1" Selected="false"/>
<Item Name="s_2" Selected="false"/>
<Item Name="s_3" Selected="true"/>
<Item Name="s_4" Selected="false"/>
</ItemList>
</BigObj >
</Objects>
And i need to check if "s_x"is in list of names and if it is then change the value of Selected to true, if it's not to false (or keep it false)
I've tried to do that with this code:
lslist = ["s_1","s_4"]
for child in root.findall("./Objects/BigObj/ItemList/Item"):
for idx in lslist:
if idx in child.find("Name").text:
child.set('Selected', "true")
else:
child.set('Selected', "false")
But i have an AttributeError: 'NoneType' object has no attribute 'text'
The below works
import xml.etree.ElementTree as ET
lslist = ["s_1", "s_4"]
xml = '''<Objects>
<BigObj Version="2.2" Name="Something">
<ItemList>
<Item Name="s_1" Selected="false"/>
<Item Name="s_2" Selected="false"/>
<Item Name="s_3" Selected="true"/>
<Item Name="s_4" Selected="false"/>
</ItemList>
</BigObj ></Objects>'''
root = ET.fromstring(xml)
items = root.findall('.//Item')
for item in items:
item.attrib['Selected'] = str(item.attrib['Name'] in lslist)
ET.dump(root)
output
<Objects>
<BigObj Version="2.2" Name="Something">
<ItemList>
<Item Name="s_1" Selected="True" />
<Item Name="s_2" Selected="False" />
<Item Name="s_3" Selected="False" />
<Item Name="s_4" Selected="True" />
</ItemList>
</BigObj></Objects>
thanks for taking the time with this one.
i have an xml file with an element called selectionset. the idea is to take that element and modify some of the subelements attributes and tails, that part i have done.
the shady thing for me to get is why when i try to add the new subelements to the original (called selectionsets) its only pushing the last on the list inplist
import xml.etree.ElementTree as etree
from xml.etree.ElementTree import *
from xml.etree.ElementTree import ElementTree
tree=ElementTree()
tree.parse('STRUCTURAL.xml')
root = tree.getroot()
col=tree.find('selectionsets/selectionset')
#find the value needed
val=tree.findtext('selectionsets/selectionset/findspec/conditions/condition/value/data')
setname=col.attrib['name']
listnames=val + " 6"
inplist=["D","E","F","G","H"]
entry=3
catcher=[]
ss=root.find('selectionsets')
outxml=ss
for i in range(len(inplist)):
str(val)
col.set('name',(setname +" "+ inplist[i]))
col.find('findspec/conditions/condition/value/data').text=str(inplist[i]+val[1:3])
#print (etree.tostring(col)) #everything working well til this point
timper=col.find('selectionset')
root[0].append(col)
# new=etree.SubElement(outxml,timper)
#you need to create a tree with element tree before creating the xml file
itree=etree.ElementTree(outxml)
itree.write('Selection Sets.xml')
print (etree.tostring(outxml))
# print (Test_file.selectionset())
#Initial xml
<?xml version="1.0" encoding="UTF-8" ?>
<exchange xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://download.autodesk.com/us/navisworks/schemas/nw-exchange-12.0.xsd" units="ft" filename="STRUCTURAL.nwc" filepath="C:\Users\Ricardo\Desktop\Comun\Taller 3">
<selectionsets>
<selectionset name="Column Location" guid="565f5345-de06-4f5b-aa0f-1ae751c98ea8">
<findspec mode="all" disjoint="0">
<conditions>
<condition test="contains" flags="10">
<category>
<name internal="LcRevitData_Element">Element</name>
</category>
<property>
<name internal="lcldrevit_parameter_-1002563">Column Location Mark</name>
</property>
<value>
<data type="wstring">C-A </data>
</value>
</condition>
</conditions>
<locator>/</locator>
</findspec>
</selectionset>
</selectionsets>
</exchange>
#----Current Output
<selectionsets>
<selectionset guid="565f5345-de06-4f5b-aa0f-1ae751c98ea8" name="Column Location H">
<findspec disjoint="0" mode="all">
<conditions>
<condition flags="10" test="contains">
<category>
<name internal="LcRevitData_Element">Element</name>
</category>
<property>
<name internal="lcldrevit_parameter_-1002563">Column Location Mark</name>
</property>
<value>
<data type="wstring">H-A</data>
</value>
</condition>
</conditions>
<locator>/</locator>
</findspec>
</selectionset>
<selectionset guid="565f5345-de06-4f5b-aa0f-1ae751c98ea8" name="Column Location H">
<findspec disjoint="0" mode="all">
<conditions>
<condition flags="10" test="contains">
<category>
<name internal="LcRevitData_Element">Element</name>
</category>
<property>
<name internal="lcldrevit_parameter_-1002563">Column Location Mark</name>
</property>
<value>
<data type="wstring">H-A</data>
</value>
</condition>
</conditions>
<locator>/</locator>
</findspec>
</selectionset>
<selectionset guid="565f5345-de06-4f5b-aa0f-1ae751c98ea8" name="Column Location H">
<findspec disjoint="0" mode="all">
<conditions>
<condition flags="10" test="contains">
<category>
<name internal="LcRevitData_Element">Element</name>
</category>
<property>
<name internal="lcldrevit_parameter_-1002563">Column Location Mark</name>
</property>
<value>
<data type="wstring">H-A</data>
</value>
</condition>
</conditions>
<locator>/</locator>
</findspec>
</selectionset>
<selectionset guid="565f5345-de06-4f5b-aa0f-1ae751c98ea8" name="Column Location H">
<findspec disjoint="0" mode="all">
<conditions>
<condition flags="10" test="contains">
<category>
<name internal="LcRevitData_Element">Element</name>
</category>
<property>
<name internal="lcldrevit_parameter_-1002563">Column Location Mark</name>
</property>
<value>
<data type="wstring">H-A</data>
</value>
</condition>
</conditions>
<locator>/</locator>
</findspec>
</selectionset>
<selectionset guid="565f5345-de06-4f5b-aa0f-1ae751c98ea8" name="Column Location H">
<findspec disjoint="0" mode="all">
<conditions>
<condition flags="10" test="contains">
<category>
<name internal="LcRevitData_Element">Element</name>
</category>
<property>
<name internal="lcldrevit_parameter_-1002563">Column Location Mark</name>
</property>
<value>
<data type="wstring">H-A</data>
</value>
</condition>
</conditions>
<locator>/</locator>
</findspec>
</selectionset>
<selectionset guid="565f5345-de06-4f5b-aa0f-1ae751c98ea8" name="Column Location H">
<findspec disjoint="0" mode="all">
<conditions>
<condition flags="10" test="contains">
<category>
<name internal="LcRevitData_Element">Element</name>
</category>
<property>
<name internal="lcldrevit_parameter_-1002563">Column Location Mark</name>
</property>
<value>
<data type="wstring">H-A</data>
</value>
</condition>
</conditions>
<locator>/</locator>
</findspec>
</selectionset>
</selectionsets>
Here's what I've been able to put together and it looks like it'll do what you're looking for. Here are the main differences: (1) This will iterate over multiple selectionset items (if you end up with more than one), (2) It creates a deepcopy of the element before modifying the values (I think you were always modifying the original "col"), (3) It appends the new selectionset to the selectionsets tag rather than the root.
Here's the deepcopy documentation
import xml.etree.ElementTree as etree
import copy
tree=etree.ElementTree()
tree.parse('test.xml')
root = tree.getroot()
inplist=["D","E","F","G","H"]
for selectionset in tree.findall('selectionsets/selectionset'):
for i in inplist:
col = copy.deepcopy(selectionset)
col.set('name', '%s %s' % (col.attrib['name'], i))
data = col.find('findspec/conditions/condition/value/data')
data.text = '%s%s' % (i, data.text[1:3])
root.find('selectionsets').append(col)
itree = etree.ElementTree(root)
itree.write('Selection Sets.xml')
I face a problem with Python and Selenium.
I want to click a link, this is my Py code:
toubao_luru_xpath='//div[87]/xml/items/item[2]/item[#path=policynewbiz/inputapplication/chooseproduct.jsp]'
#url=policynewbiz/inputapplication/chooseproduct.jsp
print WebDriverWait(browser,10).until(EC.presence_of_element_located((By.XPATH,toubao_luru_xpath)))
print browser.find_element_by_xpath(toubao_luru_xpath)
#print browser.find_element_by_xpath(toubao_luru_xpath).click()
The error is:
File
"C:\Python27\lib\site-packages\selenium\webdriver\support\wait.py",
line 80, in until
raise TimeoutException(message, screen, stacktrace) selenium.common.exceptions.TimeoutException: Message: Yes,
And this is the HTML code:
<html>
<DIV style="DISPLAY: none"><xml id=__menu>
<items>
<item name="quotation" label="散单报价" >
<item name="input" label="录入" path="policyquotation/createquotation/chooseproduct.jsp" icon="../image/icon/1.gif" visible="false" ></item>
<item name="quotationinput" label="录入" icon="../image/icon/1.gif" visible="false" command="commandCreateQuotationTemplate" ></item>
<item name="quotationinput2014" label="录入" path="policyquotation_v2/chooseproduct2014.jsp" icon="../image/icon/1.gif" ></item>
<item name="enterquotation" label="enterquotation" visible="false" ></item>
<item name="queryQuotation" label="查询" path="policyquotation/qryquotationlist.jsp" icon="../image/icon/2.gif" visible="false" ></item>
<item name="queryQuotation2" label="查询" path="policyquotation_v2/qryquotationlist.jsp" icon="../image/icon/2.gif" ></item>
<item name="packageWork" label="套餐指定" path="policyquotation/package-manage-work.jsp" icon="../image/icon/3.gif" visible="false" ></item>
<item name="querypackage" label="套餐管理" path="policyquotation/query-package-list.jsp" icon="../image/icon/4.gif" visible="false" ></item>
<item name="quotationfollow" label="报价跟进" path="policyquotation/followquotation.jsp" icon="../image/icon/5.gif" visible="false" ></item>
<item name="entererror" label="entererror" path="error.jsp" visible="false" ></item>
<item name="quotationfollownew" label="报价跟进" path="policyquotation_v2/followquotation.jsp" icon="../image/icon/5.gif" visible="false" ></item>
</item>
<item name="application" label="投保" >
<item name="input" label="录入" path="policynewbiz/inputapplication/chooseproduct.jsp" icon="../image/icon/1.gif" ></item>
</html>
The last <item> I want to click
This should work. It's unique given the HTML you provided and I'm assuming that there aren't two links to the same URL so it should be good.
driver.find_element_by_css_selector("item[path='policynewbiz/inputapplication/chooseproduct.jsp']").click()