I have the below XML document I am trying to parse. I just need to grab one node from the document. I need to get the serviceProfile text. I'm banging my head against the desk here... I am new to Python.
<?xml version='1.0' encoding='UTF-8'?>
<soapenv:Envelope
xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/">
<soapenv:Body>
<ns:getUserResponse
xmlns:ns="http://www.cisco.com/AXL/API/11.5">
<return>
<user uuid="{blbhbl-bhblb-kbhb}">
<firstName>fname</firstName>
<displayName>fname lname</displayName>
<middleName/>
<lastName>lname</lastName>
<userid>wooty</userid>
<password/>
<pin/>
<mailid>wooty#woot.com</mailid>
<department/>
<manager/>
<userLocale />
<associatedDevices/>
<primaryExtension/>
<associatedPc/>
<enableCti>false</enableCti>
<digestCredentials/>
<phoneProfiles/>
<defaultProfile/>
<presenceGroupName uuid="{sdsds-sdsds-sdsdsd-sdsdsd-sdsd}">Standard Presence group</presenceGroupName>
<subscribeCallingSearchSpaceName/>
<enableMobility>false</enableMobility>
<enableMobileVoiceAccess>false</enableMobileVoiceAccess>
<maxDeskPickupWaitTime>10000</maxDeskPickupWaitTime>
<remoteDestinationLimit>4</remoteDestinationLimit>
<associatedRemoteDestinationProfiles/>
<associatedTodAccess/>
<status>1</status>
<enableEmcc>false</enableEmcc>
<associatedCapfProfiles/>
<ctiControlledDeviceProfiles/>
<patternPrecedence />
<numericUserId />
<mlppPassword />
<customUserFields/>
<homeCluster>true</homeCluster>
<imAndPresenceEnable>true</imAndPresenceEnable>
<serviceProfile uuid="{dsdsdsd-sdsdsd-sdsd-sdsds-sdsds}">1 IM Presence Only</serviceProfile>
<lineAppearanceAssociationForPresences/>
<directoryUri>blah#wooty.com</directoryUri>
<telephoneNumber>555-555-5555</telephoneNumber>
<title/>
<mobileNumber/>
<homeNumber/>
<pagerNumber/>
<extensionsInfo/>
<selfService />
<userProfile/>
<calendarPresence>false</calendarPresence>
<ldapDirectoryName uuid="{sdsd-sdsdsd-sdsds-sdsds}">someinfo</ldapDirectoryName>
<userIdentity>blah#woot.com</userIdentity>
<nameDialing>blehWoot</nameDialing>
<ipccExtension/>
<convertUserAccount uuid="{sdsd-sdsdsd-sdsds-sdsds}">someinfo</convertUserAccount>
<enableUserToHostConferenceNow>false</enableUserToHostConferenceNow>
<attendeesAccessCode/>
</user>
</return>
</ns:getUserResponse>
</soapenv:Body>
</soapenv:Envelope>
Based on #danielHaley suggestions i created the following code to retrieve the node.
#read XML response and get service profile
tree = ET.ElementTree(ET.fromstring(response.content))
root = tree.getroot()
serviceprofile = root.find(".//serviceProfile").text
Worked great. thank you so much for your help.
Related
I am editing xml files, I ran into the problem that when changing a file in a python script, its structure is lost.
Xml file:
<?xml version="1.0" encoding="UTF-8"?>
<main>
<element formatVersion="1.0">
<firstValue>firstText</firstValue>
<secondValue>secondText</secondValue>
<thirdValue>thirdText</thirdValue>
<errors>
<path><![CDATA[path]]></path>
<code_main />
</errors>
<reference>3</reference>
</element>
....
</main>
Используя:
tree = ET.parse(xml_file).write("test.xml", encoding='utf-8', xml_declaration=True)
I lose all comments in the file, while if I compare the original file with the modified one using diff (in linux), the files are shown as completely different
Is there a way to change the xml file (my task is to add a subelement to <element>), while leaving the overall structure of the file unchanged, including comments and order.
The order and comments are fundamental in the file
UPD:
After executing the above code, I get it from the source xml in the following form:
<?xml version='1.0' encoding='utf-8'?>
<main>
<element formatVersion="1.0">
<firstValue>firstText</firstValue>
<secondValue>secondText</secondValue>
<thirdValue>thirdText</thirdValue>
<errors>
<path>path</path>
<code_main />
</errors>
<reference>3</reference>
</element>
</main>
Pay attention to <path>
Comments are also not saved at the same time:
Source:
<main>
<element formatVersion="1.0">
<firstValue>firstText</firstValue>
<secondValue>secondText</secondValue>
<thirdValue>thirdText</thirdValue>
<errors>
<path><![CDATA[path]]></path>
<!--Stt-->
<code_main />
</errors>
<reference>3</reference>
</element>
</main>
Modified:
<main>
<element formatVersion="1.0">
<firstValue>firstText</firstValue>
<secondValue>secondText</secondValue>
<thirdValue>thirdText</thirdValue>
<errors>
<path>path</path>
<code_main />
</errors>
<reference>3</reference>
</element>
</main>
I am making a request to the Salesforce merge API and getting a response like this:
xml_result = '<?xml version="1.0" encoding="UTF-8"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns="urn:partner.soap.sforce.com">
<soapenv:Header>
<LimitInfoHeader>
<limitInfo>
<current>62303</current>
<limit>2680000</limit><type>API REQUESTS</type></limitInfo>
</LimitInfoHeader>
</soapenv:Header>
<soapenv:Body>
<mergeResponse>
<result>
<errors>
<message>invalid record type</message>
<statusCode>INSUFFICIENT_ACCESS_ON_CROSS_REFERENCE_ENTITY</statusCode>
</errors>
<id>003skdjf494244</id>
<success>false</success>
</result>
</mergeResponse>
</soapenv:Body>
</soapenv:Envelope>'
I'd like to be able to parse this response and if success=false, return the errors, statusCode, and the message text.
I've tried the following:
import xml.etree.ElementTree as ET
tree = ET.fromstring(xml_result)
root.find('mergeResponse')
root.find('{urn:partner.soap.sforce.com}mergeResponse')
root.findtext('mergeResponse')
root.findall('{urn:partner.soap.sforce.com}mergeResponse')
...and a bunch of other variations of find, findtext and findall but I can't seem to get these to return any results. Here's where I get stuck. I've tried to follow the ElementTree docs, but I don't understand how to parse the tree for specific elements.
Element.find() finds the first child with a particular tag
https://docs.python.org/2/library/xml.etree.elementtree.html#finding-interesting-elements
Since mergeResponse is a descendant, not a child, you should use XPath-syntax in this case:
root.find('.//{urn:partner.soap.sforce.com}mergeResponse')
will return your node. .// searches all descendants starting with the current node (in this case the root).
I am calling an API by sending an xml request by doing a string formatting like this:
data = '''<?xml version="1.0" encoding="utf-8"?>
<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<SOAP-ENV:Body>
<ns2:MultiAvailabilityRequest xmlns:m="http://www.derbysoft.com/doorway" Password="CoolJoe" Token="{token}" UserName="CoolJoe">
<ns2:MultiAvailabilityCriteria NumberOfUnits="{units}">
<ns2:StayDateRange CheckIn="2016-05-02" CheckOut="2016-05-04"/>
<ns2:GuestCounts>
<ns2:GuestCount AdultCount="{adultcount}"/>
</ns2:GuestCounts>
<ns2:HotelCodes>
<ns2:HotelCode>{hotelcode}</ns2:HotelCode>
</ns2:HotelCodes>
</ns2:MultiAvailabilityCriteria>
</ns2::MultiAvailabilityRequest>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>'''.format(token=token, units=units, adultcount=adultcount, hotelcode=hotelcode)
The above code is working fine and getting the value of different hotelcodes, token etc and showing the results based on them.
But, I have one more different requirement where the hotelcodes could be more than 1 (either 2,3 or more). And, the required xml will look like this:
data = '''<?xml version="1.0" encoding="utf-8"?>
<SOAP-ENV:Envelope xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<SOAP-ENV:Body>
<ns2:MultiAvailabilityRequest xmlns:m="http://www.derbysoft.com/doorway" Password="CoolJoe" Token="{token}" UserName="CoolJoe">
<ns2:MultiAvailabilityCriteria NumberOfUnits="{units}">
<ns2:StayDateRange CheckIn="2016-05-02" CheckOut="2016-05-04"/>
<ns2:GuestCounts>
<ns2:GuestCount AdultCount="{adultcount}"/>
</ns2:GuestCounts>
<ns2:HotelCodes>
<ns2:HotelCode>{hotelcode1}</ns2:HotelCode>
<ns2:HotelCode>{hotelcode2}</ns2:HotelCode>
<ns2:HotelCode>{hotelcode3}</ns2:HotelCode>
</ns2:HotelCodes>
</ns2:MultiAvailabilityCriteria>
</ns2::MultiAvailabilityRequest>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>'''.format(token=token, units=units, adultcount=adultcount)
So, my question is: how do I check whether two hotelcodes are present or more than two. As you can see from second xml for each hotel code, a new line like this adds up:
<ns2:HotelCode>{hotelcode1}</ns2:HotelCode>
Any help would be appreciated. Thanks.
Basically you should split the process in two parts:
fill in the hotel codes (doesn't really matter if it's one or more):
hotelcode_string =''.join(['<ns2:HotelCode>{hotelcode}</ns2:HotelCode>'.format(hotelcode=code) for code in set([item["hotelcode"] for item in hotelcode])])
put the hotel code section in the xml:
data = '''.... <ns2:HotelCodes>{hotelcode_string}</ns2:HotelCodes>
...'''.format(token=token, units=units, adultcount=adultcount,hotelcode_string=hotelcode_string)
So I have the following XML document It is much longer:
<?xml version ="1.0" encoding="UTF-8" standalone="no" ?>
<!DOCTYPE fmresultset PUBLIC "-//FMI//DTD fmresultset//EN" "http://localhost:16020/fmi/xml/fmresultset.dtd">
<fmresultset xmlns="http://www.filemaker.com/xml/fmresultset" version="1.0">
<error code="0">
</error>
<product build="11/11/2014" name="FileMaker Web Publishing Engine" version="13.0.5.518">
</product>
I use the following python to extract some of the tag names:
doc = etree.fromstring(resulttxt)
print( doc.attrib)
print(doc.tag)
print(doc[4][0][0].tag)
if(doc[4][0][0].tag == 'field'):
print 'hi'
What I'm getting though is:
{'version': '1.0'}
{http://www.filemaker.com/xml/fmresultset}fmresultset
{http://www.filemaker.com/xml/fmresultset}field
The xmlns doesn't show up as an attribute of the root tag but it is there.
And it is placed in front of each tag name which makes it difficult to loop through and use conditionals. I want doc.tag just to show the tag and not the namespace and the tag.
This is day 1 for me using this. could anyone help out?
You need to handle namespaces, in your case an empty one:
from lxml import etree as ET
data = """<?xml version ="1.0" encoding="UTF-8" standalone="no" ?>
<!DOCTYPE fmresultset PUBLIC "-//FMI//DTD fmresultset//EN" "http://localhost:16020/fmi/xml/fmresultset.dtd">
<fmresultset xmlns="http://www.filemaker.com/xml/fmresultset" version="1.0">
<error code="0">
</error>
<product build="11/11/2014" name="FileMaker Web Publishing Engine" version="13.0.5.518">
</product>
</fmresultset>
"""
namespaces = {
"myns": "http://www.filemaker.com/xml/fmresultset"
}
tree = ET.fromstring(data)
print tree.find("myns:product", namespaces=namespaces).attrib.get("name")
Prints:
FileMaker Web Publishing Engine
How can I remove all the attributes of a xml tag so I can get from this:
<xml blah blah blah> to just <xml>.
With lxml I know I can remove the whole element and I didn't find any way to do it specific on a tag. (I found solutions on stackoverflow for C# but I want Python).
I am opening a gpx(xml) file and this is my code so far (based on How do I get the whole content between two xml tags in Python?):
from lxml import etree
t = etree.parse("1.gpx")
e = t.xpath('//trk')[0]
print(e.text + ''.join(map(etree.tostring, e))).strip()
Another approach I did was this:
from lxml import etree
TOPOGRAFIX_NS = './/{http://www.topografix.com/GPX/1/1}'
TRACKPOINT_NS = TOPOGRAFIX_NS + 'extensions/{http://www.garmin.com/xmlschemas/TrackPointExtension/v1}TrackPointExtension/{http://www.garmin.com/xmlschemas/TrackPointExtension/v1}'
doc1 = etree.parse("1.gpx")
for node1 in doc1.findall(TOPOGRAFIX_NS + 'trk'):
node_to_string1 = etree.tostring(node1)
print(node_to_string1)
But I get the trk tag with TOPOGRAFIX_NS attributes witch I don't want and here I am wanting to remove the tag attribute. I just want to get:
<trk> all the inside content </trk>
Thank you very much!
P.S. The content of the gpx file:
<?xml version="1.0" encoding="UTF-8"?>
<gpx version="1.1" creator="Endomondo.com" xsi:schemaLocation="http://www.topografix.com/GPX/1/1 http://www.topografix.com/GPX/1/1/gpx.xsd http://www.garmin.com/xmlschemas/GpxExtensions/v3 http://www.garmin.com/xmlschemas/GpxExtensionsv3.xsd http://www.garmin.com/xmlschemas/TrackPointExtension/v1 http://www.garmin.com/xmlschemas/TrackPointExtensionv1.xsd" xmlns="http://www.topografix.com/GPX/1/1" xmlns:gpxtpx="http://www.garmin.com/xmlschemas/TrackPointExtension/v1" xmlns:gpxx="http://www.garmin.com/xmlschemas/GpxExtensions/v3" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<metadata>
<author>
<name>Blah Blah</name>
<email id="blah" domain="blah.com"/>
</author>
<link href="http://www.endomondo.com">
<text>Endomondo</text>
</link>
<time>2014-01-20T10:50:28Z</time>
</metadata>
<trk>
<name>Galati</name>
<src>http://www.endomondo.com/</src>
<link href="http://www.endomondo.com/workouts/260782567/13005122">
<text>Galati</text>
</link>
<type>MOUNTAIN_BIKING</type>
<trkseg>
<trkpt lat="45.431074" lon="28.021038">
<time>2013-10-20T05:49:04Z</time>
</trkpt>
</trkseg>
</trk>
</gpx>