I'm using the following XML:
<feed xmlns:im="http://itunes.apple.com/rss" xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
<title>iTunes Store: Top Free Apps</title>
<link rel="alternate" type="text/html" href="https://itunes.apple.com/WebObjects/MZStore.woa/wa/viewTop?cc=in&id=134581&popId=27"/>
<link rel="self" href="https://itunes.apple.com/IN/rss/topfreeapplications/limit=200/xml"/>
<name>iTunes Store</name>
<rights>Copyright 2008 Apple Inc.</rights>
<id im:id="473941634" im:bundleId="com.one97.paytm">https://itunes.apple.com/in/app/recharge-bill-payment-wallet/id473941634?mt=8&uo=2</id>
<title>Recharge, Bill Payment & Wallet - Paytm Mobile Solutions</title>
<im:name>Recharge, Bill Payment & Wallet</im:name>
<link rel="alternate" type="text/html" href="https://itunes.apple.com/in/app/recharge-bill-payment-wallet/id473941634?mt=8&uo=2"/>
<im:contentType term="Application" label="Application"/>
<category im:id="6024" term="Shopping" scheme="https://itunes.apple.com/in/genre/ios-shopping/id6024?mt=8&uo=2" label="Shopping"/>
<im:artist href="https://itunes.apple.com/in/developer/paytm-mobile-solutions/id473941637?mt=8&uo=2">Paytm Mobile Solutions</im:artist>
<im:price amount="0.00000" currency="INR">Get</im:price>
<im:image height="53">http://is1.mzstatic.com/image/thumb/Purple71/v4/9b/37/bf/9b37bf75-6b4d-9c95-a8a4-ea369f05ae7e/pr_source.png/53x53bb-85.png</im:image>
<im:image height="75">http://is5.mzstatic.com/image/thumb/Purple71/v4/9b/37/bf/9b37bf75-6b4d-9c95-a8a4-ea369f05ae7e/pr_source.png/75x75bb-85.png</im:image>
<im:image height="100">http://is5.mzstatic.com/image/thumb/Purple71/v4/9b/37/bf/9b37bf75-6b4d-9c95-a8a4-ea369f05ae7e/pr_source.png/100x100bb-85.png</im:image>
<rights>© One97 Communications Ltd</rights>
<im:releaseDate label="24 October 2011">2011-10-24T16:18:48-07:00</im:releaseDate>
<content type="html"></content>
I would like to extract the id information for each entry value:
the attribute is as follows: "im:id"
from xml.dom import minidom
xmldoc = minidom.parse('topIN.xml')
itemlist = xmldoc.getElementsByTagName('link')
I get information:
[u'href', u'type', u'rel']
But when I do the same of id, nothing returns.
Here is a version using xml.etree.ElementTree:
import xml.etree.ElementTree as ET
tree = ET.parse('topIN.xml')
root = tree.getroot()
ns={'im':"http://itunes.apple.com/rss", 'atom':"http://www.w3.org/2005/Atom"}
for id_ in root.findall('atom:entry/atom:id', ns):
print (id_.attrib['{' + ns['im'] + '}id'])
Here is a version using lxml:
from lxml import etree
ns={'im':"http://itunes.apple.com/rss", 'atom':"http://www.w3.org/2005/Atom"}
print('\n'.join(root.xpath('atom:entry/atom:id/#im:id', namespaces=ns)))
This worked:
from xml.dom import minidom
xmldoc = minidom.parse('topIN.xml')
itemlist = xmldoc.getElementsByTagName('entry')
for s in itemlist:
print s.getElementsByTagName('id')[0].attributes['im:id'].value
Is it possible to change the id attribute in a tag in simplekml?
import simplekml
kml = simplekml.Kml()
pnt = kml.newpoint(name='A Point')
pnt.coords = [(1.0, 2.0)]
This will generate the following document
<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://www.opengis.net/kml/2.2" xmlns:gx="http://www.google.com/kml/ext/2.2">
<Document id="1">
<Folder id="2">
<Style id="5">
<IconStyle id="6">
<Icon id="7">
<Placemark id="4">
<name>A Point</name>
<Point id="3">
Note that in some tags, an id attribute appears as if it had been generated from some simplekml index. I needed to change the ID assigned to the <Style id="5"> tag to <Style id="icon-1532-0288D1-nodesc-normal">
This would change the icon on Google My Maps. How can I do this using simplekml?
It looks like in the (getting started) documentation, https://simplekml.readthedocs.io/en/latest/gettingstarted.html, the id tag is generated differently for different kinds of objects, ie 'feat_1', 'feat_2', 'geom_0':
import simplekml
kml = simplekml.Kml()
pnt = kml.newpoint(name="A Point")
print kml.kml()
This is what is generated:
<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://www.opengis.net/kml/2.2" xmlns:gx="http://www.google.com/kml/ext/2.2">
<Document id="feat_1">
<Placemark id="feat_2">
<name>A Point</name>
<Point id="geom_0">
<coordinates>0.0, 0.0, 0.0</coordinates>
I looked through the source code, and it looks like, at least in version 1.3.2, they got away from that, and generate tag id's by just counting up.
class Kmlable(object):
"""Enables a subclass to be converted into KML."""
_globalid = 0
_currentroot = None
_compiling = False
_namespaces = ['xmlns="http://www.opengis.net/kml/2.2"', 'xmlns:gx="http://www.google.com/kml/ext/2.2"']
def __init__(self):
self._id = str(Kmlable._globalid)
Kmlable._globalid += 1
from collections import OrderedDict
self._kml = OrderedDict()
except ImportError:
self._kml = {}
For some reason, these id tags are designed to be read only:
def id(self):
"""The id string (read only)."""
return self._id
One hack-y way you could change this is something like (in base.py):
class Kmlable(object):
"""Enables a subclass to be converted into KML."""
#_globalid = 0
_globalid = 'your_string_here'
_currentroot = None
_compiling = False
_namespaces = ['xmlns="http://www.opengis.net/kml/2.2"', 'xmlns:gx="http://www.google.com/kml/ext/2.2"']
def __init__(self):
self._id = str(Kmlable._globalid)
#Kmlable._globalid += 1
from collections import OrderedDict
self._kml = OrderedDict()
except ImportError:
self._kml = {}
But that would set the same id for every object you have in your kml document:
import simplekml
kml = simplekml.Kml()
pnt = kml.newpoint(name="A Point")
<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://www.opengis.net/kml/2.2" xmlns:gx="http://www.google.com/kml/ext/2.2">
<Document id="your_string_here">
<Placemark id="your_string_here">
<name>A Point</name>
<Point id="your_string_here">
<coordinates>0.0, 0.0, 0.0</coordinates>
Hopefully that helps your understanding. Not a full-stop answer, but too long to post in the comments.
I had a xml code and i want to get text in exact elements(xml tags) using python language .
I have tried couple of solutions and didnt work.
import xml.etree.ElementTree as ET
tree = ET.fromstring(xml)
for node in tree.iter('Model'):
print node
How can i do that ?
Xml Code :
<ResponseMessage xsi:nil="true" />
<ErrorCode xsi:nil="true" />
<RequestId> 2012290007705 </RequestId>
<AbsoluteOwner>SIYAPATHA FINANCE PLC</AbsoluteOwner>
<ClassOfVehicle>MOTOR CAR</ClassOfVehicle>
<SpecialConditions xsi:nil="true" />
Edited and improved answer:
import xml.etree.ElementTree as ET
import re
ns = {"veh": "http://schemas.conversesolutions.com/xsd/dmticta/v1"}
tree = ET.parse('test.xml') # save your xml as test.xml
root = tree.getroot()
def get_tag_name(tag):
return re.sub(r'\{.*\}', '',tag)
for node in root.find(".//veh:return", ns):
print(get_tag_name(node.tag)+': ', node.text)
It should produce something like this:
ResponseMessage: None
ErrorCode: None
RequestId: 2012290007705
TransactionCharge: 150
VehicleNumber: GF-0176
EngineNo: GA15-483936F
ClassOfVehicle: MOTOR CAR
YearOfManufacture: 1998
NoOfSpecialConditions: 0
SpecialConditions: None
I have the following XML format, and I want to pull out the values for name, region, and status using python's xml.etree.ElementTree module.
However, my attempt to get this information has been unsuccessful so far.
<title type="text"></title>
<content type="application/xml">
<NamespaceDescription xmlns="http://schemas.microsoft.com/netservices/2010/10/servicebus/connect" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<title type="text"></title>
<content type="application/xml">
<NamespaceDescription xmlns="http://schemas.microsoft.com/netservices/2010/10/servicebus/connect" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
My code attempt:
NAMESPACE = '{http://www.w3.org/2005/Atom}'
root = et.fromstring(XML_STRING)
entry_root = root.findall('{0}entry'.format(NAMESPACE))
for child in entry_root:
content_node = child.find('{0}content'.format(NAMESPACE))
for content in content_node:
for desc in content.iter():
print desc.tag
name = desc.find('{0}Name'.format(NAMESPACE))
print name
desc.tag is giving me the nodes I want to access, but name is returning None. Any ideas what's wrong with my code?
Output of desc.tag:
I don't know why I didn't see this before. But, I was able to get the values.
root = et.fromstring(XML_STRING)
entry_root = root.findall('{0}entry'.format(NAMESPACE))
for child in entry_root:
content_node = child.find('{0}content'.format(NAMESPACE))
for descr in content_node:
name_node = descr.find('{0}Name'.format(NAMESPACE))
print name_node.text
You can use lxml.etree along with default namespace mapping to parse the XML as follows:
content = '''
<title type="text"></title>
<content type="application/xml">
<NamespaceDescription xmlns="http://schemas.microsoft.com/netservices/2010/10/servicebus/connect" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<title type="text"></title>
<content type="application/xml">
<NamespaceDescription xmlns="http://schemas.microsoft.com/netservices/2010/10/servicebus/connect" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
from lxml import etree
tree = etree.XML(content)
ns = {'default': 'http://schemas.microsoft.com/netservices/2010/10/servicebus/connect'}
names = tree.xpath('//default:Name/text()', namespaces=ns)
regions = tree.xpath('//default:Region/text()', namespaces=ns)
statuses = tree.xpath('//default:Status/text()', namespaces=ns)
['instancename', 'instancename2']
['US', 'US2']
['Active', 'Active']
This XPath/namespace functionality can be adapted to output the data in any format you require.
Example xml:
<response version-api="2.0">
<book available="20" id="1" tags="">
<author id="1" tags="Joel">Manuel De Cervantes</author>
<book available="14" id="2" tags="Jane">
<title>Catcher in the Rye</title>
<author id="2" tags="">JD Salinger</author>
<book available="13" id="3" tags="">
<author id="3">Lewis Carroll</author>
<book available="5" id="4" tags="Harry">
<author id="4">Manuel De Cervantes</author>
I want to append a string value of my choosing to all attributes called "tags". This is whether the "tags" attribute has a value or not and also the attributes are at different levels of the xml structure. I have tried the method findall() but I keep on getting an error "IndexError: list index out of range." This is the code I have so far which is a little short but I have run out of steam for what else I need to type...
splitter = etree.XMLParser(strip_cdata=False)
xmldoc = etree.parse(os.path.join(root, xml_file), splitter ).getroot()
for child in xmldoc:
if child.tag != 'response':
allDescendants = list(etree.findall())
for child in allDescendants:
if hasattr(child, 'tags'):
child.attribute["tags"].value = "someString"
findall() is the right API to use. Here is an example:
from lxml import etree
import os
splitter = etree.XMLParser(strip_cdata=False)
xml_file = 'foo.xml'
root = '.'
xmldoc = etree.parse(os.path.join(root, xml_file), splitter ).getroot()
for element in xmldoc.findall(".//*[#tags]"):
element.attrib["tags"] += " KILROY!"
print etree.tostring(xmldoc)
Here is a little xml example:
<?xml version="1.0" encoding="UTF-8"?>
<person id="1">
<city>New York</city>
<person id="2">
Now I need all Persons with a name and city.
I tried:
# coding: utf8
import xml.dom.minidom as dom
tree = dom.parse("test.xml")
for listItems in tree.firstChild.childNodes:
for personItems in listItems.childNodes:
if personItems.nodeName == "name" and personItems.nextSibling == "city":
print personItems.firstChild.data.strip()
But the ouput is empty. Without the "and" condition I become all names. How can I check that the next tag after "name" is "city"?
You can do this in minidom:
import xml.dom.minidom as minidom
def getChild(n,v):
for child in n.childNodes:
if child.localName==v:
yield child
xmldoc = minidom.parse('test.xml')
person = getChild(xmldoc, 'list')
for p in person:
for v in getChild(p,'person'):
attr = v.getAttributeNode('id')
if attr:
print attr.nodeValue.strip()
This prints id of person nodes:
use element tree check this element tree
import xml.etree.ElementTree as ET
tree = ET.parse('a.xml')
root = tree.getroot()
for person in root.findall('person'):
name = person.find('name').text
city = person.find('city').text
print name, city
for id u can get it by id= person.get('id')
output:Smith New York
Using lxml, you can use xpath to get in one step what you need:
from lxml import etree
xmlstr = """
<person id="1">
<city>New York</city>
<person id="2">
xml = etree.fromstring(xmlstr)
xp = "//person[city]"
for person in xml.xpath(xp):
print etree.tostring(person)
lxml is external python package, but is so useful, that to me it is always worth to install.
xpath is searching for any (//) element person having (declared by content of []) subelement city.