I have this xml:
<resources>
<string name="name1">value1</string>
<string name="name2">value2</string>
<string name="name3">value3</string>
<string name="name4">value4</string>
<string name="name5">value5</string>
</resources>
and I want to change each value of each string tag, I've tried with ElementTree but i can not solve it...
I have this but it doesn't works!
tree = ET.parse(archivo_xml)
root = tree.getroot()
cadena = root.findall('string')
cadena.text = "something"
The root.findall() does return a list which is why that approach doesn't work.
Use root.iter() to find all the matching tags with 'string' instead, then loop over the results and change the text value of each.
for cadena in root.iter('string'):
cadena.text = "something"
Related
I have an xml in python, need to obtain the elements of the "Items" tag in an iterable list.
I need get a iterable list from this XML, for example like it:
Item 1: Bicycle, value $250, iva_tax: 50.30
Item 2: Skateboard, value $120, iva_tax: 25.0
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<data>
<info>Listado de items</info>
<detalle>
<![CDATA[<?xml version="1.0" encoding="UTF-8"?>
<tienda id="tiendaProd" version="1.1.0">
<items>
<item>
<nombre>Bicycle</nombre>
<valor>250</valor>
<data>
<tax name="iva" value="50.30"></tax>
</data>
</item>
<item>
<nombre>Skateboard</nombre>
<valor>120</valor>
<data>
<tax name="iva" value="25.0"></tax>
</data>
</item>
<item>
<nombre>Motorcycle</nombre>
<valor>900</valor>
<data>
<tax name="iva" value="120.50"></tax>
</data>
</item>
</items>
</tienda>]]>
</detalle>
</data>
I am working with
import xml.etree.ElementTree as ET
for example
import xml.etree.ElementTree as ET
xml = ET.fromstring(stringBase64)
ite = xml.find('.//detalle').text
tixml = ET.fromstring(ite)
You can use BeautifulSoup4 (BS4) to do this.
from bs4 import BeautifulSoup
#Read XML file
with open("example.xml", "r") as f:
contents = f.readlines()
#Create Soup object
soup = BeautifulSoup(contents, 'xml')
#find all the item tags
item_tags = soup.find_all("item") #returns everything in the <item> tags
#find the nombre and valor tags within each item
results = {}
for item in item_tags:
num = item.find("nombre").text
val = item.find("valor").text
results[str(num)] = val
#Prints dictionary with key value pairs from the xml
print(results)
I have just work with XML using ElementTree in my current project. I have a task to change a subchild value based on another subchild value in the same child.
I have created a code for that but somehow feel that there might be a way to improve on this readability wise and performance wise.
Here is my code,
import xml.etree.ElementTree as ET
tree = ET.ElementTree(ET.fromstring("<Properties><Property><Name>KENT</Name><Value>99</Value></Property><Property><Name>JOHN</Name><Value>fifthy</Value></Property></Properties>"))
root = tree.getroot()
change_found = False
for item in root:
for subItem in item:
if change_found and subItem.tag == "Value":
subItem.text = "50"
change_found = False
if subItem.tag == "Name" and subItem.text == "JOHN":
change_found = True
print(ET.tostring(root, encoding='utf8', method='xml'))
As you can see from the code, when the subchild text is "JOHN" and the tag is "Name", it sets the change_found to True. Since the next subchild has a tag of Value, it made the change to the text (from fifty to 50).
The code works fine, but I believe there can be some improvement.
You can assume that the structure of the property is always in this order.
<Property>
<Name> Some name </Name>
<Value> Some value </Value>
<Property>
You can also assume that there are only 1 with has a subchild "NAME" with a text "JOHN"
If I understand you correctly, you can get there more simply, using xpath:
root = ET.fromstring([your string above])
for p in root.findall('.//Property'):
if p.find('.//Name').text.strip()=="JOHN":
p.find('.//Value').text="50"
print(ET.tostring(root).decode())
Output:
<Properties>
<Property>
<Name>KENT
</Name>
<Value>99
</Value>
</Property>
<Property>
<Name>JOHN
</Name>
<Value>50</Value>
</Property>
</Properties>
I am trying to do the folowing with Python:
get "price" value and change it
find "price_qty" and insert new line with new tier and different price based on the "price".
so far I could only find the price and change it and insert line in about correct place but I can't find a way how to get there "item" and "qty" and "price" attributes, nothing has worked so far...
this is my original xml:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<body start="20.04.2014 10:02:60">
<pricelist>
<item>
<name>LEO - red pen</name>
<price>31,4</price>
<price_snc>0</price_snc>
<price_ao>0</price_ao>
<price_qty>
<item qty="150" price="28.20" />
<item qty="750" price="26.80" />
<item qty="1500" price="25.60" />
</price_qty>
<stock>50</stock>
</item>
</pricelist>
the new xml should look this way:
<pricelist>
<item>
<name>LEO - red pen</name>
<price>31,4</price>
<price_snc>0</price_snc>
<price_ao>0</price_ao>
<price_qty>
<item qty="10" price="31.20" /> **-this is the new line**
<item qty="150" price="28.20" />
<item qty="750" price="26.80" />
<item qty="1500" price="25.60" />
</price_qty>
<stock>50</stock>
</item>
</pricelist>
my code so far:
import xml.etree.cElementTree as ET
from xml.etree.ElementTree import Element, SubElement
tree = ET.ElementTree(file='pricelist.xml')
root = tree.getroot()
pos=0
# price - raise the main price and insert new tier
for elem in tree.iterfind('pricelist/item/price'):
price = elem.text
newprice = (float(price.replace(",", ".")))*1.2
newtier = "NEW TIER"
SubElement(root[0][pos][5], newtier)
pos+=1
tree.write('pricelist.xml', "UTF-8")
result:
...
<price_qty>
<item price="28.20" qty="150" />
<item price="26.80" qty="750" />
<item price="25.60" qty="1500" />
<NEW TIER /></price_qty>
thank you for any help.
Don't use fixed indexing. You already have the item element, so why don't use it?
tree = ET.ElementTree(file='pricelist.xml')
root = tree.getroot()
for elem in tree.iterfind('pricelist/item'):
price = elem.findtext('price')
newprice = float(price.replace(",", ".")) * 1.2
newtier = ET.Element("item", qty="10", price="%.2f" % newprice)
elem.find('price_qty').insert(0, newtier)
tree.write('pricelist.xml', "UTF-8")
In the below XML, I want to parse it and update the value of "alcohol" to "yes" for all the attributes where age>21. I'm having a problem with it being a node buried inside other nodes. Could someone help me understand how to handle this?
Here's the XML again..
<root xmlns="XYZ" usingPalette="">
<grandParent hostName="XYZ">
<parent>
<child name="JohnsDad">
<grandChildren name="John" sex="male" age="22" alcohol="no"/>
</child>
<child name="PaulasDad">
<grandChildren name="Paula" sex="female" age="15" alcoho="no"/>
</child>
</parent>
</grandParent>
</root>
I tried find all and find method using this document here (http://pymotw.com/2/xml/etree/ElementTree/parse.html) but it didn't find it. For example, following code returns no results
for node in tree.findall('.//grandParent'):
print node
import xml.etree.ElementTree as ET
tree = ET.parse('data')
for node in tree.getiterator():
if int(node.attrib.get('age', 0)) > 21:
node.attrib['alcohol'] = 'yes'
root = tree.getroot()
ET.register_namespace("", "XYZ")
print(ET.tostring(root))
yields
<root xmlns="XYZ" usingPalette="">
<grandParent hostName="XYZ">
<parent>
<child name="JohnsDad">
<grandChildren age="22" alcohol="yes" name="John" sex="male" />
</child>
<child name="PaulasDad">
<grandChildren age="15" alcoho="no" name="Paula" sex="female" />
</child>
</parent>
</grandParent>
</root>
By the way, since the XML uses the namespace "XYZ", you must specify the namespace in your XPath:
for node in tree.findall('.//{XYZ}grandParent'):
print node
That will return the grandParent element, but since you want to inspect all subnodes, I think using getiterator is easier here.
To preserve comments while using xml.etree.ElementTree you could use the custom parser Fredrik Lundh shows here:
import xml.etree.ElementTree as ET
class PIParser(ET.XMLTreeBuilder):
"""
http://effbot.org/zone/element-pi.htm
"""
def __init__(self):
ET.XMLTreeBuilder.__init__(self)
# assumes ElementTree 1.2.X
self._parser.CommentHandler = self.handle_comment
self._parser.ProcessingInstructionHandler = self.handle_pi
self._target.start("document", {})
def close(self):
self._target.end("document")
return ET.XMLTreeBuilder.close(self)
def handle_comment(self, data):
self._target.start(ET.Comment, {})
self._target.data(data)
self._target.end(ET.Comment)
def handle_pi(self, target, data):
self._target.start(ET.PI, {})
self._target.data(target + " " + data)
self._target.end(ET.PI)
tree = ET.parse('data', PIParser())
Note that if you install lxml, you could instead use:
import lxml.etree as ET
parser = ET.XMLParser(remove_comments=False)
tree = etree.parse('data', parser=parser)
my XML file is
<list>
<ProfileDefinition>
<string name="ID">nCGhwaZNpy6</string>
<string name="name">02.11.2013 Scott Mobile</string>
<decimal name="AccountID">10954</decimal>
<decimal name="TimeZoneID">-600</decimal>
</ProfileDefinition><ProfileDefinition>
<string name="ID">9JsG57bRUu6</string>
<string name="name">Huggies US-EN & CA-EN Test Town Responsive - Prod</string>
<decimal name="AccountID">10954</decimal>
<decimal name="TimeZoneID">-600</decimal>
</ProfileDefinition><ProfileDefinition>
<string name="ID">I3CJQ4gDkK6</string>
<string name="name">Huggies US-EN Brand Desktop - Prod</string>
<decimal name="AccountID">10954</decimal>
<decimal name="TimeZoneID">-600</decimal></ProfileDefinition>
my code is
import urllib2
theurl = 'https://ws.webtrends.com/v2/ReportService/profiles/?format=xml'
pagehandle = urllib2.urlopen(theurl)
##########################################################################
from xml.dom.minidom import parseString
file = pagehandle
data = file.read()
file.close()
dom = parseString(data)
xmlTag = dom.getElementsByTagName('string name="ID"')[0].toxml()
xmlData=xmlTag.replace('<string name="ID">','').replace('</string>','')
print xmlTag
print xmlData
I want to get value of element with tagname 'string name="ID"'
but the error comes
Traceback (most recent call last):
File "C:\Users\Vaibhav\Desktop\Webtrends\test.py", line 43, in <module>
xmlTag = dom.getElementsByTagName('string name="ID"')[0].toxml()
IndexError: list index out of range
if i replace
dom.getElementsByTagName('string name="ID"')[0].toxml()
to
dom.getElementsByTagName('string')[0].toxml()
the output comes
"nCGhwaZNpy6"
since its the first element of that list
but second element is
"02.11.2013 Scott Mobile"
which also get saved in list which i don't want
however there are two string tag with name="ID" and name="name"
how to access the string tag with name="ID" only
string name="ID" is not tag name. Only string is tag name.
You have to compare name attribute value for each string tag.
....
dom = parseString(data)
for s in dom.getElementsByTagName('string'):
if s.getAttribute('name') == 'ID':
print s.childNodes[0].data
I recommed you to use lxml or BeautifulSoup.
Following is equivalent code using lxml.
import lxml.html
dom = lxml.html.fromstring(data)
for s in dom.cssselect('string[name=ID]'):
print s.text