Customizing dictionary conversion with dicttoxml?

Customizing dictionary conversion with dicttoxml? - python

I want to convert dictionary to xml in python, I am using dicttoxml for that. My code looks like this:
>>> import os
>>> filepath=os.path.normpath('C:\\Users\\\\Desktop\\abc\\bc.txt')
>>> with open(filepath,'r') as f:
... for line in f:
... x=line.split(':')
... x[-1]=x[-1].strip()
... a=x[0]
... b=x[1]
... d[a]=b
... xml = dicttoxml(d,custom_root='doc',attr_type=False)
... xml2 = parseString(xml)
... xml3 = xml2.toprettyxml()
... print( xml3 )
The output is like this:
<?xml version="1.0" ?>
<doc>
<key name="product/productId">B000GKXY34</key>
<key name="product/title">Nun Chuck, Novelty Nun Toss Toy</key>
<key name="product/price">17.99</key>
<key name="review/userId">ADX8VLDUOL7BG</key>
<key name="review/profileName">M. Gingras</key>
<key name="review/helpfulness">0/0</key>
<key name="review/score">5.0</key>
<key name="review/time">1262304000</key>
<key name="review/summary">Great fun!</key>
<key name="review/text">Got these last Christmas as a gag gift. They are great fun, but obviously this is not a toy that lasts!</key>
</doc>
but I want to replace the key name with field name.
<field name="product/productId">B000GKXY34</field>
This is the dictionary generated by the code:
{'product/productId': 'B000GKXY34', 'product/title': 'Nun Chuck, Novelty Nun Toss Toy', 'product/price': '17.99', 'review/userId': 'ADX8VLDUOL7BG', 'review/profileName': 'M. Gingras', 'review/helpfulness': '0/0', 'review/score': '5.0', 'review/time': '1262304000', 'review/summary': 'Great fun!', 'review/text': 'Got these last Christmas as a gag gift. They are great fun, but obviously this is not a toy that lasts!'}
And also i want to write that xml in a new file,i am trying with write function but its not working:
with open ('filename','w') as output:
output.write(xml3)

According to dicttoxml documentation:
Define Custom Item Names
Starting in version 1.7, if you don’t want item elements in a list to be called ‘item’, you can specify the element name using a function that takes the parent element name (i.e. the list name) as an argument.
>>> import dicttoxml
>>> obj = {u'mylist': [u'foo', u'bar', u'baz'], u'mydict': {u'foo': u'bar', u'baz': 1}, u'ok': True}
>>> my_item_func = lambda x: 'list_item'
>>> xml = dicttoxml.dicttoxml(obj, item_func=my_item_func)
>>> print(xml)
<?xml version="1.0" encoding="UTF-8"?>
<root>
<mydict type="dict">
<foo type="str">bar</foo>
<baz type="int">1</baz>
</mydict>
<mylist type="list">
<list_item type="str">foo</list_item>
<list_item type="str">bar</list_item>
<list_item type="str">baz</list_item>
</mylist>
<ok type="bool">True</ok>
</root>

From the documentation, we have to use a function to provide the custom key name for the xml.
Reference - dicttoxml github
>>> import os
>>> filepath=os.path.normpath('C:\\Users\\\\Desktop\\abc\\bc.txt')
>>> with open(filepath,'r') as f:
... for line in f:
... x=line.split(':')
... x[-1]=x[-1].strip()
... a=x[0]
... b=x[1]
... d[a]=b
... returnfield = lambda x: 'field'
... xml = dicttoxml(d,custom_root='doc',attr_type=False, item_func=returnfield)
... xml2 = parseString(xml)
... xml3 = xml2.toprettyxml()
... print( xml3 )
Output is :
<?xml version="1.0" ?>
<doc>
<field name="product/productId">B000GKXY34</field>
<field name="product/title">Nun Chuck, Novelty Nun Toss Toy</field>
<field name="product/price">17.99</field>
<field name="review/userId">ADX8VLDUOL7BG</field>
<field name="review/profileName">M. Gingras</field>
<field name="review/helpfulness">0/0</field>
<field name="review/score">5.0</field>
<field name="review/time">1262304000</field>
<field name="review/summary">Great fun!</field>
<field name="review/text">Got these last Christmas as a gag gift. They are great fun, but obviously this is not a toy that lasts!</field>
</doc>

Related

How can I extract elementary values with ElementTree in Python?

I try to extract values attributes (ex. 'Filename') of that XML file in Python.
Can you help me ?
Here is the MC 'Librarytest.xml' file :
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<MPL Version="2.0" Title="Library">
<Item>
<Field Name="Filename">Y:\Styx\08 - Styx - Snowblind8. Snowblind.flac</Field>
<Field Name="Name">Snowblind</Field>
<Field Name="Artist">Styx</Field>
<Field Name="Album">Paradise Theater</Field>
<Field Name="Genre">Rock</Field>
</Item>
<Item>
<Field Name="Filename">Y:\David Gilmour\04 A Boat Lies Waiting.flac</Field>
<Field Name="Name">A Boat Lies Waiting</Field>
<Field Name="Artist">David Gilmour</Field>
<Field Name="Album">Rattle That Lock (Deluxe)</Field>
<Field Name="Genre">Progressive</Field>
</Item>
</MPL>
I try this :
import xml.etree.ElementTree as ET
xml_file = 'C:/Users/ClientMD/Downloads/MC Librarytest.xml'
tree = ET.parse(xml_file)
root = tree.getroot()
for each in root.findall('.//Field'):
rating = each.find('.//Filename')
print ('Nothing' if rating is None else rating.text)
and I obtain :
Nothing
...
Nothing

Like this:
import xml.etree.ElementTree as ET
xml_file = 'C:/Users/ClientMD/Downloads/MC Librarytest.xml'
tree = ET.parse(xml_file)
root = tree.getroot()
for each in root.findall('.//Field[#Name="Filename"]'):
rating = each.text
print ('Nothing' if rating is None else rating)
Output
Y:\Styx\08 - Styx - Snowblind8. Snowblind.flac
Y:\David Gilmour\04 A Boat Lies Waiting.flac

If you want to grab more elements and keep them under a single item context - you can use the below
import xml.etree.ElementTree as ET
xml = '''<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<MPL Version="2.0" Title="Library">
<Item>
<Field Name="Filename">Y:\Styx\\08 - Styx - Snowblind8. Snowblind.flac</Field>
<Field Name="Name">Snowblind</Field>
<Field Name="Artist">Styx</Field>
<Field Name="Album">Paradise Theater</Field>
<Field Name="Genre">Rock</Field>
</Item>
<Item>
<Field Name="Filename">Y:\David Gilmour\\04 A Boat Lies Waiting.flac</Field>
<Field Name="Name">A Boat Lies Waiting</Field>
<Field Name="Artist">David Gilmour</Field>
<Field Name="Album">Rattle That Lock (Deluxe)</Field>
<Field Name="Genre">Progressive</Field>
</Item>
</MPL>'''
INTERESTING_NAMES = ['Filename','Artist']
data = []
root = ET.fromstring(xml)
for item in root.findall('.//Item'):
temp = {}
for name in INTERESTING_NAMES:
temp[name] = item.find(f'Field[#Name="{name}"]').text
data.append(temp)
print(data)
output
[{'Filename': 'Y:\\Styx\\08 - Styx - Snowblind8. Snowblind.flac', 'Artist': 'Styx'}, {'Filename': 'Y:\\David Gilmour\\04 A Boat Lies Waiting.flac', 'Artist': 'David Gilmour'}]

Fetch xml tag values recursively using ElementTree

I have an xmk of the type:
<SCHOOL>
<GROUP name="GetStudInfo">
<DATA>
<NAME type="char">Sahil Jha</NAME>
<STD>11th</STD>
</DATA>
<DATA>
<NAME type="char">Rashmi Kaur</NAME>
<STD>11th</STD>
</DATA>
<DATA>
<NAME type="char">Palak Bisht</NAME>
<STD>11th</STD>
</DATA>
</SCHOOL>
I need to fetch the values of NAME, STD.
I tried doing this:
e = ET.ElementTree(ET.fromstring(getunitinfo_str))
for elt in e.iter():
print("{} {}".format(elt.tag, elt.text))
But this was covering other values as well:
Output:
SCHOOL
GROUP
DATA
NAME Sahil Jha
STD 11th
DATA
NAME Rashmi Kaur
STD 11th
DATA
NAME Palak Bisht
STD 11th
{}
Expected O/p:
{'Sahil Jha':'11th', 'Rashmi Kaur'::'11th', 'Palak Bisht':'11th'}
But the formatting should be of the type NAME:STD. Where am I going wrong?

As mentionned by #furas you can use XPATH to find all DATA elements and then find
NAME and STD elements:
import xml.etree.ElementTree as ET
xml = '''<SCHOOL>
<GROUP name="GetStudInfo">
<DATA>
<NAME type="char">Sahil Jha</NAME>
<STD>11th</STD>
</DATA>
<DATA>
<NAME type="char">Rashmi Kaur</NAME>
<STD>11th</STD>
</DATA>
<DATA>
<NAME type="char">Palak Bisht</NAME>
<STD>11th</STD>
</DATA>
</GROUP>
</SCHOOL>'''
e = ET.fromstring(xml)
for data_tag in e.findall('DATA'):
name = data_tag.find('NAME')
std = data_tag.find('STD')
print("{} {}".format(name.text, std.text))
Or you can use a dict comprehension to get the dictionary you want:
my_dict = {
data_tag.find('NAME').text: data_tag.find('STD').text
for data_tag in e.findall('.//DATA')
}
print(my_dict)

You need something more then only print() - you need if/else to check elt.tag to get only NAME and `STD.
Because NAME and STD are different tags so you will have to remeber NAME in some variable to use it when you get STD
name = None # default value at start
for elt in e.iter():
if elt.tag == 'NAME':
name = elt # remember element
if elt.tag == 'STD':
print("{}:{}".format(name.text, elt.text))
Or you could use xpath like in #qouify answer.
Minimal working code
getunitinfo_str = '''
<SCHOOL>
<GROUP name="GetStudInfo">
<DATA>
<NAME type="char">Sahil Jha</NAME>
<STD>11th</STD>
</DATA>
<DATA>
<NAME type="char">Rashmi Kaur</NAME>
<STD>11th</STD>
</DATA>
<DATA>
<NAME type="char">Palak Bisht</NAME>
<STD>11th</STD>
</DATA>
</GROUP>
</SCHOOL>
'''
import xml.etree.ElementTree as ET
e = ET.ElementTree(ET.fromstring(getunitinfo_str))
name = None # to remeber element
for elt in e.iter():
if elt.tag == 'NAME':
name = elt
if elt.tag == 'STD':
print("{}:{}".format(name.text, elt.text))

One liner below
import xml.etree.ElementTree as ET
xml = '''<SCHOOL>
<GROUP name="GetStudInfo">
<DATA>
<NAME type="char">Sahil Jha</NAME>
<STD>11th</STD>
</DATA>
<DATA>
<NAME type="char">Rashmi Kaur</NAME>
<STD>116th</STD>
</DATA>
<DATA>
<NAME type="char">Palak Bisht</NAME>
<STD>17th</STD>
</DATA>
</GROUP>
</SCHOOL>'''
root = ET.fromstring(xml)
data = {x.find("NAME").text: x.find("STD").text for x in root.findall('.//DATA')}
print(data)
output
{'Sahil Jha': '11th', 'Rashmi Kaur': '116th', 'Palak Bisht': '17th'}

Parsing name/value pairs from XML

I am trying to pull account details from XML files supplied by vendors.
I have one vendor that supplied XML files like:
<Accounts>
<Account>
<AccountNumber>1234567</AccountNumber>
<Balance>$200.00</Balance>
</Account>
<Account>
...
</Account>
</Accounts>
And I can parse this fairly easily using python:
mytree = et.parse(xml_path)
myroot = mytree.getroot()
for acc in charges_root.findall('Account'):
acctnum = acc.find('AccountNumber').text
balance = acc.find('Balance').text
print(acctnum, balance)
Which outputs like this:
1234567 $200.00
However another vendor supplies the XML files in something more like name/value pairs, and I am unsure how to easily access that data. It doesn't work the same way as above:
<Accounts>
<Account>
<field name='AccountNumber' value='1234567' />
<field name='Balance' value='$200.00' />
</Account>
<Account>
...
</Account>
</Accounts>
So far I've got this, but would like to be able to access the values separately and easily:
mytree = et.parse(xml_path)
myroot = mytree.getroot()
for field in myroot.findall('Account'):
for line in field:
print(line.attrib)
Which outputs something like:
{'name': 'AccountNumber', 'value': '1234567'}
{'name': 'Balance', 'value': '$200.00'}
So my question is this - How can I access the values and assign them to variables (based on the name) so that I can make use of them elsewhere in the script, like I have with acctnum and balance in the first example?

Populate a new datastructure (like a dict) from the field when you iterate instead of just discarding:
account_d = {}
for field in myroot.findall('Account'):
for line in field:
account_d[line.attrib['name']] = line.attrib['value']
# account_d should now be:
# { 'AccountNumber': '1234567', 'Balance': '$200.00' }
You can use a list of lists/tuples too:
account_a = []
for field in myroot.findall('Account'):
for line in field:
account_d.append(line.attrib['name'], line.attrib['value'])
# account_a should now be:
# [('AccountNumber', '1234567'), ('Balance', '$200.00')]

ElementTree 1.3 has the ability to locate nodes with particular attributes:
from xml.etree import ElementTree as et
data = '''\
<Accounts>
<Account>
<field name='AccountNumber' value='1234567' />
<field name='Balance' value='$200.00' />
</Account>
<Account>
<field name='AccountNumber' value='9999999' />
<field name='Balance' value='$300.00' />
</Account>
</Accounts>'''
tree = et.fromstring(data)
for acc in tree.iterfind('Account'):
acctnum = acc.find("field[#name='AccountNumber']").attrib['value']
balance = acc.find("field[#name='Balance']").attrib['value']
print(acctnum,balance)
1234567 $200.00
9999999 $300.00

You can do it by collecting all the Account element's field attributes into a dictionary and then using the information in it as needed:
accounts.xml sample input file:
<?xml version="1.0"?>
<Accounts>
<Account>
<field name='AccountNumber' value='1234567' />
<field name='Balance' value='$200.00' />
</Account>
<Account>
<field name='AccountNumber' value='8901234' />
<field name='Balance' value='$100.00' />
</Account>
</Accounts>
Code:
import xml.etree.ElementTree as et
xml_path = 'accounts.xml'
mytree = et.parse(xml_path)
myroot = mytree.getroot()
for acct in myroot.findall('Account'):
info = {field.attrib['name']: field.attrib['value']
for field in acct.findall('field')}
acctnum, balance = info['AccountNumber'], info['Balance']
print(acctnum, balance)
Result:
1234567 $200.00
8901234 $100.00

Question: How can I access the values and assign them to variables (based on the name)
Convert all Accounts to a Dict[AccountNumber] of Dict[field].
The Attribute name becomes the dict Key:
Accounts = {}
for account in root.findall('Account'):
fields = {}
for field in account.findall('field'):
fields[field.attrib['name']] = field.attrib['value']
print('{a[AccountNumber]} {a[Balance]}'.format(a=fields))
Accounts[fields['AccountNumber']] = fields
print(Accounts)
Output:
1234567 $200.00
9999999 $300.00
{'9999999': {'AccountNumber': '9999999', 'Balance': '$300.00'}, '1234567': {'AccountNumber': '1234567', 'Balance': '$200.00'}}
Tested with Python: 3.4.2

Store XML values as Python list

I have XML stored as a string "vincontents", formatted as such:
<response>
<data>
<vin>1FT7X2B69CEC76666</vin>
</data>
<data>
<vin>1GNDT13S452225555</vin>
</data>
</response>
I'm trying to use Python's elementtree library to parse out the VIN values into an array or Python list. I'm only interested in the values, not the tags.
def parseVins():
content = etree.fromstring(vincontents)
vins = content.findall("data/vin")
print vins
Outputs all of the tag information:
[<Element 'vin' at 0x2d2eef0>, <Element 'vin' at 0x2d2efd0> ....
Any help would be appreciated. Thank you!

Use .text property:
>>> import xml.etree.ElementTree as etree
>>> data = """<response>
... <data>
... <vin>1FT7X2B69CEC76666</vin>
... </data>
... <data>
... <vin>1GNDT13S452225555</vin>
... </data>
... </response>"""
>>> tree = etree.fromstring(data)
>>> [el.text for el in tree.findall('.//data/vin')]
['1FT7X2B69CEC76666', '1GNDT13S452225555']

Proper way to convert xml to dictionary

I'm not sure if this is the best way to convert this xml result into a dictionary, Besides doing that, is there any proper way to convert to dict ?
xml from http request result:
<Values version="2.0">
<value name="configuration">test</value>
<array name="configurationList" type="value" depth="1">
<value>test</value>
</array>
<value name="comment">Upload this for our robot.</value>
<array name="propertiesTable" type="record" depth="1">
<record javaclass="com.wm.util.Values">
<value name="name">date_to_go</value>
<value name="value">1990</value>
</record>
<record javaclass="com.wm.util.Values">
<value name="name">role</value>
<value name="value">Survivor</value>
</record>
<record javaclass="com.wm.util.Values">
<value name="name">status</value>
<value name="value">living</value>
</record>
<record javaclass="com.wm.util.Values">
<value name="name">user</value>
<value name="value">John&nbsp;Connor</value>
</record>
</array>
<null name="propertiesList"/>
</Values>
Code to convert the xml to dictionary ( which is working properly )
from xml.etree import ElementTree
tree = ElementTree.fromstring(xml)
mom = []
mim = []
configuration = tree.find('value[#name="configuration"]').text
comment = tree.find('value[#name="comment"]').text
prop = (configuration, comment)
mom.append(prop)
for records in tree.findall('./array/record'):
me = []
for child in records.iter('value'):
me.append(child.text)
mim.append(me)
for key, value in mim:
mi_dict = dict()
mi_dict[key] = value
mom.append(mi_dict)
print(mom)
The result ( working as intended ):
[('test', 'Upload this for our robot.'), {'date_to_go': '1990'}, {'role': 'Survivor'}, {'status': 'living'}, {'user': 'John Connor'}]
EDIT:
Sorry if i wans't clear, but the code described is working as expected. but i'm not sure if this is the proper way ( python way, pythonic or clean ) to do it.
Thanks in advance.

I don't think its too bad. You can make some minor changes to be a bit more pythonic
from xml.etree import ElementTree
tree = ElementTree.fromstring(xml)
mom = []
mim = []
configuration = tree.find('value[#name="configuration"]').text
comment = tree.find('value[#name="comment"]').text
prop = (configuration, comment)
mom.append(prop)
for records in tree.findall('./array/record'):
mim.append([child.text for child in records.iter('value')])
mom += [{k:v} for k, v in mim.iteritems()]
print(mom)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Customizing dictionary conversion with dicttoxml? - python

Related

How can I extract elementary values with ElementTree in Python?

Fetch xml tag values recursively using ElementTree

Parsing name/value pairs from XML

Store XML values as Python list

Proper way to convert xml to dictionary

Categories

Resources