I'm accessing a third-party API that returns a dictionary that contains both simple values and nested (embedded?) dictionaries. I need to convert this to a CSV file, but I need help extracting and exporting specific values from the nested dictionaries.
Here's a simplified example of what I'm getting back:
accounts = {
'Id': '0131232',
'AccountName': 'CompanyX',
'Active': False,
'LastModifiedBy': {'type': 'User', 'Id': '', 'Name': 'Joe Smith'}
},
{
'Id': '987654',
'AccountName': 'CompanyY',
'Active': True,
'LastModifiedBy': {'type': 'User', 'Id': '', 'Name': 'Mary Johnson'}
}
I'm trying to export this to a CSV file with the following code:
with open('output.csv', 'w') as f:
dwriter = csv.DictWriter(f, accounts[0].keys())
dwriter.writeheader()
dwriter.writerows(accounts)
f.close()
What I want in the CSV file is the following:
Id,AccountName,Active,LastModifiedBy
0131232,CompanyX,False,Joe Smith
987654,CompanyY,True,Mary Johnson
What I'm getting with my code above is the following:
Id,AccountName,Active,LastModifiedBy
0131232,CompanyX,False,"{'type': 'User', 'Id': '', 'Name': 'Joe Smith'}"
987654,CompanyY,True,"{'type': 'User', 'Id': '', 'Name': 'Mary Johnson'}"
Obviously I need to extract the key-value pair I want from the nested dictionary and assign that value to the higher-level dictionary. My question is how do I do that while still handling the simple values as is?
It seems like this could be done with dictionary comprehension, but I'm not sure I can do a conditional with that.
Alternatively I could go through each record, check each value to see if it's a dictionary, and write out the values I want to a new dictionary, but that feels a little too heavy.
Full disclosure: I'm new to Python, so apologies if I'm missing something obvious.
Thanks!
- Chris
If you don't need accounts for anything else you can do:
for account in accounts:
account['LastModifiedBy'] = account['LastModifiedBy']['Name']
otherwise, .copy() it and do the same.
Related
I have a list of dictionaries of the form:
mylist = [{'name': 'Johnny', 'surname': 'Cashew'}, {'name': 'Abraham', 'surname': 'Linfield'}]
and I am trying to dump that to a json this way:
with open('myfile.json', 'w') as f:
json.dump(mylist, f)
but then the entire list is on one line in the json file, making it hardly readable (my list is in reality very long). Is there a way to dump each element of my list on a new line? I have seen this post that suggests using indent in this way:
with open('myfile.json', 'w') as f:
json.dump(mylist, f, indent=2)
but then I get each element within the dictionaries on a new line, like that:
[
{
'name': 'Johnny',
'surname: 'Cashew'
},
{
'name': 'Abraham',
'surname: 'Linfield'
}
]
whereas what I am hoping to obtain is something like that:
[
{'name': 'Johnny', 'surname': 'Cashew'},
{'name': 'Abraham', 'surname': 'Linfield'}
]
Would someone have a hint? Many thanks!
This is a dirty way of doing it but it works for me
import json
my_list = [{'name': 'John', 'surname': 'Doe'}, {'name': 'Jane', 'surname': 'Doe'}]
with open('names.json','w') as f:
f.write('[\n')
for d in my_list:
#dumps() instead of dump() so that we can write it like a normal str
f.write(json.dumps(d))
if d == my_list[-1]:
f.write("\n")
break
f.write(",\n")
f.write(']')
Here is the text file1 content
name = test1,description=None,releaseDate="2020-02-27"
name = test2,description=None,releaseDate="2020-02-28"
name = test3,description=None,releaseDate="2020-02-29"
I want a nested dictionary like this. How to create this?
{ 'test1': {'description':'None','releaseDate':'2020-02-27'},
'test2': {'description':'None','releaseDate':'2020-02-28'},
'test3': {'description':'None','releaseDate':'2020-02-29'}}
After this I want to append these values in the following line of code through "for" loop for a list of projects.
Example: For a project="IJTP2" want to go through each name in the dictionary like below
project.create(name="test1", project="IJTP2", description=None, releaseDate="2020-02-27")
project.create(name="test2", project="IJTP2", description=None, releaseDate="2020-02-28")
project.create(name="test3", project="IJTP2", description=None, releaseDate="2020-02-29")
Now to the next project:
List of projects is stored in another file as below
IJTP1
IJTP2
IJTP3
IJTP4
I just started working on Python and have never worked on the nested dictionaries.
I assume that:
each file line has comma-separated columns
each column has only one = and key on its left, value on its right
only first column is special(name)
Of course, as #Alex Hall mentioned, I recommend JSON or CSV, too.
Anyway, I wrote code for your case.
d = {}
with open('test-200229.txt') as f:
for line in f:
(_, name), *rest = (
tuple(value.strip() for value in column.split('='))
for column in line.split(',')
)
d[name] = dict(rest)
print(d)
output:
{'test1': {'description': 'None', 'releaseDate': '"2020-02-27"'}, 'test2': {'description': 'None', 'releaseDate': '"2020-02-28"'}, 'test3': {'description': 'None', 'releaseDate': '"2020-02-29"'}}
I have a list of dictionaries which I build from .xml file:
list_1=[{'lat': '00.6849879', 'phone': '+3002201600', 'amenity': 'restaurant', 'lon': '00.2855850', 'name': 'Telegraf'},{'lat': '00.6850230', 'addr:housenumber': '6', 'lon': '00.2844493', 'addr:city': 'XXX', 'addr:street': 'YYY.'},{'lat': '00.6860304', 'crossing': 'traffic_signals', 'lon': '00.2861978', 'highway': 'crossing'}]
My aim is to build a text file with values (not keys) in such order:
lat,lon,'addr:street','addr:housenumber','addr:city','amenity','crossing' etc...
00.6849879,00.2855850, , , ,restaurant, ,'\n'00.6850230,00.2844493,YYY,6,XXX, , ,'\n'00.6860304,00.2861978, , , , ,traffic_signals,'\n'
if value not exists there should be empty space.
I tried to loop with for loop:
for i in list_1:
line= i['lat'],i['lon']
print line
Problem occurs if I add value which does not exist in some cases:
for i in list_1:
line= i['lat'],i['lon'],i['phone']
print line
Also tried to loop and use map() function, but results seems not correct:
for i in list_1:
line=map(lambda x1,x2:x1+','+x2+'\n',i['lat'],i['lon'])
print line
Also tried:
for i in list_1:
for k,v in i.items():
if k=='addr:housenumber':
print v
This time I think there might be too many if/else conditions to write.
Seems like solutions is somewhere close. But can't figure out the solution and its optimal way.
I would look to use the csv module, in particular DictWriter. The fieldnames dictate the order in which the dictionary information is written out. Actually writing the header is optional:
import csv
fields = ['lat','lon','addr:street','addr:housenumber','addr:city','amenity','crossing',...]
with open('<file>', 'w') as f:
writer = csv.DictWriter(f, fields)
#writer.writeheader() # If you want a header
writer.writerows(list_1)
If you really didn't want to use csv module then you can simple iterate over the list of the fields you want in the order you want them:
fields = ['lat','lon','addr:street','addr:housenumber','addr:city','amenity','crossing',...]
for row in line_1:
print(','.join(row.get(field, '') for field in fields))
If you can't or don't want to use csv you can do something like
order = ['lat','lon','addr:street','addr:housenumber',
'addr:city','amenity','crossing']
for entry in list_1:
f.write(", ".join([entry.get(x, "") for x in order]) + "\n")
This will create a list with the values from the entry map in the order present in the order list, and default to "" if the value is not present in the map.
If your output is a csv file, I strongly recommend using the csv module because it will also escape values correctly and other csv file specific things that we don't think about right now.
Thanks guys
I found the solution. Maybe it is not so elegant but it works.
I made a list of node keys look for them in another list and get values.
key_list=['lat','lon','addr:street','addr:housenumber','amenity','source','name','operator']
list=[{'lat': '00.6849879', 'phone': '+3002201600', 'amenity': 'restaurant', 'lon': '00.2855850', 'name': 'Telegraf'},{'lat': '00.6850230', 'addr:housenumber': '6', 'lon': '00.2844493', 'addr:city': 'XXX', 'addr:street': 'YYY.'},{'lat': '00.6860304', 'crossing': 'traffic_signals', 'lon': '00.2861978', 'highway': 'crossing'}]
Solution:
final_list=[]
for i in list:
line=str()
for ii in key_list:
if ii in i:
x=ii
line=line+str(i[x])+','
else:
line=line+' '+','
final_list.append(line)
This question already has answers here:
Write values of Python dictionary back to file
(2 answers)
Closed 7 years ago.
I have a list of dictionaries such as:
values = [{'Name': 'John Doe', 'Age': 26, 'ID': '1279abc'},
{'Name': 'Jane Smith', 'Age': 35, 'ID': 'bca9721'}
]
What I'd like to do is print this list of dictionaries to a tab delimited text file to look something like this:
Name Age ID
John Doe 26 1279abc
Jane Smith 35 bca9721
However, I am unable to wrap my head around simply printing the values, as of right now I'm printing the entire dictionary per row via:
for i in values:
f.write(str(i))
f.write("\n")
Perhaps I need to iterate through each dictionary now? I've seen people use something like:
for i, n in iterable:
pass
But I've never understood this. Anyone able to shed some light into this?
EDIT:
Appears that I could use something like this, unless someone has a more pythonic way (Perhaps someone can explain "for i, n in interable"?):
for dic in values:
for entry in dic:
f.write(dic[entry])
This is simple enough to accomplish with a DictWriter. Its purpose is to write column-separated data, but if we specify our delimiter to be that of tabs instead of commas, we can make this work just fine.
from csv import DictWriter
values = [{'Name': 'John Doe', 'Age': 26, 'ID': '1279abc'},
{'Name': 'Jane Smith', 'Age': 35, 'ID': 'bca9721'}]
keys = values[0].keys()
with open("new-file.tsv", "w") as f:
dict_writer = DictWriter(f, keys, delimiter="\t")
dict_writer.writeheader()
for value in values:
dict_writer.writerow(value)
f.write('Name\tAge\tID')
for value in values:
f.write('\t'.join([value.get('Name'), str(value.get('Age')), value.get('ID')]))
You're probably thinking of the items() method. This will return the key and value for each entry in the dictionary. http://www.tutorialspoint.com/python/dictionary_items.htm
for k,v in values.items():
pass
# assuming your dictionary is in values
import csv
with open('out.txt', 'w') as fout:
writer = csv.DictWriter(fout, fields=values.keys(). delimiter="\t")
writer.writeheader()
writer.writerow(values.values())
I need to extract information from an xml file, isolate it from the xml tags before and after, store the information in a dictionary, then loop through the dictionary to print a list. I am an absolute beginner so I'd like to keep it as simple as possible and I apologize if how I've described what I'd like to do doesn't make much sense.
here is what i have so far.
for line in open("/people.xml"):
if "name" in line:
print (line)
if "age" in line:
print(line)
Current Output:
<name>John</name>
<age>14</age>
<name>Kevin</name>
<age>10</age>
<name>Billy</name>
<age>12</age>
Desired Output
Name Age
John 14
Kevin 10
Billy 12
edit- So using the code below I can get the output:
{'Billy': '12', 'John': '14', 'Kevin': '10'}
Does anyone know how to get from this to a chart with headers like my desired output?
try xmldict (Convert xml to python dictionaries, and vice-versa.):
>>> xmldict.xml_to_dict('''
... <root>
... <persons>
... <person>
... <name first="foo" last="bar" />
... </person>
... <person>
... <name first="baz" last="bar" />
... </person>
... </persons>
... </root>
... ''')
{'root': {'persons': {'person': [{'name': {'last': 'bar', 'first': 'foo'}}, {'name': {'last': 'bar', 'first': 'baz'}}]}}}
# Converting dictionary to xml
>>> xmldict.dict_to_xml({'root': {'persons': {'person': [{'name': {'last': 'bar', 'first': 'foo'}}, {'name': {'last': 'bar', 'first': 'baz'}}]}}})
'<root><persons><person><name><last>bar</last><first>foo</first></name></person><person><name><last>bar</last><first>baz</first></name></person></persons></root>'
or try xmlmapper (list of python dictionary with parent-child relationship):
>>> myxml='''<?xml version='1.0' encoding='us-ascii'?>
<slideshow title="Sample Slide Show" date="2012-12-31" author="Yours Truly" >
<slide type="all">
<title>Overview</title>
<item>Why
<em>WonderWidgets</em>
are great
</item>
<item/>
<item>Who
<em>buys</em>
WonderWidgets1
</item>
</slide>
</slideshow>'''
>>> x=xml_to_dict(myxml)
>>> for s in x:
print s
>>>
{'text': '', 'tail': None, 'tag': 'slideshow', 'xmlinfo': {'ownid': 1, 'parentid': 0}, 'xmlattb': {'date': '2012-12-31', 'author': 'Yours Truly', 'title': 'Sample Slide Show'}}
{'text': '', 'tail': '', 'tag': 'slide', 'xmlinfo': {'ownid': 2, 'parentid': 1}, 'xmlattb': {'type': 'all'}}
{'text': 'Overview', 'tail': '', 'tag': 'title', 'xmlinfo': {'ownid': 3, 'parentid': 2}, 'xmlattb': {}}
{'text': 'Why', 'tail': '', 'tag': 'item', 'xmlinfo': {'ownid': 4, 'parentid': 2}, 'xmlattb': {}}
{'text': 'WonderWidgets', 'tail': 'are great', 'tag': 'em', 'xmlinfo': {'ownid': 5, 'parentid': 4}, 'xmlattb': {}}
{'text': None, 'tail': '', 'tag': 'item', 'xmlinfo': {'ownid': 6, 'parentid': 2}, 'xmlattb': {}}
{'text': 'Who', 'tail': '', 'tag': 'item', 'xmlinfo': {'ownid': 7, 'parentid': 2}, 'xmlattb': {}}
{'text': 'buys', 'tail': 'WonderWidgets1', 'tag': 'em', 'xmlinfo': {'ownid': 8, 'parentid': 7}, 'xmlattb': {}}
above code will give generator. When you iterate over it; you will get information in dict keys; like tag, text, xmlattb,tail and addition information in xmlinfo. Here root element will have parentid information as 0.
Use an XML parser for this. For example,
import xml.etree.ElementTree as ET
doc = ET.parse('people.xml')
names = [name.text for name in doc.findall('.//name')]
ages = [age.text for age in doc.findall('.//age')]
people = dict(zip(names,ages))
print(people)
# {'Billy': '12', 'John': '14', 'Kevin': '10'}
It seems to me that this is an exercise in learning how to parse this XML manually rather than simply pulling a library out of the bag to do it for you. If I am wrong, I suggest watching the udacity video by Steve Huffman that can be found here: http://www.udacity.com/view#Course/cs253/CourseRev/apr2012/Unit/362001/Nugget/365002. He explains how to use the minidom module to parse lightweight xml files such as these.
Now, the first point I want to make in my answer, is that you don't want to create a python dictionary to print all of these values. A python dictionary is simply a set of keys that correspond to values. There is no ordering to them, and so traversal in the order they appeared in the file is a pain in the butt. You are trying to print out all of the names together with their corresponding ages, so a data structure like a list of tuples would probably be better suited to collating your data.
It seems like the structure of your xml file is such that each name tag is succeeded by an age tag that corresponds to it. There also seems to only be a single name tag per line. This makes matters fairly simple. I'm not going to write the most efficient or universal solution to this problem, but instead I will try to make the code as simple to understand as I can.
So let's first create a list to store the data:
Let's then create a list to store the data:
a_list = []
Now open your file, and initialize a couple of variables to hold each name and age:
from __future__ import with_statement
with open("/people.xml") as f:
name, age = None, None #initialize a name and an age variable to be used during traversals.
for line in f:
name = extract_name(line,name) # This function will be defined later.
age = extract_age(line) # So will this one.
if age: #We know that if age is defined, we can add a person to our list and reset our variables
a_list.append( (name,age) ) # and now we can re-initialize our variables.
name,age = None , None # otherwise simply read the next line until age is defined.
Now for each line in the file, we wanted to determine whether it contains a user. If it did, we wanted to extract the name. Let's create a function used to do this:
def extract_name(a_line,name): #we pass in the line as well as the name value that that we defined before beginning our traversal.
if name: # if the name is predefined, we simply want to keep the name at its current value. (we can clear it upon encountering the corresponding age.)
return name
if not "<name>" in a_line: #if no "<name>" in a_line, return. otherwise, extract new name.
return
name_pos = a_line.find("<name>")+6
end_pos = a_line.find("</name>")
return a_line[name_pos:end_pos]
Now, we must create a function to parse the line for a user's age. We can do this in a similar way to the previous function, but we know that once we have an age, it will be added into the list immediately. As such, we never need to concern ourselves with age's previous value. The function can therefore look like this:
def extract_age(a_line):
if not "<age>" in a_line: #if no "<age>" in a_line:
return
age_pos = a_line.find("<age>")+5 # else extract age from line and return it.
end_pos = a_line.find("</age>")
return a_line[age_pos:end_pos]
Finally, you want to print the list. You might do it as follows:
for item in a_list:
print '\t'.join(item)
Hope this helped. I haven't tested out my code, so it might still be slightly buggy. The concepts are there, though. :)
Here's another way using lxml library:
from lxml import objectify
def xml_to_dict(xml_str):
""" Convert xml to dict, using lxml v3.4.2 xml processing library, see http://lxml.de/ """
def xml_to_dict_recursion(xml_object):
dict_object = xml_object.__dict__
if not dict_object: # if empty dict returned
return xml_object
for key, value in dict_object.items():
dict_object[key] = xml_to_dict_recursion(value)
return dict_object
return xml_to_dict_recursion(objectify.fromstring(xml_str))
xml_string = """<?xml version="1.0" encoding="UTF-8"?><Response><NewOrderResp>
<IndustryType>Test</IndustryType><SomeData><SomeNestedData1>1234</SomeNestedData1>
<SomeNestedData2>3455</SomeNestedData2></SomeData></NewOrderResp></Response>"""
print xml_to_dict(xml_string)
To preserve the parent node, use this instead:
def xml_to_dict(xml_str):
""" Convert xml to dict, using lxml v3.4.2 xml processing library, see http://lxml.de/ """
def xml_to_dict_recursion(xml_object):
dict_object = xml_object.__dict__
if not dict_object: # if empty dict returned
return xml_object
for key, value in dict_object.items():
dict_object[key] = xml_to_dict_recursion(value)
return dict_object
xml_obj = objectify.fromstring(xml_str)
return {xml_obj.tag: xml_to_dict_recursion(xml_obj)}
And if you want to only return a subtree and convert it to dict, you can use Element.find() :
xml_obj.find('.//') # lxml.objectify.ObjectifiedElement instance
See lxml documentation.