This question already has answers here:
Writing a CSV horizontally
(2 answers)
Closed 8 years ago.
Thanks to this other thread, I've successfully written my dictionary to a csv as a beginner using Python:
Writing a dictionary to a csv file with one line for every 'key: value'
dict1 = {0 : 24.7548, 1: 34.2422, 2: 19.3290}
csv looks like this:
0 24.7548
1 34.2422
2 19.3290
Now, i'm wondering what would be the best approach to organize several dictionaries with the same keys. I'm looking to have the keys as a first column, then the dict values in columns after that, all with a first row to distinguish the columns by dictionary names.
Sure, there are a lot of threads trying to do similar things, such as: Trouble writing a dictionary to csv with keys as headers and values as columns, but don't have my data structured in the same way (yet…). Maybe the dictionaries must be merged first.
dict2 = {0 : 13.422, 1 : 9.2308, 2 : 20.132}
dict3 = {0 : 32.2422, 1 : 23.342, 2 : 32.424}
My ideal output:
ID dict1 dict2 dict3
0 24.7548 13.422 32.2422
1 34.2422 9.2308 23.342
2 19.3290 20.132 32.424
I'm not sure, yet, how the column name ID for key names will work its way in there.
Use the csv module and list comprehension:
import csv
dict1 = {0: 33.422, 1: 39.2308, 2: 30.132}
dict2 = {0: 42.2422, 1: 43.342, 2: 42.424}
dict3 = {0: 13.422, 1: 9.2308, 2: 20.132}
dict4 = {0: 32.2422, 1: 23.342, 2: 32.424}
dicts = dict1, dict2, dict3, dict4
with open('my_data.csv', 'wb') as ofile:
writer = csv.writer(ofile, delimiter='\t')
writer.writerow(['ID', 'dict1', 'dict2', 'dict3', 'dict4'])
for key in dict1.iterkeys():
writer.writerow([key] + [d[key] for d in dicts])
Note that dictionaries is unordered by default, so if you want the keys in ascending order, you have to sort the keys:
for key in sorted(dict1.iterkeys(), key=lambda x: int(x)):
writer.writerow([key] + [d[key] for d in dicts])
If you need to handle situations where you can't be sure that all dicts have the same keys, you'll need to change some small stuff:
with open('my_data.csv', 'wb') as ofile:
writer = csv.writer(ofile, delimiter='\t')
writer.writerow(['ID', 'dict1', 'dict2', 'dict3', 'dict4'])
keys = set(d.keys() for d in dicts)
for key in keys:
writer.writerow([key] + [d.get(key, None) for d in dicts])
Use defaultdict(list)
from collections import defaultdict
merged_dict = defaultdict(list)
dict_list = [dict1, dict2, dict3]
for dict in dict_list:
for k, v in dict.items():
merged_dict[k].append(v)
This is what you get:
{0: [24.7548, 13.422, 32.2422], 1: [34.2422, 9.2308, 23.342], 2: [19.329, 20.132, 32.424]})
Then write the merged_dict to csv file as you had previously done for a single dict. This time writerow method of csv module will be helpful.
Here is one way to do it.
my_dicts = [dict1, dict2, dict3]
dict_names = range(1, len(my_dicts)+1)
header = "ID," + ",".join(map(lambda x: "dict"+str(x)), dict_names) + "\n"
all_possible_keys = set(reduce(lambda x,y: x + y.keys(), my_dicts, []))
with open("file_to_write.csv", "w") as output_file:
output_file.write(header)
for k in all_possible_keys:
print_str = "{},".format(k)
for d in my_dicts:
print_str += "{},".format(d.get(k, None))
print_str += "\n"
output_file.write(print_str)
It has been some time since I used Python, but here's my suggestion.
In Python, dictionary values can be of any type (as far as I remember, don't flame me if I'm wrong). At least it should be possible to map your keys to lists.
So you can loop over your dictionaries and maybe create a new dictionary 'd', and for each key, if the value is already in 'd', push the value to the value of 'd' (since the value of the associated key is a list).
Then you can write out the new dictionary as: (pseudocode)
for each key,value in dictionary
write key
write TAB
for each v in value
write v + TAB
write new line
end for
This doesn't include the 'header names' though, but I'm sure that's quite easy to add.
Related
I have a list of dictionaries and every dictionary has the word as key, and the number of times that word appears in a particular document as value. Now I am wondering How can I find how many dictionaries a particular word appears in?
suppose I have a list of following dictionaries:
dict1 = {'Association':5, 'Rule':2, 'Mining':3}
dict2 = {'Rule':4, 'Mining':1}
dict3 = {'Association':4, 'Mining':3}
Result after counting how many dictionaries a word appears in:
result_dict = {'Association':2, 'Rule':2, 'Mining':3}
Counter is a dict subclass that can be useful here:
from collections import Counter
dicts = [dict1, dict2, dict3]
key_counters = [Counter(dictionary.keys()) for dictionary in dicts]
start_counter = Counter()
result_dict = sum(key_counters, start_counter)
assert result_dict == {'Association': 2, 'Rule': 2, 'Mining': 3}
This can be easily done with dict comprehension.
First, make a list out of your dicts:
dict1 = {'Association':5,'Rule':2,'Mining':3}
dict2 = {'Rule':4,'Mining':1}
dict3 = {'Association':4,'Mining':3}
dicts = [dict1, dict2, dict3]
Then, make a set of all the words in the dictionaries with a union (might be a cleaner way to do this, but this worked):
all_words = set().union(*[d.keys() for d in dicts])
Then, count how many dictionaries each word appears in:
{k: sum([1 for d in dicts if k in d.keys()]) for k in all_words}
This returned the desired output from your example.
I need to systematically access dictionaries that are nested within a list within a dictionary at the 3rd level, like this:
responses = {'1': {'responses': [{1st dict to be retrieved}, {2nd dict to be retrieved}, ...]},
'2': {'responses': [{1st dict to be retrieved}, {2nd dict to be retrieved}, ...]}, ...}
I need to unnest and transform these nested dicts into dataframes, so the end result should look like this:
responses = {'1': df1,
'2': df2, ...}
In order to achieve this, I built a for-loop in order to loop through all keys on the first level. Within that loop, I am using another loop to extract each item from the nested dicts into a new empty list called responses_df:
responses_dict = {}
for key in responses.keys():
for item in responses[key]['responses']:
responses_dict[key].update(item)
However, I get:
KeyError: '1'
The inner loop works if I use it individually on a key within the dict, but that doesn't really help me since the data comes from an API and has to be updated dynamically every few minutes in production.
The nex loop to transform the result into dataframes would look like this:
for key in responses_dict:
responses_df[key] = pd.DataFrame.from_dict(responses_dict[key], orient='index')
But I haven't gotten to try that out since the first operation fails.
Try this:
from collections import defaultdict
responses_dict = defaultdict(dict) # instead of {}
Then your code will work.
In fact responses_dict[key] where key=1 doesn't exist.
So when you simply do print(responses_dict[key]) you get the same error, 1 is not a key of that dict and update is not used as it should be.
Try the following syntax :
responses_dict = {}
for key in responses.keys():
print(key)
for item in responses[key]['responses']:
responses_dict.update(key = item)
I prefer using dictionaries while updating a dictionary.
If you update with an existing key, the value of that key will be updated.
If you update with an new key-value pair, the pair will be added to that dictionary.
>>>d1 = {1: 10, 2:20}
>>>d1.update({1:20})
>>>d1
>>>{1: 20, 2:20}
>>>d1.update({3:30})
>>>d1
>>>{1: 20, 2:20, 3:30}
Try fixing your line with:
responses_dict = {}
for key in responses.keys():
for item in responses[key]['responses']:
responses_dict.update({key: item})
So basically, use dictionary to update a dictionary, more readable and easy.
Try this:
responses = {'1': {'responses': [{'a': 1, 'b': 2}, {'c': 3, 'd': 4}]},
'2': {'responses': [{'e': 5}, {'f': 6}]}}
result = {k: pd.DataFrame(chain.from_iterable(v['responses'])) for k, v in responses.items()}
for df in result.values():
print(df, end='\n\n')
Output:
0
0 a
1 b
2 c
3 d
0
0 e
1 f
For some reason my code refuses to convert to uppercase and I cant figure out why. Im trying to then write the dictionary to a file with the uppercase dictionary values being inputted into a sort of template file.
#!/usr/bin/env python3
import fileinput
from collections import Counter
#take every word from a file and put into dictionary
newDict = {}
dict2 = {}
with open('words.txt', 'r') as f:
for line in f:
k,v = line.strip().split(' ')
newDict[k.strip()] = v.strip()
print(newDict)
choice = input('Enter 1 for all uppercase keys or 2 for all lowercase, 3 for capitalized case or 0 for unchanged \n')
print("Your choice was " + choice)
if choice == 1:
for k,v in newDict.items():
newDict.update({k.upper(): v.upper()})
if choice == 2:
for k,v in newDict.items():
dict2.update({k.lower(): v})
#find keys and replace with word
print(newDict)
with open("tester.txt", "rt") as fin:
with open("outwords.txt", "wt") as fout:
for line in fin:
fout.write(line.replace('{PETNAME}', str(newDict['PETNAME:'])))
fout.write(line.replace('{ACTIVITY}', str(newDict['ACTIVITY:'])))
myfile = open("outwords.txt")
txt = myfile.read()
print(txt)
myfile.close()
In python 3 you cannot do that:
for k,v in newDict.items():
newDict.update({k.upper(): v.upper()})
because it changes the dictionary while iterating over it and python doesn't allow that (It doesn't happen with python 2 because items() used to return a copy of the elements as a list). Besides, even if it worked, it would keep the old keys (also: it's very slow to create a dictionary at each iteration...)
Instead, rebuild your dict in a dict comprehension:
newDict = {k.upper():v.upper() for k,v in newDict.items()}
You should not change dictionary items as you iterate over them. The docs state:
Iterating views while adding or deleting entries in the dictionary may
raise a RuntimeError or fail to iterate over all entries.
One way to update your dictionary as required is to pop values and reassign in a for loop. For example:
d = {'abc': 'xyz', 'def': 'uvw', 'ghi': 'rst'}
for k, v in d.items():
d[k.upper()] = d.pop(k).upper()
print(d)
{'ABC': 'XYZ', 'DEF': 'UVW', 'GHI': 'RST'}
An alternative is a dictionary comprehension, as shown by #Jean-FrançoisFabre.
I have two dictionaries. In both dictionaries, the value of each key is a single list. If any element in any list in dictionary 2 is equal to a key of dictionary 1, I want to replace that element with the first element in that dictionary 1 list.
In other words, I have:
dict1 = {'IDa':['newA', 'x'], 'IDb':['newB', 'x']}
dict2 = {1:['IDa', 'IDb']}
and I want:
dict2 = {1:['newA', 'newB']}
I tried:
for ID1, news in dict1.items():
for x, ID2s in dict2.items():
for ID in ID2s:
if ID == ID1:
print ID1, 'match'
ID.replace(ID, news[0])
for k, v in dict2.items():
print k, v
and I got:
IDb match
IDa match
1 ['IDa', IDb']
So it looks like everything up to the replace method is working. Is there a way to make this work? To replace an entire string in a value-list with a string in another value-list?
Thanks a lot for your help.
Try this:
dict1 = {'IDa':['newA', 'x'], 'IDb':['newB', 'x']}
dict2 = {1:['IDa', 'IDb']}
for key in dict2.keys():
dict2[key] = [dict1[x][0] if x in dict1.keys() else x for x in dict2[key]]
print dict2
this will print:
{1: ['newA', 'newB']}
as required.
Explanation
dict.keys() gives us just the keys of a dictionary (i.e. just the left hand side of the colon). When we use for key in dict2.keys(), at present our only key is 1. If the dictionary was larger, it'd loop through all keys.
The following line uses a list comprehension - we know that dict2[key] gives us a list (the right side of the colon), so we loop through every element of the list (for x in dict2[key]) and return the first entry of the corresponding list in dict1 only if we can find the element in the keys of dict1 (dict1[x][0] if x in dict1.keys) and otherwise leave the element untouched ([else x]).
For example, if we changed our dictionaries to be the following:
dict1 = {'IDa':['newA', 'x'], 'IDb':['newB', 'x']}
dict2 = {1:['IDa', 'IDb'], 2:{'IDb', 'IDc'}}
we'd get the output:
{1: ['newA', 'newB'], 2: ['newB', 'IDc']}
because 'IDc' doesn't exist in the keys of dict1.
You could also use dictionary comprehensions, but I am not sure that they are working in Python 2.7, it may be limited to Python 3 :
# Python 3
dict2 = {k: [dict1.get(e, [e])[0] for e in v] for k,v in dict2.items()}
edit: I just checked, this is working in Python 2.7. However, dict2.items() should be replaced by dict2.iteritems() :
# Python 2.7
dict2 = {k: [dict1.get(e, [e])[0] for e in v] for k,v in dict2.iteritems()}
This was a fun one!
dict2[1] = [dict1[val][0] if val in dict1 else val for val in dict2[1]]
Or, here is the same logic without list comprehension:
new_dict = {1: []}
for val in dict2[1]:
if val in dict1:
new_dict[1].append(dict1[val][0])
else:
new_dict[1].append(val)
dict2 = new_dict
I have one list which contain a few dictionaries.
[{u'TEXT242.txt': u'work'},{u'TEXT242.txt': u'go to work'},{u'TEXT1007.txt': u'report'},{u'TEXT797.txt': u'study'}]
how to combine the dictionary when it has the same key. for example:
u'work', u'go to work'are under one key:'TEXT242.txt', so that i can remove the duplicated key.
[{u'TEXT242.txt': [u'work', u'go to work']},{u'TEXT1007.txt': u'report'},{u'TEXT797.txt': u'study'}]
The setdefault method of dictionaries is handy here... it can create an empty list when a dictionary key doesn't exist, so that you can always append the value.
dictlist = [{u'TEXT242.txt': u'work'},{u'TEXT242.txt': u'go to work'},{u'TEXT1007.txt': u'report'},{u'TEXT797.txt': u'study'}]
newdict = {}
for d in dictlist:
for k in d:
newdict.setdefault(k, []).append(d[k])
from collections import defaultdict
before = [{u'TEXT242.txt': u'work'},{u'TEXT242.txt': u'go to work'},{u'TEXT1007.txt': u'report'},{u'TEXT797.txt': u'study'}]
after = defaultdict(list)
for i in before:
for k, v in i.items():
after[k].append(v)
out:
defaultdict(list,
{'TEXT1007.txt': ['report'],
'TEXT242.txt': ['work', 'go to work'],
'TEXT797.txt': ['study']})
This technique is simpler and faster
than an equivalent technique using dict.setdefault()