Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I have a dictionary in the next format:
a = {'file': {'x': [1, 2, 3, 4], 'y': [23, 134, 571, 13]},
'file2': {'x': [1, 2, 3, 5], 'y': [123, 215, 21, 123]}}
Is it possible to convert this dictionary in this format (here keys are union between all x's):
{'1': {'file': 23, 'file2': 123}, '2': {'file': '134', 'file2': 215}, ...,
'4': {'file': 13, 'file2': '-'}, '5': {'file': '-', 'file2': '123'}}
I just cannot figure out how to do it.
Yes, of course it is possible. I think what you want is something like:
interim = {k: {x: y for x, y in zip(v['x'], v['y'])} for k, v in a.items()}
which creates a dictionary mapping the 'x's to the 'y's:
{'file2': {1: 123, 2: 215, 3: 21, 5: 123},
'file': {1: 23, 2: 134, 3: 571, 4: 13}}
then:
out_keys = set().union(*interim.values())
which creates the set of keys for the output:
set([1, 2, 3, 4, 5])
and finally:
output = {k: {k1: v1.get(k, "-") for k1, v1 in interim.items()} for k in keys}
which creates your output format:
{1: {'file2': 123, 'file': 23},
2: {'file2': 215, 'file': 134},
3: {'file2': 21, 'file': 571},
4: {'file2': '-', 'file': 13},
5: {'file2': 123, 'file': '-'}}
This is flexible such that any number of 'file's, 'x's and 'y's can be handled. Note that zip will truncate to whichever of the x's and 'y's is shorter if they aren't the same length.
Here's a one-liner, so to speak:
In [1]: %paste
a = {'file': {'x': [1, 2, 3, 4], 'y': [23, 134, 571, 13]},
'file2': {'x': [1, 2, 3, 5], 'y': [123, 215, 21, 123]}}
## -- End pasted text --
In [2]: {x: {fkey:
..: ([y for _, y in zip(fval['x'], fval['y']) if _ == x] or ['-'])[0]
..: for fkey, fval in a.items()}
..: for x in set().union(*[fval['x'] for fval in a.values()])}
Out[2]:
{1: {'file2': 123, 'file': 23},
2: {'file2': 215, 'file': 134},
3: {'file2': 21, 'file': 571},
4: {'file2': '-', 'file': 13},
5: {'file2': 123, 'file': '-'}}
Essentially it is equivalent to jonsharpe's answer, although it doesn't create an intermediate dict.
Related
I have two list of dictionaries, namely
bandits = [{'health': 15, 'damage': 2, 'id': 0}, {'health': 10, 'damage': 2, 'id': 0}, {'health': 12, 'damage': 2, 'id': 0}]
hero = [{'name': "Arthur", 'health': 50, 'damage': 5, 'id': 0}]
What I would like to do, is simulate a hero strike on each member of the bandits list, which consist in substracting the damage value of hero to the health value of each bandits entry. As an illustration, with the values given above, after the hero has dealt its blow, the bandits list should read
bandits = [{'health': 10, 'damage': 2, 'id': 0}, {'health': 5, 'damage': 2, 'id': 0}, {'health': 7, 'damage': 2, 'id': 0}]
I have tried several things, amongst which
for i, v in enumerate(bandits):
bandits[i] = {k: (bandits[i][k] - hero[0].get('damage')) for k in bandits[i] if k=='health'}
which yields
bandits = [{'health': 10}, {'health': 5}, {'health': 7}]
i.e. the results for the health are good, but all other key:val pairs in the dictionaries contained in the bandits list are deleted. How can I correct my code?
Depended on the goals/use case you can iterate the collection and update the value in-place (variable names are used from the "I have tried several things" code):
bandit = [{'health': 15, 'damage': 2, 'id': 0}, {'health': 10, 'damage': 2, 'id': 0}, {'health': 12, 'damage': 2, 'id': 0}]
knight_data = [{'name': "Arthur", 'health': 50, 'damage': 5, 'id': 0}]
for b in bandit:
for k in knight_data:
b['health'] -= k['damage']
Or:
for b in bandit:
b['health'] -= knight_data[0]['damage']
Don't create new dictionaries, just subtract from the values in the existing dictionaries.
for bandit in bandits:
bandit['health'] -= hero[0]['damage']
In my DataFrame I have list with dicts. When I do
data.stations.apply(lambda x: x)[5]
the output is:
[{'id': 245855,
'outlets': [{'connector': 13, 'id': 514162, 'power': 0},
{'connector': 3, 'id': 514161, 'power': 0},
{'connector': 7, 'id': 514160, 'power': 0}]},
{'id': 245856,
'outlets': [{'connector': 13, 'id': 514165, 'power': 0},
{'connector': 3, 'id': 514164, 'power': 0},
{'connector': 7, 'id': 514163, 'power': 0}]},
{'id': 245857,
'outlets': [{'connector': 13, 'id': 514168, 'power': 0},
{'connector': 3, 'id': 514167, 'power': 0},
{'connector': 7, 'id': 514166, 'power': 0}]}]
So it looks like 3 dicts in a list.
When I do
data.stations.apply(lambda x: x[0] )[5]
It does what it should:
{'id': 245855,
'outlets': [{'connector': 13, 'id': 514162, 'power': 0},
{'connector': 3, 'id': 514161, 'power': 0},
{'connector': 7, 'id': 514160, 'power': 0}]}
HOWEVER, when I chose second or third element, it doesn't work:
data.stations.apply(lambda x: x[1])[5]
This gives an error:
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-118-1210ba659690> in <module>()
----> 1 data.stations.apply(lambda x: x[1])[5]
~\AppData\Local\Continuum\Anaconda3\envs\geo2\lib\site-packages\pandas\core\series.py in apply(self, func, convert_dtype, args, **kwds)
2549 else:
2550 values = self.asobject
-> 2551 mapped = lib.map_infer(values, f, convert=convert_dtype)
2552
2553 if len(mapped) and isinstance(mapped[0], Series):
pandas/_libs/src/inference.pyx in pandas._libs.lib.map_infer()
<ipython-input-118-1210ba659690> in <lambda>(x)
----> 1 data.stations.apply(lambda x: x[1])[5]
IndexError: list index out of range
Why? It should just give me the second element.
The reason might be simple that all the list entries in each row might not be of same length. Lets consider an example
data = pd.DataFrame({'stations':[[{'1':2,'3':4},{'1':2,'3':4},{'1':2,'3':4}],
[{'1':2,'3':4},{'1':2,'3':4}],
[{'1':2,'3':4}],
[{'1':2,'3':4},{'1':2,'3':4},{'1':2,'3':4}]]
})
stations
0 [{'1': 2, '3': 4}, {'1': 2, '3': 4}, {'1': 2, ...
1 [{'1': 2, '3': 4}, {'1': 2, '3': 4}]
2 [{'1': 2, '3': 4}]
3 [{'1': 2, '3': 4}, {'1': 2, '3': 4}, {'1': 2, ...
If you do :
data['stations'].apply(lambda x: x[0])[3]
You will get :
{'1': 2, '3': 4}
But if you do:
data['stations'].apply(lambda x: x[1])[3]
You will get Index Error... list out of bounds because if you observe the 3rd row there is only one element in the list. Hope it clears your doubt.
Every loop in this function:
def sum_total(files, local_dir):
final_dict = {}
for i in range(len(files)):
with open(local_dir+files[i], 'r') as f:
data = f.readlines()
res = find_by_tag(data)
print('res: ', res)
sum_values_from_several_dict_to_one(res)
Generates example output:
{'Critical Tests': {'failed': 1, 'passed': 2, 'total': 5}, 'All Tests': {'failed': 5, 'passed': 0, 'total': 5}}
{'Critical Tests': {'failed': 2, 'passed': 3, 'total': 5}, 'All Tests': {'failed': 10, 'passed': 12, 'total': 12}}
{'Critical Tests': {'failed': 3, 'passed': 4, 'total': 5}, 'All Tests': {'failed': 10, 'passed': 0, 'total': 10}}
EXPECTED OUTPUT:
I would like to sum those values into one dictionary to get output like:
{'Critical Tests': {'failed': 6, 'passed': 9, 'total': 15}, 'All Tests': {'failed': 25, 'passed': 12, 'total': 27}}
The problem is - how should the 'sum_values_from_several_dict_to_one' function looks like? Thats my code but it does not work.. What should be improved?
def sum_values_from_several_dict_to_one(d1):
final_dict = {}
for d in d1 <?>:
for test, results in d.items():
if test not in final_dict:
final_dict[test] = {}
for key, value in results.items():
if key in final_dict[test]:
final_dict[test][results] += value
else:
final_dict[test][key] = value
return final_dict
Here you have:
a = {'Critical Tests': {'failed': 1, 'passed': 2, 'total': 5}, 'All Tests': {'failed': 5, 'passed': 0, 'total': 5}}
b = {'Critical Tests': {'failed': 2, 'passed': 3, 'total': 5}, 'All Tests': {'failed': 10, 'passed': 12, 'total': 12}}
def sum_dicts (dict1, dict2):
res = {}
for key, val in dict1.items():
for k, v in dict2.items():
if k == key:
if type(val) is dict:
res.update({key: sum_dicts(val, v)})
else:
res.update({key: val + v})
break
return res
if __name__ == '__main__':
sol = sum_dicts(a, b)
print(sol)
Output:
{'All Tests': {'failed': 15, 'total': 17, 'passed': 12}, 'Critical Tests': {'failed': 3, 'total': 10, 'passed': 5}}
EDIT:
Assuming res is a dict you can use it like this:
def sum_total(files, local_dir):
final_dict = {}
for i in range(len(files)):
with open(local_dir+files[i], 'r') as f:
data = f.readlines()
res = find_by_tag(data)
print('res: ', res)
final_dict = sum_dicts(final_dict, res)
I have a list of id's sorted in a proper oder:
ids = [1, 2, 4, 6, 5, 0, 3]
I also have a list of dictionaries, sorted in some random way:
rez = [{'val': 7, 'id': 1}, {'val': 8, 'id': 2}, {'val': 2, 'id': 3}, {'val': 0, 'id': 4}, {'val': -1, 'id': 5}, {'val': -4, 'id': 6}, {'val': 9, 'id': 0}]
My intention is to sort rez list in a way that corresponds to ids:
rez = [{'val': 7, 'id': 1}, {'val': 8, 'id': 2}, {'val': 0, 'id': 4}, {'val': -4, 'id': 6}, {'val': -1, 'id': 5}, {'val': 9, 'id': 0}, {'val': 2, 'id': 3}]
I tried:
rez.sort(key = lambda x: ids.index(x['id']))
However that way is too slow for me, as len(ids) > 150K, and each dict actually had a lot of keys (some values there are strings). Any suggestion how to do it in the most pythonic, but still fastest way?
You don't need to sort because ids specifies the entire ordering of the result. You just need to pick the correct elements by their ids:
rez_dict = {d['id']:d for d in rez}
rez_ordered = [rez_dict[id] for id in ids]
Which gives:
>>> rez_ordered
[{'id': 1, 'val': 7}, {'id': 2, 'val': 8}, {'id': 4, 'val': 0}, {'id': 6, 'val': -4}, {'id': 5, 'val': -1}, {'id': 0, 'val': 9}, {'id': 3, 'val': 2}]
This should be faster than sorting because it can be done in linear time on average, while sort is O(nlogn).
Note that this assumes that there will be one entry per id, as in your example.
I think you are on the right track. If you need to speed it up, because your list is too long and you are having quadratic complexity, you can turn the list into a dictionary first, mapping the ids to their respective indices.
indices = {id_: pos for pos, id_ in enumerate(ids)}
rez.sort(key = lambda x: indices[x['id']])
This way, indices is {0: 5, 1: 0, 2: 1, 3: 6, 4: 2, 5: 4, 6: 3}, and rez is
[{'id': 1, 'val': 7},
{'id': 2, 'val': 8},
{'id': 4, 'val': 0},
{'id': 6, 'val': -4},
{'id': 5, 'val': -1},
{'id': 0, 'val': 9},
{'id': 3, 'val': 2}]
I am trying to replace list element value with value looked up in dictionary how do I do that?
list = [1, 3, 2, 10]
d = {'id': 1, 'val': 30},{'id': 2, 'val': 53}, {'id': 3, 'val': 1}, {'id': 4, 'val': 9}, {'id': 5, 'val': 2}, {'id': 6, 'val': 6}, {'id': 7, 'val': 11}, {'id': 8, 'val': 89}, {'id': 9, 'val': 2}, {'id': 10, 'val': 4}
for i in list:
for key, v in d.iteritems():
???
???
so at the end I am expecting:
list = [30, 1, 53, 4]
thank you
D2 = dict((x['id'], x['val']) for x in D)
L2 = [D2[x] for x in L]
td = (
{'val': 30, 'id': 1},
{'val': 53, 'id': 2},
{'val': 1, 'id': 3},
{'val': 9, 'id': 4},
{'val': 2, 'id': 5},
{'val': 6, 'id': 6},
{'val': 11, 'id': 7},
{'val': 89, 'id': 8},
{'val': 2, 'id': 9},
{'val': 4, 'id': 10}
)
source_list = [1, 3, 2, 10]
final_list = []
for item in source_list:
for d in td:
if d['id'] == item:
final_list.append(d['val'])
print('Source : ', source_list)
print('Final : ', final_list)
Result
Source : [1, 3, 2, 10]
Final : [30, 1, 53, 4]