How to extract values from a nested dict_values object - python

I have a dict_values object. When i read it:
Info = mydata.values()
and I display it, it shows as follow:
dict_values([{'loc': -1392.9605288965874, 'scale': 652.5001331690878}])
My goal is to generate a numpy array extracting the values in the dict_values object, so:
array([-1392.9605288965874, 652.5001331690878])
The problem is that I'm still not able to 'separate' the text information, so for instance, the description of the parameters ('loc' and 'scale').
How can I do it?

You can use .values() twice and obtain the array without defining the inner dict keynames:
out = np.array([x for y in mydata.values() for x in y.values()])

To get that values mydata must have been a dict itself, something like:
In [46]: d={'foobar':{'loc': -1392.9605288965874, 'scale': 652.5001331690878}}
In [47]: d
Out[47]: {'foobar': {'loc': -1392.9605288965874, 'scale': 652.5001331690878}}
Then its values:
In [48]: d.values()
Out[48]: dict_values([{'loc': -1392.9605288965874, 'scale': 652.5001331690878}])
expand that into a list, and select the one item:
In [49]: list(d.values())
Out[49]: [{'loc': -1392.9605288965874, 'scale': 652.5001331690878}]
In [50]: list(d.values())[0]
Out[50]: {'loc': -1392.9605288965874, 'scale': 652.5001331690878}
That's a dict, so we can again get the values:
In [51]: list(d.values())[0].values()
Out[51]: dict_values([-1392.9605288965874, 652.5001331690878])
and make an array from the derived list:
In [53]: np.array(list(list(d.values())[0].values()))
Out[53]: array([-1392.9605289 , 652.50013317])
with just one element in the outer dict, it may be simpler to select that by key rather than values:
In [54]: d['foobar']
Out[54]: {'loc': -1392.9605288965874, 'scale': 652.5001331690878}
The main challenge with nested structures like this is keeping track of what you have at each level.

does this help?
values_dict = list(Info)[0]
values_list = [v for k, v in values_dict.items()]
numpy_array = np.array(values_list)
print(numpy_array)

Related

Can this for loop be reduced to one line?

Python code:
arr = [['name1', 101], ['name2', 234], ['name3', 456]]
nametolookfor = input("Please enter the name: ")
data = 0
for value in arr:
if value[0] == nametolookfor:
otherdata = value[1]
I was wondering if the for loop and its contents could be brought down to one line.
I have tried using list comprehensions but can't seem to get it to work.
You can use a list comprehension to filter for array elements where the name matches, then select only the value, and from that retrieve the first:
>>> arr = [['name1', 101], ['name2', 234], ['name3', 456]]
>>> nametolookfor = 'name2'
>>> [v for (n, v) in arr if n == nametolookfor][0]
234
Since your list is a list of key/value pairs where the key is (by definition) unique, you can also make it into a dictionary and have a direct lookup instead:
>>> lookup = dict(arr)
>>> lookup[nametolookfor]
234
Of course, if your arr is static, you can just declare it as a dictionary from the start to save you from having to do the conversion:
>>> lookup = { 'name1': 101, 'name2': 234, 'name3': 456 }
>>> lookup[nametolookfor]
234
I think this should be close to what you want. Using next avoids some boilerplate later to check if the value exists.
next(value[0] for value in are if value[0] == nametolookfor, None)
Yes this is posible in list comprehensions. Follow this
[value[1] for value in arr if value[0] == nametolookfor]
Please enter the name: name1
Output: [101]
Assume your list arr = [['name1', 101], ['name2', 234], ['name3', 456], ['name1', 202], ['name1', 303]] and set input nametolookfor = name1 then result will be
[101, 202, 303]
Try this out:
otherdata = dict(arr)[nametolookfor]
Since the list is a list of key/value pairs where the key is unique, you can convert it into a dictionary and have a direct lookup.

How to compare index values of list in default dict in python

I have a default dict d in python which contains two list in it as below:
{
'data1': [0.8409093126477928, 0.9609093126477928, 0.642217399079215, 0.577003839123445, 0.7024399719949195, 1.0739533732043967],
'data2': [0.9662666242560285, 0.9235637581239243, 0.8947656867577896, 0.9266919525550584, 1.0220039913024457]
}
In future there can be many list in in default dict like data1, data2, data3, data4 etc. I need to compare the index values of default dict with each other. So for above default dict I need to check weather data1[0]->0.8409093126477928 is smaller than data2[0]->0.9662666242560285 or not and same goes for other index, and store the result of wining list index in separate list like below:
result = ['data1', 'data2', 'data1', 'data1', 'data1']
If length of any list is greater than other list, we simply need to check if the last index value is smaller than 1 or not. Like data1[5] cannot be compared with data2[5] because there is no value of data2[5] thus we will simply check if data1[5] is less than 1 or not. If its less than 1 then we will consider it and add it to result otherwise ignore it and will not save it in result.
To resolve this I thought, of extracting the list from default dict to separate list and then using a for loop to compare index values, but when I did print(d[0]) to print the 0th index list, it printed out []. Why is it printing null. How can I compare the index values as above. Please help. Thanks
Edit: as suggested by #ggorlen replaced the custom iterator with zip_longest
I would do it using custom_iterator like this,
zip longest yeild one item from each array in each iteration. for shorter array it will return 1 when iteration goes past its length
The list comprehension loop through the iterator and get 1st index of min item item.index(min(item)) then get the key corresponding to the min value keys[item.index(min(item))]
if selected list is shorter than current iterator index it either skips or give "NA" value
from itertools import zip_longest
keys = list(d.keys())
lengths = list(map(len,d.values()))
result = [keys[item.index(min(item))]
for i, item in enumerate(zip_longest(*d.values(), fillvalue=1))
if lengths[item.index(min(item))]>i]
result
if you want to give default key instead of skip-ing when minimum value found is not less than one
result = [keys[item.index(min(item))] if lengths[item.index(min(item))]>i else "NA"
for i, item in enumerate(zip_longest(*d.values(), fillvalue=1))]
We can use zip_longest from itertools and a variety of loops to achieve the result:
from itertools import zip_longest
result = []
pairs = [[[z, y] for z in x] for y, x in data.items()]
for x in zip_longest(*pairs):
x = [y for y in x if y]
if len(x) > 1:
result.append(min(x, key=lambda x: x[0])[1])
elif x[0][0] < 1:
result.append(x[0][1])
print(result) # => ['data1', 'data2', 'data1', 'data1', 'data1']
First we create pairs of every item in each dict value and its key. This makes it easier to get result keys later. We zip_longest and iterate over the lists, filtering out Nones. If we have more than one element to compare, we take the min and append it to the result, else we check the lone element and keep it if its value is less than 1.
A more verifiable example is
data = {
'foo': [1, 0, 1, 0],
'bar': [1, 1, 1, 1, 0],
'baz': [1, 1, 0, 0, 1, 1, 0],
'quux': [0],
}
which produces
['quux', 'foo', 'baz', 'foo', 'bar', 'baz']
Element-wise, "quux" wins round 0, "foo" wins round 1, "baz" 2, "foo" round 3 thanks to key order (tied with "baz"), "bar" for round 4. For round 5, "baz" is the last one standing but isn't below 1, so nothing is taken. For round 6, "baz" is still the last one standing but since 0 < 1, it's taken.
d = {
'd0': [0.1, 1.1, 0.3],
'd1': [0.4, 0.5, 1.4, 0.3, 1.6],
'd2': [],
}
import itertools
import collections
# sort by length of lists, shortest first and longest last
d = sorted(d.items(), key=lambda k:len(k[1]))
# loop through all combinations possible
for (key1, list1), (key2, list2) in itertools.combinations(d, 2):
result = []
for v1, v2 in itertools.zip_longest(list1, list2): # shorter list is padded with None
# no need to check if v2 is None because of sorting
if v1 is None:
result.append(key2 if v2 < 1 else None)
else:
result.append(key1 if v1 < v2 else key2)
# DO stuff with result, keys, list, etc...
print(f'{key1} vs {key2} = {result}')
Output
d2 vs d0 = ['d0', None, 'd0']
d2 vs d1 = ['d1', 'd1', None, 'd1', None]
d0 vs d1 = ['d0', 'd1', 'd0', 'd1', None]
I sorted them based on the list lengths. This ensures that list1 will always be shorter or of the same length as list2.
For different lengths, the remaining indices will be a mixture of None and key2.
However, when the elements are equal, key2 is added to the result. This might not be the desired behavior.

Python: modifying single dictionary item containing an array view modifies all items

I have two dictionaries with same keys. Each item is an ndarray.
from numpy import zeros, random
from collections import namedtuple
PhaseAmplitude = namedtuple('PhaseAmplitude','phase amplitude')
dict_keys = {'K1','K2', 'K3'}
J1 = dict.fromkeys(dict_keys, zeros((2,2,2,2)))
U1 = dict.fromkeys(dict_keys, PhaseAmplitude(phase = zeros((2,2)),
amplitude = zeros((2,2))))
for iFld in dict_keys:
U1[iFld] = U1[iFld]._replace(phase = random.random_sample((2,2)),
amplitude = random.random_sample((2,2)))
I want to modify each item in the the first dictionary using the corresponding item in the second one:
for iFld in dict_keys:
J1[iFld][0,0,:,:] += U1[iFld].phase
J1[iFld][0,1,:,:] += U1[iFld].amplitude
I expect to get that J1[iFld][0,0,:,:] = U1[iFld].phase and J1[iFld][0,1,:,:] = U1[iFld].amplitude but I get J1[iFld] being the same for all iFld and equal to the sum over all iFld keys of U1 (keeping track of the phase and amplitude fields of U1 of course).
To me this looks like a bug but I've been using Python only for a month or so (switching from matlab) so I am not sure.
Question: Is this expected behavior or a bug? What should I change in my code in order to get the behavior I want?
Note: I chose the number of dimensions of dict_keys, J1 and U1 to reflect my particular situation.
This isn't a bug, though it is a pretty common gotcha that shows up in a few different situations. dict.fromkeys creates a new dictionary where all of the values are the same object. This works great for immutable types (e.g. int, str), but for mutable types, you can run into problems.
e.g.:
>>> import numpy as np
>>> d = dict.fromkeys('ab', np.zeros(2))
>>> d
{'a': array([ 0., 0.]), 'b': array([ 0., 0.])}
>>> d['a'][1] = 1
>>> d
{'a': array([ 0., 1.]), 'b': array([ 0., 1.])}
and this is because:
>>> d['a'] is d['b']
True
Use a dict comprehension to build the dictionary in this case:
J1 = {k: zeros((2,2,2,2)) for k in dict_keys}
(or, pre-python2.7):
J1 = dict((k, zeros((2,2,2,2))) for k in dict_keys)

Summing up numbers in a defaultdict(list)

I've been experimenting trying to get this to work and I've exhausted every idea and web search. Nothing seems to do the trick. I need to sum numbers in a defaultdict(list) and i just need the final result but no matter what i do i can only get to the final result by iterating and returning all sums adding up to the final. What I've been trying generally,
d = { key : [1,2,3] }
running_total = 0
#Iterate values
for value in d.itervalues:
#iterate through list inside value
for x in value:
running_total += x
print running_total
The result is :
1,3,6
I understand its doing this because its iterating through the for loop. What i dont get is how else can i get to each of these list values without using a loop? Or is there some sort of method iv'e overlooked?
To be clear i just want the final number returned e.g. 6
EDIT I neglected a huge factor , the items in the list are timedealta objects so i have to use .seconds to make them into integers for adding. The solutions below make sense and I've tried similar but trying to throw in the .seconds conversion in the sum statement throws an error.
d = { key : [timedelta_Obj1,timedelta_Obj2,timedelta_Obj3] }
I think this will work for you:
sum(td.seconds for sublist in d.itervalues() for td in sublist)
Try this approach:
from datetime import timedelta as TD
d = {'foo' : [TD(seconds=1), TD(seconds=2), TD(seconds=3)],
'bar' : [TD(seconds=4), TD(seconds=5), TD(seconds=6), TD(seconds=7)],
'baz' : [TD(seconds=8)]}
print sum(sum(td.seconds for td in values) for values in d.itervalues())
You could just sum each of the lists in the dictionary, then take one final sum of the returned list.
>>> d = {'foo' : [1,2,3], 'bar' : [4,5,6,7], 'foobar' : [10]}
# sum each value in the dictionary
>>> [sum(d[i]) for i in d]
[10, 6, 22]
# sum each of the sums in the list
>>> sum([sum(d[i]) for i in d])
38
If you don't want to iterate or to use comprehensions you can use this:
d = {'1': [1, 2, 3], '2': [3, 4, 5], '3': [5], '4': [6, 7]}
print(sum(map(sum, d.values())))
If you use Python 2 and your dict has a lot of keys it's better you use imap (from itertools) and itervalues
from itertools import imap
print sum(imap(sum, d.itervalues()))
Your question was how to get the value "without using a loop". Well, you can't. But there is one thing you can do: use the high performance itertools.
If you use chain you won't have an explicit loop in your code. chain manages that for you.
>>> data = {'a': [1, 2, 3], 'b': [10, 20], 'c': [100]}
>>> import itertools
>>> sum(itertools.chain.from_iterable(data.itervalues()))
136
If you have timedelta objects you can use the same recipe.
>>> data = {'a': [timedelta(minutes=1),
timedelta(minutes=2),
timedelta(minutes=3)],
'b': [timedelta(minutes=10),
timedelta(minutes=20)],
'c': [timedelta(minutes=100)]}
>>> sum(td.seconds for td in itertools.chain.from_iterable(data.itervalues()))
8160

What is the Pythonic way to use (elements of) a list as keys in a dictionary?

Given the list:
keys = ['Orange','Blue','Green']
and the dictionary
colors = {}
What is the most Pythonic way to use (the elements of) keys as the keys to colors? I'm currently doing the following but want to know if there's a better way of using Python than this.
for key in keys:
colors[key] = []
EDIT: The question originally asked for "the most Pythonic way to use keys as the keys to colors", but the subsequent code snippet indicates that what's actually required is a way to use its elements.
you can use dict comprehensions:
In [1]: keys = ['Orange', 'Blue', 'Green']
In [2]: colors={key: [] for key in keys}
In [3]: colors
Out[3]: {'Blue': [], 'Green': [], 'Orange': []}
for python 2.6:
In [4]: colors = dict((key, []) for key in keys)
In [5]: colors
Out[5]: {'Blue': [], 'Green': [], 'Orange': []}
If order matters, use a tuple.
In [113]: keys = ['Orange','Blue','Green']
In [114]: colors = {}
In [115]: colors[tuple(keys)] = 0
In [116]: colors
Out[116]: {('Orange', 'Blue', 'Green'): 0}
If order does not matter, use a frozenset. A frozenset essentially gives you a hashable set, which can't be modified and yet has all the advantages of a set (O(1) lookup, etc)
In [117]: colors = {}
In [118]: colors[frozenset(keys)] = 0
In [119]: colors
Out[119]: {frozenset(['Orange', 'Blue', 'Green']): 0}
If you want each element in keys to be a key in colors:
In [120]: colors = {k:[] for k in keys}
In [121]: colors
Out[121]: {'Blue': [], 'Green': [], 'Orange': []}
Dict comprehensions as mentioned above are the way to go. In the interest of showing alternatives, you could also use a defaultdict in combination with fromkeys if you want to instantiate with a list:
In [1]: from collections import defaultdict
In [2]: keys = ['Orange','Blue','Green']
In [3]: colors = defaultdict.fromkeys(keys, list)
In [4]: colors['Orange']
Out[4]: <type 'list'>

Categories

Resources