Make a dictionary from two lists after applying a function - python

I have two lists of tuples
keys = [(0,1), (2,1)]
values = [('a','b'), ('c','d')]
I want to make a dictionary dict that will apply a function f1 to each
dict.keys[i] = keys[i][0], keys[i][i]: f1(keys[i][0],keys[i][1])
And for the values of the dictionary, I would like to be tuples
dict.values[i] = (f2(values[i][0]), f2(values[i][1]))
What is the most efficient way of doing that in one pass in a pythonic way?

You can do this using a dictionary comprehension.
out = {f1(keys[i][0], keys[i][1]):(f2(values[i][0]),f2(values[i][1])) for i in range(len(keys))}
If you want to avoid using range you can also use zip to accomplish the same thing:
out = {f1(M[0],M[1]):(f2(N[0]), f2(N[1])) for M,N in zip(keys, values)}
And even briefer
out = {f1(*M):tuple(map(f2, N)) for M,N in zip(keys, values)}
If you're on Python 2.6 or earlier (prior to dictionary comprehensions), you can always use a list comprehension and convert to a dictionary explicitly.
out = dict([(f1(*M), tuple(map(f2, N))) for M,N in zip(keys, values)])

Related

Python dictionary comprehension to group together equal keys

I have a code snippit that groups together equal keys from a list of dicts and adds the dict with equal ObjectID to a list under that key.
Code bellow works, but I am trying to convert it to a Dictionary comprehension
group togheter subblocks if they have equal ObjectID
output = {}
subblkDBF : list[dict]
for row in subblkDBF:
if row["OBJECTID"] not in output:
output[row["OBJECTID"]] = []
output[row["OBJECTID"]].append(row)
Using a comprehension is possible, but likely inefficient in this case, since you need to (a) check if a key is in the dictionary at every iteration, and (b) append to, rather than set the value. You can, however, eliminate some of the boilerplate using collections.defaultdict:
output = defaultdict(list)
for row in subblkDBF:
output[row['OBJECTID']].append(row)
The problem with using a comprehension is that if really want a one-liner, you have to nest a list comprehension that traverses the entire list multiple times (once for each key):
{k: [d for d in subblkDBF if d['OBJECTID'] == k] for k in set(d['OBJECTID'] for d in subblkDBF)}
Iterating over subblkDBF in both the inner and outer loop leads to O(n^2) complexity, which is pointless, especially given how illegible the result is.
As the other answer shows, these problems go away if you're willing to sort the list first, or better yet, if it is already sorted.
If rows are sorted by Object ID (or all rows with equal Object ID are at least next to each other, no matter the overall order of those IDs) you could write a neat dict comprehension using itertools.groupby:
from itertools import groupby
from operator import itemgetter
output = {k: list(g) for k, g in groupby(subblkDBF, key=itemgetter("OBJECTID"))}
However, if this is not the case, you'd have to sort by the same key first, making this a lot less neat, and less efficient than above or the loop (O(nlogn) instead of O(n)).
key = itemgetter("OBJECTID")
output = {k: list(g) for k, g in groupby(sorted(subblkDBF, key=key), key=key)}
You can adding an else block to safe on time n slightly improve perfomrance a little:
output = {}
subblkDBF : list[dict]
for row in subblkDBF:
if row["OBJECTID"] not in output:
output[row["OBJECTID"]] = [row]
else:
output[row["OBJECTID"]].append(row)

How can I make a one line generator expression to generate these two different lists

I am writing a program which parses lists like this.
['ecl:gry', 'pid:860033327', 'eyr:2020', 'hcl:#fffffd', 'byr:1937', 'iyr:2017', 'cid:147', 'hgt:183cm']
I want to turn this list into a dictionary of the key value pairs which I have done here:
keys = []
values = []
for string in data:
pair = string.split(':')
keys.append(pair[0])
values.append(pair[1])
zipped = zip(keys, values)
self.dic = dict(zipped)
print(self.dic)
I know that I can use list comprehension to make one of the lists at a time like this
keys = [s.split(':')[0] for s in data]
values = [s.split(':')[1] for s in data]
This requires two loops so the first code example would be better, but is there a way to generate both lists using one generator with unpacking and then zip the two together?
l = ['ecl:gry', 'pid:860033327', 'eyr:2020', 'hcl:#fffffd',
'byr:1937', 'iyr:2017', 'cid:147', 'hgt:183cm']
dict(e.split(':') for e in l)
I did it like this:
self.dic = {}
for e in data:
k, v = e.split(':')
self.dic[k] = v
You can dict comprehension it easily:
your_dict = {x.split(':')[0]: x.split(':')[1] for x in data}
You can also prevent from using split two times and use generator:
your_dict = dict(x.split(':') for x in data)
Which seems even cleaner...

List comprehension for dict in dict

I have the following python dictionary (a dict in a dict):
d = {'k1': {'kk1':'v1','kk2':'v2','kk3':'v3'},'k2':{'kk1':'v4'}}
I can't get my brains to figure out the list comprehension to get a list of all values (v1, v2...). If you can give my an example with a lambda also, that you be nice.
The goal is to have values_lst = ['v1','v2','v3','v4']
Thanks
Combine two loops to "flatten" the dict of dicts. Loop over the values of d first, and then loop over the values of the values of d. The syntax might be a bit hard to grasp at first:
values_lst = [v for x in d.values() for v in x.values()]

Need to pull multiple values from python list comprehension?

I have a dictionary with tuple keys, like so:
(1111, 3454): 34.55555
(1123, 4665): 67.12
(1111, 9797): 5.09
I need to do a list comprehension that grabs the values for all entries with a matching first element.
Problem is, I ALSO need that second value of the tuple...
interimlist = [v for k,v in mydict.items() if k[0]==item[0]]
Is what I've got right now for pulling the values if the first element of the tuple is correct (item is an iterator variable). I'd like the output to be a list of tuples of (value, second tuple number), so with the example points, would be the following output if item[0] is 1111:
[(34.55555, 3454), (5.09, 9797)]
The dict is not stored with a good with a good structure here. All the keys/vals must be iterated in order to do one lookup, so it is O(n) retrieval.
You should do a once-off re-keying of the data, adding another level of nesting in the dict:
>>> d
{(1111, 3454): 34.55555, (1123, 4665): 67.12, (1111, 9797): 5.09}
>>> d_new = {}
>>> for (k1, k2), v in d.items():
... if k1 not in d_new:
... d_new[k1] = {}
... d_new[k1][k2] = v
And now, O(1) lookups are restored:
>>> d_new[1111]
{3454: 34.55555, 9797: 5.09}
>>> [item[::-1] for item in d_new[1111].items()]
[(34.55555, 3454), (5.09, 9797)]

Python: Conversion from dictionary to array

I have a Python dictionary (say D) where every key corresponds to some predefined list. I want to create an array with two columns where the first column corresponds to the keys of the dictionary D and the second column corresponds to the sum of the elements in the corresponding lists. As an example, if,
D = {1: [5,55], 2: [25,512], 3: [2, 18]}
Then, the array that I wish to create should be,
A = array( [[1,60], [2,537], [3, 20]] )
I have given a small example here, but I would like to know of a way where the implementation is the fastest. Presently, I am using the following method:
A_List = map( lambda x: [x,sum(D[x])] , D.keys() )
I realize that the output from my method is in the form of a list. I can convert it into an array in another step, but I don't know if that will be a fast method (I presume that the use of arrays will be faster than the use of lists). I will really appreciate an answer where I can know what's the fastest way of achieving this aim.
You can use a list comprehension to create the desired output:
>>> [(k, sum(v)) for k, v in D.items()] # Py2 use D.iteritems()
[(1, 60), (2, 537), (3, 20)]
On my computer, this runs about 50% quicker than the map(lambda:.., D) version.
Note: On py3 map just returns a generator so you need to list(map(...)) to get the real time it takes.
I hope that helps:
Build an array with the values of the keys of D:
first_column = list(D.keys())
Build an array with the sum of values in each key:
second_column = [sum(D[key]) for key in D.keys()]
Build an array with shape [first_column,second_column]
your_array = list(zip(first_column,second_column))
You can try this also:
a=[]
for i in D.keys():
a+=[[i,sum(D[i])]]

Categories

Resources