How does python achieve the following data format conversion? - python

I would like to achieve a function that deals with the following dataļ¼š'd' and generate 'L'. How to achieve?
def func(**d):
'do something'
return [....]
source:
d = {'a': 1, 'b': 2, 'c': [3, 4, 5]}
or
d = {'a': 1, 'b': 2, 'c': [3, 4, 5], 'd': [6, 7]}
TO:
L=[{'a':1,'b':2,'c':3},
{'a':1,'b':2,'c':4},
{'a':1,'b':2,'c':5}]
or
L=[{'a': 1, 'b': 2, 'c': 3, 'd': 6},
{'a': 1, 'b': 2, 'c': 3, 'd': 7},
{'a': 1, 'b': 2, 'c': 4, 'd': 6},
{'a': 1, 'b': 2, 'c': 4, 'd': 7},
{'a': 1, 'b': 2, 'c': 5, 'd': 6},
{'a': 1, 'b': 2, 'c': 5, 'd': 7}]

d = {'a': 1, 'b': 2,' c': [3, 4, 5]}
temp_d = d
L = []
for key in temp_d:
item = {}
for key in temp_d:
if isinstance(temp_d[key], list):
item[key] = temp_d[key].pop(0)
else:
item[key] = temp_d[key]
L.append(item)
Basically, what i am doing here is:
I create a copy of the dictionary 'd' named 'temp_d';
I go through every key in the 'temp_d ' dictionary, and create an empty one;
I loop again through all the keys in the 'd' dictionary, and basically I verify if the value of the current key of the loop is a list, if it is, I add the key to the dictionary 'item' with the first value of the list, with the function pop(index) (this function removes an element from a list and returns it). If the value of the current key isn't a list, it just adds the key to the dict with the value.
After filling the dictionary 'item', I append it to 'L'.
Example in this case:
first key ('a'):
item = {}
first key of second loop ('a'):
is the value of 'a' a list?
no. adds the value.
new item{'a': 1}
second key of second loop ('b'):
is the value of 'b' a list?
no. adds the value.
new item{'a': 1, 'b': 2}
third key of second loop ('c'):
is the value of 'c' a list?
yes. adds the first element of the list, removing it from the list
(the list was [3, 4, 5], now is [4, 5])
new item{'a': 1, 'b': 2, 'c': 3}
appends the item to L
(the 'L' was [], now is [{'a': 1, 'b': 2, 'c': 3}])
etc until the end.

This will work with python 3:
d = {'a': 1, 'b': 2, 'c': [3, 4, 5]}
def f(**d):
return [{**d, 'c': i} for i in d.pop('c')]

Your problem can be solved as follows:
from itertools import cycle
def func(indict):
dictlist = [dict(indict)] # make copy to not change original dict
for key in indict: # loop keys to find lists
if type(indict[key]) == list:
listlength = len(indict[key])
dictlist = listlength * dictlist # elements are not unique
for dictindex, listelement in zip(range(len(dictlist)), cycle(indict[key])):
dictlist[dictindex] = dict(dictlist[dictindex]) # uniquify
dictlist[dictindex][key] = listelement # replace list by list element
return dictlist
In the general case you can have multiple lists in your dict. My solution assumes you want to unroll all of these.
Looking at the details of the solution, it starts by adding a copy of your original dict to dictlist then it cycles the elements and whenever it finds a list, it multiplies the dictlist with the the length of the list found. This will ensure that dictlist contains the correct number of elements.
However, the elements will not be unique as they will be references to the same underlying dicts.
To fix this, the elements of the dict list are "uniquified" by looping the list and replacing every element with a copy of itself and the list in the original indict is replaced by each element of the list, cycling the different elements of dictlist.
I know my explanation is a bit messy. I'm sorry about that, but I find it hard to explain in a short and simple way.
Also, the order of the element in the list, is not identical to what you ask for in the question. Since the individual key-value pairs of the dict are not ordered, it is not possible to ensure which order the elements will be unrolled, which leads to the list order is also not ensured.

Related

How to effectively transfer a two-dimensional list data to list of dict?

key_list = ['a', 'b']
value_list = [[1, 2], [3, 4],...]
res = [dict(zip(key_list, item)) for item in value_list]
print(res) # [{'a': 1, 'b': 2}, {'a': 3, 'b': 4}, ...]
code as above can get what I want (list of dict element): [{'a': 1, 'b': 2}, {'a': 3, 'b': 4}, ...]
but I think this maybe not effective way.
Is there some more effective method to transfer?
If you only need to iterate through res sequentially and it is made from huge lists you could use:
res = (dict(zip(key_list, item)) for item in value_list)
(Note the round parenthesis), this will create the pairs one by one as you request them. Keep in mind that with this way you will only be able to iterate over res once and you will not be able to index it.

Keys with maximum value in Python dictionary? [duplicate]

If I have a dictionary with their corresponding frequency values:
numbers = {'a': 1, 'b': 4, 'c': 1, 'd': 3, 'e': 3}
To find the highest, what I know is:
mode = max(numbers, key=numbers.get)
print mode
and that prints:
b
But if I have:
numbers = {'a': 1, 'b': 0, 'c': 1, 'd': 3, 'e': 3}
and apply the 'max' function above, the output is:
d
What I need is:
d,e
Or something similar, displaying both keys.
numbers = {'a': 1, 'b': 0, 'c': 1, 'd': 3, 'e': 3}
max_value = max(numbers.values())
[k for k,v in numbers.items() if v == max_value]
prints
['e', 'd']
what it does is, loop over all entries via .items and then check if the value is the maximum and if so add the key to a list.
numbers = {'a': 1, 'b': 4, 'c': 1, 'd':4 , 'e': 3}
mx_tuple = max(numbers.items(),key = lambda x:x[1]) #max function will return a (key,value) tuple of the maximum value from the dictionary
max_list =[i[0] for i in numbers.items() if i[1]==mx_tuple[1]] #my_tuple[1] indicates maximum dictionary items value
print(max_list)
This code will work in O(n). O(n) in finding maximum value and O(n) in the list comprehension. So overall it will remain O(n).
Note : O(2n) is equivalent to O(n).
The collections.Counter object is useful for this as well. It gives you a .most_common() method which will given you the keys and counts of all available values:
from collections import Counter
numbers = Counter({'a': 1, 'b': 0, 'c': 1, 'd': 3, 'e': 3})
values = list(numbers.values())
max_value = max(values)
count = values.count(max_value)
numbers.most_common(n=count)
You can use the .items() property and sort after a tuple of count, key - on similar counts the key will decide:
d = ['a','b','c','b','c','d','c','d','e','d','b']
from collections import Counter
get_data = Counter(d)
# sort by count, then key
maxmax = sorted(get_data.items(), key=lambda a: (a[1],a[0]) )
for elem in maxmax:
if elem[1] == maxmax[0][1]:
print (elem)
Output:
('a', 1)
('e', 1) # the last one is the one with "highest" key
To get the "highest" key, use maxmax[-1].

Computing the sum of all unique values in a numpy array containing rows of dicts

I have a large numpy array, with each row containing a dict of words, in a similar format to below:
data = [{'a': 1, 'c': 2}, {'ba': 3, 'a': 4}, ... }
Could someone please point me in the right direction for how would I go about computing the sum of all the unique values of the dicts in each row of the numpy array? From the example above, I would hope to obtain something like this:
result = {'a': 5, 'c': 2, 'ba': 3, ...}
At the moment, the only way I can think to do it is iterating through each row of the data, and then each key of the dict, if a unique key is found then append it to the new dict and set the value, if a key that's already contained in the dict is found then add the value of that key to the key in the 'result'. Although this seems like an inefficient way to do it.
You could use a Counter() and update it with each dictionary contained in data, in a loop:
from collections import Counter
data = [{'a': 1, 'c': 2}, {'ba': 3, 'a': 4}]
c = Counter()
for d in data:
c.update(d)
output:
Counter({'a': 5, 'ba': 3, 'c': 2})
alternate one liner:
(as proposed by #AntonVBR in the comments)
sum((Counter(dict(x)) for x in data), Counter())
A pure Python solution using for-loops:
data = [{'a': 1, 'c': 2}, {'ba': 3, 'a': 4}]
result = {}
for d in data:
for k, v in d.items():
if k in result:
result[k] += v
else:
result[k] = v
output:
{'c': 2, 'a': 5, 'ba': 3}

How can I get the values that are common to two dictionaries, even if the keys are different?

Starting from two different dictionaries:
dict_a = {'a': 1, 'b': 3, 'c': 4, 'd': 4, 'e': 6}
dict_b = {'d': 1, 'e': 6, 'a': 3, 'v': 7}
How can I get the common values even if they have different keys? Considering the above dictionaries, I would like to have this output:
common = [1, 3, 6]
Create sets from the values:
list(set(dict_a.values()) & set(dict_b.values()))
This creates an intersection of the unique values in either dictionary:
>>> dict_a = {'a': 1, 'b': 3, 'c': 4, 'd': 4, 'e': 6}
>>> dict_b = {'d': 1, 'e': 6, 'a': 3, 'v': 7}
>>> list(set(dict_a.values()) & set(dict_b.values()))
[1, 3, 6]
Unfortunately, we can't use dictionary views here (which can act like sets), because dictionary values are not required to be unique. Had you asked for just the keys, or the key-value pairs, the set() calls would not have been necessary.
Try this,
commom = [item for item in dict_b.values() if item in dict_a.values()]
The intersection expression & requires 2 sets but the method counterpart can work with any iterable, like dict.values. So here is another version of the Martijn Pieters solution :
list(set(dict_a.values()).intersection(dict_b.values()))
My 2 cents :)

How to get all the keys with the same highest value?

If I have a dictionary with their corresponding frequency values:
numbers = {'a': 1, 'b': 4, 'c': 1, 'd': 3, 'e': 3}
To find the highest, what I know is:
mode = max(numbers, key=numbers.get)
print mode
and that prints:
b
But if I have:
numbers = {'a': 1, 'b': 0, 'c': 1, 'd': 3, 'e': 3}
and apply the 'max' function above, the output is:
d
What I need is:
d,e
Or something similar, displaying both keys.
numbers = {'a': 1, 'b': 0, 'c': 1, 'd': 3, 'e': 3}
max_value = max(numbers.values())
[k for k,v in numbers.items() if v == max_value]
prints
['e', 'd']
what it does is, loop over all entries via .items and then check if the value is the maximum and if so add the key to a list.
numbers = {'a': 1, 'b': 4, 'c': 1, 'd':4 , 'e': 3}
mx_tuple = max(numbers.items(),key = lambda x:x[1]) #max function will return a (key,value) tuple of the maximum value from the dictionary
max_list =[i[0] for i in numbers.items() if i[1]==mx_tuple[1]] #my_tuple[1] indicates maximum dictionary items value
print(max_list)
This code will work in O(n). O(n) in finding maximum value and O(n) in the list comprehension. So overall it will remain O(n).
Note : O(2n) is equivalent to O(n).
The collections.Counter object is useful for this as well. It gives you a .most_common() method which will given you the keys and counts of all available values:
from collections import Counter
numbers = Counter({'a': 1, 'b': 0, 'c': 1, 'd': 3, 'e': 3})
values = list(numbers.values())
max_value = max(values)
count = values.count(max_value)
numbers.most_common(n=count)
You can use the .items() property and sort after a tuple of count, key - on similar counts the key will decide:
d = ['a','b','c','b','c','d','c','d','e','d','b']
from collections import Counter
get_data = Counter(d)
# sort by count, then key
maxmax = sorted(get_data.items(), key=lambda a: (a[1],a[0]) )
for elem in maxmax:
if elem[1] == maxmax[0][1]:
print (elem)
Output:
('a', 1)
('e', 1) # the last one is the one with "highest" key
To get the "highest" key, use maxmax[-1].

Categories

Resources