Python miminum value in dictionary of lists - python

Sorry about the question repost...I should have just edited this question in the first place. Flagged the new one for the mods. Sorry for the trouble
Had to re-write the question due to changed requirements.
I have a dictionary such as the following:
d = {'a': [4, 2], 'b': [3, 4], 'c': [4, 3], 'd': [4, 3], 'e': [4], 'f': [4], 'g': [4]}
I want to get the keys that are associated with the smallest length in the dictionary d, as well as those that have the maximum value.
In this case, the keys with the smallest length (smallest length of lists in this dictionary) should return
'e, 'f', 'g'
And those with the greatest value(the sum of the integers in each list) should return
'b' 'c'
I have tried
min_value = min(dict.itervalues())
min_keys = [k for k in d if dict[k] == min_value]
But that does not give me the result I want.
Any ideas?
Thanks!

Your problem is that your lists contain strings ('2'), and not integers (2). Leave out the quotes, or use the following:
min_value = min(min(map(int, v) for v in dct.values()))
min_keys = [k for k,v in d.items() if min_value in map(int, v)]
Similarily, to calculate the keys with the max length:
max_length = max(map(len, dct.values()))
maxlen_keys = [k for k,v in d.items() if max_length == len(v)]
Also, it's a bad idea to use dict as a variable name, as doing so overshadows the built-in dict.

You can use min() with a key= argument, and specify a key function that compares the way you want.
d = {'a': ['1'], 'b': ['1', '2'], 'c': ['2'], 'd':['1']}
min_value = min(d.values())
min_list = [key for key, value in d.items() if value == min_value]
max_len = len(max(d.values(), key=len))
long_list = [key for key, value in d.items() if len(value) == max_len]
print(min_list)
print(long_list)
Notes:
0) Don't use dict as a variable name; that's the name of the class for dictionary, and if you use it as a variable name you "shadow" it. I just used d for the name here.
1) min_value was easy; no need to use a key= function.
2) max_len uses a key= function, len(), to find the longest value.

How about using sorting and lambdas?
#!/usr/bin/env python
d = {'a': ['1'], 'b': ['1', '2'], 'c': ['8', '1'], 'd':['1'], 'e':['1', '2', '3'], 'f': [4, 1]}
sorted_by_sum_d = sorted(d, key=lambda key: sum(list(int(item) for item in d[key])))
sorted_by_length_d = sorted(d, key=lambda key: len(d[key]))
print "Sorted by sum of the items in the list : %s" % sorted_by_sum_d
print "Sorted by length of the items in the list : %s" % sorted_by_length_d
This would output:
Sorted by sum of the items in the list : ['a', 'd', 'b', 'f', 'e', 'c']
Sorted by length of the items in the list : ['a', 'd', 'c', 'b', 'f', 'e']
Be aware I changed the initial 'd' dictionary (just to make sure it was working)
Then, if you want the item with the biggest sum, you get the last element of the sorted_by_sum_d list.
(I'm not too sure this is what you want, though)
Edit:
If you can ensure that the lists are always going to be lists of integers (or numeric types, for that matter, such as long, float...), there's not need to cast strings to integers. The calculation of the sorted_by_sum_d variable can be done simply using:
d = {'a': [1], 'b': [1, 2], 'c': [8, 1], 'd':[1], 'e':[1, 2, 3], 'f': [4, 1]}
sorted_by_sum_d = sorted(d, key=lambda key: sum(d[key]))

I've found such a simple solution:
min_len = len(min(d.values(), key=(lambda value: len(value)))) # 1
min_keys = [key for i, key in enumerate(d) if len(d[key]) == min_len] # ['e', 'f', 'g']

Related

Filter values of a dictionary of lists, and return key

Take this for example:
d = {1: ['a', 'b', 'c'], 2: ['d', 'e', 'f'], 3: [1, 'i', 'j']}
I want to check if a value x exists inside of any of the lists in the dictionary, if it does, return the key of the list it's in.
So checking if 1 was in any of the lists in d, would return 3 (the key).
I know how to do this in the case that the dictionary values are not an iterable, but I'm having trouble figuring out how to do it when it is an iterable.
You can use list comprehension.
d = {1: ['a', 'b', 'c'], 2: ['d', 'e', 'f'], 3: [1, 'i', 'j']}
def ret_key_base_val(dct, val):
return [k for k,v in dct.items() if val in v]
result = ret_key_base_val(d, 1)
print(result)
# [3]
Using a list comprehension:
filtered_d = [k for k,v in d.items() if 1 in v]
for k, v for d.items() is your friend here, it will allow you easy access to both key and values in the context of your for-loop.
To support multiple types for v (making comprehensions tough to read/follow), then use type(v) and some if statements.

How to merge keys of dictionary which have the same value?

I need to combine two dictionaries by their value, resulting in a new key which is the list of keys with the shared value. All I can find online is how to add two values with the same key or how to simply combine two dictionaries, so perhaps I am just searching in the wrong places.
To give an idea:
dic1 = {'A': 'B', 'C': 'D'}
dic2 = {'D': 'B', 'E': 'F'}
Should result in:
dic3 = {['A', 'D']: 'B', 'C': 'D', 'E': 'F'}
I am not sure why you would need such a data structure, you can probably find a better solution to your problem. However, just for the sake of answering your question, here is a possible solution:
dic1 = {'A':'B', 'C':'D'}
dic2 = {'D':'B', 'E':'F'}
key_list = list(dic2.keys())
val_list = list(dic2.values())
r = {}
for k,v in dic1.items():
if v in val_list:
i = val_list.index(v) #get index at value
k2 = key_list[i] #use index to retrive the key at value
r[(k, k2)] = v #make the dict entry
else:
r[k] = v
val_list = list(r.values()) #get all the values already processed
for k,v in dic2.items():
if v not in val_list: #if missing value
r[k] = v #add new entry
print(r)
output:
{('A', 'D'): 'B', 'C': 'D', 'E': 'F'}
You can't assign a list as a key in a python dictionary since the key must be hashable and a list is not an ashable object, so I have used a tuple instead.
I would use a defaultdict of lists and build a reversed dict and in the end reverse it while converting the lists to tuples (because lists are not hashable and can't be used as dict keys):
from collections import defaultdict
dic1 = {'A':'B', 'C':'D'}
dic2 = {'D':'B', 'E':'F'}
temp = defaultdict(list)
for d in (dic1, dic2):
for key, value in d.items():
temp[value].append(key)
print(temp)
res = {}
for key, value in temp.items():
if len(value) == 1:
res[value[0]] = key
else:
res[tuple(value)] = key
print(res)
The printout from this (showing the middle step of temp) is:
defaultdict(<class 'list'>, {'B': ['A', 'D'], 'D': ['C'], 'F': ['E']})
{('A', 'D'): 'B', 'C': 'D', 'E': 'F'}
If you are willing to compromise from 1-element tuples as keys, the second part will become much simpler:
res = {tuple(value): key for key, value in temp.items()}

Merge two lists to make dictionary with same key Python

I would like to know how to change this code to NOT using the function zip. I haven’t been taught this function yet and so I want to know if there is an alternative way to retrieve the output I require?
list_one = ['a', 'a', 'c', 'd']
list_two = [1, 2, 3, 4]
dict_1 = {}
for key, value in zip(list_one, list_two):
if key not in dict_1:
dict_1[key] = [value]
else:
dict_1[key].append(value)
print(dict_1)
I would like the output to be:
{'a': [1, 2], 'd': [4], 'c': [3]}
A simple way to do this:
l1 = ['a', 'a', 'c', 'd']
l2 = [1, 2, 3, 4]
# Dict comprehension to initialize keys: list pairs
dct = {x: [] for x in l1}
# Append value related to key
for i in range(len(l1)):
dct[l1[i]].append(l2[i])
print(dct)
Output:
{'a': [1, 2], 'c': [3], 'd': [4]}
zip() makes it easy to iterate through multiple lists in parallel. If you try to understand the zip() it would be very easy to replicate it. So, please find the explanation and example in the official docs here.
Below is an example code with the implementation,
list_one = ['a', 'a', 'c', 'd']
list_two = [1,2,3,4]
dict_1={}
for index in range(len(list_one)):
if list_one[index] not in dict_1:
dict_1[list_one[index]]=[list_two[index]]
else:
dict_1[list_one[index]].append(list_two[index])
print(dict_1)
Output:
{'a': [1, 2], 'c': [3], 'd': [4]}
You can rewrite like this without any library.
list_one = ['a', 'a', 'c', 'd']
list_two = [1,2,3,4]
dict_1={}
if len(list_one) == len(list_two):
for i in range(len(list_one)):
if list_one[i] not in dict_1:
dict_1[list_one[i]]=[list_two[i]]
else:
dict_1[list_one[i]].append(list_two[i])
print(dict_1)
Assuming the two lists always have the same length, you can circumvent the use of zip() by iterating over the indices:
dict_1 = {}
for i in range(len(list_one)):
key = list_one[i]
value = list_two[i]
if key not in dict_1:
dict_1[key] = [value]
else:
dict_1[key].append(value)
print(dict_1)
As others have said, this is not a recommended way to do this, because the code using zip() is more readable, and zip() is a built-in function, so there shouldn't be any reason not to use it.

Check if repeating Key or Value exists in Python Dictionary

The following is my dictionary and I need to check if I have repeated key or Value
dict = {' 1': 'a', '2': 'b', '3': 'b', '4': 'c', '5': 'd', '5': 'e'}
This should return false or some kind of indicator which helps me print out that key or value might be repeated. It would be much appreciated if I am able to identify if a key is repeated or a Value (but not required).
Dictionaries can't have duplicate keys, so in case of repeated keys it only keeps the last value, so check values (one-liner is your friend):
print(('There are duplicates' if len(set(dict.values()))!=len(values) else 'No duplicates'))
Well in a dictionary keys can't repeat so we only have to deal with values.
dict = {...}
# get the values
values = list(dict.values())
And then you can use a set() to check for duplicates:
if len(values) == len(set(values)): print("no duplicates")
else: print("duplicates)
It's not possible to check if a key repeats in a dictionary, because dictionaries in Python only support unique keys. If you enter the dictionary as is, only the last value will be associated with the redundant key:
In [4]: dict = {' 1': 'a', '2': 'b', '3': 'b', '4': 'c', '5': 'd', '5': 'e'}
In [5]: dict
Out[5]: {' 1': 'a', '2': 'b', '3': 'b', '4': 'c', '5': 'e'}
A one-liner to find repeating values
In [138]: {v: [k for k in d if d[k] == v] for v in set(d.values())}
Out[138]: {'a': [' 1'], 'b': ['2', '3'], 'c': ['4'], 'e': ['5']}
Check all the unique values of the dict with set(d.values()) and then creating a list of keys that correspond to those values.
Note: repeating keys will just be overwritten
In [139]: {'a': 1, 'a': 2}
Out[139]: {'a': 2}
What about
has_dupes = len(d) != len(set(d.values()))
I'm on my phone so I cant test it. But j think it will work.
Well, although key value should be unique according to the documentation, there is still condition where repeated key could appear.
For example,
>>> import json
>>> a = {1:10, "1":20}
>>> b = json.dumps(a)
>>> b
'{"1": 20, "1": 10}'
>>> c = json.loads(b)
>>> c
{u'1': 10}
>>>
But in general, when python finds out there's conflict, it takes the latest value assigned to that key.
For your question, you should use comparison such as
len(dict) == len(set(dict.values()))
because set in python contains an unordered collection of unique and immutable objects, it could automatically get all unique values even when you have duplicate values in dict.values()

Pythonic way to create a dictionary from a list where the keys are the elements that are found in another list and values are elements between keys

Considering that I have two lists like:
l1 = ['a', 'c', 'b', 'e', 'f', 'd']
l2 = [
'x','q','we','da','po',
'a', 'el1', 'el2', 'el3', 'el4',
'b', 'some_other_el_1', 'some_other_el_2',
'c', 'another_element_1', 'another_element_2',
'd', '', '', 'another_element_3', 'd4'
]
and I need to create a dictionary where the keys are those element from second list that are found in the first and values are lists of elements found between "keys" like:
result = {
'a': ['el1', 'el2', 'el3', 'el4'],
'b': ['some_other_el_1', 'some_other_el_2'],
'c': ['another_element_1', 'another_element_2'],
'd': ['', '', 'another_element_3', 'd4']
}
What's a more pythonic way to do this?
Currently I'm doing this :
# I'm not sure that the first element in the second list
# will also be in the first so I have to create a key
k = ''
d[k] = []
for x in l2:
if x in l1:
k = x
d[k] = []
else:
d[k].append(x)
But I'm quite positive that this is not the best way to do it and it also doesn't looks nice :)
Edit:
I also have to mention that no list is necessary ordered and neither the second list must start with an element from the first one.
I don't think you'll do much better if this is the most specific statement of the problem. I mean I'd do it this way, but it's not much better.
import collections
d = collections.defaultdict(list)
s = set(l1)
k = ''
for x in l2:
if x in s:
k = x
else:
d[k].append(x)
For fun, you can also do this with itertools and 3rd party numpy:
import numpy as np
from itertools import zip_longest, islice
arr = np.where(np.in1d(l2, l1))[0]
res = {l2[i]: l2[i+1: j] for i, j in zip_longest(arr, islice(arr, 1, None))}
print(res)
{'a': ['el1', 'el2', 'el3', 'el4'],
'b': ['some_other_el_1', 'some_other_el_2'],
'c': ['another_element_1', 'another_element_2'],
'd': ['', '', 'another_element_3', 'd4']}
Here is a version using itertools.groupby. It may or may not be more efficient than the plain version from your post, depending on how groupby is implemented, because the for loop has fewer iterations.
from itertools import groupby
from collections import defaultdict, deque
def group_by_keys(keys, values):
"""
>>> sorted(group_by_keys('abcdef', [
... 1, 2, 3,
... 'b', 4, 5,
... 'd',
... 'a', 6, 7,
... 'c', 8, 9,
... 'a', 10, 11, 12
... ]).items())
[('a', [6, 7, 10, 11, 12]), ('b', [4, 5]), ('c', [8, 9])]
"""
keys = set(keys)
result = defaultdict(list)
current_key = None
for is_key, items in groupby(values, key=lambda x: x in keys):
if is_key:
current_key = deque(items, maxlen=1).pop() # last of items
elif current_key is not None:
result[current_key].extend(items)
return result
This doesn't distinguish between keys that don't occur in values at all (like e and f), and keys for which there are no corresponding values (like d). If this information is needed, one of the other solutions might be better suited.
Updated ... Again
I misinterpreted the question. If you are using large lists then list comprehensions are the way to go and they are fairly simple once you learn how to use them.
I am going to use two list comprehensions.
idxs = [i for i, val in enumerate(l2) if val in l1] + [len(l2)+1]
res = {l2[idxs[i]]: list(l2[idxs[i]+1: idxs[i+1]]) for i in range(len(idxs)-1)}
print(res)
Results:
{'a': ['el1', 'el2', 'el3', 'el4'],
'b': ['some_other_el_1', 'some_other_el_2'],
'c': ['another_element_1', 'another_element_2'],
'd': ['', '', 'another_element_3', 'd4']}
Speed Testing for large lists:
import collections
l1 = ['a', 'c', 'b', 'e', 'f', 'd']
l2 = [
'x','q','we','da','po',
'a', 'el1', 'el2', 'el3', 'el4', *(str(i) for i in range(300)),
'b', 'some_other_el_1', 'some_other_el_2', *(str(i) for i in range(100)),
'c', 'another_element_1', 'another_element_2', *(str(i) for i in range(200)),
'd', '', '', 'another_element_3', 'd4'
]
def run_comp():
idxs = [i for i, val in enumerate(l2) if val in l1] + [len(l2)+1]
res = {l2[idxs[i]]: list(l2[idxs[i]+1: idxs[i+1]]) for i in range(len(idxs)-1)}
def run_other():
d = collections.defaultdict(list)
k = ''
for x in l2:
if x in l1:
k = x
else:
d[k].append(x)
import timeit
print('For Loop:', timeit.timeit(run_other, number=1000))
print("List Comprehension:", timeit.timeit(run_comp, number=1000))
Results:
For Loop: 0.1327093063242541
List Comprehension: 0.09343156142774986
old stuff below
This is rather simple with list comprehensions.
{key: [val for val in l2 if key in val] for key in l1}
Results:
{'a': ['a', 'a1', 'a2', 'a3', 'a4'],
'b': ['b', 'b1', 'b2', 'b3', 'b4'],
'c': ['c', 'c1', 'c2', 'c3', 'c4'],
'd': ['d', 'd1', 'd2', 'd3', 'd4'],
'e': [],
'f': []}
The code below shows what is happening above.
d = {}
for key in l1:
d[key] = []
for val in l2:
if key in val:
d[key].append(val)
The list comprehension / dictionary comprehension (First piece of code) is actually way faster. List comprehensions are creating the list in place which is much faster than walking through and appending to the list. Appending makes the program walk the list, allocate more memory, and add the data to the list which can be very slow for large lists.
References:
http://www.pythonforbeginners.com/basics/list-comprehensions-in-python
https://docs.python.org/3.6/tutorial/datastructures.html#list-comprehensions
You can use itertools.groupby:
import itertools
l1 = ['a', 'c', 'b', 'e', 'f', 'd']
l2 = ['x', 'q', 'we', 'da', 'po', 'a', 'el1', 'el2', 'el3', 'el4', 'b', 'some_other_el_1', 'some_other_el_2', 'c', 'another_element_1', 'another_element_2', 'd', '', '', 'another_element_3', 'd4']
groups = [[a, list(b)] for a, b in itertools.groupby(l2, key=lambda x:x in l1)]
final_dict = {groups[i][-1][-1]:groups[i+1][-1] for i in range(len(groups)-1) if groups[i][0]}
Output:
{'a': ['el1', 'el2', 'el3', 'el4'], 'b': ['some_other_el_1', 'some_other_el_2'], 'c': ['another_element_1', 'another_element_2'], 'd': ['', '', 'another_element_3', 'd4']}
Your code is readable, does the job and is reasonably efficient. There's no need to change much!
You could use more descriptive variable names and replace l1 with a set for faster lookup:
keys = ('a', 'c', 'b', 'e', 'f', 'd')
keys_and_values = [
'x','q','we','da','po',
'a', 'el1', 'el2', 'el3', 'el4',
'b', 'some_other_el_1', 'some_other_el_2',
'c', 'another_element_1', 'another_element_2',
'd', '', '', 'another_element_3', 'd4'
]
current_key = None
result = {}
for x in keys_and_values:
if x in keys:
current_key = x
result[current_key] = []
elif current_key:
result[current_key].append(x)
print(result)
# {'a': ['el1', 'el2', 'el3', 'el4'],
# 'c': ['another_element_1', 'another_element_2'],
# 'b': ['some_other_el_1', 'some_other_el_2'],
# 'd': ['', '', 'another_element_3', 'd4']}
def find_index():
idxs = [l2.index(i) for i in set(l1).intersection(set(l2))]
idxs.sort()
idxs+= [len(l2)+1]
res = {l2[idxs[i]]: list(l2[idxs[i]+1: idxs[i+1]]) for i in range(len(idxs)-1)}
return(res)
Comparison of methods, using justengel's test:
justengel
run_comp: .455
run_other: .244
mkrieger1
group_by_keys: .160
me
find_index: .068
Note that my method ignores keys that don't appear l2, and doesn't handle cases where keys appear more than once in l2. Adding in empty lists for keys that don't appear in l2 can be done by {**res, **{key: [] for key in set(l1).difference(set(l2))}}, which raises the time to .105.
Even cleaner than turning l1 into a set, use the keys of the dictionary you're building. Like this
d = {x: [] for x in l1}
k = None
for x in l2:
if x in d:
k = x
elif k is not None:
d[k].append(x)
This is because (in the worst case) your code would be iterating over all the values in l1 for every value in l2 on the if x in l1: line, because checking if a value is in a list takes linear time. Checking if a value is in a dictionary's keys is constant time in the average case (same with sets, as already suggested by Eric Duminil).
I set k to None and check for it because your code would've returned d with '': ['x','q','we','da','po'], which is presumably not what you want. This assumes l1 can't contain None.
My solution also assumes it's okay for the resulting dictionary to contain keys with empty lists if there are items in l1 that never appear in l2. If that's not okay, you can remove them at the end with
final_d = {k: v for k, v in d.items() if v}

Categories

Resources