finding values in dictionary based on their key - python

I'm trying to find the values of keys based on their 3 first letters. I have three different categories of subjects that i have to get the grade from, stored as value with the subject being key. I have ECO, GEO, and INF. As there are multiple subjects i want to get the values from every key containing either ECO, GEO or INF.
subject={"INFO100":"A"}
(subject.get("INF"))
In this method i don't get the value, i have to use the whole Key. Is there a work-a-round? I want the values seperately so i can calculate their GPA based on their field of study:)

You need to iterate on the pairs, to filter on the key and keep the value
subject = {"INFO100": "A", "INF0200": "B", "ECO1": "C"}
grades_inf = [v for k, v in subject.items() if k.startswith("INF")]
print(grades_inf) # ['A', 'B']
grades_eco = [v for k, v in subject.items() if k.startswith("ECO")]
print(grades_eco) # ['C']

A said in the comments, the purpose of a dictionary is to have unique keys. Indexing is extremely fast as it uses hash tables. By searching for parts of the keys you need to loop and lose the benefit of hashing.
Why don't you store your data in a nested dictionary?
subject={'INF': {"INFO100":"A", "INFO200":"B"},
'OTH': {"OTHER100":"C", "OTHER200":"D"},
}
Then access:
# all subitems
subject['INF']
# given item
subject['INF']['INFO100']

For understanding porpoises, you can create a function that returns a dictionary, like:
def getGradesBySubject(dict, search_subject):
return [grade for subject,grade in dict.iteritems() if subject.startwith(search_subject)]

I'd suggest using a master dict object that contains a mapping of the three-letter subjects like ECO, GEO, to all subject values. For example:
subject = {"INFO100": "A",
"INFO200": "B",
"GEO100": "D",
"ECO101": "B",
"GEO003": "C",
"INFO101": "C"}
master_dict = {}
for k, v in subject.items():
master_dict.setdefault(k[:3], []).append(v)
print(master_dict)
# now you can access it like: master_dict['INF']
Output:
{'INF': ['A', 'B', 'C'], 'GEO': ['D', 'C'], 'ECO': ['B']}
If you want to eliminate duplicate grades for a subject, or just as an alternate approach, I'd also suggest a defaultdict:
from collections import defaultdict
subject = {"INFO100": "A",
"INFO300": "A",
"INFO200": "B",
"GEO100": "D",
"ECO101": "B",
"GEO003": "C",
"GEO102": "D",
"INFO101": "C"}
master_dict = defaultdict(set)
for k, v in subject.items():
master_dict[k[:3]].add(v)
print(master_dict)
defaultdict(<class 'set'>, {'INF': {'B', 'A', 'C'}, 'GEO': {'D', 'C'}, 'ECO': {'B'}})

Related

Iterate adding values to an existing dictionary by using a variety of keys stored as a list

I had trouble coming up with an appropriate title for this, so apologies there.
I have an existing dictionary di_end which already has an order to its keys. I also have some objects which have a property containing the keys for where in di_end the user-entered value will go
Note: the methods setProperty() and property() are from the pyqt library where setProperty() creates a custom property for an object where the first argument is the name of the property and the second argument is the value for that named property and property() just returns the values for whatever name is passed into it as an argument.
Something like this:
a.setProperty('keys', [key1, key2, key3])
b.setProperty('keys', [key4, key5, key6, key7, key8])
c.setProperty('keys', [key9])
objects_list = [a, b, c]
I want to be able to use the keys stored in the object properties to load the value that the user enters into a field to a dictionary
I'd like to iterate the process such that these parts
di_end[a.property(['keys'])[0]][a.property(['keys'])[1]][a.property(['keys'])[2]] = a.value
di_end[b.property(['keys'])[0]][b.property(['keys'])[1]][b.property(['keys'])[2]]\
[b.property(['keys'])[3]][b.property(['keys'])[4]] = b.value
di_end[c.property(['keys'])[0]] = c.value
or
a_li, b_li, c_li = a.property(['keys']), b.property(['keys']), c.property(['keys'])
di_end[a_li[0]][a_li[1]][a_li[2]] = a.value
di_end[b_li[0]][b_li[1]][b_li[2]][b_li[3]][b_li[4]] = b.value
di_end[c_li[0]] = c.value
do not need to be manually typed out and could be performed procedurally. I think I could do this if it was the same amount of keys but I'm not sure how to do it with differing amounts of keys. If they were all the same amounts I'd just do this
a.setProperty('keys', [key1, key2, key3, key4])
b.setProperty('keys', [key5, key6, key7, key8])
c.setProperty('keys', [key9, key10, key11, key12])
objects_list = [a, b, c]
a_li, b_li, c_li = a.property(['keys']), b.property(['keys']), c.property(['keys']) # assuming all are 4 entires each
for count, item in enumerate([a_li, b_li, c_li]):
di_end[item[0]][item[1]][item[2]][item[3]] = objects_list[count].value
but since there are different amounts of keys for each entry, I'm not sure how to accomplish this.
Edit: Added a note about setProperty() and property()
It's pretty hard to tell what your code is doing here, mostly due to the setProperty and property objects, which seem awkward and not-pythonic.
That being said, this looks like yet another use case for the excellent glom library (pip install glom).
from glom import assign, Path
assign(data, Path(*keys), value)
Note that in your case
data = di_end
keys = a.property['keys']
value = a.value
and an example:
>>> data = {'a': {'b': {'c': 5}}}
>>> keys = ['a', 'b', 'c']
>>> value = 100
>>> assign(data, Path(*keys), value)
{'a': {'b': {'c': 100}}}
>>> data
{'a': {'b': {'c': 100}}}
Note that this will raise a KeyError if the path doesn't already exist. For the generate-dict-on-demand (like a infinite nested defaultdict) style, you need to use Assign instead:
glom(data, Assign(Path(*keys), value))
See https://glom.readthedocs.io/en/latest/mutation.html for more details.
It seems you can try use recursion here. A simple example below.
put function creates nested dict and put value. fetch function can extract value from nested dict.
def fetch(data, keys):
if len(keys) == 1:
return data[keys[0]]
return fetch(data[keys.pop()], keys)
def put(data, keys, value):
if len(keys) == 1:
data[keys[0]] = value
return
key = keys.pop()
data[key] = {}
put(data[key], keys, value)
di_empty = {}
keys_values = [
(("a", "b", "c"), "abc"),
(("b", "c", "d", "e"), "bcde")
]
for selected in keys_values:
put(di_empty, list(reversed(selected[0])), selected[1])
di_end = dict()
di_end["a"] = {"b": {"c": "abc"}}
di_end["b"] = {"c":{"d": {"e": "bcde"}}}
keys = [("a", "b", "c"), ("b", "c", "d", "e")]
for selected in keys:
s1 = fetch(di_end, list(reversed(selected)))
s2 = fetch(di_empty, list(reversed(selected)))
assert s1 == s2
print(s1)
print(s2)

How does this lambda function work exactly?

Can anybody explain to me please what exactly does this lambda function do?
from collections import Counter
def solve(a):
c = Counter(a)
return sorted(a, key=lambda k: (-c[k], k))
Thanks beforehand!
A lambda function is just like any other function, it's just expressed in a more compact way - so breaking it down:
lambda k : ( -c[k], k )
Is equivalent to:
def lambdafunction(k):
return (-c[k], k )
Where c is some in-scope variable - which per your solve function is a Counter.
The contents of that counter are keys and variables, and the lambda extracts those values and multiplies them by minus one, it then builds a tuple containing this extracted, negated value as the first entry, and the key as the second entry. These tuples are then used to perform the sort on the object to be solved, sorting the elements by frequency - most frequent first, with tie-breaking (i.e. where two or more elements share the same frequency) performed on the natural object.
e.g.
alist = ["a", "a", "b", "b", "b", "c", "c", "c", "c", "c", "c", "d", "d"]
solve(alist)
>> ['c', 'c', 'c', 'c', 'c', 'c', 'b', 'b', 'b', 'a', 'a', 'd', 'd']
Internally, there's a Counter which contains the values:
Counter({'a': 2, 'b': 3, 'c': 6, 'd': 2})
The lambda function converts these to tuples, which it associates with each element of the list before sorting them:
( -6, "c" )
( -3, "b" )
( -2, "a" )
( -2, "d" )
So all the "c" items appear at the top of the list, because the internally calculated tuples associated with them ( -6, "c" ) come first.
Using a lambda function like this within the sorted function gives sorted the flexibility to sort using whatever method you like - you define the function used to describe exactly what aspects of the collection you want sorted.
Counter(a) counts how many times each element is present in a, so this sorts a from most often element to least often element and when counts are the same it sorts alphabetically

Python dictionary with multiple unique values corresponding to a key

I have 2 lists which correspond to what I would like to be my key:value pairs, for example:
list_1 = [1,1,1,1,1,1,1,1,1,2,2,2,2,2,2] #(key)
list_2 = [x,x,x,y,g,r,t,w,r,r,r,t,f,c,d] #(value)
I've (kind of) been able to create a dictionary via: dict = dict(zip(list_1, [list_2]))
However the problem with this is that it is only picking up '1' as a key and also results in duplicate entries within the list of values for the key.
Can anyone suggest a way to create a dictionary so that only the unique values from list_2 are mapped to their corresponding key?
Thanks
EDIT:
output I'm looking for would be one dictionary keyed by 1 and 2 with lists as values containing only the unique values for each i.e.:
dict = {1: [x,y,g,r,t,w], 2: [r,t,f,c,d]}
This sort of problem is properly solved with a collections.defaultdict(set); the defaultdict gives you easy auto-vivificaction of sets for each key on demand, and the set uniquifies the values associated with each key:
from collections import defaultdict
mydict = defaultdict(set)
for k, v in zip(list_1, list_2):
mydict[k].add(v)
You can then convert the result to a plain dict with list values with:
mydict = {k: list(v) for k, v in mydict.items()}
If order of the values must be preserved, on modern Python you can use dicts instead of set (on older Python, you'd use collections.OrderedDict):
mydict = defaultdict(dict)
for k, v in zip(list_1, list_2):
mydict[k][v] = True # Dummy value; we're using a dict to get an ordered set of the keys
with the conversion to plain dict with list values being unchanged
If the input is already sorted, itertools.groupby is theoretically slightly more efficient (it's actual O(n), vs. average case O(n) using dicts), but in practice the defaultdict is typically as faster or faster (the implementation of groupby has some unavoidable inefficiencies). Just for illustration, the groupby solution would be:
from itertools import groupby
from operator import itemgetter
mydict = {k: {v for _, v in grp} for k, grp in groupby(zip(list_1, list_2), key=itemgetter(0))]
# Or preserving order of insertion:
getval = itemgetter(1) # Construct once to avoid repeated construction
mydict = {k: list(dict.fromkeys(map(getval, grp)))
for k, grp in groupby(zip(list_1, list_2), key=itemgetter(0))]
Since a dictionary is a set it cant contain twice the same key but it can have the key once then a list of value for that you can use the one-line method
my_dict = {key:[list_2[i] for i in range(len(list_2)) if list_1[i]==key] for key in set(list_1)}
Or a more classic method
my_dict = {}
for key_id in range(len(list_1)):
if list_1[key_id] not in my_dict:
my_dict[list_1[key_id]] = []
my_dict[list_1[key_id]].append(list_2[key_id])
In both case the result is
my_dict = {1: ['x', 'x', 'x', 'y', 'g', 'r', 't', 'w', 'r'], 2: ['r', 'r', 't', 'f', 'c', 'd']}
The problem is your key is too unique. there're only two unique keys 1 and 2. So if you're creating dictionaries you can't have {1:x, 1:y} at same time for example, unless you change the key to something new and unique.
I would use a tuple in your purpose:
list(set(tuple(zip(list_1, list_2))))
The set gives you unique mappings which is what dropping the duplicates.
keys = [1,1,1,1,1,1,1,1,1,2,2,2,2,2,2]
values = ['x','x','x','y','g','r','t','w','r','r','r','t','f','c','d']
result = {}
for key,value in zip(keys,values):
if key not in result:
result[key] = []
if value not in result[key]:
result[key].append(value)
else:
if value not in result[key]:
result[key].append(value)
print(result)
{1: ['x', 'y', 'g', 'r', 't', 'w'], 2: ['r', 't', 'f', 'c', 'd']}
Note:
zip(keys,values) this will create a iterable of tuples, each tuple consist of one element from the keys and values.
(1,'x')
(1,'x')

How do I join a list of values into a Python dictionary?

I am trying to join a list to a dictionary in Python 3 and return the sum of the key values.
So far, I can't join the two, I've tried using get and set and am not succeeding.
I also tried a for loop with set linking listy and dict2, like this:
dict2 = {
1: "A",
2: "B",
3: "C"
}
listy = ['A', 'M', 'B', 'A']
for k in dict2:
if set(listy) & set(dict2[value]):
print(dict2.key)
This is the error I'm getting in IPython:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-291-5a5e2eb8d7f8> in <module>
10
11 for k in dict2:
---> 12 if set(listy) & set(dict2[value]):
13 print(dict2.key)
14
TypeError: unhashable type: 'list'
You can use a list-comprehension:
[x for x in listy if x in set(dict2.values())]
In code:
dict2 = {
1: "A",
2: "B",
3: "C"
}
listy = ['A', 'M', 'B', 'A']
print([x for x in listy if x in set(dict2.values())])
# ['A', 'B', 'A']
Your task will be easier if you flip the keys and values in your dictionary. I assume that there are no duplicate values.
dict2 = {1: "A", 2: "B", 3: "C"}
lookup = {value: key for key, value in dict2.items()}
Now lookup is {'A': 1, 'B': 2, 'C': 3}.
Now you can loop over the list:
listy = ['A', 'M', 'B', 'A']
result = []
for key in listy:
if key in lookup:
result.append(key)
Now result is ['A', 'B', 'A']. The code will be shorter with a list comprehension:
result = [key for key in listy if key in lookup]
As far as I unterstood the question you want to get the sum of the keys in dict2 for every entry in listy that has a corresponding value in dict2. If you have created the lookupdictionary you can do the following to get the single values.
[lookup.get(key, 0) for key in listy]
# -> [1, 0, 2, 1]
If a key doesn't appear in the dictionary it gets a default value of 0.
To get the sum is easy now
sum(lookup.get(key, 0) for key in listy)
# -> 4
You probably meant to use dict[k] instead of dict2[value]
Also your dictionary's entries contain single values (not lists) so you can use the in operator:
for example:
# if listy is a dictionary or a small list, you don't need to build a set
for key,value in dict2.items():
if value in listy:
print(key)
or :
# If listy is a large list, you should build a set only once
listySet = set(listy)
for key,value in dict2.items():
if value in listySet:
print(key)
If you have a lot of code to perform on the "joined" data, you could structure the condition like this:
for key,value in dict2.items():
if value not in listy: continue
print(key)
... do more stuff ...
If you're only looking for a sum, you can do it more directly:
# counting sum of dict2 keys matching each value in listy
# select sum(dict2.key) from listy join dict2 where dict2.value = listy.value
# (note that an inverted dict2 would be better suited for that)
result = sum(key*listy.count(value) for key,value in dict2.items())
# counting sum of keys in dict2 that have a value in listy
# select sum(dict2.key) from dict2 where exists listy.value = dict2.value
result = sum(key for key,value in dict2.items() if value in listy)
In short, you have to implement the linking logic that the RDBMS query optimizer normally does for you in SQL.

Iterate multiple dict with different keys

The goal
I am trying to iterate two dicts at the same time knowing that they have some keys in common (for sure), but some of them are not (possibly). What is more the same keys could (rarely, but still) be ordered differently. Another issue is that dicts can have different lenght. In my case the keys are all numerical.
Atempted solutions
Example dicts:
di1 = {1: "a", 2: "b", 3: "c", 5:"e"}
di2 = {1: "a", 2: "b", 4: "d", 5:"e", 6:"f"}
After reading some answers to iterating multiple dicts I tried zip()ing the two dicts:
for i, j in zip( di1, di2 ): print( i, j )
1 1
2 2
3 4
5 5
but this 'cuts' the longer dict, also this iterates over keys of each dict seperately instead of keeping them consistent (always i == j, even if i in di1 and j in di2 would return False)
Given that in my case all keys are numerical I tried the following:
for i in range(max(max(di1), max(di2))+1): print(i)
0
1
2
3
4
5
6
which works (I can pass i as dict key), but:
Doesn't iterate dicts per se, just generates numbers to try to match to given dicts.
Iterates over values even if they are non existent keys in both dicts (i in di1 or i in di2 is False).
This works only if keys are numerical.
Doesn't seem very pythonic.
The quesstion
How do I iterate two (or more) dicts (keys) given that it is enough for the key to exist in at least one of them?
Conditions
Solutions using standard libraries are preferable.
You can assume dict keys are numerical but a more general solution is preferable.
Iteration order is of no importance but additional information on the matter is a bonus.
I'm iterating two dicts.
Both dicts should remain unaltered.
I'm using python 3.6.1
You can iterate over common keys:
for key in di1.keys() & di2.keys():
print(key)
Or union of keys:
for key in di1.keys() | di2.keys():
print(key)
You choose. Use dict.viewkeys() in Python 2.
I would extract keys from both dicts (.keys()), join the lists of the keys, remove duplicates (make it set), then iterate over the dicts using this new set of keys.
keys1 = di1.keys()
keys2 = di2.keys()
keys = keys1 + keys2
keys = set(keys)
for key in keys:
try:
di1[key]
di2[key]
except KeyNotFoundError:
# key is not present in both dicts
pass
You can use itertools.izip_longest, to iterate till the longer collection, where zip iterates till the smaller collection.
>>> di1 = {1: "a", 2: "b", 3: "c", 5:"e"}
>>> di2 = {1: "a", 2: "b", 4: "d", 5:"e", 6:"f"}
>>>
>>> from itertools import izip_longest
>>> for a, b in izip_longest(di1, di2):
... print(di1.get(a), di2.get(b))
...
('a', 'a')
('b', 'b')
('c', 'd')
('e', 'e')
(None, 'f')
The thing to look at here is the use of dict.get(key), because using dict[key] will cause KeyError for unique keys. You can however, add an optional default value as second parameter inside dict.get(key, default_value).
Hope this helps.
Assuming that the values are consistent between the various dictionaries you can use collections.ChainMap to iterate over multiple dictionaries:
from collections import ChainMap
di1 = {1: "a", 2: "b", 3: "c", 5:"e"}
di2 = {1: "a", 2: "b", 4: "d", 5:"e", 6:"f"}
chained_dicts = ChainMap(di1, di2) # add more dicts as required
for key in chained_dicts:
print(key, chained_dicts[key])
Output:
1 a
2 b
3 c
4 d
5 e
6 f
Or more simply:
for key, value in ChainMap(di1, di2).items():
print(key, value)
As mentioned above, this is fine if the values for duplicated keys are the same. Where there is variation the value from the first chained dictionary will be returned.

Categories

Resources