Matching a Reference Dict to a List of Dicts

Matching a Reference Dict to a List of Dicts - python

So I have a reference dict
ref = {"a": 1, "b": 2}
and a list of dicts
dict_list = [{"a": 1, "b": 2, "c": "success!"}, {"a": 1, "b": 3, "c": "nope!"}]
What I want is to find the dict in dict_list which matches the reference, and returns the value c (i.e. "success!"). I was able to do this, but I'm not a fan of this solution at all:
In [7]: import pandas as pd
...: def f(ref, dict_list):
...: df = pd.DataFrame.from_records(dict_list)
...: return df.loc[(df["a"] == ref["a"]) & (df["b"] == ref["b"])].c[0]
...:
...: f(ref, dict_list)
Out[7]: 'success!'
if anyone has anything more elegant (ideally in pure python) would be great!

Use next:
>>> next((x['c'] for x in dict_list if ref.items() < x.items()))
'success!'
>>>
Or:
>>> next((x['c'] for x in dict_list if dict(x, **ref) == x))
'success!'
>>>
This will get the key c when the ref dictionary is a subset of the iterator dictionaries. This won't only work for a and b keys, it works for all cases.
In Python 3, to check if a dictionary is a subset of another, you can use the < operator.
For the second case, since dictionaries can't have additional keys, it joins the two dictionaries and determines whether it's the same as the original dictionary iterator, if so, it yields the c key from iterator x.
Edit:
As #sj95126 mentioned, as of Python 3.9 you could use the concatenation method:
>>> next((x['c'] for x in dict_list if x | ref == x))
'success!'
>>>
This is the same logic as dict(x, **ref).
Edit 2:
Obviously, you could do:
>>> next((x['c'] for x in dict_list if [x['a'], x['b']] == list(ref.values())))
'success!'
>>>
For only a and b keys.

Related

How to make function unique in python

Okay, so I have to make a function called unique. This is what it should do:
If the input is: s1 = [{1,2,3,4}, {3,4,5}]
unique(s1) should return: {1,2,5} because the 1, 2 and 5 are NOT in both lists.
And if the input is s2 = [{1,2,3,4}, {3,4,5}, {2,6}]
unique(s2) should return: {1,5,6} because those numbers are unique and are in only one list of this collection of 3 lists.
I tried to make something like this:
for x in s1:
if x not in unique_list:
unique_list.append(x)
else:
unique_list.remove(x)
print(unique_list)
But the problem with this is that it takes a whole list as "x" and not each element from each list.
Anyone that can help me a bit with this?
I am not allowed to import anything.

Python set() objects have a symmetric_difference() method to find elements in either, but not both sets. You can reduce your list with this to find the total elements unique to each set:
from functools import reduce
l = [{1,2,3,4}, {3,4,5}, {2,6}]
reduce(set.symmetric_difference, l)
# {1, 5, 6}
You can, of course do this without reduce by manually looping over the list. ^ will produce the symmetric_difference:
l = [{1,2,3,4}, {3,4,5}, {2,6}]
final = set()
for s in l:
final = final ^ s
print(final)
# {1, 5, 6}

In [13]: def f(sets):
...: c = {}
...: for s in sets:
...: for x in s:
...: c[x] = c.setdefault(x, 0) + 1
...: return {x for x, v in c.items() if v == 1}
...:
In [14]: f([{1,2}, {2, 3}, {3, 4}])
Out[14]: {1, 4}

Find count of characters within the string in Python

I am trying to create a dictionary of word and number of times it is repeating in string. Say suppose if string is like below
str1 = "aabbaba"
I want to create a dictionary like this
word_count = {'a':4,'b':3}
I am trying to use dictionary comprehension to do this.
I did
dic = {x:dic[x]+1 if x in dic.keys() else x:1 for x in str}
This ends up giving an error saying
File "<stdin>", line 1
dic = {x:dic[x]+1 if x in dic.keys() else x:1 for x in str}
^
SyntaxError: invalid syntax
Can anybody tell me what's wrong with the syntax? Also,How can I create such a dictionary using dictionary comprehension?

As others have said, this is best done with a Counter.
You can also do:
>>> {e:str1.count(e) for e in set(str1)}
{'a': 4, 'b': 3}
But that traverses the string 1+n times for each unique character (once to create the set, and once for each unique letter to count the number of times it appears. i.e., This has quadratic runtime complexity.). Bad result if you have a lot of unique characters in a long string... A Counter only traverses the string once.
If you want no import version that is more efficient than using .count, you can use .setdefault to make a counter:
>>> count={}
>>> for c in str1:
... count[c]=count.setdefault(c, 0)+1
...
>>> count
{'a': 4, 'b': 3}
That only traverses the string once no matter how long or how many unique characters.
You can also use defaultdict if you prefer:
>>> from collections import defaultdict
>>> count=defaultdict(int)
>>> for c in str1:
... count[c]+=1
...
>>> count
defaultdict(<type 'int'>, {'a': 4, 'b': 3})
>>> dict(count)
{'a': 4, 'b': 3}
But if you are going to import collections -- Use a Counter!

Ideal way to do this is via using collections.Counter:
>>> from collections import Counter
>>> str1 = "aabbaba"
>>> Counter(str1)
Counter({'a': 4, 'b': 3})
You can not achieve this via simple dict comprehension expression as you will require reference to your previous value of count of element. As mentioned in Dawg's answer, as a work around you may use list.count(e) in order to find count of each element from the set of string within you dict comprehension expression. But time complexity will be n*m as it will traverse the complete string for each unique element (where m are uniques elements), where as with counter it will be n.

This is a nice case for collections.Counter:
>>> from collections import Counter
>>> Counter(str1)
Counter({'a': 4, 'b': 3})
It's dict subclass so you can work with the object similarly to standard dictionary:
>>> c = Counter(str1)
>>> c['a']
4
You can do this without use of Counter class as well. The simple and efficient python code for this would be:
>>> d = {}
>>> for x in str1:
... d[x] = d.get(x, 0) + 1
...
>>> d
{'a': 4, 'b': 3}

Note that this is not the correct way to do it since it won't count repeated characters more than once (apart from losing other characters from the original dict) but this answers the original question of whether if-else is possible in comprehensions and demonstrates how it can be done.
To answer your question, yes it's possible but the approach is like this:
dic = {x: (dic[x] + 1 if x in dic else 1) for x in str1}
The condition is applied on the value only not on the key:value mapping.
The above can be made clearer using dict.get:
dic = {x: dic.get(x, 0) + 1 for x in str1}
0 is returned if x is not in dic.
Demo:
In [78]: s = "abcde"
In [79]: dic = {}
In [80]: dic = {x: (dic[x] + 1 if x in dic else 1) for x in s}
In [81]: dic
Out[81]: {'a': 1, 'b': 1, 'c': 1, 'd': 1, 'e': 1}
In [82]: s = "abfg"
In [83]: dic = {x: dic.get(x, 0) + 1 for x in s}
In [84]: dic
Out[84]: {'a': 2, 'b': 2, 'f': 1, 'g': 1}

Search for the position of a specific dictionary in a list of dictionaries

dict1 = {"1":"a" "2":"b" "3":"c"}
for dict2 in all_dict:
if compare_dicts(dict1, dict2):
...
...
I need the index of the dict inside all_dict which is exactly the same like dict1.
Does the for loop go over all_dict sequentially so I can count the iterations inside the for-loop?

You can write a function yielding all indices of matching objects in a list using enumerate():
def findall(lst, value):
for i, x in enumerate(lst):
if x == value:
yield i
You can apply this to your use case like this:
matching_indices = list(findall(all_dicts, dict1))
If you are just looking for a single match, the list.index() method is all you need:
matching_index = all_dicts.index(dict1)

Use filter:
filter(lambda x: x == dict1, all_dict)
This returns a list of all dictionaries you're looking for. Example:
>>> all_dict = [{'a':1}, {'b':2}, {'a':1}]
>>> dict1 = {'a':1}
>>> filter(lambda x: x == dict1, all_dict)
[{'a': 1}, {'a': 1}]

Is it possible to "unpack" a dict in one call?

I was looking for a way to "unpack" a dictionary in a generic way and found a relevant question (and answers) which explained various techniques (TL;DR: it is not too elegant).
That question, however, addresses the case where the keys of the dict are not known, the OP anted to have them added to the local namespace automatically.
My problem is possibly simpler: I get a dict from a function and would like to dissecate it on the fly, knowing the keys I will need (I may not need all of them every time). Right now I can only do
def myfunc():
return {'a': 1, 'b': 2, 'c': 3}
x = myfunc()
a = x['a']
my_b_so_that_the_name_differs_from_the_key = x['b']
# I do not need c this time
while I was looking for the equivalent of
def myotherfunc():
return 1, 2
a, b = myotherfunc()
but for a dict (which is what is returned by my function). I do not want to use the latter solution for several reasons, one of them being that it is not obvious which variable corresponds to which returned element (the first solution has at least the merit of being readable).
Is such operation available?

If you really must, you can use an operator.itemgetter() object to extract values for multiple keys as a tuple:
from operator import itemgetter
a, b = itemgetter('a', 'b')(myfunc())
This is still not pretty; I'd prefer the explicit and readable separate lines where you first assign the return value, then extract those values.
Demo:
>>> from operator import itemgetter
>>> def myfunc():
... return {'a': 1, 'b': 2, 'c': 3}
...
>>> itemgetter('a', 'b')(myfunc())
(1, 2)
>>> a, b = itemgetter('a', 'b')(myfunc())
>>> a
1
>>> b
2

You could also use map:
def myfunc():
return {'a': 1, 'b': 2, 'c': 3}
a,b = map(myfunc().get,["a","b"])
print(a,b)

In addition to the operator.itemgetter() method, you can also write your own myotherfunc(). It takes list of the required keys as an argument and returns a tuple of their corresponding value.
def myotherfunc(keys_list):
reference_dict = myfunc()
return tuple(reference_dict[key] for key in keys_list)
>>> a,b = myotherfunc(['a','b'])
>>> a
1
>>> b
2
>>> a,c = myotherfunc(['a','c'])
>>> a
1
>>> c
3

Destructuring-bind dictionary contents

I am trying to 'destructure' a dictionary and associate values with variables names after its keys. Something like
params = {'a':1,'b':2}
a,b = params.values()
But since dictionaries are not ordered, there is no guarantee that params.values() will return values in the order of (a, b). Is there a nice way to do this?

from operator import itemgetter
params = {'a': 1, 'b': 2}
a, b = itemgetter('a', 'b')(params)
Instead of elaborate lambda functions or dictionary comprehension, may as well use a built in library.

One way to do this with less repetition than Jochen's suggestion is with a helper function. This gives the flexibility to list your variable names in any order and only destructure a subset of what is in the dict:
pluck = lambda dict, *args: (dict[arg] for arg in args)
things = {'blah': 'bleh', 'foo': 'bar'}
foo, blah = pluck(things, 'foo', 'blah')
Also, instead of joaquin's OrderedDict you could sort the keys and get the values. The only catches are you need to specify your variable names in alphabetical order and destructure everything in the dict:
sorted_vals = lambda dict: (t[1] for t in sorted(dict.items()))
things = {'foo': 'bar', 'blah': 'bleh'}
blah, foo = sorted_vals(things)

How come nobody posted the simplest approach?
params = {'a':1,'b':2}
a, b = params['a'], params['b']

Python is only able to "destructure" sequences, not dictionaries. So, to write what you want, you will have to map the needed entries to a proper sequence. As of myself, the closest match I could find is the (not very sexy):
a,b = [d[k] for k in ('a','b')]
This works with generators too:
a,b = (d[k] for k in ('a','b'))
Here is a full example:
>>> d = dict(a=1,b=2,c=3)
>>> d
{'a': 1, 'c': 3, 'b': 2}
>>> a, b = [d[k] for k in ('a','b')]
>>> a
1
>>> b
2
>>> a, b = (d[k] for k in ('a','b'))
>>> a
1
>>> b
2

Here's another way to do it similarly to how a destructuring assignment works in JS:
params = {'b': 2, 'a': 1}
a, b, rest = (lambda a, b, **rest: (a, b, rest))(**params)
What we did was to unpack the params dictionary into key values (using **) (like in Jochen's answer), then we've taken those values in the lambda signature and assigned them according to the key name - and here's a bonus - we also get a dictionary of whatever is not in the lambda's signature so if you had:
params = {'b': 2, 'a': 1, 'c': 3}
a, b, rest = (lambda a, b, **rest: (a, b, rest))(**params)
After the lambda has been applied, the rest variable will now contain:
{'c': 3}
Useful for omitting unneeded keys from a dictionary.
Hope this helps.

Maybe you really want to do something like this?
def some_func(a, b):
print a,b
params = {'a':1,'b':2}
some_func(**params) # equiv to some_func(a=1, b=2)

If you are afraid of the issues involved in the use of the locals dictionary and you prefer to follow your original strategy, Ordered Dictionaries from python 2.7 and 3.1 collections.OrderedDicts allows you to recover you dictionary items in the order in which they were first inserted

(Ab)using the import system
The from ... import statement lets us desctructure and bind attribute names of an object. Of course, it only works for objects in the sys.modules dictionary, so one could use a hack like this:
import sys, types
mydict = {'a':1,'b':2}
sys.modules["mydict"] = types.SimpleNamespace(**mydict)
from mydict import a, b
A somewhat more serious hack would be to write a context manager to load and unload the module:
with obj_as_module(mydict, "mydict_module"):
from mydict_module import a, b
By pointing the __getattr__ method of the module directly to the __getitem__ method of the dict, the context manager can also avoid using SimpleNamespace(**mydict).
See this answer for an implementation and some extensions of the idea.
One can also temporarily replace the entire sys.modules dict with the dict of interest, and do import a, b without from.

Warning 1: as stated in the docs, this is not guaranteed to work on all Python implementations:
CPython implementation detail: This function relies on Python stack frame support
in the interpreter, which isn’t guaranteed to exist in all implementations
of Python. If running in an implementation without Python stack frame support
this function returns None.
Warning 2: this function does make the code shorter, but it probably contradicts the Python philosophy of being as explicit as you can. Moreover, it doesn't address the issues pointed out by John Christopher Jones in the comments, although you could make a similar function that works with attributes instead of keys. This is just a demonstration that you can do that if you really want to!
def destructure(dict_):
if not isinstance(dict_, dict):
raise TypeError(f"{dict_} is not a dict")
# the parent frame will contain the information about
# the current line
parent_frame = inspect.currentframe().f_back
# so we extract that line (by default the code context
# only contains the current line)
(line,) = inspect.getframeinfo(parent_frame).code_context
# "hello, key = destructure(my_dict)"
# -> ("hello, key ", "=", " destructure(my_dict)")
lvalues, _equals, _rvalue = line.strip().partition("=")
# -> ["hello", "key"]
keys = [s.strip() for s in lvalues.split(",") if s.strip()]
if missing := [key for key in keys if key not in dict_]:
raise KeyError(*missing)
for key in keys:
yield dict_[key]
In [5]: my_dict = {"hello": "world", "123": "456", "key": "value"}
In [6]: hello, key = destructure(my_dict)
In [7]: hello
Out[7]: 'world'
In [8]: key
Out[8]: 'value'
This solution allows you to pick some of the keys, not all, like in JavaScript. It's also safe for user-provided dictionaries

With Python 3.10, you can do:
d = {"a": 1, "b": 2}
match d:
case {"a": a, "b": b}:
print(f"A is {a} and b is {b}")
but it adds two extra levels of indentation, and you still have to repeat the key names.

Look for other answers as this won't cater to the unexpected order in the dictionary. will update this with a correct version sometime soon.
try this
data = {'a':'Apple', 'b':'Banana','c':'Carrot'}
keys = data.keys()
a,b,c = [data[k] for k in keys]
result:
a == 'Apple'
b == 'Banana'
c == 'Carrot'

Well, if you want these in a class you can always do this:
class AttributeDict(dict):
def __init__(self, *args, **kwargs):
super(AttributeDict, self).__init__(*args, **kwargs)
self.__dict__.update(self)
d = AttributeDict(a=1, b=2)

Based on #ShawnFumo answer I came up with this:
def destruct(dict): return (t[1] for t in sorted(dict.items()))
d = {'b': 'Banana', 'c': 'Carrot', 'a': 'Apple' }
a, b, c = destruct(d)
(Notice the order of items in dict)

An old topic, but I found this to be a useful method:
data = {'a':'Apple', 'b':'Banana','c':'Carrot'}
for key in data.keys():
locals()[key] = data[key]
This method loops over every key in your dictionary and sets a variable to that name and then assigns the value from the associated key to this new variable.
Testing:
print(a)
print(b)
print(c)
Output
Apple
Banana
Carrot

An easy and simple way to destruct dict in python:
params = {"a": 1, "b": 2}
a, b = [params[key] for key in ("a", "b")]
print(a, b)
# Output:
# 1 2

I don't know whether it's good style, but
locals().update(params)
will do the trick. You then have a, b and whatever was in your params dict available as corresponding local variables.

Since dictionaries are guaranteed to keep their insertion order in Python >= 3.7, that means that it's complete safe and idiomatic to just do this nowadays:
params = {'a': 1, 'b': 2}
a, b = params.values()
print(a)
print(b)
Output:
1
2

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Matching a Reference Dict to a List of Dicts - python

Related

How to make function unique in python

Find count of characters within the string in Python

Search for the position of a specific dictionary in a list of dictionaries

Is it possible to "unpack" a dict in one call?

Destructuring-bind dictionary contents

Categories

Resources