Comparing Python Dictionaries - python

I created two dictionaries. Each is based on a different query of the same database. There is a key and four fields from the database in each dictionary. I want to find all rows of dict_x that are in are in dict_y.
for row in dict_x:
if dict_y.values() not in dict_x.values():
del dict_x[row]
print 'Length dict_x', len(dict_x)
This returns the error
TypeError: 'type' object does not support item deletion

This will work as long as the elements in the array will always be in the same order.
dict_x = {'hi': ['hello', 'hi'], 'bye': ['good bye', 'bye']}
dict_y = {'hi': ['hello', 'hi']}
dict_z = dict()
for key, row in dict_x.items():
if row in dict_y.values():
dict_z[key] = row
print(dict_z)
If the elements won't be in the same order then you'll have to do this:
dict_x = {'hi': ['hi', 'hello'], 'bye': ['good bye', 'bye']}
dict_y = {'hi': ['hello', 'hi']}
dict_z = dict()
for x_key, x_row in dict_x.items():
for y_key, y_row in dict_y.items():
if set(x_row).intersection(y_row):
dict_z[x_key] = y_row
print(dict_z)

>>> dict_a = {'a':[1,1,2,3]}
>>> dict_b = {'b':[1,2]}
>>> for a_key, b_key in zip(dict_a.keys(), dict_b.keys()):
... print [i for i in dict_a[a_key] if i in set(dict_b[b_key])]
...
[1, 1, 2]

The steps to solve the problem would be to
Invert the key value pairs of the dictionaries
Identify the common intersecting keys
Loop through the keys and check if their values match
The code could look something like below
dict_x = {v: k for k, v in dict_x.items()}
dict_y = {v: k for k, v in dict_y.items()}
for key in dict_x.keys() & dict_y.keys():
print(key, dict_x[key])
print(key, dict_y[key])
Here is the dict comprehension equivalent in python 3
result_set = {key: dict_x[key] for key in dict_x.keys() & dict_y.keys() if dict_x[key] == dict_y[key]}

This might help, it will return False if the dict is equal and true if not. I know its other way around
def compare_dict(
dict_1: Dict[Any, Any], dict_2: Dict[Any, Any]
):
new_key = any([False if key in dict_1 else True for key in dict_2])
delete_key = any(
[False if key in dict_2 else True for key in dict_1]
)
if new_key or delete_key:
return True
else:
values_mismatch_flag = any(
[
True if v != dict_1[k] else False
for k, v in dict_2.items()
]
)
if values_mismatch_flag:
return True
return False

Related

How to search all list elements?

my_dict={'reportName': 'sale_order', 'extract_extractType': 'sales', 'extract_stages_load_output_tableName': 'extract_table_name',
'extract_stages_load_clientMarket_tableFilters_0': 'client_mkt_version_id|1',
'extract_stages_load_clientMarket_tableFilters_1': 'client_mkt_short_name|HCV_NOVEL',
'extract_stages_load_clientMarket_tableFilters_2': 'client_Id|161'}
case1:
kv=['extract', 'output']
Now my question is if all list elements present on dictionary then display key and value and dict.
output: 'extract_stages_load_output_tableName': 'extract_table_name'.
case2:
if list elements contains tableFilters
kv=['extract','clientMarket','tableFilters','client_mkt_short_name']
output: 'extract_stages_load_clientMarket_tableFilters_1': HCV_NOVEL
same way for remaining
{'extract_stages_load_clientMarket_tableFilters_0': 1,
'extract_stages_load_clientMarket_tableFilters_1': 'HCV_NOVEL',
'extract_stages_load_clientMarket_tableFilters_2': 161'}
def get_key(key, kv):
return len(set(kv) - set(key.split('_'))) == 0
{k: v for k, v in my_dict.items() if get_key(k, kv)}
{'extract_stages_load_output_tableName': 'extract_table_name'}
Edit
def get_key(key, kv):
return all([_kv in s for _kv in kv])
{k: v for k, v in my_dict.items() if get_key(k, kv)}
{'extract_stages_load_output_tableName': 'extract_table_name'}
In case you are not searching for specific words in underscore-separted strings:
my_dict={'reportName': 'sale_order', 'extract_extractType': 'sales',
'extract_stages_load_output_tableName': 'extract_table_name', 'numeric_value': 10}
kv=['extract', 'output']
for key, value in my_dict.items():
if isinstance(value, str): # in case value is not a string
kv_string = key + '|' + value
contains = True
for item in kv:
if kv_string.find(item) == -1:
contains = False
if contains:
print(f"{key}: {value}")
else:
print("-w: non-string value in dict")

Get specific key of a nested iterable and check if its value exists in a list

I am trying to access a specific key in a nest dictionary, then match its value to a string in a list. If the string in the list contains the string in the dictionary value, I want to override the dictionary value with the list value. below is an example.
my_list = ['string1~', 'string2~', 'string3~', 'string4~', 'string5~', 'string6~']
my_iterable = {'A':'xyz',
'B':'string6',
'C':[{'B':'string4', 'D':'123'}],
'E':[{'F':'321', 'B':'string1'}],
'G':'jkl'
}
The key I'm looking for is B, the objective is to override string6 with string6~, string4 with string4~, and so on for all B keys found in the my_iterable.
I have written a function to compute the Levenshtein distance between two strings, but I am struggling to write an efficient ways to override the values of the keys.
def find_and_replace(key, dictionary, original_list):
for k, v in dictionary.items():
if k == key:
#function to check if original_list item contains v
yield v
elif isinstance(v, dict):
for result in find_and_replace(key, v, name_list):
yield result
elif isinstance(v, list):
for d in v:
if isinstance(d, dict):
for result in find_and_replace(key, d, name_list):
yield result
if I call
updated_dict = find_and_replace('B', my_iterable, my_list)
I want updated_dict to return the below:
{'A':'xyz',
'B':'string6~',
'C':[{'B':'string4~', 'D':'123'}],
'E':[{'F':'321', 'B':'string1~'}],
'G':'jkl'
}
Is this the right approach to the most efficient solution, and how can I modify it to return a dictionary with the updated values for B?
You can use below code. I have assumed the structure of input dict to be same throughout the execution.
# Input List
my_list = ['string1~', 'string2~', 'string3~', 'string4~', 'string5~', 'string6~']
# Input Dict
# Removed duplicate key "B" from the dict
my_iterable = {'A':'xyz',
'B':'string6',
'C':[{'B':'string4', 'D':'123'}],
'E':[{'F':'321', 'B':'string1'}],
'G':'jkl',
}
# setting search key
search_key = "B"
# Main code
for i, v in my_iterable.items():
if i == search_key:
if not isinstance(v,list):
search_in_list = [i for i in my_list if v in i]
if search_in_list:
my_iterable[i] = search_in_list[0]
else:
try:
for j, k in v[0].items():
if j == search_key:
search_in_list = [l for l in my_list if k in l]
if search_in_list:
v[0][j] = search_in_list[0]
except:
continue
# print output
print (my_iterable)
# Result -> {'A': 'xyz', 'B': 'string6~', 'C': [{'B': 'string4~', 'D': '123'}], 'E': [{'F': '321', 'B': 'string1~'}], 'G': 'jkl'}
Above can has scope of optimization using list comprehension or using
a function
I hope this helps and counts!
In some cases, if your nesting is kind of complex you can treat the dictionary like a json string and do all sorts of replacements. Its probably not what people would call very pythonic, but gives you a little more flexibility.
import re, json
my_list = ['string1~', 'string2~', 'string3~', 'string4~', 'string5~', 'string6~']
my_iterable = {'A':'xyz',
'B':'string6',
'C':[{'B':'string4', 'D':'123'}],
'E':[{'F':'321', 'B':'string1'}],
'G':'jkl'}
json_str = json.dumps(my_iterable, ensure_ascii=False)
for val in my_list:
json_str = re.sub(re.compile(f"""("[B]":\\W?")({val[:-1]})(")"""), r"\1" + val + r"\3", json_str)
my_iterable = json.loads(json_str)
print(my_iterable)

Add Key, Value pair to new dict

I have an existing list of Key, Value pairs in my current dictionary called total_list. I want to check my list to see if the length of each Key == 1 in total_list, I want to add that key and its value pair to a new dictionary. This is the code that I've come up with.
total_list = {104370544: [31203.7, 01234], 106813775: [187500.0], 106842625: [60349.8]}
diff_so = defaultdict(list)
for key, val in total_list:
if len(total_list[key]) == 1:
diff_so[key].append[val]
total_list.pop[key]
But I keep getting an error with
"cannot unpack non-iterable int object".
I was wondering if there's anyway for me to fix this code for it to run properly?
Assuming that the OP means a string of one character by length = 1 of the key.
You can do this:
total_list = [{'abc':"1", 'bg':"7", 'a':"7"}]
new_dict = {}
for i in total_list:
for k,v in i.items():
if len(k) == 1:
new_dict[str(k)] = v
else:
pass
print(new_dict)
Output:
{'a': '7'}
After edit:
total_list = {104370544: [31203.7, 1234], 106813775: [187500.0], 106842625: [60349.8]}
new_dict = {}
for k,v in total_list.items():
if len(v) == 1:
new_dict[k] = v
else:
pass
Output:
{'106842625': [60349.8], '106813775': [187500.0]}
You just need a dictionary comprehension
diff_so = {k: v for k, v in total_list.items() if len(v) == 1}

Unflatten nested Python dictionary

What would be the cleanest way to convert this
{"a.b.c[0].key1": 1, "a.b.c[1].key2": 2, "a.b.c[3].key3": 3}
Into this
{"a": {"b": {"c": [{"key1": 1}, {"key2": 2}, None, {"key3": 3}]}}}
the dictionary keys may be anything.
the length of the list may vary.
the depth of the dictionary may vary.
if there are missing values in the list the value must be None.
if values are repeated the last one declared is the one that counts.
I came up with the following working example.
Was wondering if we could find a better solution for our community.
def unflatten(data):
if type(data) != dict:
return None
regex = r'\.?([^.\[\]]+)|\[(\d+)\]'
result_holder = {}
for key,value in data.items():
cur = result_holder
prop = ""
results = re.findall(regex, key)
for result in results:
prop = int(prop) if type(cur) == list else prop
if (type(cur) == dict and cur.get(prop)) or (type(cur) == list and len(cur) > prop):
cur = cur[prop]
else:
if type(cur) == list:
if type(prop) is int:
while len(cur) <= prop:
cur.append(None)
cur[prop] = list() if result[1] else dict()
cur = cur[prop]
prop = result[1] or result[0]
prop = int(prop) if type(cur) == list else prop
if type(cur) == list:
if type(prop) is int:
while len(cur) <= prop:
cur.append(None)
print(len(cur), prop)
cur[prop] = data[key]
return result_holder[""] or result_holder
You can use recursion:
d = {"a.b.c[0].key1": 1, "a.b.c[1].key2": 2, "a.b.c[3].key3": 3}
from itertools import groupby
import re
def group_data(data):
new_results = [[a, [i[1:] for i in b]] for a, b in groupby(sorted(data, key=lambda x:x[0]), key=lambda x:x[0])]
arrays = [[a, list(b)] for a, b in groupby(sorted(new_results, key=lambda x:x[0].endswith(']')), key=lambda x:x[0].endswith(']'))]
final_result = {}
for a, b in arrays:
if a:
_chars = [[c, list(d)] for c, d in groupby(sorted(b, key=lambda x:re.findall('^\w+', x[0])[0]), key=lambda x:re.findall('^\w+', x[0])[0])]
_key = _chars[0][0]
final_result[_key] = [[int(re.findall('\d+', c)[0]), d[0]] for c, d in _chars[0][-1]]
_d = dict(final_result[_key])
final_result[_key] = [group_data([_d[i]]) if i in _d else None for i in range(min(_d), max(_d)+1)]
else:
for c, d in b:
final_result[c] = group_data(d) if all(len(i) >1 for i in d) else d[0][0]
return final_result
print(group_data([[*a.split('.'), b] for a, b in d.items()]))
Output:
{'a': {'b': {'c': [{'key1': 1}, {'key2': 2}, None, {'key3': 3}]}}}
A recursive function would probably be much easier to work with and more elegant.
This is partly pseudocode, but it may help you get thinking.
I haven't tested it, but I'm pretty sure it should work so long as you don't have any lists that are directly elements of other lists. So you can have dicts of dicts, dicts of lists, and lists of dicts, but not lists of lists.
def unflatten(data):
resultDict = {}
for e in data:
insertElement(e.split("."), data[e], resultDict)
return resultDict
def insertElement(path, value, subDict):
if (path[0] is of the form "foo[n]"):
key, index = parseListNotation(path[0])
if (key not in subDict):
subDict[key] = []
if (index >= subDict[key].len()):
subDict[key].expandUntilThisSize(index)
if (subDict[key][index] == None):
subDict[key][index] = {}
subDict[key][index] = insertElement(path.pop(0), value, subDict[key][index])
else:
key = path[0]
if (path.length == 1):
subDict[key] = value
else:
if (key not in subDict):
subDict[key] = {}
subDict[key] = insertElement(path.pop(0), value, subDict[key])
return subDict;
The idea is to build the dictionary from the inside, out. E.g.:
For the first element, first create the dictionary `
{key1: 1},
Then assign that to an element of a new dictionary
{c : [None]}, c[0] = {key1: 1}
Then assign that dictionary to the next element b in a new dict, like
- {b: {c : [{key1: 1}]}
Assign that result to a in a new dict
- {a: {b: {c : [{key1: 1}]}}
And lastly return that full dictionary, to use to add the next value.
If you're not familiar with recursive functions, I'd recommend practicing with some simpler ones, and then writing one that does what you want but for input that's only dictionaries.
General path of a dictionary-only recursive function:
Given a path that's a list of attributes of nested dictionaries ( [a, b, c, key1] in your example, if c weren't a list):
Start (path, value):
If there's only item in your path, build a dictionary setting
that key to your value, and you're done.
If there's more than one, build a dictionary using the first
element as a key, and set the value as the output of Start(path.remove(0), value)
Here is another variation on how to achieve the desired results. Not as pretty as I would like though, so I expect there is a much more elegant way. Probably more regex than is really necessary if you spent a bit more time on this, and also seems like the break approach to handling the final key is probably just an indicator that the loop logic could be improved to eliminate that sort of manual intervention. That said, hopefully this is helpful in the process of refining your approach here.
import re
def unflatten(data):
results = {}
list_rgx = re.compile(r'[^\[\]]+\[\d+\]')
idx_rgx = re.compile(r'\d+(?=\])')
key_rgx = re.compile(r'[^\[]+')
for text, value in data.items():
cur = results
keys = text.split('.')
idx = None
for i, key in enumerate(keys):
stop = (i == len(keys) - 1)
if idx is not None:
val = value if stop else {}
if len(cur) > idx:
cur[idx] = {key: val}
else:
for x in range(len(cur), idx + 1):
cur.append({key: val}) if x == idx else cur.append(None)
if stop:
break
else:
cur[idx].get(key)
idx = None
if stop:
cur[key] = value
break
elif re.match(list_rgx, key):
idx = int(re.search(idx_rgx, key).group())
key = re.search(key_rgx, key).group()
cur.setdefault(key, [])
else:
cur.setdefault(key, {})
cur = cur.get(key)
print(results)
Output:
d = {"a.b.c[0].key1": 1, "a.b.c[1].key2": 2, "a.b.c[3].key3": 3}
unflatten(d)
# {'a': {'b': {'c': [{'key1': 1}, {'key2': 2}, None, {'key3': 3}]}}}

Find a string as value in a dictionary of dictionaries and return its key

I need to write a function which is doing following work
Find a string as value in a dictionary of dictionaries and return its key
(1st key if found in main dictionary, 2nd key if found in sub dictionary).
Source Code
Here is the function which I try to implement, but it works incorrect as I can't find any answer of how to convert list into dictionary as in this case the following error occurs
for v, k in l:
ValueError: need more than 1 value to unpack
def GetKeyFromDictByValue(self, dictionary, value_to_find):
""""""
key_list = [k for (k, v) in dictionary.items() if v == value_to_find]
if key_list.__len__() is not 0:
return key_list[0]
else:
l = [s for s in dictionary.values() if ":" in str(s)]
d = defaultdict(list)
for v, k in l:
d[k].append(v)
print d
dict = {'a': {'a1': 'a2'}, "aa": "aa1", 'aaa': {'aaa1': 'aaa2'}}
print GetKeyFromDictByValue(dict, "a2")
I must do this on Python 2.5
You created a list of only the dictionary values, but then try to loop over it as if it already contains both keys and values of those dictionaries. Perhaps you wanted to loop over each matched dictionary?
l = [v for v in dictionary.values() if ":" in str(v)]
d = defaultdict(list)
for subdict in l:
for k, v in subdict.items():
I'd instead flatten the structure:
def flatten(dictionary):
for key, value in dictionary.iteritems():
if isinstance(value, dict):
# recurse
for res in flatten(value):
yield res
else:
yield key, value
then just search:
def GetKeyFromDictByValue(self, dictionary, value_to_find):
for key, value in flatten(dictionary):
if value == value_to_find:
return key
Demo:
>>> sample = {'a': {'a1': 'a2'}, "aa": "aa1", 'aaa': {'aaa1': 'aaa2'}}
>>> GetKeyFromDictByValue(None, sample, "a2")
'a1'

Categories

Resources