Search dictionary keys with regex

Search dictionary keys with regex - python

I have defined a dictionary with string keys:
{'dummy': 0, 'K1::foo(bar::z(x,u))': 1, 'K2::foo()': 2}
I want to search for key pattern (not the exact word), so if 'foo' in my_dict: should return true.
yax = 'foo'
if yax in my_dict:
# Should definitely go here
value = my_dict[yax]
print(value)
else:
# Just for error checking that the given name doesn't exist in dictionary
print("Given value does not exist")
But the above code goes to the else section.
In the example, foo exists in two keys. That doesn't matter. the first match is OK. As another example, if I search for bar, the if statement should be true, too.

First, know that it is not a good approach to have to search through dictionary keys. The purpose of a dictionary is to enable O(1) access to the values using hashed keys.
That said, you can loop over the keys.
Searching any substring:
d = {'dummy': 0, 'K1::foo(bar::z(x,u))': 1, 'K2::foo()': 2}
[k for k in d if 'foo' in k]
Searching an independent word:
import re
[k for k in d if re.search(r'\bfoo\b', k)]
output: ['K1::foo(bar::z(x,u))', 'K2::foo()']
as dictionary comprehension:
{k:v for k,v in d.items() if 'foo' in k}
output: {'K1::foo(bar::z(x,u))': 1, 'K2::foo()': 2}

Related

Get specific key of a nested iterable and check if its value exists in a list

I am trying to access a specific key in a nest dictionary, then match its value to a string in a list. If the string in the list contains the string in the dictionary value, I want to override the dictionary value with the list value. below is an example.
my_list = ['string1~', 'string2~', 'string3~', 'string4~', 'string5~', 'string6~']
my_iterable = {'A':'xyz',
'B':'string6',
'C':[{'B':'string4', 'D':'123'}],
'E':[{'F':'321', 'B':'string1'}],
'G':'jkl'
}
The key I'm looking for is B, the objective is to override string6 with string6~, string4 with string4~, and so on for all B keys found in the my_iterable.
I have written a function to compute the Levenshtein distance between two strings, but I am struggling to write an efficient ways to override the values of the keys.
def find_and_replace(key, dictionary, original_list):
for k, v in dictionary.items():
if k == key:
#function to check if original_list item contains v
yield v
elif isinstance(v, dict):
for result in find_and_replace(key, v, name_list):
yield result
elif isinstance(v, list):
for d in v:
if isinstance(d, dict):
for result in find_and_replace(key, d, name_list):
yield result
if I call
updated_dict = find_and_replace('B', my_iterable, my_list)
I want updated_dict to return the below:
{'A':'xyz',
'B':'string6~',
'C':[{'B':'string4~', 'D':'123'}],
'E':[{'F':'321', 'B':'string1~'}],
'G':'jkl'
}
Is this the right approach to the most efficient solution, and how can I modify it to return a dictionary with the updated values for B?

You can use below code. I have assumed the structure of input dict to be same throughout the execution.
# Input List
my_list = ['string1~', 'string2~', 'string3~', 'string4~', 'string5~', 'string6~']
# Input Dict
# Removed duplicate key "B" from the dict
my_iterable = {'A':'xyz',
'B':'string6',
'C':[{'B':'string4', 'D':'123'}],
'E':[{'F':'321', 'B':'string1'}],
'G':'jkl',
}
# setting search key
search_key = "B"
# Main code
for i, v in my_iterable.items():
if i == search_key:
if not isinstance(v,list):
search_in_list = [i for i in my_list if v in i]
if search_in_list:
my_iterable[i] = search_in_list[0]
else:
try:
for j, k in v[0].items():
if j == search_key:
search_in_list = [l for l in my_list if k in l]
if search_in_list:
v[0][j] = search_in_list[0]
except:
continue
# print output
print (my_iterable)
# Result -> {'A': 'xyz', 'B': 'string6~', 'C': [{'B': 'string4~', 'D': '123'}], 'E': [{'F': '321', 'B': 'string1~'}], 'G': 'jkl'}
Above can has scope of optimization using list comprehension or using
a function
I hope this helps and counts!

In some cases, if your nesting is kind of complex you can treat the dictionary like a json string and do all sorts of replacements. Its probably not what people would call very pythonic, but gives you a little more flexibility.
import re, json
my_list = ['string1~', 'string2~', 'string3~', 'string4~', 'string5~', 'string6~']
my_iterable = {'A':'xyz',
'B':'string6',
'C':[{'B':'string4', 'D':'123'}],
'E':[{'F':'321', 'B':'string1'}],
'G':'jkl'}
json_str = json.dumps(my_iterable, ensure_ascii=False)
for val in my_list:
json_str = re.sub(re.compile(f"""("[B]":\\W?")({val[:-1]})(")"""), r"\1" + val + r"\3", json_str)
my_iterable = json.loads(json_str)
print(my_iterable)

Merge two dictionaries' values by keys

Hi I want to merge two dictionaries' values if the keys are the same.
DIC_01
{'A': ['Zero'],
'B': ['Zero'],
'C': ['Zero'],
'D': ['Zero']}
DIC_02
{'A': [2338.099365234375,
-3633.070068359375,
-73.45938873291016],
'D':[2839.291015625,
-2248.350341796875,
1557.59423828125]}
Idea output
{'A': [[2338.099365234375,
-3633.070068359375,
-73.45938873291016],['Zero']],
'D': [[2839.291015625,
-2248.350341796875,
1557.59423828125]['Zero']]}
Output for the Keys that cannot be found
{'B': ['Zero'],'C': ['Zero']}
I tried
NO_MATCH={}
for k in DIC_01.keys():
DOC={}
for k2 in DIC_02.keys():
if k == k2:
DOC = k.values().update(k2.values())
else:
NO_MATCH.update(DIC_01)
There is nothing in DOC and all the dictionary elements are in NO_MATCH, no error message. don't know where goes wrong, also I think there must be better ways to do this.
Thank you!

Edited: You can declare two separate dictionaries and iterate through all the keys of both dictionaries. For every iteration, check if the key exists in DIC_02 and DIC_01 and concatenate the two corresponding lists
match, no_match = {}, {}
for i in {**DIC_01,**DIC_02}.keys():
if i in DIC_01 and i in DIC_02:
match[i] = DIC_01[i] + DIC_02[i]
else:
no_match[i] = DIC_01.get(i,[]) + DIC_02.get(i,[])

It's not particularly fancy but this should solve what you're looking for
def merge(a, b):
out = {}
for key in a.keys() | b.keys():
if key in a and key in b:
out[key] = [a[key], b[key]]
elif key in a:
out[key] = a[key]
else:
out[key] = b[key]
return out
where a and b are dicts. The | takes the union of the two key sets.

As for why your code goes wrong.
NO_MATCH={}
for k in DIC_01.keys():
DOC={} # (1)
for k2 in DIC_02.keys(): # (2)
if k == k2:
DOC = k.values().update(k2.values()) # (3)
else:
NO_MATCH.update(DIC_01) # (4)
Not there! Everything you define inside a loop, will be redefined everytime the loop goes around.
This goes to the else block even if there's a matched key. For instance, in your case, it compares A in DIC_01 with A in DIC_02, "ok, A matched". BUT, then it proceeds to compare A in DIC_01 with D in DIC_02, "ok, A not found in DIC_02, not matched, add to NOT_MATCH" which is wrong, because A IS a matched key and A IS in DIC_02.
Not sure how you didn't get an error, seems very erroneous.
This lines add the entire DIC_01 to NO_MATCH, wrong!
FIX:
MATCH = {}
NO_MATCH = {}
# This goes through all keys in DIC_01. If a key is also found in DIC_02,
# it's a "matched" key so it adds that key to the MATCH variable. If it's
# not in DIC_02, it's a "no matched" key -> add key to NO_MATCH variable.
for k in DIC_01.keys():
if k in DIC_02.keys():
MATCH[k] = [DIC_01[k], DIC_02[k]]
else:
NO_MATCH[k] = DIC_01[k]
# BUT...We are still missing the keys that are only in DIC_02. So we need
# another loop
for k in DIC_02.keys():
if k not in DIC_01.keys():
NO_MATCH[k] = DIC_02[k]
# This is the same as the loop above, without the if block.
BETTER WAY
Some list comprehensions would keep things clean.
MATCH = {key:[DIC_01[key], DIC_02[key]] for key in DIC_01 if key in DIC_02}
unmatch_1 = {key:DIC_01[key] for key in DIC_01 if key not in DIC_02}
unmatch_2 = {key:DIC_02[key] for key in DIC_02 if key not in DIC_01}
NOT_MATCH = {**unmatch_1, **unmatch_2}
EXPLAINATION:
MATCH = {key:[DIC_01[key], DIC_02[key]] for key in DIC_01 if key in DIC_02}
This, in English, create a new dictionary called match.For every key in DIC_01, if the key is also in DIC_02, create the same key in match and assign both the values of that key from DIC_01 and DIC_02.
unmatch_1 = {key:DIC_01[key] for key in DIC_01 if key not in DIC_02}
This ... For every key in DIC_01, if it's not in DIC_02, create a key and assign the associated value from DIC_01
unmatch_2 = {key:DIC_02[key] for key in DIC_02 if key not in DIC_01}
This ... For every key in DIC_02, if it's not in DIC_01, create a key and assign the associated value from DIC_02
UNMATCH = {**unmatch_1, **unmatch_2}
This ... is a cool way of merging 2 dictionaries (Only for Python 3.5 and up)

This looks like a great use for ChainMap
>>> a={'A': ['Zero'],
... 'B': ['Zero'],
... 'C': ['Zero'],
... 'D': ['Zero']}
>>> b={'A': [2338.099365234375,
... -3633.070068359375,
... -73.45938873291016],
... 'D':[2839.291015625,
... -2248.350341796875,
... 1557.59423828125]}
>>> map=ChainMap(b,a)
>>> map['A']
[2338.099365234375, -3633.070068359375, -73.45938873291016]
>>> map['C']
['Zero']
The key precedence will be on the order of the dictionaries, so if you can't control order or if the ['Zero'] are mixed and matched: this way can't help.

If dict2 value = dict1 key, replace entire dict2 value with dict1 value

I have two dictionaries. In both dictionaries, the value of each key is a single list. If any element in any list in dictionary 2 is equal to a key of dictionary 1, I want to replace that element with the first element in that dictionary 1 list.
In other words, I have:
dict1 = {'IDa':['newA', 'x'], 'IDb':['newB', 'x']}
dict2 = {1:['IDa', 'IDb']}
and I want:
dict2 = {1:['newA', 'newB']}
I tried:
for ID1, news in dict1.items():
for x, ID2s in dict2.items():
for ID in ID2s:
if ID == ID1:
print ID1, 'match'
ID.replace(ID, news[0])
for k, v in dict2.items():
print k, v
and I got:
IDb match
IDa match
1 ['IDa', IDb']
So it looks like everything up to the replace method is working. Is there a way to make this work? To replace an entire string in a value-list with a string in another value-list?
Thanks a lot for your help.

Try this:
dict1 = {'IDa':['newA', 'x'], 'IDb':['newB', 'x']}
dict2 = {1:['IDa', 'IDb']}
for key in dict2.keys():
dict2[key] = [dict1[x][0] if x in dict1.keys() else x for x in dict2[key]]
print dict2
this will print:
{1: ['newA', 'newB']}
as required.
Explanation
dict.keys() gives us just the keys of a dictionary (i.e. just the left hand side of the colon). When we use for key in dict2.keys(), at present our only key is 1. If the dictionary was larger, it'd loop through all keys.
The following line uses a list comprehension - we know that dict2[key] gives us a list (the right side of the colon), so we loop through every element of the list (for x in dict2[key]) and return the first entry of the corresponding list in dict1 only if we can find the element in the keys of dict1 (dict1[x][0] if x in dict1.keys) and otherwise leave the element untouched ([else x]).
For example, if we changed our dictionaries to be the following:
dict1 = {'IDa':['newA', 'x'], 'IDb':['newB', 'x']}
dict2 = {1:['IDa', 'IDb'], 2:{'IDb', 'IDc'}}
we'd get the output:
{1: ['newA', 'newB'], 2: ['newB', 'IDc']}
because 'IDc' doesn't exist in the keys of dict1.

You could also use dictionary comprehensions, but I am not sure that they are working in Python 2.7, it may be limited to Python 3 :
# Python 3
dict2 = {k: [dict1.get(e, [e])[0] for e in v] for k,v in dict2.items()}
edit: I just checked, this is working in Python 2.7. However, dict2.items() should be replaced by dict2.iteritems() :
# Python 2.7
dict2 = {k: [dict1.get(e, [e])[0] for e in v] for k,v in dict2.iteritems()}

This was a fun one!
dict2[1] = [dict1[val][0] if val in dict1 else val for val in dict2[1]]
Or, here is the same logic without list comprehension:
new_dict = {1: []}
for val in dict2[1]:
if val in dict1:
new_dict[1].append(dict1[val][0])
else:
new_dict[1].append(val)
dict2 = new_dict

Adding nonzero items from a dictionary to another dictionary

I have a set of reactions (keys) with values (0.0 or 100) stored in mydict.
Now I want to place non zero values in a new dictionary (nonzerodict).
def nonzero(cmod):
mydict = cmod.getReactionValues()
nonzerodict = {}
for key in mydict:
if mydict.values() != float(0):
nonzerodict[nz] = mydict.values
print nz
Unfortunately this is not working.
My questions:
Am I iterating over a dictionary correctly?
Am I adding items to the new dictionary correctly?

You are testing if the list of values is not equal to float(0). Test each value instead, using the key to retrieve it:
if mydict[key] != 0:
nonzerodict[key] = mydict[key]
You are iterating over the keys correctly, but you could also iterate over the key-value pairs:
for key, value in mydict.iteritems():
if value != 0:
nonzerodict[key] = value
Note that with floating point values, chances are you'll have very small values, close to zero, that you may want to filter out too. If so, test if the value is close to zero instead:
if abs(value) > 1e-9:
You can do the whole thing in a single dictionary expression:
def nonzero(cmod):
return {k: v for k, v in cmod.getReactionValues().iteritems() if abs(v) > 1e-9}

Its simple and you can it by below way -
>>> d = {'a':4,'b':2, 'c':0}
>>> dict((k,v) for k,v in d.iteritems() if v!=0)
{'a': 4, 'b': 2}
>>>

Replace if condition in you code with:
if mydict[key]:
nonzerodict[key] = mydict[key]
Your solution can be further simplified as:
def nonzero(cmod):
mydict = cmod.getReactionValues()
nonzerodict = {key: value for key, value in mydict.iteritems() if value}

Dividing dictionary into nested dictionaries, based on the key's name on Python 3.4

I have the following dictionary (short version, real data is much larger):
dict = {'C-STD-B&M-SUM:-1': 0, 'C-STD-B&M-SUM:-10': 4.520475, 'H-NSW-BAC-ART:-9': 0.33784000000000003, 'H-NSW-BAC-ART:0': 0, 'H-NSW-BAC-ENG:-59': 0.020309999999999998, 'H-NSW-BAC-ENG:-6': 0,}
I want to divide it into smaller nested dictionaries, depending on a part of the key name.
Expected output would be:
# fixed closing brackets
dict1 = {'C-STD-B&M-SUM: {'-1': 0, '-10': 4.520475}}
dict2 = {'H-NSW-BAC-ART: {'-9': 0.33784000000000003, '0': 0}}
dict3 = {'H-NSW-BAC-ENG: {'-59': 0.020309999999999998, '-6': 0}}
Logic behind is:
dict1: if the part of the key name is 'C-STD-B&M-SUM', add to dict1.
dict2: if the part of the key name is 'H-NSW-BAC-ART', add to dict2.
dict3: if the part of the key name is 'H-NSW-BAC-ENG', add to dict3.
Partial code so far:
def divide_dictionaries(dict):
c_std_bem_sum = {}
for k, v in dict.items():
if k[0:13] == 'C-STD-B&M-SUM':
c_std_bem_sum = k[14:17], v
What I'm trying to do is to create the nested dictionaries that I need and then I'll create the dictionary and add the nested one to it, but I'm not sure if it's a good way to do it.
When I run the code above, the variable c_std_bem_sum becomes a tuple, with only two values that are changed at each iteration. How can I make it be a dictionary, so I can later create another dictionary, and use this one as the value for one of the keys?

One way to approach it would be to do something like
d = {'C-STD-B&M-SUM:-1': 0, 'C-STD-B&M-SUM:-10': 4.520475, 'H-NSW-BAC-ART:-9': 0.33784000000000003, 'H-NSW-BAC-ART:0': 0, 'H-NSW-BAC-ENG:-59': 0.020309999999999998, 'H-NSW-BAC-ENG:-6': 0,}
def divide_dictionaries(somedict):
out = {}
for k,v in somedict.items():
head, tail = k.split(":")
subdict = out.setdefault(head, {})
subdict[tail] = v
return out
which gives
>>> dnew = divide_dictionaries(d)
>>> import pprint
>>> pprint.pprint(dnew)
{'C-STD-B&M-SUM': {'-1': 0, '-10': 4.520475},
'H-NSW-BAC-ART': {'-9': 0.33784000000000003, '0': 0},
'H-NSW-BAC-ENG': {'-59': 0.020309999999999998, '-6': 0}}
A few notes:
(1) We're using nested dictionaries instead of creating separate named dictionaries, which aren't convenient.
(2) We used setdefault, which is a handy way to say "give me the value in the dictionary, but if there isn't one, add this to the dictionary and return it instead.". Saves an if.
(3) We can use .split(":") instead of hardcoding the width, which isn't very robust -- at least assuming that's the delimiter, anyway!
(4) It's a bad idea to use dict, the name of a builtin type, as a variable name.

That's because you're setting your dictionary and overriding it with a tuple:
>>> a = 1, 2
>>> print a
>>> (1,2)
Now for your example:
>>> def divide_dictionaries(dict):
>>> c_std_bem_sum = {}
>>> for k, v in dict.items():
>>> if k[0:13] == 'C-STD-B&M-SUM':
>>> new_key = k[14:17] # sure you don't want [14:], open ended?
>>> c_std_bem_sum[new_key] = v
Basically, this grabs the rest of the key (or 3 characters, as you have it, the [14:None] or [14:] would get the rest of the string) and then uses that as the new key for the dict.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Search dictionary keys with regex - python

Related

Get specific key of a nested iterable and check if its value exists in a list

Merge two dictionaries' values by keys

If dict2 value = dict1 key, replace entire dict2 value with dict1 value

Adding nonzero items from a dictionary to another dictionary

Dividing dictionary into nested dictionaries, based on the key's name on Python 3.4

Categories

Resources