Is there a way to match to a key partially? - python

I am working with a set of key words and a dictionary in python and am looking to match the keywords with the keys in the dictionary but just partially. For example if I have a key word like wide-plank floors I would like to match it to some key 'floor'. Is there a way I can just check for partial?

Q: Is there a way to match to a key partially?
Short answer: No.
Reason:
How Dictionaries Work in
Python
keys are “hashed.” Python dictionaries are implemented as a hash table
behind the scenes. The dictionary uses each key’s hash function to
change some of the key’s information into an integer known as a hash
value.
So the only way to do a "partial search" (e.g. check the key against a "wildcard pattern") is to loop through each of the keys until you find a match.
Of course, once you find a matching key (by inspecting each key in turn), then you can use the dictionary to look up the corresponding value.
But you can't use the dictionary itself for a "wildcard lookup".
'Hope that helps...

You can create a filtered dictionary:
dx = {k: v for k, v in original if "floor" in k}
Then you can assume the keys in dx all share the property of containing 'floor', and just work with the values. If you need to do something more complex, replace if "floor" in k with if f(k), where f() is some function you write to decide if a key k should be included in the final set you're working with.

Something like this would work:
for keyword in keyword_dict:
if search_term in keyword:
print(keyword_dict[keyword])

Related

Error Changing Dictionary Keys

I've two defaultdicts I eventually want to merge, but first I need to make their keys match. According to some threads I've seen here, I can use pop() to replace keys in a dictionary. But that only updates the existing dictionary, whereas I want to create a new dictionary with the new keys. So something like:
existing_dict_one -> new_dict_one
This is what I've so far:
def split_tabs(x):
"""
Function to split tab-separated strings, used to break up the keys that are separated by tabs.
"""
return x.split('\t')
def create_dict(old_dict):
"""
Function to create a new defaultdict from an existing defaultdict, just with
different keys.
"""
new_dict = old_dict.copy() # Create a copy of old_dict to house the new keys, but with the same values.
for key, value in new_dict.iteritems():
umi = split_tabs(key)[0] # Change key to be UMI, which is the 0th index of the tab-delimited key.
# new_key = key.replace(key, umi)
new_dict[umi] = new_dict.pop(key)
return new_dict
However, I'm getting the following error
RuntimeError: dictionary changed size during iteration
and I don't know how to fix it. Does anyone know how to correct it? I'd like to use the variable "umi" as the new key.
I'd like to post the variable "key" and dictionary "old_dict" I'm using for testing this code, but it's messy and takes up a lot of space. So here's a pastebin link that contains them instead.
Note that "umi" comes from variable "key" which is separated by tabs. So I split "key" and get the first object as "umi".
Just use a dict comprehension for this:
new_dict = {split_tabs(key)[0]: value for key, value in old_dict.iteritems()}
Trying to modify a dictionary while iterating over it is not a good idea in general.
If you use .items() instead of .iteritems(), you won't have that problem, because that will just return a list that is disconnected from the dictionary. In python 3 it would be 'list(new_dict.items())`.
Also if there's any possibility that the dictionary values are mutable, you'll have to use copy.deepcopy(old_dict) instead of just old_dict.copy().

How can I access to a dictionary element indexed with a string?

I want to access to an element of a dictionary with a string.
For example, I have a dictionary like this:
data = {"masks": {"id": "valore"}}
I have one string campo="masks,id" I want to split this string with this campo.split(','). I obtain ['masks', 'id'] and with this I want to access to the element data["masks"]["id"].
This dictionary is an example, my dictionaries have more complexity. The point is that I want to access to the element data["masks"]["id"] with an input string "masks,id", or to the element data["masks"] with the string "masks" and to the element data["masks"]["id"]["X"] with the input string "masks,id,X" and so on.
How can I do this?
However, I won't recommend you to use the following method, as python dict is not meant to be accessed the way you want it to be, but since in Python you can change the object type at your own risk, I would like to attach the snippet which would get the work done for you.
So what I do is iterate over the keys and at each iteration fetch the child dictionary is present else put empty dictionary, the .get() method used, returns empty dict if the key was not found.
data = {"masks": {"id": "valore"}}
text = "masks, id"
nested_keys = text.split(", ")
nested_dict = data
for key in nested_keys:
nested_dict = nested_dict.get(key, {})
if (isinstance(nested_dict, str)):
print nested_dict
The point is that you are coming up with requirements that do not match the capability of the python-built-in dictionaries.
If you want to have nested maps that do this kind of automated "splitting" of a single key string like "masks, id, X" then ... you will have to implement that yourself.
In other words: the answer is - the built-in dictionary can't do that for you.
So, the "real" thing to do here: step back and carefully look into your requirements to understand exactly what you want to do; and why you want to do that. And going from there look for the best design to support that.
From an implementation side, I think what you "need" would roughly look like:
check if the provided "key" matches "key1,key2,key3"
if so, split that key into its sub-keys
then check if the "out dictionary" has a value for key1
then check, if the value for key1 is a dictionary
then check if that "inner" dictionary has a value for key2
...
and so on.

Full re-map of keys and values in a dictionary, set, list, etc?

I have a defaultdict(set) of various keys (sets), values (set of tuples), etc.
I also have a dictionary of various key (tuples), and values (strings).
For example maybe everything is in Japanese, with Japanese keys and values.
I have a mapping of Japanese -> English and want to update everything in my objects to the new key and values. The overall structure is the same, just the key/values are different named.
I can do this manually by looping over everything and popping/replacing but this is boring. I am curious if there is a more Pythonic way.
Assuming you have a function named convert_keys or similar that you can pass in a Japanese key and get out the equivalent English key, as well as a convert_values function that does the same but with values, you might try something like this:
englishDict = dict(map(lambda x: (convert_keys(x[0]),convert_values(x[1])),japaneseDict.items()))
It's still essentially the same algorithm as popping/replacing but with map operating on each key/item pair you can condense it down to one line.

Efficiently iterating a dictionary in Python

So here's the problem, I'm importing a dictionary with anywhere from 6000 to 12000 keys. Then using a nested for algorithm to group them into a list inside of another dictionary. I'm using the following code to check if the key is in the dictionary:
for key in range(sizeOfOriginalKeys):
if key in key_data:
As you might imagine, this is taking forever since the sorting algorithm is fairly complex. I would like to only iterate through the keys in 'key_data', without doing 1000 to 11999 checks if there is that key in the dictionary. Is there a way to make a list of current keys? Then iterate through them? Or at least something more efficient than what I'm currently doing?
Current Code after Kevin's suggestion:
for key in key_data:
currentKey = key_data[key].name
if key_data[currentKey].prefList[currentPref] == currentGroup
key_data[currentKey].currentScore = getDotProduct()
group_data[currentGroup].keyList.append(key_data[currentKey])
group_data[currentGroup].sortKeys()
del key_data[currentKey]
The key names are integers.
At the end of the sorting algorithm I delete the key, if its been sorted into a group.
Now I get an error: dictionary changed size during iteration.
Thoughts?
You're trying too hard:
for key in key_data:
You can try
for key,value in key_data.items() :
print key
print value
you can access to value without calling key_data[key]

Finding a key in a dictionary without knowing its full name

I have a dictionary with a key called ev#### where #### is some number that I do not know ahead of time. There is only one of this type of key in the dictionary and no other key starts with ev.
What's the cleanest way to access that key without knowing what the #### is?
You can try this list comprehension: (ideone)
result = [v for k, v in d.iteritems() if k.startswith('ev')][0]
Or this approach using a generator expression: (ideone)
result = next(v for k, v in d.iteritems() if k.startswith('ev'))
Note that these will both require a linear scan of the items in the dictionary, unlike an ordinary key-lookup which runs in constant time on average (assuming a good hash function). The generator expression however can stop as soon as it finds the key. The list comprehension will always scan the entire dicitonary.
If there is only one such value in the dictionary, I would say it's better to use an approach similar to this:
for k,v in d.iteritems():
if k.startswith('ev'):
result = v
break
else:
raise KeyError() # or set to default value
That way you don't have to loop through every value in the dictionary, but only until you find the key, which should speed up the calculation by ~ 2x on average.
Store the item in the dictionary without the ev prefix in the first place.
If you also need to access it with the prefix, store it both ways.
If there can be multiple prefixes for a given number, use a second dictionary that stores the actual keys associated with each number as a list or sub-dictionary, and use that to find the available keys in the main dictionary matching the number.
If you can't easily do this when the dictionary is initially created (e.g. someone else's code is giving you the dict and you can't change it), and you will be doing a lot of lookups of this sort, it is probably worthwhile to iterate over the dict once and make the second dict, or use a dict to cache the lookups, or something of that sort, to avoid iterating the keys each time.

Categories

Resources