Finding a key in a dictionary without knowing its full name - python

I have a dictionary with a key called ev#### where #### is some number that I do not know ahead of time. There is only one of this type of key in the dictionary and no other key starts with ev.
What's the cleanest way to access that key without knowing what the #### is?

You can try this list comprehension: (ideone)
result = [v for k, v in d.iteritems() if k.startswith('ev')][0]
Or this approach using a generator expression: (ideone)
result = next(v for k, v in d.iteritems() if k.startswith('ev'))
Note that these will both require a linear scan of the items in the dictionary, unlike an ordinary key-lookup which runs in constant time on average (assuming a good hash function). The generator expression however can stop as soon as it finds the key. The list comprehension will always scan the entire dicitonary.

If there is only one such value in the dictionary, I would say it's better to use an approach similar to this:
for k,v in d.iteritems():
if k.startswith('ev'):
result = v
break
else:
raise KeyError() # or set to default value
That way you don't have to loop through every value in the dictionary, but only until you find the key, which should speed up the calculation by ~ 2x on average.

Store the item in the dictionary without the ev prefix in the first place.
If you also need to access it with the prefix, store it both ways.
If there can be multiple prefixes for a given number, use a second dictionary that stores the actual keys associated with each number as a list or sub-dictionary, and use that to find the available keys in the main dictionary matching the number.
If you can't easily do this when the dictionary is initially created (e.g. someone else's code is giving you the dict and you can't change it), and you will be doing a lot of lookups of this sort, it is probably worthwhile to iterate over the dict once and make the second dict, or use a dict to cache the lookups, or something of that sort, to avoid iterating the keys each time.

Related

Is there a way to match to a key partially?

I am working with a set of key words and a dictionary in python and am looking to match the keywords with the keys in the dictionary but just partially. For example if I have a key word like wide-plank floors I would like to match it to some key 'floor'. Is there a way I can just check for partial?
Q: Is there a way to match to a key partially?
Short answer: No.
Reason:
How Dictionaries Work in
Python
keys are “hashed.” Python dictionaries are implemented as a hash table
behind the scenes. The dictionary uses each key’s hash function to
change some of the key’s information into an integer known as a hash
value.
So the only way to do a "partial search" (e.g. check the key against a "wildcard pattern") is to loop through each of the keys until you find a match.
Of course, once you find a matching key (by inspecting each key in turn), then you can use the dictionary to look up the corresponding value.
But you can't use the dictionary itself for a "wildcard lookup".
'Hope that helps...
You can create a filtered dictionary:
dx = {k: v for k, v in original if "floor" in k}
Then you can assume the keys in dx all share the property of containing 'floor', and just work with the values. If you need to do something more complex, replace if "floor" in k with if f(k), where f() is some function you write to decide if a key k should be included in the final set you're working with.
Something like this would work:
for keyword in keyword_dict:
if search_term in keyword:
print(keyword_dict[keyword])

Get reference to Python dict key

In Python (3.7 and above) I would like to obtain a reference to a dict key. More precisely, let d be a dict where the keys are strings. In the following code, the value of k is potentially stored at two distinct locations in memory (one pointed to by the dict and one pointed to by k), whereas the value of v is stored at only one location (the one pointed to by the dict).
# d is a dict
# k is a string dynamically constructed, in particular not from iterating over d's keys
if k in d:
v = d[k]
# Now store k and v in other data structures
In my case, the dict is very large and the string keys are very long. To keep memory usage down I would like to replace k with a pointer to the corresponding string used by d before storing k in other data structures. Is there a straightforward way of doing this, that is using the keys of the dict as a string pool?
(Footnote: this may seem as premature optimisation, and perhaps it is, but being an old-school C programmer I sleep better at night doing "memory tricks". Joke aside, I do genuinely would like to know the answer out of curiosity, and I am indeed going to run my code on a Raspberry Pi and will probably face memory issues.)
Where does the key k come from? Is it dynamically constructed by something like str.join, + , slicing another string, bytes.decode etc? Is it read from a file or input()? Did you get it from iterating over d at some point? Or does it originate from a literal somewhere in your source code?
In the last two cases, you don't need to worry about it since it is going to be a single instance anyway.
If not, you could use sys.intern to intern your keys. If a == b then sys.intern(a) is sys.intern(b).
Another possible solution, in case you might want to garbage collect the strings at some point or you want to intern some non-string values, like tuples of strings, you could do the following:
# create this dictionary once after `d` has all the right keys
canonical_keys = {key: key for key in d}
k = canonical_keys.get(k, k) # use the same instance if possible
I recommend reading up on Python's data model.

Efficiently iterating a dictionary in Python

So here's the problem, I'm importing a dictionary with anywhere from 6000 to 12000 keys. Then using a nested for algorithm to group them into a list inside of another dictionary. I'm using the following code to check if the key is in the dictionary:
for key in range(sizeOfOriginalKeys):
if key in key_data:
As you might imagine, this is taking forever since the sorting algorithm is fairly complex. I would like to only iterate through the keys in 'key_data', without doing 1000 to 11999 checks if there is that key in the dictionary. Is there a way to make a list of current keys? Then iterate through them? Or at least something more efficient than what I'm currently doing?
Current Code after Kevin's suggestion:
for key in key_data:
currentKey = key_data[key].name
if key_data[currentKey].prefList[currentPref] == currentGroup
key_data[currentKey].currentScore = getDotProduct()
group_data[currentGroup].keyList.append(key_data[currentKey])
group_data[currentGroup].sortKeys()
del key_data[currentKey]
The key names are integers.
At the end of the sorting algorithm I delete the key, if its been sorted into a group.
Now I get an error: dictionary changed size during iteration.
Thoughts?
You're trying too hard:
for key in key_data:
You can try
for key,value in key_data.items() :
print key
print value
you can access to value without calling key_data[key]

Skipping to Next item in Dictionary

When iterating through a dictionary, I want to skip an item if it has a particular key. I tried something like mydict.next(), but I got an error message 'dict' object has no attribute 'next'
for key, value in mydict.iteritems():
if key == 'skipthis':
mydict.next()
# for others do some complicated process
I am using Python 2.7 if that matters.
Use continue:
for key, value in mydict.iteritems():
if key == 'skipthis':
continue
Also see:
Are break and continue bad programming practices?
I think you want to call mydict.iteritems().next(), however you should just filter the list before iterating.
To filter your list, you could use a generator expression:
r = ((k, v) for k, v in mydict.iteritems() if k != 'skipthis')
for k,v in r:
#do something complicated to filtered items
Because this is a generator expression, it has the property of only traversing the original dict once, leading to a boost in performance over other alternatives which iterate the dictionary, and optionally copy elements to a new one or delete existing elements from it. Generators can also be chained, which can be a powerful concept when iterating.
More info on generator expressions:
http://www.python.org/dev/peps/pep-0289/
Another alternative is this:
for key, value in mydict.iteritems():
if key != 'skipthis':
# Do whatever
It does the same thing as skipping the key with continue. The code under the if statement will only run if the key is not 'skipthis'.
The advantage of this method is that it is cleaner and saves lines. Also is a little better to read in my opinion.
You should ask the question why are you needing to do this? One unit of code should do one thing, so in this case the loop should have had the dict 'cleaned' before it reaches it.
Something along these lines:
def dict_cleaner(my_dict):
#make a dict of stuff you want your loop to deal with
return clean_dict
for key, value in dict_cleaner(mydict).iteritems():
#Do the stuff the loop actually does, no worrying about selecting items from it.

Most efficient way to update attribute of one instance

I'm creating an arbitrary number of instances (using for loops and ranges). At some event in the future, I need to change an attribute for only one of the instances. What's the best way to do this?
Right now, I'm doing the following:
1) Manage the instances in a list.
2) Iterate through the list to find a key value.
3) Once I find the right object within the list (i.e. key value = value I'm looking for), change whatever attribute I need to change.
for Instance within ListofInstances:
if Instance.KeyValue == SearchValue:
Instance.AttributeToChange = 10
This feels really inefficient: I'm basically iterating over the entire list of instances, even through I only need to change an attribute in one of them.
Should I be storing the Instance references in a structure more suitable for random access (e.g. dictionary with KeyValue as the dictionary key?) Is a dictionary any more efficient in this case? Should I be using something else?
Thanks,
Mike
Should I be storing the Instance references in a structure more suitable for random access (e.g. dictionary with KeyValue as the dictionary key?)
Yes, if you are mapping from a key to a value (which you are in this case), such that one typically accesses an element via its key, then a dict rather than a list is better.
Is a dictionary any more efficient in this case?
Yes, it is much more efficient. A dictionary takes O(1) on average to lookup an item by its key whereas a list takes O(n) to lookup an item by its key, which is what you are currently doing.
Using a Dictionary
# Construct the dictionary
d = {}
# Insert items into the dictionary
d[key1] = value1
d[key2] = value2
# ...
# Checking if an item exists
if key in d:
# Do something requiring d[key]
# such as updating an attribute:
d[key].attr = val
As you mention, you need to keep an auxiliary dictionary with the key value as the key and the instance (or list of instance with that value for their attribute) as the value(s) -- way more efficient. Indeed, there's nothing more efficient than a dictionary for such uses.
It depends on what the other needs of your program are. If all you ever do with these objects is access the one with that particular key value, then sure, a dictionary is perfect. But if you need to preserve the order of the elements, storing them in a dictionary won't do that. (You could store them in both a dict and a list, or there might be a data structure that provides a compromise between random access and order preservation) Alternatively, if more than one object can have the same key value, then you can't store both of them in a single dict at the same time, at least not directly. (You could have a dict of lists or something)

Categories

Resources