I have created a dictionary in python, this is some sample code from it.
filesAndHashes = dict()
...
>>>print filesAndHashes
{
"/home/rob/Desktop/test.txt":"1c52fe8fbb1463d541c2d971d9890c24",
"/home/rob/Desktop/file.dat":"6386ba70e82f11aa027bfc9874cd58cb",
"/home/rob/Desktop/test2.exe":"5b73c2a88fab97f558a07d40cc1e9d8e"
}
So all this is, is a file path and the MD5 of the file.
So what I want to do now is, I have found some MD5's of interest and created a list of them and want to search the dictionary for each MD5 in my list and return the file path to me for each hash.
Also the way the program works, there will never be an MD5 in my list that isn't in the dictionary, so not worried about error checking that.
Please feel free to ask for my information
Thanks.
You have a path -> hash mapping, but you need a hash -> path mapping. Assuming the hashes are unique, reverse the dictionary
>>> filesAndHashes = {'foo': '123', 'bar': '456'}
>>> hashesAndFiles = {hash:fname for fname,hash in filesAndHashes.iteritems()}
>>> hashesAndFiles
{'123': 'foo', '456': 'bar'}
Now just iterate over your list and report matches:
>>> hashes = ['456']
>>> for hash in hashes:
... filename = hashesAndFiles[hash]
... print(filename)
...
bar
If you cannot rule out that hashes are not unique, which in theory is possible, use a defaultdict.
>>> from collections import defaultdict
>>> hashesAndFiles = defaultdict(list)
>>>
>>> filesAndHashes = {'foo': '123', 'bar': '456', 'baz': '456'}
>>> for fname, hash in filesAndHashes.items():
... hashesAndFiles[hash].append(fname)
...
>>> hashesAndFiles
defaultdict(<type 'list'>, {'123': ['foo'], '456': ['baz', 'bar']})
>>>
>>> hashes = ['456']
>>> for hash in hashes:
... for filename in hashesAndFiles[hash]:
... print(filename)
...
baz
bar
Catch KeyErrors as needed (from your question I assumed you don't expect any non existing hashes in your list).
Reverse the dictionary so that the Keys are the Hashes, since you want to search with the Hashes.
Then simply search for the key in the dictionary with: filesAndHashes_reversed.get( hash_value, None )
filesAndHashes_reversed = { value: key for key, value in filesAndHashes.iteritems() }
hash_list = [ hash_1,hash_2, hash_3, ]
for hash in hash_list:
if filesAndHashes_reversed.get( hash, None ) == None:
print( "Not Found" )
else:
print( filesAndHashes_reversed.get( hash, None ) )
Probably you aren't using the right approach but first I'll answer the question as asked.
To find the FIRST match you can do this:
def find_item(md5hash)
for k,v in a.iteritems():
if v == md5hash:
return k
Note that this is the first match. In theory it is possible to have multiple entries with the same hash but the OP has said that the hashes are expected to be unique. But in that case why not use them as the key? This makes it easy to search for them:
hashes_and_files = dict()
hashes_and_files["1c52fe8fbb1463d541c2d971d9890c24"]="/home/rob/Desktop/test.txt"
hashes_and_files["6386ba70e82f11aa027bfc9874cd58cb"]="/home/rob/Desktop/file.dat"
hashes_and_files["5b73c2a88fab97f558a07d40cc1e9d8e"]="/home/rob/Desktop/test2.exe"
#finding is trivial
find_hash = "5b73c2a88fab97f558a07d40cc1e9d8e"
file_name = hashes_and_files["5b73c2a88fab97f558a07d40cc1e9d8e"]
Related
I'm not sure if this is even possible but it's worth a shot asking.
I want to be able to access the value from indexing one of the values.
The first thing that came to mind was this but of course, it didn't work.
dict = {['name1', 'name2'] : 'value1'}
print(dict.get('name1))
You can use a tuple (as it's immutable) as a dict key if you need to access it by a pair (or more) of strings (or other immutable values):
>>> d = {}
>>> d[("foo", "bar")] = 6
>>> d[("foo", "baz")] = 8
>>> d
{('foo', 'bar'): 6, ('foo', 'baz'): 8}
>>> d[("foo", "baz")]
8
>>>
This isn't "a key having multiple names", though, it's just a key that happens to be built of multiple strings.
Edit
As discussed in the comments, the end goal is to have multiple keys for each (static) value. That can be succinctly accomplished with an inverted dict first, which is then "flipped" using dict.fromkeys():
def foobar():
pass
def spameggs():
pass
func_to_names = {
foobar: ("foo", "bar", "fb", "foobar"),
spameggs: ("spam", "eggs", "se", "breakfast"),
}
name_to_func = {}
for func, names in func_to_names.items():
name_to_func.update(dict.fromkeys(names, func))
If we tried it you way using:
# Creating a dictionary
myDict = {[1, 2]: 'Names'}
print(myDict)
We get an output of:
TypeError: unhashable type: 'list'
To get around this, we can use this method:
# Creating an empty dictionary
myDict = {}
# Adding list as value
myDict["key1"] = [1, 2]
myDict["key2"] = ["Jim", "Jeff", "Jack"]
print(myDict)
I get a list here:
my_list=["Alex:1990:London",
"Tony:1993:NYC",
"Kate:2001:Beijing",
"Tony:2001:LA",
"Alex:1978:Shanghai"]
How can I get the target dictionary my_target_dict from my_list in the easiest way?
my_target_dict={
"Alex":["Alex:1990:London", "Alex:1978:Shanghai"],
"Tony":["Tony:1993:NYC", "Tony:2001:LA"],
"Kate":["Kate:2001:Beijing"]
}
Use a defaultdict:
>>> from collections import defaultdict
>>> my_list=["Alex:1990:London", "Tony:1993:NYC", "Kate:2001:Beijing", "Tony:2001:LA", "Alex:1978:Shanghai"]
>>> d = defaultdict(list)
>>> for item in my_list:
... name, *_ = item.partition(":")
... d[name].append(item)
...
>>> d
defaultdict(<class 'list'>, {'Alex': ['Alex:1990:London', 'Alex:1978:Shanghai'], 'Tony': ['Tony:1993:NYC', 'Tony:2001:LA'], 'Kate': ['Kate:2001:Beijing']})
>>> d["Alex"]
['Alex:1990:London', 'Alex:1978:Shanghai']
You can use this comprehension to clean the list wrapped single items:
>>> {k:v if len(v) > 1 else v[0] for k,v in d.items()}
{'Alex': ['Alex:1990:London', 'Alex:1978:Shanghai'], 'Tony': ['Tony:1993:NYC', 'Tony:2001:LA'], 'Kate': 'Kate:2001:Beijing'}
In case you intend to work strictly with lists and dictionaries alone, try this:
my_target_dict=dict()
for value in my_list:
key=value.split(':')[0]
if key in my_target_dict:
my_target_dict[key].append(value)
else:
my_target_dict[key]=[value]
print(my_target_dict)
This is my solution for you:
my_list=["Alex:1990:London", "Tony:1993:NYC", "Kate:2001:Beijing", "Tony:2001:LA", "Alex:1978:Shanghai"]
dict = {}
for idx, content in enumerate(my_list):
name = content[:(content.index(':'))]
if name not in dict:
dict[name] = []
dict[name].append(my_list[idx])
First if you don't know about enumerate, it count your index and
take the content in each element of list.
Second, take name of there people by basic python of string. I use name = content[:(content.index(':'))] in order to take string from start to the first symbol ":".
Third, check if the key of dict exist or not. Otherwise, it will delete all your element in list of that key.
Last but not least, append the element you want into your key dict.
Your finally result:
{'Alex': ['Alex:1990:London', 'Alex:1978:Shanghai'], 'Tony': ['Tony:1993:NYC', 'Tony:2001:LA'], 'Kate': ['Kate:2001:Beijing']}
If you are a beginner (as I see) and don't want to use Python's collections module and do the implementation from scratch (it's imp to understand the concept of background work which collection does).
Once you are familiar with this, you can go with collections module and that is beautiful as it has many classes like defaultdict, OrderedDict etc. which can boost the speed of your work.
Here is what I have tried (do not forget to read the commented lines).
I have written a function named get_my_target_dict() which takes my_list and returns my_target_dict. And this is the modular implemenation (that you should prefer).
re is a module to work with regular expressions. Here it is used to match "Alex: 1990 : London" (i.e. spaces around :) kind of strings if any (by mistake).
import re
def get_my_target_dict(my_list):
my_target_dict = {} # dictionary
for string in my_list:
# "Alex:1990:London" => ["Alex", "1990", "London"]
# "Alex : 1990: London" => ["Alex", "1990", "London"]
items = re.split(r"\s*:\s*", string) # `\s*` is to match spaces around `:`
print(items)
# Alex, Tony etc.
key = items[0]
if key in my_target_dict:
my_target_dict[key].append(string)
else:
my_target_dict[key] = [string]
return my_target_dict
if __name__ == "__main__":
my_list=["Alex:1990:London",
"Tony:1993:NYC",
"Kate:2001:Beijing",
"Tony:2001:LA",
"Alex:1978:Shanghai"]
# Call get_my_target_dict(), pass my_list & get my_target_dict
my_target_dict = get_my_target_dict(my_list)
print(my_target_dict)
# {'Alex': ['Alex:1990:London', 'Alex:1978:Shanghai'], 'Tony': ['Tony:1993:NYC', 'Tony:2001:LA'], 'Kate': ['Kate:2001:Beijing']}
# Pretty printing dictionary
import json
print(json.dumps(my_target_dict, indent=4))
# {
# "Alex": [
# "Alex:1990:London",
# "Alex:1978:Shanghai"
# ],
# "Tony": [
# "Tony:1993:NYC",
# "Tony:2001:LA"
# ],
# "Kate": [
# "Kate:2001:Beijing"
# ]
# }
I'm working on a small framework and I've found a place where it would be beneficial to save a dictionary key as variable.
The problem I have is that the dictionary may have any number of layers, so it's not just a case of storing the final key. For example in the below I am accessing ['dig']['result'], but that could equally be ['output'] or ['some']['thing']['strange']
if result:
if self.cli_args.json:
pprint(result)
else:
print result['dig']['result']
I could save the key as a string and use eval() in something such as:
key="['test']"
test_dict = { "test" : "This works" }
eval("test_dict" + key)
>>> 'This works'
But eval is really dirty right? :-)
Is there a nice / pythonic way to do this?
To handle an arbitrary depth of key nesting, you can iterate over a sequence (e.g. tuple) of the keys:
>>> d = {'a': {'b': {'c': 'd'}}}
>>> d['a']['b']['c']
'd'
>>> keys = ('a', 'b', 'c') # or just 'abc' for this trivial example
>>> content = d
>>> for k in keys:
content = content[k]
>>> content
'd'
>>> def access(o,path):
... for k in path.split('/'):
... o = o[k]
... return o
...
>>> access({'a': {'b': {'c': 'd'}}},'a/b/c')
'd'
Is a dictionary the right type for data where I want to look up entries based on an index, e.g.
dictlist = {}
dictlist['itemid' + '1'] = {'name':'AAA', 'class':'Class1', 'nonstandard':'whatever'}
dictlist['itemid' + '2'] = {'name':'BBB', 'class':'Class2', 'maynotbehere':'optional'}
dictlist['itemid' + '3'] = {'name':'CCC', 'class':'Class3', 'regular':'or not'}
I can now address a specific item, e.g.
finditem='itemid2'
dictitem = {}
try:
dictitem[finditem] = dictlist[finditem]
print dictitem
except KeyError:
print "Nothing there"
Is that the right way to create such a lookup table in python?
If I now wanted to print the data, but only the Item ID, and an associated dictionary with only name and class "properties", how can I do that?
I am looking for something that will create a new dictionary by copying the desired properties only, or else present a limited view of the existing dictionary, as if the unspecified properties were not there. So for example
view(dictlist, 'name', 'class')
will return a dictionary that displays a restricted view of the list, showing only the name and class keys. I have tried
view = {}
for item in dictlist:
view[item] = {dictlist[item]['name'], dictlist[item]['class']}
print view
Which returns
{'itemid1': set(['AAA', 'Class1']), 'itemid3': set(['Class3', 'CCC']), 'itemid2': set(['Class2', 'BBB'])}
Instead of
{'itemid1': {'name':'AAA', 'class':'Class1'}, 'itemid3': {'name':'CCC', 'class':'Class3'}, 'itemid2': {'name':'BBB', 'class':'Class2'} }
Note that {'foo', 'bar'} is a set literal, not a dictionary literal, as it does not have the key: value syntax required for a dictionary:
>>> type({'foo', 'bar'})
<class 'set'>
>>> type({'foo': 'bar'})
<class 'dict'>
You need to be more careful with your syntax generally; I have no idea what the random closing square brackets ] are doing in the output you claim you want, and it's missing a closing brace }.
You could extend your current code to do keys and values as follows:
for item in dictlist:
view[item] = {'name': dictlist[item]['name'],
'class': dictlist[item]['class']}
but a more generic function would look like:
def view(dictlist, *keys):
output = {}
for item in dictlist:
output[item] = {}
for key in keys:
output[item][key] = dictlist[item].get(key)
return output
note the use of dict.get to handle missing keys gracefully:
>>> d = {'foo': 'bar'}
>>> d.get('foo')
'bar' # returns the value if key present, or
>>> d.get('baz')
>>> # returns None by default
or, using a "dictionary comprehension":
def view(dictlist, *keys):
return {k1: {k2: v2 for k2, v2 in v1.items() if k2 in keys}
for k1, v1 in dictlist.items()}
(This will exclude missing keys from the output, whereas the previous code will include them with None value - which is preferable will depend on your use case.)
Note the use of *keys to take an arbitrary number of positional arguments:
>>> def test(d, *keys):
print(keys)
>>> test({}, "foo", "bar", "baz")
('foo', 'bar', 'baz')
I have dictionary with three-level nesting, eg:
d = {
'sp1':{
'a1':{'c1':2,'c2':3},
'a2':{'c3':1,'c4':4}
},
'sp2':{
'a1':{'c1':3,'c2':3},
'a2':{'c3':2,'c4':0}
}
}
All 2nd-level dictionaries contain the same elements, so I want to change it to
d2 = {'a1':{'c1':{'sp1':2,'sp2':3}, 'c2':{'sp1':3,'sp2':3}}}
i.e. essentially switch nesting order. But when I write code like
d2 = {}
d2['a1']['c1']['sp1'] = 2
It just throws KeyError with whatever values happens to be 'a1'. How do I perform such operation?
If you are doing it manually like the snippet you tried, this is how you should be doing it:
>>> d = {
... 'sp1':{
... 'a1':{'c1':2,'c2':3},
... 'a2':{'c3':1,'c4':4}
... },
... 'sp2':{
... 'a1':{'c1':3,'c2':3},
... 'a2':{'c3':2,'c4':0}
... }
... }
>>>
>>> e = {}
>>> e['a1'] = {}
>>> e['a1']['c1'] = {}
>>> e['a1']['c1']['sp1'] = d['sp1']['a1']['c1']
>>> e['a1']['c2'] = {}
>>> e['a1']['c2']['sp1'] = d['sp1']['a1']['c2']
>>> e['a2'] = {}
>>> e['a2']['c1'] = {}
>>> e['a2']['c2'] = {}
>>> e['a1']['c1']['sp2'] = d['sp2']['a1']['c1']
>>> e['a1']['c2']['sp2'] = d['sp2']['a1']['c2']
>>> e
{'a1': {'c2': {'sp1': 3, 'sp2': 3}, 'c1': {'sp1': 2, 'sp2': 3}}}
>>>
But it is unclear as to why you are doing it. As OmnipotentEntity suggested in the comments, may be you need to use a different data structure to store the data.
To do this, you can use defaultdict, which allows you to define a default initialization action on a dict.
In your case, you want a reverse-order recursive defaultdict, with a classmethod
reverse_recursive_make() which unrolls and reverses the key order:
when passed in a key-value pair or None, returns a (toplevel) dict
when passed in a dict, recurses into each of the {k:v} pairs
I'm not going to write the code for that because what you want can be much more easily achieved with SQL, like I commented.
FOOTNOTE: your version with lambdas (comment below) is perfect.
(If you insist on using dicts, and not some other data structure)
something like this should work
d_final = {}
for k in d.keys():
d2 = d[k]
for k2 in d2.keys():
d3 = d2[k2]
for k3 in d3.keys():
d4 = d_final.get(k2,{})
d4[k] = d3[k3]
d_final[k2] = d4
I may have my indexing off a little, but that should be about right.