Python `dict` indexed by tuple: Getting a slice of the pie

Python `dict` indexed by tuple: Getting a slice of the pie - python

Let's say I have
my_dict = {
("airport", "London"): "Heathrow",
("airport", "Tokyo"): "Narita",
("hipsters", "London"): "Soho"
}
What is an efficient (no scanning of all keys), yet elegant way to get all airports out of this dictionary, i.e. expected output ["Heathrow", "Narita"]. In databases that can index by tuples, it's usually possible to do something like
airports = my_dict.get(("airport",*))
(but usually only with the 'stars' sitting at the rightmost places in the tuple since the index usually is only stored in one order).
Since I imagine Python to index dictionary with tuple keys in a similar way (using the keys's inherent order), I imagine there might be a method I could use to slice the index this way?
Edit1: Added expected output
Edit2: Removed last phrase. Added '(no scanning of all keys)' to the conditions to make it clearer.

The way your data is currently organized doesn't allow efficient lookup - essentially you have to scan all the keys.
Dictionaries are hash tables behind the scenes, and the only way to access a value is to get the hash of the key - and for that, you need the whole key.
Use a nested hierarchy like this, so you can do a direct O(1) lookup:
my_dict = {
"airport": {
"London": "Heathrow",
"Tokyo": "Narita",
},
"hipsters": {
"London": "Soho"
}
}

Check "airport" is present in the every key in the dictionary.
Demo:
>>> [value for key, value in my_dict.items() if "airport" in key]
['Narita', 'Heathrow']
>>>
Yes, Nested dictionary will be better option.
>>> my_dict = {
... "airport": {
... "London": "Heathrow",
... "Tokyo": "Narita",
... },
... "hipsters": {
... "London": "Soho"
... }
... }
>>>
>>> if "airport" in my_dict:
... result = my_dict["airport"].values()
... else:
... result = []
...
>>> print result
['Heathrow', 'Narita']
>>>

What I'd like to avoid, if possible, is to go through all dictionary keys and filter them down.
Why? Why do you think Python is doing the equivalent of a DB full table scan? Filtering a dictionary does not mean sequential scanning it.
Python:
[value for key, value in my_dict.items() if key[0] == "airport"]
Output:
['Narita', 'Heathrow']

Related

Nested Dictionary (JSON): Merge multiple keys stored in a list to access its value from the dict

I have a JSON with an unknown number of keys & values, I need to store the user's selection in a list & then access the selected key's value; (it'll be guaranteed that the keys in the list are always stored in the correct sequence).
Example
I need to access the value_key1-2.
mydict = {
'key1': {
'key1-1': {
'key1-2': 'value_key1-2'
},
},
'key2': 'value_key2'
}
I can see the keys & they're limited so I can manually use:
>>> print(mydict['key1']['key1-1']['key1-2'])
>>> 'value_key1-2'
Now after storing the user's selections in a list, we have the following list:
Uselection = ['key1', 'key1-1', 'key1-2']
How can I convert those list elements into the similar code we used earlier?
How can I automate it using Python?

You have to loop the list of keys and update the "current value" on each step.
val = mydict
try:
for key in Uselection:
val = val[key]
except KeyError:
handle non-existing keys here
Another, more 'posh' way to do the same (not generally recommended):
from functools import reduce
val = reduce(dict.get, Uselection, mydict)

Check for string in list items using list as reference

I want to replace items in a list based on another list as reference.
Take this example lists stored inside a dictionary:
dict1 = {
"artist1": ["dance pop","pop","funky pop"],
"artist2": ["chill house","electro house"],
"artist3": ["dark techno","electro techno"]
}
Then, I have this list as reference:
wish_list = ["house","pop","techno"]
My result should look like this:
dict1 = {
"artist1": ["pop"],
"artist2": ["house"],
"artist3": ["techno"]
}
I want to check if any of the list items inside "wishlist" is inside one of the values of the dict1. I tried around with regex, any.
This was an approach with just 1 list instead of a dictionary of multiple lists:
check = any(item in artist for item in wish_list)
if check == True:
artist_genres.clear()
artist_genres.append()
I am just beginning with Python on my own and am playing around with the SpotifyAPI to clean up my favorite songs into playlists. Thank you very much for your help!

The idea is like this,
dict1 = { "artist1" : ["dance pop","pop","funky pop"],
"artist2" : ["house","electro house"],
"artist3" : ["techno","electro techno"] }
wish_list = ["house","pop","techno"]
dict2={}
for key,value in dict1.items():
for i in wish_list:
if i in value:
dict2[key]=i
break
print(dict2)

A regex is not needed, you can get away by simply iterating over the list:
wish_list = ["house","pop","techno"]
dict1 = {
"artist1": ["dance pop","pop","funky pop"],
"artist2": ["chill house","electro house"],
"artist3": ["dark techno","electro techno"]
}
dict1 = {
# The key is reused as-is, no need to change it.
# The new value is the wishlist, filtered based on its presence in the current value
key: [genre for genre in wish_list if any(genre in item for item in value)]
for key, value in dict1.items() # this method returns a tuple (key, value) for each entry in the dictionary
}
This implementation relies a lot on list comprehensions (and also dictionary comprehensions), you might want to check it if it's new to you.

How to properly keep structure when removing keys in JSON using python?

I'm using this as a reference: Elegant way to remove fields from nested dictionaries
I have a large number of JSON-formatted data here and we've determined a list of unnecessary keys (and all their underlying values) that we can remove.
I'm a bit new to working with JSON and Python specifically (mostly did sysadmin work) and initially thought it was just a plain dictionary of dictionaries. While some of the data looks like that, several more pieces of data consists of dictionaries of lists, which can furthermore contain more lists or dictionaries with no specific pattern.
The idea is to keep the data identical EXCEPT for the specified keys and associated values.
Test Data:
to_be_removed = ['leecher_here']
easy_modo =
{
'hello_wold':'konnichiwa sekai',
'leeching_forbidden':'wanpan kinshi',
'leecher_here':'nushiyowa'
}
lunatic_modo =
{
'hello_wold':
{'
leecher_here':'nushiyowa','goodbye_world':'aokigahara'
},
'leeching_forbidden':'wanpan kinshi',
'leecher_here':'nushiyowa',
'something_inside':
{
'hello_wold':'konnichiwa sekai',
'leeching_forbidden':'wanpan kinshi',
'leecher_here':'nushiyowa'
},
'list_o_dicts':
[
{
'hello_wold':'konnichiwa sekai',
'leeching_forbidden':'wanpan kinshi',
'leecher_here':'nushiyowa'
}
]
}
Obviously, the original question posted there isn't accounting for lists.
My code, modified appropriately to work with my requirements.
from copy import deepcopy
def remove_key(json,trash):
"""
<snip>
"""
keys_set = set(trash)
modified_dict = {}
if isinstance(json,dict):
for key, value in json.items():
if key not in keys_set:
if isinstance(value, dict):
modified_dict[key] = remove_key(value, keys_set)
elif isinstance(value,list):
for ele in value:
modified_dict[key] = remove_key(ele,trash)
else:
modified_dict[key] = deepcopy(value)
return modified_dict
I'm sure something's messing with the structure since it doesn't pass the test I wrote since the expected data is exactly the same, minus the removed keys. The test shows that, yes it's properly removing the data but for the parts where it's supposed to be a list of dictionaries, it's only getting returned as a dictionary instead which will have unfortunate implications down the line.
I'm sure it's because the function returns a dictionary but I don't know to proceed from here in order to maintain the structure.
At this point, I'm needing help on what I could have overlooked.

When you go through your json file, you only need to determine whether it is a list, a dict or neither. Here is a recursive way to modify your input dict in place:
def remove_key(d, trash=None):
if not trash: trash = []
if isinstance(d,dict):
keys = [k for k in d]
for key in keys:
if any(key==s for s in trash):
del d[key]
for value in d.values():
remove_key(value, trash)
elif isinstance(d,list):
for value in d:
remove_key(value, trash)
remove_key(lunatic_modo,to_be_removed)
remove_key(easy_modo,to_be_removed)
Result:
{
"hello_wold": {
"goodbye_world": "aokigahara"
},
"leeching_forbidden": "wanpan kinshi",
"something_inside": {
"hello_wold": "konnichiwa sekai",
"leeching_forbidden": "wanpan kinshi"
},
"list_o_dicts": [
{
"hello_wold": "konnichiwa sekai",
"leeching_forbidden": "wanpan kinshi"
}
]
}
{
"hello_wold": "konnichiwa sekai",
"leeching_forbidden": "wanpan kinshi"
}

Retrieve a key-value pair from a dict as another dict

I have a dictionary that looks like:
{u'message': u'Approved', u'reference': u'A71E7A739E24', u'success': True}
I would like to retrieve the key-value pair for reference, i.e. { 'reference' : 'A71E7A739E24' }.
I'm trying to do this using iteritems which does return k, v pairs, and then I'm adding them to a new dictionary. But then, the resulting value is unicode rather than str for some reason and I'm not sure if this is the most straightforward way to do it:
ref = {}
for k, v in charge.iteritems():
if k == 'reference':
ref['reference'] = v
print ref
{'reference': u'A71E7A739E24'}
Is there a built-in way to do this more easily? Or, at least, to avoid using iteritems and simply return:
{ 'reference' : 'A71E7A739E24' }

The trouble with using iteritems is that you increase lookup time to O(n) where n is dictionary size, because you are no longer using a hash table

If you only need to get one key-value pair, it's as simple as
ref = { key: d[key] }
If there may be multiple pairs that are selected by some condition,
either use dict from iterable constructor (the 2nd version is better if your condition depends on values, too):
ref = dict(k,d[k] for k in charge if <condition>)
ref = dict(k,v for k,v in charge.iteritems() if <condition>)
or (since 2.7) a dict comprehension (which is syntactic sugar for the above):
ref = {k,d[k] for k in charge if <condition>}
<same as above>

I dont understand the question:
is this what you are trying to do:
ref={'reference',charge["reference"]}

Accessing elements of Python dictionary by index

Consider a dict like
mydict = {
'Apple': {'American':'16', 'Mexican':10, 'Chinese':5},
'Grapes':{'Arabian':'25','Indian':'20'} }
How do I access for instance a particular element of this dictionary?
for instance, I would like to print the first element after some formatting the first element of Apple which in our case is 'American' only?
Additional information
The above data structure was created by parsing an input file in a python function. Once created however it remains the same for that run.
I am using this data structure in my function.
So if the file changes, the next time this application is run the contents of the file are different and hence the contents of this data structure will be different but the format would be the same.
So you see I in my function I don't know that the first element in Apple is 'American' or anything else so I can't directly use 'American' as a key.

Given that it is a dictionary you access it by using the keys. Getting the dictionary stored under "Apple", do the following:
>>> mydict["Apple"]
{'American': '16', 'Mexican': 10, 'Chinese': 5}
And getting how many of them are American (16), do like this:
>>> mydict["Apple"]["American"]
'16'

If the questions is, if I know that I have a dict of dicts that contains 'Apple' as a fruit and 'American' as a type of apple, I would use:
myDict = {'Apple': {'American':'16', 'Mexican':10, 'Chinese':5},
'Grapes':{'Arabian':'25','Indian':'20'} }
print myDict['Apple']['American']
as others suggested. If instead the questions is, you don't know whether 'Apple' as a fruit and 'American' as a type of 'Apple' exist when you read an arbitrary file into your dict of dict data structure, you could do something like:
print [ftype['American'] for f,ftype in myDict.iteritems() if f == 'Apple' and 'American' in ftype]
or better yet so you don't unnecessarily iterate over the entire dict of dicts if you know that only Apple has the type American:
if 'Apple' in myDict:
if 'American' in myDict['Apple']:
print myDict['Apple']['American']
In all of these cases it doesn't matter what order the dictionaries actually store the entries. If you are really concerned about the order, then you might consider using an OrderedDict:
http://docs.python.org/dev/library/collections.html#collections.OrderedDict

As I noticed your description, you just know that your parser will give you a dictionary that its values are dictionary too like this:
sampleDict = {
"key1": {"key10": "value10", "key11": "value11"},
"key2": {"key20": "value20", "key21": "value21"}
}
So you have to iterate over your parent dictionary. If you want to print out or access all first dictionary keys in sampleDict.values() list, you may use something like this:
for key, value in sampleDict.items():
print value.keys()[0]
If you want to just access first key of the first item in sampleDict.values(), this may be useful:
print sampleDict.values()[0].keys()[0]
If you use the example you gave in the question, I mean:
sampleDict = {
'Apple': {'American':'16', 'Mexican':10, 'Chinese':5},
'Grapes':{'Arabian':'25','Indian':'20'}
}
The output for the first code is:
American
Indian
And the output for the second code is:
American
EDIT 1:
Above code examples does not work for version 3 and above of python; since from version 3, python changed the type of output of methods keys and values from list to dict_values. Type dict_values is not accepting indexing, but it is iterable. So you need to change above codes as below:
First One:
for key, value in sampleDict.items():
print(list(value.keys())[0])
Second One:
print(list(list(sampleDict.values())[0].keys())[0])

I know this is 8 years old, but no one seems to have actually read and answered the question.
You can call .values() on a dict to get a list of the inner dicts and thus access them by index.
>>> mydict = {
... 'Apple': {'American':'16', 'Mexican':10, 'Chinese':5},
... 'Grapes':{'Arabian':'25','Indian':'20'} }
>>>mylist = list(mydict.values())
>>>mylist[0]
{'American':'16', 'Mexican':10, 'Chinese':5},
>>>mylist[1]
{'Arabian':'25','Indian':'20'}
>>>myInnerList1 = list(mylist[0].values())
>>>myInnerList1
['16', 10, 5]
>>>myInnerList2 = list(mylist[1].values())
>>>myInnerList2
['25', '20']

As a bonus, I'd like to offer kind of a different solution to your issue. You seem to be dealing with nested dictionaries, which is usually tedious, especially when you have to check for existence of an inner key.
There are some interesting libraries regarding this on pypi, here is a quick search for you.
In your specific case, dict_digger seems suited.
>>> import dict_digger
>>> d = {
'Apple': {'American':'16', 'Mexican':10, 'Chinese':5},
'Grapes':{'Arabian':'25','Indian':'20'}
}
>>> print(dict_digger.dig(d, 'Apple','American'))
16
>>> print(dict_digger.dig(d, 'Grapes','American'))
None

You can use mydict['Apple'].keys()[0] in order to get the first key in the Apple dictionary, but there's no guarantee that it will be American. The order of keys in a dictionary can change depending on the contents of the dictionary and the order the keys were added.

You can't rely on order of dictionaries, but you may try this:
mydict['Apple'].items()[0][0]
If you want the order to be preserved you may want to use this:
http://www.python.org/dev/peps/pep-0372/#ordered-dict-api

Simple Example to understand how to access elements in the dictionary:-
Create a Dictionary
d = {'dog' : 'bark', 'cat' : 'meow' }
print(d.get('cat'))
print(d.get('lion'))
print(d.get('lion', 'Not in the dictionary'))
print(d.get('lion', 'NA'))
print(d.get('dog', 'NA'))
Explore more about Python Dictionaries and learn interactively here...

Few people appear, despite the many answers to this question, to have pointed out that dictionaries are un-ordered mappings, and so (until the blessing of insertion order with Python 3.7) the idea of the "first" entry in a dictionary literally made no sense. And even an OrderedDict can only be accessed by numerical index using such uglinesses as mydict[mydict.keys()[0]] (Python 2 only, since in Python 3 keys() is a non-subscriptable iterator.)
From 3.7 onwards and in practice in 3,6 as well - the new behaviour was introduced then, but not included as part of the language specification until 3.7 - iteration over the keys, values or items of a dict (and, I believe, a set also) will yield the least-recently inserted objects first. There is still no simple way to access them by numerical index of insertion.
As to the question of selecting and "formatting" items, if you know the key you want to retrieve in the dictionary you would normally use the key as a subscript to retrieve it (my_var = mydict['Apple']).
If you really do want to be able to index the items by entry number (ignoring the fact that a particular entry's number will change as insertions are made) then the appropriate structure would probably be a list of two-element tuples. Instead of
mydict = {
'Apple': {'American':'16', 'Mexican':10, 'Chinese':5},
'Grapes':{'Arabian':'25','Indian':'20'} }
you might use:
mylist = [
('Apple', {'American':'16', 'Mexican':10, 'Chinese':5}),
('Grapes', {'Arabian': '25', 'Indian': '20'}
]
Under this regime the first entry is mylist[0] in classic list-endexed form, and its value is ('Apple', {'American':'16', 'Mexican':10, 'Chinese':5}). You could iterate over the whole list as follows:
for (key, value) in mylist: # unpacks to avoid tuple indexing
if key == 'Apple':
if 'American' in value:
print(value['American'])
but if you know you are looking for the key "Apple", why wouldn't you just use a dict instead?
You could introduce an additional level of indirection by cacheing the list of keys, but the complexities of keeping two data structures in synchronisation would inevitably add to the complexity of your code.

With the following small function, digging into a tree-shaped dictionary becomes quite easy:
def dig(tree, path):
for key in path.split("."):
if isinstance(tree, dict) and tree.get(key):
tree = tree[key]
else:
return None
return tree
Now, dig(mydict, "Apple.Mexican") returns 10, while dig(mydict, "Grape") yields the subtree {'Arabian':'25','Indian':'20'}. If a key is not contained in the dictionary, dig returns None.
Note that you can easily change (or even parameterize) the separator char from '.' to '/', '|' etc.

mydict = {
'Apple': {'American':'16', 'Mexican':10, 'Chinese':5},
'Grapes':{'Arabian':'25','Indian':'20'} }
for n in mydict:
print(mydict[n])

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python `dict` indexed by tuple: Getting a slice of the pie - python

Related

Nested Dictionary (JSON): Merge multiple keys stored in a list to access its value from the dict

Check for string in list items using list as reference

How to properly keep structure when removing keys in JSON using python?

Retrieve a key-value pair from a dict as another dict

Accessing elements of Python dictionary by index

Categories

Resources