I wrote a code that takes 9 keys from API.
The authors, isbn_one, isbn_two, thumbinail, page_count fields may not always be retrievable, and if any of them are missing, I would like it to be None. Unfortunately, if, or even nested, doesn't work. Because that leads to a lot of loops. I also tried try and except KeyError etc. because each key has a different error and it is not known which to assign none to. Here is an example of logic when a photo is missing:
th = result['volumeInfo'].get('imageLinks')
if th is not None:
book_exists_thumbinail = {
'thumbinail': result['volumeInfo']['imageLinks']['thumbnail']
}
dnew = {**book_data, **book_exists_thumbinail}
book_import.append(dnew)
else:
book_exists_thumbinail_n = {
'thumbinail': None
}
dnew_none = {**book_data, **book_exists_thumbinail_n}
book_import.append(dnew_none)
When I use logic, you know when one condition is met, e.g. for thumbinail, the rest is not even checked.
When I use try and except, it's similar. There's also an ISBN in the keys, but there's a list in the dictionary over there, and I need to use something like this:
isbn_zer = result['volumeInfo']['industryIdentifiers']
dic = collections.defaultdict(list)
for d in isbn_zer:
for k, v in d.items():
dic[k].append(v)
Output data: [{'type': 'ISBN_10', 'identifier': '8320717507'}, {'type': 'ISBN_13', 'identifier': '9788320717501'}]
I don't know what to use anymore to check each key separately and in the case of its absence or lack of one ISBN (identifier) assign the value None. I have already tried many ideas.
The rest of the code:
book_import = []
if request.method == 'POST':
filter_ch = BookFilterForm(request.POST)
if filter_ch.is_valid():
cd = filter_ch.cleaned_data
filter_choice = cd['choose_v']
filter_search = cd['search']
search_url = "https://www.googleapis.com/books/v1/volumes?"
params = {
'q': '{}{}'.format(filter_choice, filter_search),
'key': settings.BOOK_DATA_API_KEY,
'maxResults': 2,
'printType': 'books'
}
r = requests.get(search_url, params=params)
results = r.json()['items']
for result in results:
book_data = {
'title': result['volumeInfo']['title'],
'authors': result['volumeInfo']['authors'][0],
'publish_date': result['volumeInfo']['publishedDate'],
'isbn_one': result['volumeInfo']['industryIdentifiers'][0]['identifier'],
'isbn_two': result['volumeInfo']['industryIdentifiers'][1]['identifier'],
'page_count': result['volumeInfo']['pageCount'],
'thumbnail': result['volumeInfo']['imageLinks']['thumbnail'],
'country': result['saleInfo']['country']
}
book_import.append(book_data)
else:
filter_ch = BookFilterForm()
return render(request, "BookApp/book_import.html", {'book_import': book_import,
'filter_ch': filter_ch})```
I am struggling with figuring out the best way to loop through a function. The output of this API is a Graph Connection and I am a-little out of my element. I really need to obtain ID's from an api output and have them in a dict or some sort of form that I can pass to another API call.
**** It is important to note that the original output is a graph connection.... print(type(api_response) does show it as a list however, if I do a print(type(api_response[0])) it returns a
This is the original output from the api call:
[{'_from': None, 'to': {'id': '5c9941fcdd2eeb6a6787916e', 'type': 'user'}}, {'_from': None, 'to': {'id': '5cc9055fcc5781152ca6eeb8', 'type': 'user'}}, {'_from': None, 'to': {'id': '5d1cf102c94c052cf1bfb3cc', 'type': 'user'}}]
This is the code that I have up to this point.....
api_response = api_instance.graph_user_group_members_list(group_id, content_type, accept,limit=limit, skip=skip, x_org_id=x_org_id)
def extract_id(result):
result = str(result).split(' ')
for i, r in enumerate(result):
if 'id' in r:
id = (result[i+1].translate(str.maketrans('', '', string.punctuation)))
print( id )
return id
extract_id(api_response)
def extract_id(result):
result = str(result).split(' ')
for i, r in enumerate(result):
if 'id' in r:
id = (result[i+8].translate(str.maketrans('', '', string.punctuation)))
print( id )
return id
extract_id(api_response)
def extract_id(result):
result = str(result).split(' ')
for i, r in enumerate(result):
if 'id' in r:
id = (result[i+15].translate(str.maketrans('', '', string.punctuation)))
print( id )
return id
extract_id(api_response)
I have been able to use a function to extract the ID's but I am doing so through a string. I am in need of a scalable solution that I can use to pass these ID's along to another API call.
I have tried to use a for loop but because it is 1 string and i+1 defines the id's position, it is redundant and just outputs 1 of the id's multiple times.
I am receiving the correct output using each of these functions however, it is not scalable..... and just is not a solution. Please help guide me......
So to solve the response as a string issue I would suggest using python's builtin json module. Specifically, the method .loads() can convert a string to a dict or list of dicts. From there you can iterate over the list or dict and check if the key is equal to 'id'. Here's an example based on what you said the response would look like.
import json
s = "[{'_from': None, 'to': {'id': '5c9941fcdd2eeb6a6787916e', 'type': 'user'}}, {'_from': None, 'to': {'id': '5cc9055fcc5781152ca6eeb8', 'type': 'user'}}, {'_from': None, 'to': {'id': '5d1cf102c94c052cf1bfb3cc', 'type': 'user'}}]"
# json uses double quotes and null; there is probably a better way to do this though
s = s.replace("\'", '\"').replace('None', 'null')
response = json.loads(s) # list of dicts
for d in response:
for key, value in d['to'].items():
if key == 'id':
print(value) # or whatever else you want to do
# 5c9941fcdd2eeb6a6787916e
# 5cc9055fcc5781152ca6eeb8
# 5d1cf102c94c052cf1bfb3cc
For some post-processing, I need to flatten a structure like this
{'foo': {
'cat': {'name': 'Hodor', 'age': 7},
'dog': {'name': 'Mordor', 'age': 5}},
'bar': { 'rat': {'name': 'Izidor', 'age': 3}}
}
into this dataset:
[{'foobar': 'foo', 'animal': 'dog', 'name': 'Mordor', 'age': 5},
{'foobar': 'foo', 'animal': 'cat', 'name': 'Hodor', 'age': 7},
{'foobar': 'bar', 'animal': 'rat', 'name': 'Izidor', 'age': 3}]
So I wrote this function:
def flatten(data, primary_keys):
out = []
keys = copy.copy(primary_keys)
keys.reverse()
def visit(node, primary_values, prim):
if len(prim):
p = prim.pop()
for key, child in node.iteritems():
primary_values[p] = key
visit(child, primary_values, copy.copy(prim))
else:
new = copy.copy(node)
new.update(primary_values)
out.append(new)
visit(data, { }, keys)
return out
out = flatten(a, ['foo', 'bar'])
I was not really satisfied because I have to use copy.copy to protect my inputs. Obviously, when using flatten one does not want the inputs be altered.
Then I thought about one alternative that uses more global variables (at least global to flatten) and uses an index instead of directly passing primary_keys to visit. However, this does not really help me to get rid of the ugly initial copy:
keys = copy.copy(primary_keys)
keys.reverse()
So here is my final version:
def flatten(data, keys):
data = copy.copy(data)
keys = copy.copy(keys)
keys.reverse()
out = []
values = {}
def visit(node, id):
if id:
id -= 1
for key, child in node.iteritems():
values[keys[id]] = key
visit(child, id)
else:
node.update(values)
out.append(node)
visit(data, len(keys))
return out
Is there a better implementation (that can avoid the use of copy.copy)?
Edit: modified to account for variable dictionary depth.
By using the merge function from my previous answer (below), you can avoid calling update which modifies the caller. There is then no need to copy the dictionary first.
def flatten(data, keys):
out = []
values = {}
def visit(node, id):
if id:
id -= 1
for key, child in node.items():
values[keys[id]] = key
visit(child, id)
else:
out.append(merge(node, values)) # use merge instead of update
visit(data, len(keys))
return out
One thing I don't understand is why you need to protect the keys input. I don't see them being modified anywhere.
Previous answer
How about list comprehension?
def merge(d1, d2):
return dict(list(d1.items()) + list(d2.items()))
[[merge({'foobar': key, 'animal': sub_key}, sub_sub_dict)
for sub_key, sub_sub_dict in sub_dict.items()]
for key, sub_dict in a.items()]
The tricky part was merging the dictionaries without using update (which returns None).
Given the following data received from a web form:
for key in request.form.keys():
print key, request.form.getlist(key)
group_name [u'myGroup']
category [u'social group']
creation_date [u'03/07/2013']
notes [u'Here are some notes about the group']
members[0][name] [u'Adam']
members[0][location] [u'London']
members[0][dob] [u'01/01/1981']
members[1][name] [u'Bruce']
members[1][location] [u'Cardiff']
members[1][dob] [u'02/02/1982']
How can I turn it into a dictionary like this? It's eventually going to be used as JSON but as JSON and dictionaries are easily interchanged my goal is just to get to the following structure.
event = {
group_name : 'myGroup',
notes : 'Here are some notes about the group,
category : 'social group',
creation_date : '03/07/2013',
members : [
{
name : 'Adam',
location : 'London',
dob : '01/01/1981'
}
{
name : 'Bruce',
location : 'Cardiff',
dob : '02/02/1982'
}
]
}
Here's what I have managed so far. Using the following list comprehension I can easily make sense of the ordinary fields:
event = [ (key, request.form.getlist(key)[0]) for key in request.form.keys() if key[0:7] != "catches" ]
but I'm struggling with the members list. There can be any number of members. I think I need to separately create a list for them and add that to a dictionary with the non-iterative records. I can get the member data like this:
tmp_members = [(key, request.form.getlist(key)) for key in request.form.keys() if key[0:7]=="members"]
Then I can pull out the list index and field name:
member_arr = []
members_orig = [ (key, request.form.getlist(key)[0]) for key in request.form.keys() if key[0:7] ==
"members" ]
for i in members_orig:
p1 = i[0].index('[')
p2 = i[0].index(']')
members_index = i[0][p1+1:p2]
p1 = i[0].rfind('[')
members_field = i[0][p1+1:-1]
But how do I add this to my data structure. The following won't work because I could be trying to process members[1][name] before members[0][name].
members_arr[int(members_index)] = {members_field : i[1]}
This seems very convoluted. Is there a simper way of doing this, and if not how can I get this working?
You could store the data in a dictionary and then use the json library.
import json
json_data = json.dumps(dict)
print(json_data)
This will print a json string.
Check out the json library here
Yes, convert it to a dictionary, then use json.dumps(), with some optional parameters, to print out the JSON in the format you need:
eventdict = {
'group_name': 'myGroup',
'notes': 'Here are some notes about the group',
'category': 'social group',
'creation_date': '03/07/2013',
'members': [
{'name': 'Adam',
'location': 'London',
'dob': '01/01/1981'},
{'name': 'Bruce',
'location': 'Cardiff',
'dob': '02/02/1982'}
]
}
import json
print json.dumps(eventdict, indent=4)
The order of the key:value pairs is not always consistent, but if you're just looking for pretty-looking JSON that can be parsed by a script, while remaining human-readable, this should work. You can also sort the keys alphabetically, using:
print json.dumps(eventdict, indent=4, sort_keys=True)
The following python functions can be used to create a nested dictionary from the flat dictionary. Just pass in the html form output to decode().
def get_key_name(str):
first_pos = str.find('[')
return str[:first_pos]
def get_subkey_name(str):
'''Used with lists of dictionaries only'''
first_pos = str.rfind('[')
last_pos = str.rfind(']')
return str[first_pos:last_pos+1]
def get_key_index(str):
first_pos = str.find('[')
last_pos = str.find(']')
return str[first_pos:last_pos+1]
def decode(idic):
odic = {} # Initialise an empty dictionary
# Scan all the top level keys
for key in idic:
# Nested entries have [] in their key
if '[' in key and ']' in key:
if key.rfind('[') == key.find('[') and key.rfind(']') == key.find(']'):
print key, 'is a nested list'
key_name = get_key_name(key)
key_index = int(get_key_index(key).replace('[','',1).replace(']','',1))
# Append can't be used because we may not get the list in the correct order.
try:
odic[key_name][key_index] = idic[key][0]
except KeyError: # List doesn't yet exist
odic[key_name] = [None] * (key_index + 1)
odic[key_name][key_index] = idic[key][0]
except IndexError: # List is too short
odic[key_name] = odic[key_name] + ([None] * (key_index - len(odic[key_name]) + 1 ))
# TO DO: This could be a function
odic[key_name][key_index] = idic[key][0]
else:
key_name = get_key_name(key)
key_index = int(get_key_index(key).replace('[','',1).replace(']','',1))
subkey_name = get_subkey_name(key).replace('[','',1).replace(']','',1)
try:
odic[key_name][key_index][subkey_name] = idic[key][0]
except KeyError: # Dictionary doesn't yet exist
print "KeyError"
# The dictionaries must not be bound to the same object
odic[key_name] = [{} for _ in range(key_index+1)]
odic[key_name][key_index][subkey_name] = idic[key][0]
except IndexError: # List is too short
# The dictionaries must not be bound to the same object
odic[key_name] = odic[key_name] + [{} for _ in range(key_index - len(odic[key_name]) + 1)]
odic[key_name][key_index][subkey_name] = idic[key][0]
else:
# This can be added to the output dictionary directly
print key, 'is a simple key value pair'
odic[key] = idic[key][0]
return odic
Is there a way in Python to serialize a dictionary that is using a tuple as key?
e.g.
a = {(1, 2): 'a'}
simply using json.dumps(a) raises this error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.6/json/__init__.py", line 230, in dumps
return _default_encoder.encode(obj)
File "/usr/lib/python2.6/json/encoder.py", line 367, in encode
chunks = list(self.iterencode(o))
File "/usr/lib/python2.6/json/encoder.py", line 309, in _iterencode
for chunk in self._iterencode_dict(o, markers):
File "/usr/lib/python2.6/json/encoder.py", line 268, in _iterencode_dict
raise TypeError("key {0!r} is not a string".format(key))
TypeError: key (1, 2) is not a string
You can't serialize that as json, json has a much less flexible idea about what counts as a dict key than python.
You could transform the mapping into a sequence of key, value pairs, something like this:
import json
def remap_keys(mapping):
return [{'key':k, 'value': v} for k, v in mapping.iteritems()]
...
json.dumps(remap_keys({(1, 2): 'foo'}))
>>> '[{"value": "foo", "key": [1, 2]}]'
from json import loads, dumps
from ast import literal_eval
x = {(0, 1): 'la-la la', (0, 2): 'extricate'}
# save: convert each tuple key to a string before saving as json object
s = dumps({str(k): v for k, v in x.items()})
# load in two stages:
# (i) load json object
obj = loads(s)
# (ii) convert loaded keys from string back to tuple
d = {literal_eval(k): v for k, v in obj.items()}
See https://stackoverflow.com/a/12337657/2455413.
JSON only supports strings as keys. You'll need to choose a way to represent those tuples as strings.
You could just use str((1,2)) as key because json only expects the keys as strings but if you use this you'll have to use a[str((1,2))] to get the value.
json can only accept strings as keys for dict,
what you can do, is to replace the tuple keys with string like so
with open("file", "w") as f:
k = dic.keys()
v = dic.values()
k1 = [str(i) for i in k]
json.dump(json.dumps(dict(zip(*[k1,v]))),f)
And than when you want to read it, you can change the keys back to tuples using
with open("file", r) as f:
data = json.load(f)
dic = json.loads(data)
k = dic.keys()
v = dic.values()
k1 = [eval(i) for i in k]
return dict(zip(*[k1,v]))
This solution:
Avoids the security risk of eval().
Is short.
Is copy-pastable as save and load functions.
Keeps the structure of tuple as the key, in case you are editing the JSON by hand.
Adds ugly \" to the tuple representation, which is worse than the other str()/eval() methods here.
Can only handle tuples as keys at the first level for nested dicts (as of this writing no other solution here can do better)
def json_dumps_tuple_keys(mapping):
string_keys = {json.dumps(k): v for k, v in mapping.items()}
return json.dumps(string_keys)
def json_loads_tuple_keys(string):
mapping = json.loads(string)
return {tuple(json.loads(k)): v for k, v in mapping.items()}
m = {(0,"a"): "first", (1, "b"): [9, 8, 7]}
print(m) # {(0, 'a'): 'first', (1, 'b'): [9, 8, 7]}
s = json_dumps_tuple_keys(m)
print(s) # {"[0, \"a\"]": "first", "[1, \"b\"]": [9, 8, 7]}
m2 = json_loads_tuple_keys(s)
print(m2) # {(0, 'a'): 'first', (1, 'b'): [9, 8, 7]}
print(m==m2) # True
Here is one way to do it. It will require the key to be json decoded after the main dictionary is decoded and the whole dictionary re-sequenced, but it is doable:
import json
def jsonEncodeTupleKeyDict(data):
ndict = dict()
# creates new dictionary with the original tuple converted to json string
for key,value in data.iteritems():
nkey = json.dumps(key)
ndict[nkey] = value
# now encode the new dictionary and return that
return json.dumps(ndict)
def main():
tdict = dict()
for i in range(10):
key = (i,"data",5*i)
tdict[key] = i*i
try:
print json.dumps(tdict)
except TypeError,e:
print "JSON Encode Failed!",e
print jsonEncodeTupleKeyDict(tdict)
if __name__ == '__main__':
main()
I make no claim to any efficiency of this method. I needed this for saving some joystick mapping data to a file. I wanted to use something that would create a semi-human readable format so it could be edited if needed.
You can actually not serialize tuples as key to json, but you can convert the tuple to a string and recover it, after you have deserialized the file.
with_tuple = {(0.1, 0.1): 3.14} ## this will work in python but is not serializable in json
{(0.1, 0.1): 3.14}
But you cannot serialize it with json. However, you can use
with_string = {str((0.1, 0.1))[1:-1]: 3.14} ## the expression [1,-1] removes the parenthesis surrounding the tuples in python.
{'0.1, 0.1': 3.14} # This is serializable
With a bit of cheating, you will recover the original tuple (after having deserialized the whole file) by treating each key (as str) separately
tuple(json.loads("["+'0.1, 0.1'+"]")) ## will recover the tuple from string
(0.1, 0.1)
It is a bit of overload to convert a string to a tuple using json.loads, but it will work. Encapsulate it and you are done.
Peace out and happy coding!
Nicolas
Here are two functions you could use to convert a dict_having_tuple_as_key into a json_array_having_key_and_value_as_keys and then de-convert it the way back
import json
def json_dumps_dict_having_tuple_as_key(dict_having_tuple_as_key):
if not isinstance(dict_having_tuple_as_key, dict):
raise Exception('Error using json_dumps_dict_having_tuple_as_key: The input variable is not a dictionary.')
list_of_dicts_having_key_and_value_as_keys = [{'key': k, 'value': v} for k, v in dict_having_tuple_as_key.items()]
json_array_having_key_and_value_as_keys = json.dumps(list_of_dicts_having_key_and_value_as_keys)
return json_array_having_key_and_value_as_keys
def json_loads_dictionary_split_into_key_and_value_as_keys_and_underwent_json_dumps(json_array_having_key_and_value_as_keys):
list_of_dicts_having_key_and_value_as_keys = json.loads(json_array_having_key_and_value_as_keys)
if not all(['key' in diz for diz in list_of_dicts_having_key_and_value_as_keys]) and all(['value' in diz for diz in list_of_dicts_having_key_and_value_as_keys]):
raise Exception('Error using json_loads_dictionary_split_into_key_and_value_as_keys_and_underwent_json_dumps: at least one dictionary in list_of_dicts_having_key_and_value_as_keys ismissing key "key" or key "value".')
dict_having_tuple_as_key = {}
for dict_having_key_and_value_as_keys in list_of_dicts_having_key_and_value_as_keys:
dict_having_tuple_as_key[ tuple(dict_having_key_and_value_as_keys['key']) ] = dict_having_key_and_value_as_keys['value']
return dict_having_tuple_as_key
usage example:
my_dict = {
('1', '1001', '2021-12-21', '1', '484'): {"name": "Carl", "surname": "Black", "score": 0},
('1', '1001', '2021-12-22', '1', '485'): {"name": "Joe", "id_number": 134, "percentage": 11}
}
my_json = json_dumps_dict_having_tuple_as_key(my_dict)
print(my_json)
[{'key': ['1', '1001', '2021-12-21', '1', '484'], 'value': {'name': 'Carl', 'surname': 'Black', 'score': 0}},
{'key': ['1', '1001', '2021-12-22', '1', '485'], 'value': {'name': 'Joe', 'id_number': 134, 'percentage': 11}}]
my_dict_reconverted = json_loads_dictionary_split_into_key_and_value_as_keys_and_underwent_json_dumps(my_json)
print(my_dict_reconverted)
{('1', '1001', '2021-12-21', '1', '484'): {'name': 'Carl', 'surname': 'Black', 'score': 0},
('1', '1001', '2021-12-22', '1', '485'): {'name': 'Joe', 'id_number': 134, 'percentage': 11}}
# proof of working 1
my_dict == my_dict_reconverted
True
# proof of working 2
my_dict == json_loads_dictionary_split_into_key_and_value_as_keys_and_underwent_json_dumps(
json_dumps_dict_having_tuple_as_key(my_dict)
)
True
(Using concepts expressed by #SingleNegationElimination to answer #Kvothe comment)
Here's a complete example to encode/decode nested dictionaries with tuple keys and values into/from json. tuple key will be a string in JSON.
values of types tuple or set will be converted to list
def JSdecoded(item:dict, dict_key=False):
if isinstance(item, list):
return [ JSdecoded(e) for e in item ]
elif isinstance(item, dict):
return { literal_eval(key) : value for key, value in item.items() }
return item
def JSencoded(item, dict_key=False):
if isinstance(item, tuple):
if dict_key:
return str(item)
else:
return list(item)
elif isinstance(item, list):
return [JSencoded(e) for e in item]
elif isinstance(item, dict):
return { JSencoded(key, True) : JSencoded(value) for key, value in item.items() }
elif isinstance(item, set):
return list(item)
return item
usage
import json
pydata = [
{ ('Apple','Green') : "Tree",
('Orange','Yellow'):"Orchard",
('John Doe', 1945) : "New York" }
]
jsstr= json.dumps(JSencoded(pydata), indent='\t')
print(jsstr)
#[
# {
# "('Apple', 'Green')": "Tree",
# "('Orange', 'Yellow')": "Orchard",
# "('John Doe', 1945)": "New York"
# }
#]
data = json.loads(jsstr) #string keys
newdata = JSdecoded(data) #tuple keys
print(newdata)
#[{('Apple', 'Green'): 'Tree', ('Orange', 'Yellow'): 'Orchard', ('John Doe', 1945): 'New York'}]
def stringify_keys(d):
if isinstance(d, dict):
return {str(k): stringify_keys(v) for k, v in d.items()}
if isinstance(d, (list, tuple)):
return type(d)(stringify_keys(v) for v in d)
return d
json.dumps(stringify_keys(mydict))