Related
I have the following data, and when I used json_flatten i was able to extract most of the data except for deliveryMethod.items and languages.items.
I also tried to use pd.json_normalize(a, record_path= 'deliveryMethod.items') but it doesn't seem to be working.
a = {'ID': '1', 'Name': 'ABC', 'Center': 'Center For Education', 'providerNameAr': 'ABC', 'city': {'id': 1, 'cityEn': 'LA', 'regionId': 0, 'region': None}, 'cityName': None, 'LevelNumber': 'ABCD', 'activityStartDate': '09/01/2020', 'activityEndDate': '09/02/2020', 'activityType': {'lookUpId': 2, 'lookUpEn': 'Course', 'code': None, 'parent': None, 'hasParent': False}, 'deliveryMethod': {'items': [{'lookUpId': 2, 'lookUpEn': 'online' 'code': None, 'parent': None, 'hasParent': False}]}, 'languages': {'items': [{'lookUpId': 1, 'lookUpEn': 'English', 'code': None, 'parent': None, 'hasParent': False}]}, 'activityCategory': {'lookUpId': 1, 'lookUpEn': 'Regular', 'code': None, 'parent': None, 'hasParent': False}, 'address': 'LA', 'phoneNumber': '-11111', 'emailAddress': 'ABCS#Gmail.com', 'isAllSpeciality': True, 'requestId': 23, 'parentActivityId': None, 'sppData': None}
I found some elegant code that builds a list by iterating through each element of another JSON list:
results = [
(
t["vintage"]["wine"]["winery"]["name"],
t["vintage"]["year"],
t["vintage"]["wine"]["id"],
f'{t["vintage"]["wine"]["name"]} {t["vintage"]["year"]}',
t["vintage"]["wine"]["statistics"]["ratings_average"],
t["vintage"]["wine"]["statistics"]["ratings_count"],
t["price"]["amount"],
t["vintage"]["wine"]["region"]["name"],
t["vintage"]["wine"]["style"]["name"], #<--------------issue here
)
for t in r.json()["explore_vintage"]["matches"]
]
The problem is that sometimes the JSON doesn't have a "name" element because the "style" is null (or None in JSON world). See the second-last line below for the JSON sample.
Is there a simple way to handle this error?
Error:
matches[23]["vintage"]["wine"]["style"]["name"]
Traceback (most recent call last):
File "<ipython-input-94-59447d0d4859>", line 1, in <module>
matches[23]["vintage"]["wine"]["style"]["name"]
TypeError: 'NoneType' object is not subscriptable
Perhaps something like:
iferror(t["vintage"]["wine"]["style"]["name"], "DoesNotExist")
JSON:
{'id': 4026076,
'name': 'Shiraz - Petit Verdot',
'seo_name': 'shiraz-petit-verdot',
'type_id': 1,
'vintage_type': 0,
'is_natural': False,
'region': {'id': 685,
'name': 'South Eastern Australia',
'name_en': '',
'seo_name': 'south-eastern',
'country': {'code': 'au',
'name': 'Australia',
'native_name': 'Australia',
'seo_name': 'australia',
'sponsored': False,
'currency': {'code': 'AUD',
'name': 'Australian Dollars',
'prefix': '$',
'suffix': None},
'regions_count': 120,
'users_count': 867353,
'wines_count': 108099,
'wineries_count': 13375,
'most_used_grapes': [{'id': 1,
'name': 'Shiraz/Syrah',
'seo_name': 'shiraz-syrah',
'has_detailed_info': True,
'wines_count': 536370},
{'id': 2,
'name': 'Cabernet Sauvignon',
'seo_name': 'cabernet-sauvignon',
'has_detailed_info': True,
'wines_count': 780931},
{'id': 5,
'name': 'Chardonnay',
'seo_name': 'chardonnay',
'has_detailed_info': True,
'wines_count': 586874}],
'background_video': None},
'class': {'typecast_map': {'background_image': {}, 'class': {}}},
'background_image': {'location': '//images.vivino.com/regions/backgrounds/0iT8wuQXRWaAmEGpPjZckg.jpg',
'variations': {'large': '//thumbs.vivino.com/region_backgrounds/0iT8wuQXRWaAmEGpPjZckg_1280x760.jpg',
'medium': '//thumbs.vivino.com/region_backgrounds/0iT8wuQXRWaAmEGpPjZckg_600x356.jpg'}}},
'winery': {'id': 74363,
'name': 'Barramundi',
'seo_name': 'barramundi',
'status': 0,
'background_image': None},
'taste': {'structure': None,
'flavor': [{'group': 'black_fruit', 'stats': {'count': 16, 'score': 2987}},
{'group': 'oak', 'stats': {'count': 11, 'score': 1329}},
{'group': 'red_fruit', 'stats': {'count': 10, 'score': 1413}},
{'group': 'spices', 'stats': {'count': 6, 'score': 430}},
{'group': 'non_oak', 'stats': {'count': 5, 'score': 126}},
{'group': 'floral', 'stats': {'count': 3, 'score': 300}},
{'group': 'earth', 'stats': {'count': 3, 'score': 249}},
{'group': 'microbio', 'stats': {'count': 2, 'score': 66}},
{'group': 'vegetal', 'stats': {'count': 1, 'score': 100}},
{'group': 'dried_fruit', 'stats': {'count': 1, 'score': 100}}]},
'statistics': {'status': 'Normal',
'ratings_count': 1002,
'ratings_average': 3.5,
'labels_count': 11180,
'vintages_count': 25},
'style': None,
'has_valid_ratings': True}
I have list as follows:
data = [
{'items': [
{'key': u'3', 'id': 1, 'name': u'Typeplaatje'},
{'key': u'2', 'id': 2, 'name': u'Aanduiding van het chassisnummer '},
{'key': u'1', 'id': 3, 'name': u'Kilometerteller: Kilometerstand '},
{'key': u'5', 'id': 4, 'name': u'Inschrijvingsbewijs '},
{'key': u'4', 'id': 5, 'name': u'COC of gelijkvormigheidsattest '}
], 'id': 2, 'key': u'B', 'name': u'Onderdelen'},
{'items': [
{'key': u'10', 'id': 10, 'name': u'Koppeling'},
{'key': u'7', 'id': 11, 'name': u'Differentieel '},
{'key': u'9', 'id': 12, 'name': u'Cardanhoezen '},
{'key': u'8', 'id': 13, 'name': u'Uitlaat '},
{'key': u'6', 'id': 15, 'name': u'Batterij'}
], 'id': 2, 'key': u'B', 'name': u'Onderdelen'}
]
And I want to sort items by key.
Thus the wanted result is as follows:
res = [
{'items': [
{'key': u'1', 'id': 3, 'name': u'Kilometerteller: Kilometerstand '},
{'key': u'2', 'id': 2, 'name': u'Aanduiding van het chassisnummer '},
{'key': u'3', 'id': 1, 'name': u'Typeplaatje'},
{'key': u'4', 'id': 5, 'name': u'COC of gelijkvormigheidsattest '},
{'key': u'5', 'id': 4, 'name': u'Inschrijvingsbewijs '},
], 'id': 2, 'key': u'B', 'name': u'Onderdelen'},
{'items': [
{'key': u'6', 'id': 15, 'name': u'Batterij'},
{'key': u'7', 'id': 11, 'name': u'Differentieel '},
{'key': u'8', 'id': 13, 'name': u'Uitlaat '},
{'key': u'9', 'id': 12, 'name': u'Cardanhoezen '},
{'key': u'10', 'id': 10, 'name': u'Koppeling'}
], 'id': 2, 'key': u'B', 'name': u'Onderdelen'}
]
I've tried as follows:
res = []
for item in data:
new_data = {
'id': item['id'],
'key': item['key'],
'name': item['name'],
'items': sorted(item['items'], key=lambda k : k['key'])
}
res.append(new_data)
print(res)
The first is sorted fine, but the second one not.
What am I doing wrong and is there a better way of doing it?
Your sort is wrong in the second case because the keys are strings, and strings are sorted by their first character which is '1' if your key is '10'. A slight modification to your sorting function would do the trick:
'items': sorted(item['items'], key=lambda k : int(k['key'])
I'm doing an int because you want to sort them as if they are numbers. Here it is in your code:
res = []
for item in data:
new_data = {
'id': item['id'],
'key': item['key'],
'name': item['name'],
'items': sorted(item['items'], key=lambda k : int(k['key']) )
}
res.append(new_data)
print(res)
And here's the result:
[{'id': 2,
'items': [{'id': 3, 'key': '1', 'name': 'Kilometerteller: Kilometerstand '},
{'id': 2, 'key': '2', 'name': 'Aanduiding van het chassisnummer '},
{'id': 1, 'key': '3', 'name': 'Typeplaatje'},
{'id': 5, 'key': '4', 'name': 'COC of gelijkvormigheidsattest '},
{'id': 4, 'key': '5', 'name': 'Inschrijvingsbewijs '}],
'key': 'B',
'name': 'Onderdelen'},
{'id': 2,
'items': [{'id': 15, 'key': '6', 'name': 'Batterij'},
{'id': 11, 'key': '7', 'name': 'Differentieel '},
{'id': 13, 'key': '8', 'name': 'Uitlaat '},
{'id': 12, 'key': '9', 'name': 'Cardanhoezen '},
{'id': 10, 'key': '10', 'name': 'Koppeling'}],
'key': 'B',
'name': 'Onderdelen'}]
You need to replace the old items in the data with the sorted items based on key numerically instead of string sort. So use int(item['key']) in sort like,
>>> data
[{'items': [{'key': '1', 'id': 3, 'name': 'Kilometerteller: Kilometerstand '}, {'key': '2', 'id': 2, 'name': 'Aanduiding van het chassisnummer '}, {'key': '3', 'id': 1, 'name': 'Typeplaatje'}, {'key': '4', 'id': 5, 'name': 'COC of gelijkvormigheidsattest '}, {'key': '5', 'id': 4, 'name': 'Inschrijvingsbewijs '}], 'id': 2, 'key': 'B', 'name': 'Onderdelen'}, {'items': [{'key': '6', 'id': 15, 'name': 'Batterij'}, {'key': '7', 'id': 11, 'name': 'Differentieel '}, {'key': '8', 'id': 13, 'name': 'Uitlaat '}, {'key': '9', 'id': 12, 'name': 'Cardanhoezen '}, {'key': '10', 'id': 10, 'name': 'Koppeling'}], 'id': 2, 'key': 'B', 'name': 'Onderdelen'}]
>>>
>>> for item in data:
... item['items'] = sorted(item['items'], key=lambda x: int(x['key']))
...
>>> import pprint
>>> pprint.pprint(data)
[{'id': 2,
'items': [{'id': 3, 'key': '1', 'name': 'Kilometerteller: Kilometerstand '},
{'id': 2, 'key': '2', 'name': 'Aanduiding van het chassisnummer '},
{'id': 1, 'key': '3', 'name': 'Typeplaatje'},
{'id': 5, 'key': '4', 'name': 'COC of gelijkvormigheidsattest '},
{'id': 4, 'key': '5', 'name': 'Inschrijvingsbewijs '}],
'key': 'B',
'name': 'Onderdelen'},
{'id': 2,
'items': [{'id': 15, 'key': '6', 'name': 'Batterij'},
{'id': 11, 'key': '7', 'name': 'Differentieel '},
{'id': 13, 'key': '8', 'name': 'Uitlaat '},
{'id': 12, 'key': '9', 'name': 'Cardanhoezen '},
{'id': 10, 'key': '10', 'name': 'Koppeling'}],
'key': 'B',
'name': 'Onderdelen'}]
So list comes with a handy method called sort which sorts itself inplace. I'd use that to your advantage:
for d in data:
d['items'].sort(key=lambda x: int(x['key']))
Results:
[{'id': 2,
'items': [{'id': 3, 'key': '1', 'name': 'Kilometerteller: Kilometerstand '},
{'id': 2, 'key': '2', 'name': 'Aanduiding van het chassisnummer '},
{'id': 1, 'key': '3', 'name': 'Typeplaatje'},
{'id': 5, 'key': '4', 'name': 'COC of gelijkvormigheidsattest '},
{'id': 4, 'key': '5', 'name': 'Inschrijvingsbewijs '}],
'key': 'B',
'name': 'Onderdelen'},
{'id': 2,
'items': [{'id': 15, 'key': '6', 'name': 'Batterij'},
{'id': 11, 'key': '7', 'name': 'Differentieel '},
{'id': 13, 'key': '8', 'name': 'Uitlaat '},
{'id': 12, 'key': '9', 'name': 'Cardanhoezen '},
{'id': 10, 'key': '10', 'name': 'Koppeling'}],
'key': 'B',
'name': 'Onderdelen'}]
In Python, I am trying to turn a list of separate JSON data:
[[{'id': 1, 'name': 'pencil', 'description': '2b or not 2b, that is the question'}], [{'id': 2, 'name': 'oil pastel', 'description': None}], [{'id': 3, 'name': 'gouache', 'description': None}], [{'id': 4, 'name': 'paper', 'description': None}]]
into one piece of JSON data:
{'id': 1, 'name': 'pencil', 'description': '2b or not 2b, that is the question'}, {'id': 2, 'name': 'oil pastel', 'description': None}, {'id': 3, 'name': 'gouache', 'description': None}, {'id': 4, 'name': 'paper', 'description': None}, {'id': 5, 'name': 'coloured pencil', 'description': None}
Been struggling with it for a few hours. Does anyone have any ideas?
Use simple list-comprehension
[y for x in list_of_lists for y in x]
Output:
[{'description': '2b or not 2b, that is the question', 'id': 1, 'name': 'pencil'}, {'description': None, 'id': 2, 'name': 'oil pastel'}, {'description': None, 'id': 3, 'name': 'gouache'}, {'description': None, 'id': 4, 'name': 'paper'}]
Use itertools.chain
>>> list(itertools.chain.from_iterable(j))
Or a list comprehension
>>> [x[0] for x in j] # Assuming there is only one item in each list
Both outputs
[{'id': 1,
'name': 'pencil',
'description': '2b or not 2b, that is the question'},
{'id': 2, 'name': 'oil pastel', 'description': None},
{'id': 3, 'name': 'gouache', 'description': None},
{'id': 4, 'name': 'paper', 'description': None}]
Using functools with operator
j = [[{'id': 1, 'name': 'pencil', 'description': '2b or not 2b, that is the question'}], [{'id': 2, 'name': 'oil pastel', 'description': None}], [{'id': 3, 'name': 'gouache', 'description': None}], [{'id': 4, 'name': 'paper', 'description': None}]]
import functools
import operator
functools.reduce(operator.iadd,j,[])
Output:
[{'id': 1,
'name': 'pencil',
'description': '2b or not 2b, that is the question'},
{'id': 2, 'name': 'oil pastel', 'description': None},
{'id': 3, 'name': 'gouache', 'description': None},
{'id': 4, 'name': 'paper', 'description': None}]
I have list of dictionaries as follows:
[
{'id': 16419, 'name': 'Audi'},
{'id': 13, 'name': 'BMW'},
{'id': 31, 'name': 'Honda'},
{'id': 50060, 'name': 'KTM'},
{'id': 54, 'name': 'Opel'},
{'id': 55, 'name': 'Peugeot'},
{'id': 50083, 'name': 'PGO'},
{'id': 16350, 'name': 'Skoda'},
{'id': 68, 'name': 'Suzuki'},
{'id': 2120, 'name': 'Triumph'},
{'id': 16328, 'name': 'Others'},
{'id': 16396, 'name': 'Seat'},
{'id': 14979, 'name': 'Opel'},
{'id': 6, 'name': 'Volkswagen'}
]
What I want to do is to order it. And I want that some dictionaries with some name values show in the beginning of the list.
I want that for example Volkswagen, Audi, BMW, Opel, Peugeot as first params appears in list.
Thus the wanted result should be something like this:
[
{'id': 6, 'name': 'Volkswagen'}
{'id': 16419, 'name': 'Audi'},
{'id': 13, 'name': 'BMW'},
{'id': 54, 'name': 'Opel'},
{'id': 55, 'name': 'Peugeot'},
{'id': 31, 'name': 'Honda'},
{'id': 50060, 'name': 'KTM'},
{'id': 50083, 'name': 'PGO'},
{'id': 16350, 'name': 'Skoda'},
{'id': 68, 'name': 'Suzuki'},
{'id': 2120, 'name': 'Triumph'},
{'id': 16328, 'name': 'Others'},
{'id': 16396, 'name': 'Seat'},
{'id': 14979, 'name': 'Opel'},
]
Any idea how to do that?
You can use an appropriate key function for your sorting. This one orders by the given names first (in the given order). All other brands come after that with no order specified among themselves:
>>> rank = {x: i for i, x in enumerate(['Volkswagen', 'Audi', 'BMW', 'Opel', 'Peugeot'])}
# {'Volkswagen': 0, 'Audi': 1, ...}
>>> sorted(lst, key=lambda x: rank.get(x['name'], len(rank)))
[{'id': 6, 'name': 'Volkswagen'},
{'id': 16419, 'name': 'Audi'},
{'id': 13, 'name': 'BMW'},
{'id': 54, 'name': 'Opel'},
{'id': 14979, 'name': 'Opel'},
{'id': 55, 'name': 'Peugeot'},
{'id': 31, 'name': 'Honda'},
{'id': 50060, 'name': 'KTM'},
{'id': 50083, 'name': 'PGO'},
{'id': 16350, 'name': 'Skoda'},
{'id': 68, 'name': 'Suzuki'},
{'id': 2120, 'name': 'Triumph'},
{'id': 16328, 'name': 'Others'},
{'id': 16396, 'name': 'Seat'}]
You can use a dictionary to define a custom sorting order.
dicts = [
{'id': 16419, 'name': 'Audi'},
{'id': 13, 'name': 'BMW'},
{'id': 31, 'name': 'Honda'},
{'id': 50060, 'name': 'KTM'},
{'id': 54, 'name': 'Opel'},
{'id': 55, 'name': 'Peugeot'},
{'id': 50083, 'name': 'PGO'},
{'id': 16350, 'name': 'Skoda'},
{'id': 68, 'name': 'Suzuki'},
{'id': 2120, 'name': 'Triumph'},
{'id': 16328, 'name': 'Others'},
{'id': 16396, 'name': 'Seat'},
{'id': 14979, 'name': 'Opel'},
{'id': 6, 'name': 'Volkswagen'}
]
brand_order = ['Volkswagen', 'Audi', 'BMW', 'Opel', 'Peugeot']
order = dict(zip(brand_order, range(len(brand_order))))
dicts_sorted = sorted(dicts, key=lambda d: order.get(d['name'], float('inf')))
print(dicts_sorted)
Output:
[{'id': 6, 'name': 'Volkswagen'},
{'id': 16419, 'name': 'Audi'},
{'id': 13, 'name': 'BMW'},
{'id': 54, 'name': 'Opel'},
{'id': 14979, 'name': 'Opel'},
{'id': 55, 'name': 'Peugeot'},
{'id': 31, 'name': 'Honda'},
{'id': 50060, 'name': 'KTM'},
{'id': 50083, 'name': 'PGO'},
{'id': 16350, 'name': 'Skoda'},
{'id': 68, 'name': 'Suzuki'},
{'id': 2120, 'name': 'Triumph'},
{'id': 16328, 'name': 'Others'},
{'id': 16396, 'name': 'Seat'}]
Falling back to float('inf') ensures that whatever is not in order comes last.