How to unpack a nested array in python

How to unpack a nested array in python - python

my data is a mix of lists and dictionaries.
I need to create a data frame for my_stats.
I can can get to the data at (my_stats[0]['stats'],['data']), through
df = pd.DataFrame (my_stats[0]["stats"]["data"])
area key ... Scrum Errors Confirm Try
0 C1 Kick Off ... NaN NaN
1 NaN Passive Tackle ... NaN NaN
2 D1 Rucks ... NaN NaN
3 D1 Lineouts ... NaN NaN
4 NaN Neutral Tackle ... NaN NaN
but I need my data frame to show the game _id as the 1st column of the data frame.
here is some of the data of two of the matches.
Please assist.
my_stats = [{'_id': 'GLEvHIL2020031419A', 'stats': {'data': [{'area': 'C1', 'key': 'Kick Off', 'possession': 'against', 'second': 0, 'endSecond': 6, 'time': '2020-03-14T12:00:06', 'Kick Off Fielded': 'Unsuccessful'}, {'key': 'Passive Tackle', 'possession': 'against', 'second': 9, 'time': '2020-03-14T12:00:09', 'value': 9, 'subMetric': 3, 'rtp': []}, {'area': 'D1', 'key': 'Rucks', 'possession': 'against', 'second': 9, 'endSecond': 19, 'time': '2020-03-14T12:00:19'}, d': 1175, 'time': '2020-03-14T12:19:35', 'value': 8, 'subMetric': None, 'rtp': []}, {'area': 'C2', 'key': 'Rucks', 'possession': 'against', 'second': 1176, 'endSecond': 1178, 'time': '2020-03-14T12:19:38'}, {'key': 'Defender in Position', 'possession': 'against', 'second': 1176, 'time': '2020-03-14T12:19:36', 'value': 6, 'subMetric': None, 'rtp': []}, {'key': 'Defender in Position', 'possession': 'against', 'second': 1177, 'time': '2020-03-14T12:19:37', 'value': 12, 'subMetric': None, 'rtp': []}, {'key': 'Tackle Assist', 'possession': 'against', 'second': 1184, 'time': '2020-03-14T12:19:44', 'value': 7, 'subMetric': None, 'rtp': []}]}}, {'_id': 'HJSvMON2020031419A', 'stats': {'data': [{'area': 'C2', 'key': 'Kick Off', 'possession': 'against', 'second': 1, 'endSecond': 5, 'time': '2020-03-14T12:00:05', 'Kick Off Fielded': 'Successful'}, {'area': 'C2', 'key': 'Kick Off', 'possession': 'against', 'second': 2, 'endSecond': 5, 'time': '2020-03-14T12:00:05', 'Kick Off Fielded': 'Successful'}, {'key': 'Kick Fielded Successfully', 'possession': 'for', 'second': 4, 'time': '2020-03-14T12:00:04', 'value': 7, 'subMetric': None, 'rtp': []}, {'key': 'Effective Ruck', 'possession': 'for', 'second': 6, 'time': '2020-03-14T12:00:06', 'value': 3, 'subMetric': None, 'rtp': []}, {'key': 'Effective Ruck', 'possession': 'for', 'second': 6, 'time': '2020-03-14T12:00:06', 'value': 2, 'subMetric': None, 'rtp': []}, {'area': 'C2', 'key': 'Rucks', 'possession': 'for', 'second': 6, 'endSecond': 10, 'time': '2020-03-14T12:00:10'}, {'area': 'C2', 'key': 'Rucks', 'possession': 'against', 'second': 6, 'endSecond': 9, 'time': '2020-03-14T12:00:09'}, {'key': 'Good Pass', 'possession': 'for', 'second': 10, 'time': '2020-03-14T12:00:10', 'value': 9, 'subMetric': None, 'rtp': [{'key': 'Good Pass', 'possession': 'for', 'second': 16, 'time': '2020-03-14T12:00:16', 'value': 9, 'subMetric': None}]}, ]}}]

Here's a solution:
df = pd.DataFrame(my_stats)
def pd_unnest_dict(df, col):
exploded_col = df[col].apply(pd.Series)
return pd.concat([df.drop(columns=col), exploded_col], axis=1)
df = pd_unnest_dict(df, 'stats')
df = df.explode('data')
df = pd_unnest_dict(df, 'data')

Related

How to do error-handling of JSON Parser Loop

I found some elegant code that builds a list by iterating through each element of another JSON list:
results = [
(
t["vintage"]["wine"]["winery"]["name"],
t["vintage"]["year"],
t["vintage"]["wine"]["id"],
f'{t["vintage"]["wine"]["name"]} {t["vintage"]["year"]}',
t["vintage"]["wine"]["statistics"]["ratings_average"],
t["vintage"]["wine"]["statistics"]["ratings_count"],
t["price"]["amount"],
t["vintage"]["wine"]["region"]["name"],
t["vintage"]["wine"]["style"]["name"], #<--------------issue here
)
for t in r.json()["explore_vintage"]["matches"]
]
The problem is that sometimes the JSON doesn't have a "name" element because the "style" is null (or None in JSON world). See the second-last line below for the JSON sample.
Is there a simple way to handle this error?
Error:
matches[23]["vintage"]["wine"]["style"]["name"]
Traceback (most recent call last):
File "<ipython-input-94-59447d0d4859>", line 1, in <module>
matches[23]["vintage"]["wine"]["style"]["name"]
TypeError: 'NoneType' object is not subscriptable
Perhaps something like:
iferror(t["vintage"]["wine"]["style"]["name"], "DoesNotExist")
JSON:
{'id': 4026076,
'name': 'Shiraz - Petit Verdot',
'seo_name': 'shiraz-petit-verdot',
'type_id': 1,
'vintage_type': 0,
'is_natural': False,
'region': {'id': 685,
'name': 'South Eastern Australia',
'name_en': '',
'seo_name': 'south-eastern',
'country': {'code': 'au',
'name': 'Australia',
'native_name': 'Australia',
'seo_name': 'australia',
'sponsored': False,
'currency': {'code': 'AUD',
'name': 'Australian Dollars',
'prefix': '$',
'suffix': None},
'regions_count': 120,
'users_count': 867353,
'wines_count': 108099,
'wineries_count': 13375,
'most_used_grapes': [{'id': 1,
'name': 'Shiraz/Syrah',
'seo_name': 'shiraz-syrah',
'has_detailed_info': True,
'wines_count': 536370},
{'id': 2,
'name': 'Cabernet Sauvignon',
'seo_name': 'cabernet-sauvignon',
'has_detailed_info': True,
'wines_count': 780931},
{'id': 5,
'name': 'Chardonnay',
'seo_name': 'chardonnay',
'has_detailed_info': True,
'wines_count': 586874}],
'background_video': None},
'class': {'typecast_map': {'background_image': {}, 'class': {}}},
'background_image': {'location': '//images.vivino.com/regions/backgrounds/0iT8wuQXRWaAmEGpPjZckg.jpg',
'variations': {'large': '//thumbs.vivino.com/region_backgrounds/0iT8wuQXRWaAmEGpPjZckg_1280x760.jpg',
'medium': '//thumbs.vivino.com/region_backgrounds/0iT8wuQXRWaAmEGpPjZckg_600x356.jpg'}}},
'winery': {'id': 74363,
'name': 'Barramundi',
'seo_name': 'barramundi',
'status': 0,
'background_image': None},
'taste': {'structure': None,
'flavor': [{'group': 'black_fruit', 'stats': {'count': 16, 'score': 2987}},
{'group': 'oak', 'stats': {'count': 11, 'score': 1329}},
{'group': 'red_fruit', 'stats': {'count': 10, 'score': 1413}},
{'group': 'spices', 'stats': {'count': 6, 'score': 430}},
{'group': 'non_oak', 'stats': {'count': 5, 'score': 126}},
{'group': 'floral', 'stats': {'count': 3, 'score': 300}},
{'group': 'earth', 'stats': {'count': 3, 'score': 249}},
{'group': 'microbio', 'stats': {'count': 2, 'score': 66}},
{'group': 'vegetal', 'stats': {'count': 1, 'score': 100}},
{'group': 'dried_fruit', 'stats': {'count': 1, 'score': 100}}]},
'statistics': {'status': 'Normal',
'ratings_count': 1002,
'ratings_average': 3.5,
'labels_count': 11180,
'vintages_count': 25},
'style': None,
'has_valid_ratings': True}

Is one of the numbers in this list in between the two given integers?

I have a list with barline ticks and midi notes that can overlap the barlines. So I made a list of 'barlineticks':
barlinepos = [0, 768.0, 1536.0, 2304.0, 3072.0, 3840.0, 4608.0, 5376.0, 6144.0, 6912.0, 0, 576.0, 1152.0, 1728.0, 2304.0, 2880.0, 3456.0, 4032.0, 4608.0, 5184.0, 5760.0, 6336.0, 6912.0, 7488.0]
And a MidiFile:
{'type': 'time_signature', 'numerator': 4, 'denominator': 4, 'time': 0, 'duration': 768, 'ID': 0}
{'type': 'set_tempo', 'tempo': 500000, 'time': 0, 'ID': 1}
{'type': 'track_name', 'name': 'Tempo Track', 'time': 0, 'ID': 2}
{'type': 'track_name', 'name': 'New Instrument', 'time': 0, 'ID': 3}
{'type': 'note_on', 'time': 0, 'channel': 0, 'note': 48, 'velocity': 100, 'ID': 4, 'duration': 956}
{'type': 'time_signature', 'numerator': 3, 'denominator': 4, 'time': 768, 'duration': 6911, 'ID': 5}
{'type': 'note_on', 'time': 768, 'channel': 0, 'note': 46, 'velocity': 100, 'ID': 6, 'duration': 575}
{'type': 'note_off', 'time': 956, 'channel': 0, 'note': 48, 'velocity': 0, 'ID': 7}
{'type': 'note_off', 'time': 1343, 'channel': 0, 'note': 46, 'velocity': 0, 'ID': 8}
{'type': 'end_of_track', 'time': 7679, 'ID': 9}
And I want to check if the midi note is overlapping a barline. Every note_on message has a 'time' and a 'duration' value. I have to check if one of the barlineticks(in the list) is inside the range of the note('time' and 'duration'). I tried:
if barlinepos in range(0, 956):
print(True)
Of course this doesn't work because barlinepos is a list. How can I check if one of the values in the list results in True?

Simple iteration to solve the requirement:
for i in midifile:
start, end = i["time"], i["time"]+i["duration"]
for j in barlinepos:
if j >= start and j<= end:
print(True)
break
print(False)

Cannot convert Dictionary to JSON object using json.dumps() in python

I am trying to convert a dictionary using json.dumps()
def create_custom(json_input):
custom = dict()
custom['list'] = dict()
custom['list']['Elements'] = json_input['nodes']
custom['list']['links'] = json_input['links']
return custom
JsonData = create_custom(json_graph.node_link_data(G))
for i, j in enumerate(Elements):
JsonData['list']['Elements'][i]['Shape'] = j['Shape']
The above code is not complete but the final output I am getting is a dictionary
Output
{'list': {'Elements': [{'text': 'Task 1', 'Shape': 'Decision', 'id': 0},
{'text': 'Task 2', 'Shape': 'Decision', 'id': 1},
{'text': 'Task 3', 'Shape': 'Decision', 'id': 2},
{'text': 'Task 4', 'Shape': 'Decision', 'id': 3},
{'text': 'Task 5', 'Shape': 'Rectangle', 'id': 4},
{'text': 'Task 6', 'Shape': 'Decision', 'id': 5}],
'links': [{'source': 0, 'target': 1, 'key': 0},
{'source': 0, 'target': 4, 'key': 0},
{'source': 1, 'target': 2, 'key': 0},
{'source': 1, 'target': 3, 'key': 0},
{'source': 2, 'target': 1, 'key': 0},
{'source': 2, 'target': 4, 'key': 0},
{'source': 3, 'target': 1, 'key': 0},
{'source': 3, 'target': 4, 'key': 0},
{'source': 4, 'target': 5, 'key': 0},
{'source': 5, 'target': 4, 'key': 0},
{'source': 5, 'target': 1, 'key': 0}]}}
When I am converting the above output to JSON object
json.dumps(JsonData)
I am getting an error:
~\AppData\Local\Continuum\anaconda3\lib\json\encoder.py in default(self, o)
177
178 """
--> 179 raise TypeError(f'Object of type {o.__class__.__name__} '
180 f'is not JSON serializable')
181
TypeError: Object of type int64 is not JSON serializable
I went across many answers but they are saying about numpy array etc.
Where I am going wrong

I could not recreate the error, but json.dumps worked fine for me. Please refer to the screenshot below:
Code I tried:
import json
JsonData={'list': {'Elements': [{'text': 'Task 1', 'Shape': 'Decision', 'id': 0},
{'text': 'Task 2', 'Shape': 'Decision', 'id': 1},
{'text': 'Task 3', 'Shape': 'Decision', 'id': 2},
{'text': 'Task 4', 'Shape': 'Decision', 'id': 3},
{'text': 'Task 5', 'Shape': 'Rectangle', 'id': 4},
{'text': 'Task 6', 'Shape': 'Decision', 'id': 5}],
'links': [{'source': 0, 'target': 1, 'key': 0},
{'source': 0, 'target': 4, 'key': 0},
{'source': 1, 'target': 2, 'key': 0},
{'source': 1, 'target': 3, 'key': 0},
{'source': 2, 'target': 1, 'key': 0},
{'source': 2, 'target': 4, 'key': 0},
{'source': 3, 'target': 1, 'key': 0},
{'source': 3, 'target': 4, 'key': 0},
{'source': 4, 'target': 5, 'key': 0},
{'source': 5, 'target': 4, 'key': 0},
{'source': 5, 'target': 1, 'key': 0}]}}
print(type(JsonData))
print(json.dumps(JsonData))
print(type(json.dumps(JsonData)))

How to sort list of dictrionaries in the right way Python

I have list as follows:
data = [
{'items': [
{'key': u'3', 'id': 1, 'name': u'Typeplaatje'},
{'key': u'2', 'id': 2, 'name': u'Aanduiding van het chassisnummer '},
{'key': u'1', 'id': 3, 'name': u'Kilometerteller: Kilometerstand '},
{'key': u'5', 'id': 4, 'name': u'Inschrijvingsbewijs '},
{'key': u'4', 'id': 5, 'name': u'COC of gelijkvormigheidsattest '}
], 'id': 2, 'key': u'B', 'name': u'Onderdelen'},
{'items': [
{'key': u'10', 'id': 10, 'name': u'Koppeling'},
{'key': u'7', 'id': 11, 'name': u'Differentieel '},
{'key': u'9', 'id': 12, 'name': u'Cardanhoezen '},
{'key': u'8', 'id': 13, 'name': u'Uitlaat '},
{'key': u'6', 'id': 15, 'name': u'Batterij'}
], 'id': 2, 'key': u'B', 'name': u'Onderdelen'}
]
And I want to sort items by key.
Thus the wanted result is as follows:
res = [
{'items': [
{'key': u'1', 'id': 3, 'name': u'Kilometerteller: Kilometerstand '},
{'key': u'2', 'id': 2, 'name': u'Aanduiding van het chassisnummer '},
{'key': u'3', 'id': 1, 'name': u'Typeplaatje'},
{'key': u'4', 'id': 5, 'name': u'COC of gelijkvormigheidsattest '},
{'key': u'5', 'id': 4, 'name': u'Inschrijvingsbewijs '},
], 'id': 2, 'key': u'B', 'name': u'Onderdelen'},
{'items': [
{'key': u'6', 'id': 15, 'name': u'Batterij'},
{'key': u'7', 'id': 11, 'name': u'Differentieel '},
{'key': u'8', 'id': 13, 'name': u'Uitlaat '},
{'key': u'9', 'id': 12, 'name': u'Cardanhoezen '},
{'key': u'10', 'id': 10, 'name': u'Koppeling'}
], 'id': 2, 'key': u'B', 'name': u'Onderdelen'}
]
I've tried as follows:
res = []
for item in data:
new_data = {
'id': item['id'],
'key': item['key'],
'name': item['name'],
'items': sorted(item['items'], key=lambda k : k['key'])
}
res.append(new_data)
print(res)
The first is sorted fine, but the second one not.
What am I doing wrong and is there a better way of doing it?

Your sort is wrong in the second case because the keys are strings, and strings are sorted by their first character which is '1' if your key is '10'. A slight modification to your sorting function would do the trick:
'items': sorted(item['items'], key=lambda k : int(k['key'])
I'm doing an int because you want to sort them as if they are numbers. Here it is in your code:
res = []
for item in data:
new_data = {
'id': item['id'],
'key': item['key'],
'name': item['name'],
'items': sorted(item['items'], key=lambda k : int(k['key']) )
}
res.append(new_data)
print(res)
And here's the result:
[{'id': 2,
'items': [{'id': 3, 'key': '1', 'name': 'Kilometerteller: Kilometerstand '},
{'id': 2, 'key': '2', 'name': 'Aanduiding van het chassisnummer '},
{'id': 1, 'key': '3', 'name': 'Typeplaatje'},
{'id': 5, 'key': '4', 'name': 'COC of gelijkvormigheidsattest '},
{'id': 4, 'key': '5', 'name': 'Inschrijvingsbewijs '}],
'key': 'B',
'name': 'Onderdelen'},
{'id': 2,
'items': [{'id': 15, 'key': '6', 'name': 'Batterij'},
{'id': 11, 'key': '7', 'name': 'Differentieel '},
{'id': 13, 'key': '8', 'name': 'Uitlaat '},
{'id': 12, 'key': '9', 'name': 'Cardanhoezen '},
{'id': 10, 'key': '10', 'name': 'Koppeling'}],
'key': 'B',
'name': 'Onderdelen'}]

You need to replace the old items in the data with the sorted items based on key numerically instead of string sort. So use int(item['key']) in sort like,
>>> data
[{'items': [{'key': '1', 'id': 3, 'name': 'Kilometerteller: Kilometerstand '}, {'key': '2', 'id': 2, 'name': 'Aanduiding van het chassisnummer '}, {'key': '3', 'id': 1, 'name': 'Typeplaatje'}, {'key': '4', 'id': 5, 'name': 'COC of gelijkvormigheidsattest '}, {'key': '5', 'id': 4, 'name': 'Inschrijvingsbewijs '}], 'id': 2, 'key': 'B', 'name': 'Onderdelen'}, {'items': [{'key': '6', 'id': 15, 'name': 'Batterij'}, {'key': '7', 'id': 11, 'name': 'Differentieel '}, {'key': '8', 'id': 13, 'name': 'Uitlaat '}, {'key': '9', 'id': 12, 'name': 'Cardanhoezen '}, {'key': '10', 'id': 10, 'name': 'Koppeling'}], 'id': 2, 'key': 'B', 'name': 'Onderdelen'}]
>>>
>>> for item in data:
... item['items'] = sorted(item['items'], key=lambda x: int(x['key']))
...
>>> import pprint
>>> pprint.pprint(data)
[{'id': 2,
'items': [{'id': 3, 'key': '1', 'name': 'Kilometerteller: Kilometerstand '},
{'id': 2, 'key': '2', 'name': 'Aanduiding van het chassisnummer '},
{'id': 1, 'key': '3', 'name': 'Typeplaatje'},
{'id': 5, 'key': '4', 'name': 'COC of gelijkvormigheidsattest '},
{'id': 4, 'key': '5', 'name': 'Inschrijvingsbewijs '}],
'key': 'B',
'name': 'Onderdelen'},
{'id': 2,
'items': [{'id': 15, 'key': '6', 'name': 'Batterij'},
{'id': 11, 'key': '7', 'name': 'Differentieel '},
{'id': 13, 'key': '8', 'name': 'Uitlaat '},
{'id': 12, 'key': '9', 'name': 'Cardanhoezen '},
{'id': 10, 'key': '10', 'name': 'Koppeling'}],
'key': 'B',
'name': 'Onderdelen'}]

So list comes with a handy method called sort which sorts itself inplace. I'd use that to your advantage:
for d in data:
d['items'].sort(key=lambda x: int(x['key']))
Results:
[{'id': 2,
'items': [{'id': 3, 'key': '1', 'name': 'Kilometerteller: Kilometerstand '},
{'id': 2, 'key': '2', 'name': 'Aanduiding van het chassisnummer '},
{'id': 1, 'key': '3', 'name': 'Typeplaatje'},
{'id': 5, 'key': '4', 'name': 'COC of gelijkvormigheidsattest '},
{'id': 4, 'key': '5', 'name': 'Inschrijvingsbewijs '}],
'key': 'B',
'name': 'Onderdelen'},
{'id': 2,
'items': [{'id': 15, 'key': '6', 'name': 'Batterij'},
{'id': 11, 'key': '7', 'name': 'Differentieel '},
{'id': 13, 'key': '8', 'name': 'Uitlaat '},
{'id': 12, 'key': '9', 'name': 'Cardanhoezen '},
{'id': 10, 'key': '10', 'name': 'Koppeling'}],
'key': 'B',
'name': 'Onderdelen'}]

How do i fetch first key value pair (i.e., 'type': 45) from a dictionary which contains key as list with dictionaries

{'alarms': [{'date': '20170925T235525-0700',
'id': 8,
'ip': '172.26.70.4',
'severity': 4,
'type': 45},
{'date': '20170925T235525-0700',
'id': 7,
'ip': '172.26.70.4',
'severity': 4,
'type': 45},
{'date': '20170925T235525-0700',
'id': 6,
'ip': '172.26.70.4',
'severity': 4,
'type': 45},
{'date': '20170925T220858-0700',
'id': 5,
'ip': '172.26.70.4',
'severity': 6,
'type': 44},
{'date': '20170925T220857-0700',
'id': 4,
'ip': '172.26.70.4',
'severity': 6,
'type': 44},
{'date': '20170925T220857-0700',
'id': 3,
'ip': '172.26.70.4',
'severity': 6,
'type': 44},
{'date': '20170925T220856-0700',
'id': 2,
'severity': 6,
'type': 32},
{'date': '20170925T220850-0700', 'id': 1, 'severity': 6, 'type': 1},
{'date': '20170925T220850-0700',
'id': 0,
'severity': 6,
'type': 33}]}
Need to fetch first key value pair (i.e., 'type': 45)
Kindly guide, I am trying it on Python 2.7.

Your data is a dictionary where the `"alarms" key is associated with a list of dictionaries.
That dictionary is in the list associated with the "alarms" key. So you can fetch it with:
data['alarms'][0]
with data the variable that stores this structure. So:
>>> data['alarms'][0]
{'date': '20170925T235525-0700', 'severity': 4, 'id': 8, 'ip': '172.26.70.4', 'type': 45}

you will need to do :
def return_correct_dict(data):
for d in data['alarms']:
if d.get('type',"") == 45:
return d

You have a dictionary of a list of dictionaries.
Suppose your dictionary is stored in a variable named dict.
dict['alarm'][0]['type'] will give you the value 45.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to unpack a nested array in python - python

Here's a solution: df = pd.DataFrame(my_stats) def pd_unnest_dict(df, col): exploded_col = df[col].apply(pd.Series) return pd.concat([df.drop(columns=col), exploded_col], axis=1) df = pd_unnest_dict(df, 'stats') df = df.explode('data') df = pd_unnest_dict(df, 'data')

Related

How to do error-handling of JSON Parser Loop

Is one of the numbers in this list in between the two given integers?

Cannot convert Dictionary to JSON object using json.dumps() in python

How to sort list of dictrionaries in the right way Python

How do i fetch first key value pair (i.e., 'type': 45) from a dictionary which contains key as list with dictionaries

Categories

Resources