Json to Pandas, include "Parents"

Json to Pandas, include "Parents" - python

With a list of 150+ Neighborhoods , I am using Foursquare API to retrieve nearby venues at 500m radius of a given Neighbourhood. Each Neighbourhood is expected to return 10-20 nearby venues.
Refer to snippet of json result as returned by Foursquare.
With results['response']['groups'][0]['items'], I able to retrieve the nearby venues information and make it a Table as below. However results['response']['groups'][0]['items'] does not have the Neighbourhood ( under headerFullLocation in json) of associated venues.
Q: How can I link the Neighbourhood(headerFullLocation) to its associated nearby venue and add it as a column to table below? Thanks for the advice.
{'suggestedFilters': {'header': 'Tap to show:',
'filters': [{'name': 'Open now', 'key': 'openNow'}]},
'headerLocation': 'Alexandra Park',
'headerFullLocation': 'Alexandra Park, Toronto',**
'headerLocationGranularity': 'neighborhood',
'totalResults': 138,
'suggestedBounds': {'ne': {'lat': 43.6545000045, 'lng': -79.39379244047241},
'sw': {'lat': 43.645499995499996, 'lng': -79.4062075595276}},
'groups': [{'type': 'Recommended Places',
'name': 'recommended',
'items': [{'reasons': {'count': 0,
'items': [{'summary': 'This spot is popular',
'type': 'general',
'reasonName': 'globalInteractionReason'}]},
'venue': {'id': '5644dbaa498e7f7534154326',
'**name': 'Maker Pizza',**
'contact': {},
'location': {'address': '59 Cameron St',
'lat': 43.6504011331197,
'lng': -79.39804047841302,
'labeledLatLngs': [{'label': 'display',
'lat': 43.6504011331197,
'lng': -79.39804047841302}],
'distance': 164,
'postalCode': 'M5T 2H1',
'cc': 'CA',
'city': 'Toronto',
'state': 'ON',
'country': 'Canada',
'formattedAddress': ['59 Cameron St', 'Toronto ON M5T 2H1', 'Canada']},
'categories': [{'id': '4bf58dd8d48988d1ca941735',
'name': 'Pizza Place',
'pluralName': 'Pizza Places',
'shortName': 'Pizza',
'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/pizza_',
'suffix': '.png'},
'primary': True}],
'verified': False,
'stats': {'tipCount': 0,
'usersCount': 0,
'checkinsCount': 0,
'visitsCount': 0},
'beenHere': {'count': 0,
'lastCheckinExpiredAt': 0,
'marked': False,
'unconfirmedCount': 0},
'photos': {'count': 0, 'groups': []},
'hereNow': {'count': 0, 'summary': 'Nobody here', 'groups': []}},

Why don't you just do venues['Neighbourhood'] = response['headerFullLocation']. I am assuming, you send a separate request for each neigbhourhood and plan to concatenate multiple venue dataframes in the end.

Related

How to do error-handling of JSON Parser Loop

I found some elegant code that builds a list by iterating through each element of another JSON list:
results = [
(
t["vintage"]["wine"]["winery"]["name"],
t["vintage"]["year"],
t["vintage"]["wine"]["id"],
f'{t["vintage"]["wine"]["name"]} {t["vintage"]["year"]}',
t["vintage"]["wine"]["statistics"]["ratings_average"],
t["vintage"]["wine"]["statistics"]["ratings_count"],
t["price"]["amount"],
t["vintage"]["wine"]["region"]["name"],
t["vintage"]["wine"]["style"]["name"], #<--------------issue here
)
for t in r.json()["explore_vintage"]["matches"]
]
The problem is that sometimes the JSON doesn't have a "name" element because the "style" is null (or None in JSON world). See the second-last line below for the JSON sample.
Is there a simple way to handle this error?
Error:
matches[23]["vintage"]["wine"]["style"]["name"]
Traceback (most recent call last):
File "<ipython-input-94-59447d0d4859>", line 1, in <module>
matches[23]["vintage"]["wine"]["style"]["name"]
TypeError: 'NoneType' object is not subscriptable
Perhaps something like:
iferror(t["vintage"]["wine"]["style"]["name"], "DoesNotExist")
JSON:
{'id': 4026076,
'name': 'Shiraz - Petit Verdot',
'seo_name': 'shiraz-petit-verdot',
'type_id': 1,
'vintage_type': 0,
'is_natural': False,
'region': {'id': 685,
'name': 'South Eastern Australia',
'name_en': '',
'seo_name': 'south-eastern',
'country': {'code': 'au',
'name': 'Australia',
'native_name': 'Australia',
'seo_name': 'australia',
'sponsored': False,
'currency': {'code': 'AUD',
'name': 'Australian Dollars',
'prefix': '$',
'suffix': None},
'regions_count': 120,
'users_count': 867353,
'wines_count': 108099,
'wineries_count': 13375,
'most_used_grapes': [{'id': 1,
'name': 'Shiraz/Syrah',
'seo_name': 'shiraz-syrah',
'has_detailed_info': True,
'wines_count': 536370},
{'id': 2,
'name': 'Cabernet Sauvignon',
'seo_name': 'cabernet-sauvignon',
'has_detailed_info': True,
'wines_count': 780931},
{'id': 5,
'name': 'Chardonnay',
'seo_name': 'chardonnay',
'has_detailed_info': True,
'wines_count': 586874}],
'background_video': None},
'class': {'typecast_map': {'background_image': {}, 'class': {}}},
'background_image': {'location': '//images.vivino.com/regions/backgrounds/0iT8wuQXRWaAmEGpPjZckg.jpg',
'variations': {'large': '//thumbs.vivino.com/region_backgrounds/0iT8wuQXRWaAmEGpPjZckg_1280x760.jpg',
'medium': '//thumbs.vivino.com/region_backgrounds/0iT8wuQXRWaAmEGpPjZckg_600x356.jpg'}}},
'winery': {'id': 74363,
'name': 'Barramundi',
'seo_name': 'barramundi',
'status': 0,
'background_image': None},
'taste': {'structure': None,
'flavor': [{'group': 'black_fruit', 'stats': {'count': 16, 'score': 2987}},
{'group': 'oak', 'stats': {'count': 11, 'score': 1329}},
{'group': 'red_fruit', 'stats': {'count': 10, 'score': 1413}},
{'group': 'spices', 'stats': {'count': 6, 'score': 430}},
{'group': 'non_oak', 'stats': {'count': 5, 'score': 126}},
{'group': 'floral', 'stats': {'count': 3, 'score': 300}},
{'group': 'earth', 'stats': {'count': 3, 'score': 249}},
{'group': 'microbio', 'stats': {'count': 2, 'score': 66}},
{'group': 'vegetal', 'stats': {'count': 1, 'score': 100}},
{'group': 'dried_fruit', 'stats': {'count': 1, 'score': 100}}]},
'statistics': {'status': 'Normal',
'ratings_count': 1002,
'ratings_average': 3.5,
'labels_count': 11180,
'vintages_count': 25},
'style': None,
'has_valid_ratings': True}

Need help translating a nested dictionary into a pandas dataframe

Looking into translating the following nested dictionary which is an API pull from Yelp into a pandas dataframe to run visualization on:
Top 50 Pizzerias in Chicago
{'businesses': [{'alias': 'pequods-pizzeria-chicago',
'categories': [{'alias': 'pizza', 'title': 'Pizza'}],
'coordinates': {'latitude': 41.92187, 'longitude': -87.664486},
'display_phone': '(773) 327-1512',
'distance': 2158.7084581522413,
'id': 'DXwSYgiXqIVNdO9dazel6w',
'image_url': 'https://s3-media1.fl.yelpcdn.com/bphoto/8QJUNblfCI0EDhOjuIWJ4A/o.jpg',
'is_closed': False,
'location': {'address1': '2207 N Clybourn Ave',
'address2': '',
'address3': '',
'city': 'Chicago',
'country': 'US',
'display_address': ['2207 N Clybourn Ave',
'Chicago, IL 60614'],
'state': 'IL',
'zip_code': '60614'},
'name': "Pequod's Pizzeria",
'phone': '+17733271512',
'price': '$$',
'rating': 4.0,
'review_count': 6586,
'transactions': ['restaurant_reservation', 'delivery'],
'url': 'https://www.yelp.com/biz/pequods-pizzeria-chicago?adjust_creative=wt2WY5Ii_urZB8YeHggW2g&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=wt2WY5Ii_urZB8YeHggW2g'},
{'alias': 'lou-malnatis-pizzeria-chicago',
'categories': [{'alias': 'pizza', 'title': 'Pizza'},
{'alias': 'italian', 'title': 'Italian'},
{'alias': 'sandwiches', 'title': 'Sandwiches'}],
'coordinates': {'latitude': 41.890357,
'longitude': -87.633704},
'display_phone': '(312) 828-9800',
'distance': 4000.9990531720227,
'id': '8vFJH_paXsMocmEO_KAa3w',
'image_url': 'https://s3-media3.fl.yelpcdn.com/bphoto/9FiL-9Pbytyg6usOE02lYg/o.jpg',
'is_closed': False,
'location': {'address1': '439 N Wells St',
'address2': '',
'address3': '',
'city': 'Chicago',
'country': 'US',
'display_address': ['439 N Wells St',
'Chicago, IL 60654'],
'state': 'IL',
'zip_code': '60654'},
'name': "Lou Malnati's Pizzeria",
'phone': '+13128289800',
'price': '$$',
'rating': 4.0,
'review_count': 6368,
'transactions': ['pickup', 'delivery'],
'url': 'https://www.yelp.com/biz/lou-malnatis-pizzeria-chicago?adjust_creative=wt2WY5Ii_urZB8YeHggW2g&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=wt2WY5Ii_urZB8YeHggW2g'},
....]
I've tried the below and iterations of it but haven't had any luck.
df = pd.DataFrame.from_dict(topresponse)
Im really new to coding so any advice would be helpful

response["businesses"] is a list of records, so:
df = pd.DataFrame.from_records(response["businesses"])

I'm trying to normalize the documents column within the dataframe

[{
'processingTechniques': ['Hot rolling'],
'summary': 'Metals Long Products Rebar in Coil',
'applications': ['CONCRETE REINFORCEMENT', 'METAL DOWNSTREAM INDUSTRIALS', 'CUT AND BEND', 'EPOXY COATING '],
'regions': ['MEA'],
'description': 'Metals Long Products Rebar in Coil',
'industrySegments': None,
'grade_id': '89a63243-74c7-e611-8197-06b69393ae39',
'name': '40',
'documents': [{
'documentType': 'TDS',
'title': 'Rebar in Coil_40_Global_Technical_Data_Sheet',
'url': '/en/products/documents/rebar-in-coil_40_global_technical_data_sheet/en',
'language': 'English',
'region': 'Global',
'revision': '20210812',
'id': '2bc4102f-8df7-e611-819b-06b69393ae39'
}]
}, {
'processingTechniques': ['Hot rolling'],
'summary': 'Metals Long Products Rebar in Coil',
'applications': ['CONCRETE REINFORCEMENT', 'METAL DOWNSTREAM INDUSTRIALS', 'CUT AND BEND', 'EPOXY COATING '],
'regions': ['MEA'],
'description': 'Metals Long Products Rebar in Coil',
'industrySegments': None,
'grade_id': 'dddd0468-79c7-e611-8197-06b69393ae39',
'name': '460B',
'documents': [{
'documentType': 'TDS',
'title': 'Rebar in Coil_460B_MEA_Technical_Data_Sheet',
'url': '/en/products/documents/rebar-in-coil_460b_mea_technical_data_sheet/en',
'language': 'English',
'region': 'MEA',
'revision': '20210812',
'id': '0e63bc98-c343-e811-80fd-005056857ef3'
}]
}, {
'processingTechniques': ['Hot rolling'],
'summary': 'Metals Long Products Rebar in Coil',
'applications': ['CONCRETE REINFORCEMENT', 'METAL DOWNSTREAM INDUSTRIALS', 'CUT AND BEND', 'EPOXY COATING '],
'regions': ['MEA'],
'description': 'Metals Long Products Rebar in Coil',
'industrySegments': None,
'grade_id': 'cd695035-76c7-e611-8197-06b69393ae39',
'name': '60',
'documents': [{
'documentType': 'TDS',
'title': 'Rebar in Coil_60_MEA_Technical_Data_Sheet',
'url': '/en/products/documents/rebar-in-coil_60_mea_technical_data_sheet/en',
'language': 'English',
'region': 'MEA',
'revision': '20210812',
'id': '733946d8-c343-e811-80fd-005056857ef3'
}]
}, {
'processingTechniques': ['Hot rolling'],
'summary': 'Metals Long Products Rebar in Coil',
'applications': ['CONCRETE REINFORCEMENT', 'METAL DOWNSTREAM INDUSTRIALS', 'CUT AND BEND', 'EPOXY COATING '],
'regions': ['MEA'],
'description': 'Metals Long Products Rebar in Coil',
'industrySegments': None,
'grade_id': 'c99a8cc9-79c7-e611-8197-06b69393ae39',
'name': 'B500B',
'documents': [{
'documentType': 'TDS',
'title': 'Rebar in Coil_B500B_MEA_Technical_Data_Sheet',
'url': '/en/products/documents/rebar-in-coil_b500b_mea_technical_data_sheet/en',
'language': 'English',
'region': 'MEA',
'revision': '20210812',
'id': 'bc25a637-c443-e811-80fd-005056857ef3'
}]
}]
The code written to convert this json to dataframe after normalizing the document is this
gr2 = pd.json_normalize(result, ['documents'], meta = ['regions', 'name', 'description', 'grade_id', 'processingTechniques','summary', 'applications'])
gr2['product_id'] = prod_id
gr2.head()
result is the json file attached above.
After running the above code, I'm getting this error
Can anyone help me with this ? I just want documents to get normalised along with the other columns.

Python Decision Tree: Creating Relationship using Dictionary from Row data

I have a hierarchical data(more than 10 generation) which tells who a person's parent/children are. i would want to represent this as dict of dict. is there any way to achieve this.
sample input - List of Dict/Dataframe
[{'Name': 'Oli Bob', 'Location': 'United Kingdom', 'Parent': nan}, {'Name': 'Mary May', 'Location': 'Germany', 'Parent': 'Oli Bob'}, {'Name': 'Christine Lobowski', 'Location': 'France', 'Parent': 'Oli Bob'}, {'Name': 'Brendon Philips', 'Location': 'USA', 'Parent': 'Oli Bob'}, {'Name': 'Margret Marmajuke', 'Location': 'Canada', 'Parent': 'Brendon Philips'}, {'Name': 'Frank Harbours', 'Location': 'Russia', 'Parent': 'Brendon Philips'}, {'Name': 'Todd Philips', 'Location': 'United Kingdom', 'Parent': 'Frank Harbours'}, {'Name': 'Jamie Newhart', 'Location': 'India', 'Parent': nan}, {'Name': 'Gemma Jane', 'Location': 'China', 'Parent': nan}, {'Name': 'Emily Sykes', 'Location': 'South Korea', 'Parent': 'Emily Sykes'}, {'Name': 'James Newman', 'Location': 'Japan', 'Parent': nan}]
same data in table form
Desired Output
[
{name:"Oli Bob", location:"United Kingdom", _children:[
{name:"Mary May", location:"Germany"},
{name:"Christine Lobowski", location:"France"},
{name:"Brendon Philips", location:"USA",_children:[
{name:"Margret Marmajuke", location:"Canada"},
{name:"Frank Harbours", location:"Russia",_children:[{name:"Todd Philips", location:"United Kingdom"}]},
]},
]},
{name:"Jamie Newhart", location:"India"},
{name:"Gemma Jane", location:"China", _children:[
{name:"Emily Sykes", location:"South Korea"},
]},
{name:"James Newman", location:"Japan"},
];

Extract specific region from image using segmentation in python

I am having a JSON file where the annotation is stored as below
{'licenses': [{'name': '', 'id': 0, 'url': ''}], 'info': {'contributor': '', 'date_created': '', 'description': '', 'url': '', 'version': '', 'year': ''}, 'categories': [{'id': 1, 'name': 'book', 'supercategory': ''}, {'id': 2, 'name': 'ceiling', 'supercategory': ''}, {'id': 3, 'name': 'chair', 'supercategory': ''}, {'id': 4, 'name': 'floor', 'supercategory': ''}, {'id': 5, 'name': 'object', 'supercategory': ''}, {'id': 6, 'name': 'person', 'supercategory': ''}, {'id': 7, 'name': 'screen', 'supercategory': ''}, {'id': 8, 'name': 'table', 'supercategory': ''}, {'id': 9, 'name': 'wall', 'supercategory': ''}, {'id': 10, 'name': 'window', 'supercategory': ''}, {'id': 11, 'name': '__background__', 'supercategory': ''}], 'images': [{'id': 1, 'width': 848, 'height': 480, 'file_name': '153058384000.png', 'license': 0, 'flickr_url': '', 'coco_url': '', 'date_captured': 0}], 'annotations': [{'id': 1, 'image_id': 1, 'category_id': 7, 'segmentation': [[591.81, 146.75, 848.0, 119.83, 848.0, 289.18, 606.39, 288.06]], 'area': 38747.0, 'bbox': [591.81, 119.83, 256.19, 169.35], 'iscrowd': 0, 'attributes': {'occluded': False}}]}
I want to select a specific region from the image using the ''segmentation': [[591.81, 146.75, 848.0, 119.83, 848.0, 289.18, 606.39, 288.06]]' field within annotation in the above json file.
The image I am using is below
I tried with Opencv and PIL, but I didn't get effective output
Note: segmentation may have more than 8 coordinates

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Json to Pandas, include "Parents" - python

Why don't you just do venues['Neighbourhood'] = response['headerFullLocation']. I am assuming, you send a separate request for each neigbhourhood and plan to concatenate multiple venue dataframes in the end.

Related

How to do error-handling of JSON Parser Loop

Need help translating a nested dictionary into a pandas dataframe

I'm trying to normalize the documents column within the dataframe

Python Decision Tree: Creating Relationship using Dictionary from Row data

Extract specific region from image using segmentation in python

Categories

Resources