Data manipulation of two list of dictionaries - python

I have the following two lists:
retrieved_sessions = [
{'start_time': '2020-01-17T08:30:00.000Z', 'availability': '5'},
{'start_time': '2020-01-17T09:30:00.000Z', 'availability': '7'},
{'start_time': '2020-01-17T10:30:00.000Z', 'availability': '6'},
{'start_time': '2020-01-17T11:30:00.000Z', 'availability': '5'},
{'start_time': '2020-01-17T12:30:00.000Z', 'availability': '0'},
{'start_time': '2020-01-17T13:30:00.000Z', 'availability': '2'},
{'start_time': '2020-01-17T14:30:00.000Z', 'availability': '13'}
]
all_sessions = [
{'start_time': '2020-01-17T09:00:00.000Z', 'availability': None},
{'start_time': '2020-01-17T09:30:00.000Z', 'availability': None},
{'start_time': '2020-01-17T10:00:00.000Z', 'availability': None},
{'start_time': '2020-01-17T10:30:00.000Z', 'availability': None},
{'start_time': '2020-01-17T11:00:00.000Z', 'availability': None},
{'start_time': '2020-01-17T11:30:00.000Z', 'availability': None},
{'start_time': '2020-01-17T12:00:00.000Z', 'availability': None},
{'start_time': '2020-01-17T12:30:00.000Z', 'availability': None},
{'start_time': '2020-01-17T13:00:00.000Z', 'availability': None},
{'start_time': '2020-01-17T13:30:00.000Z', 'availability': None},
{'start_time': '2020-01-17T14:00:00.000Z', 'availability': None},
{'start_time': '2020-01-17T14:30:00.000Z', 'availability': None},
{'start_time': '2020-01-17T15:00:00.000Z', 'availability': None},
{'start_time': '2020-01-17T15:30:00.000Z', 'availability': None}
]
I was wondering, what would be the best way to update the dictionary in all_sessions with the availability from the corresponding retrieved_sessions dictinary, using the start_time as the primary key lookup/matching field?
Expected output:
all_sessions = [
{'start_time': '2020-01-17T09:00:00.000Z', 'availability': None},
{'start_time': '2020-01-17T09:30:00.000Z', 'availability': '7'},
{'start_time': '2020-01-17T10:00:00.000Z', 'availability': None},
{'start_time': '2020-01-17T10:30:00.000Z', 'availability': '6'},
{'start_time': '2020-01-17T11:00:00.000Z', 'availability': None},
{'start_time': '2020-01-17T11:30:00.000Z', 'availability': '5'},
{'start_time': '2020-01-17T12:00:00.000Z', 'availability': None},
{'start_time': '2020-01-17T12:30:00.000Z', 'availability': '0'},
{'start_time': '2020-01-17T13:00:00.000Z', 'availability': None},
{'start_time': '2020-01-17T13:30:00.000Z', 'availability': '2'},
{'start_time': '2020-01-17T14:00:00.000Z', 'availability': None},
{'start_time': '2020-01-17T14:30:00.000Z', 'availability': '13'},
{'start_time': '2020-01-17T15:00:00.000Z', 'availability': None},
{'start_time': '2020-01-17T15:30:00.000Z', 'availability': None}
]
I have tried the following loop within a loop:
for available in availability:
for session in sessions:
if session['start_time'] == available['start_time']:
session['availability'] = available['availability']
N.B. Data is coming from a SOAP API, hence the wierd '1'/'2' instead of e.g., 1, 2. etc

You can make helper dict where the key is start_time and value is availability and then replace correspondent values in all_sessions
d = {s['start_time']: s['availability'] for s in retrieved_sessions}
for s in all_sessions:
s['availability'] = d.get(s['start_time'])
from pprint import pprint
pprint(all_sessions)
Prints:
[{'availability': None, 'start_time': '2020-01-17T09:00:00.000Z'},
{'availability': '7', 'start_time': '2020-01-17T09:30:00.000Z'},
{'availability': None, 'start_time': '2020-01-17T10:00:00.000Z'},
{'availability': '6', 'start_time': '2020-01-17T10:30:00.000Z'},
{'availability': None, 'start_time': '2020-01-17T11:00:00.000Z'},
{'availability': '5', 'start_time': '2020-01-17T11:30:00.000Z'},
{'availability': None, 'start_time': '2020-01-17T12:00:00.000Z'},
{'availability': '0', 'start_time': '2020-01-17T12:30:00.000Z'},
{'availability': None, 'start_time': '2020-01-17T13:00:00.000Z'},
{'availability': '2', 'start_time': '2020-01-17T13:30:00.000Z'},
{'availability': None, 'start_time': '2020-01-17T14:00:00.000Z'},
{'availability': '13', 'start_time': '2020-01-17T14:30:00.000Z'},
{'availability': None, 'start_time': '2020-01-17T15:00:00.000Z'},
{'availability': None, 'start_time': '2020-01-17T15:30:00.000Z'}]

Create a new dict with start_time as keys and availability as values.
Then iterate over copy.deepcopy(all_sessions).items(), matching on start_time and adding to the original all_sessions as you go. The deepcopy is because you can't iterate over a dict and modify it in the same pass.
The two nested for loops in your example are O(n*m) - IOW it's slow. Introducing a dict this way will speed things up to O(n+m).
Note that keying on times is prone to problems from duplicate times.

Related

having trouble passing multiple dictionaries for the argument record_path in json_normalize

I'm having troubles completely unnesting this json from an Api.
[{'id': 1,
'name': 'Buzz',
'tagline': 'A Real Bitter Experience.',
'first_brewed': '09/2007',
'description': 'A light, crisp and bitter IPA brewed with English and American hops. A small batch brewed only once.',
'image_url': 'https://images.punkapi.com/v2/keg.png',
'abv': 4.5,
'ibu': 60,
'target_fg': 1010,
'target_og': 1044,
'ebc': 20,
'srm': 10,
'ph': 4.4,
'attenuation_level': 75,
'volume': {'value': 20, 'unit': 'litres'},
'boil_volume': {'value': 25, 'unit': 'litres'},
'method': {'mash_temp': [{'temp': {'value': 64, 'unit': 'celsius'},
'duration': 75}],
'fermentation': {'temp': {'value': 19, 'unit': 'celsius'}},
'twist': None},
'ingredients': {'malt': [{'name': 'Maris Otter Extra Pale',
'amount': {'value': 3.3, 'unit': 'kilograms'}},
{'name': 'Caramalt', 'amount': {'value': 0.2, 'unit': 'kilograms'}},
{'name': 'Munich', 'amount': {'value': 0.4, 'unit': 'kilograms'}}],
'hops': [{'name': 'Fuggles',
'amount': {'value': 25, 'unit': 'grams'},
'add': 'start',
'attribute': 'bitter'},
{'name': 'First Gold',
'amount': {'value': 25, 'unit': 'grams'},
'add': 'start',
'attribute': 'bitter'},
{'name': 'Fuggles',
'amount': {'value': 37.5, 'unit': 'grams'},
'add': 'middle',
'attribute': 'flavour'},
{'name': 'First Gold',
'amount': {'value': 37.5, 'unit': 'grams'},
'add': 'middle',
'attribute': 'flavour'},
{'name': 'Cascade',
'amount': {'value': 37.5, 'unit': 'grams'},
'add': 'end',
'attribute': 'flavour'}],
'yeast': 'Wyeast 1056 - American Ale™'},
'food_pairing': ['Spicy chicken tikka masala',
'Grilled chicken quesadilla',
'Caramel toffee cake'],
'brewers_tips': 'The earthy and floral aromas from the hops can be overpowering. Drop a little Cascade in at the end of the boil to lift the profile with a bit of citrus.',
'contributed_by': 'Sam Mason <samjbmason>'},
{'id': 2,
'name': 'Trashy Blonde',
'tagline': "You Know You Shouldn't",
'first_brewed': '04/2008',
'description': 'A titillating, neurotic, peroxide punk of a Pale Ale. Combining attitude, style, substance, and a little bit of low self esteem for good measure; what would your mother say? The seductive lure of the sassy passion fruit hop proves too much to resist. All that is even before we get onto the fact that there are no additives, preservatives, pasteurization or strings attached. All wrapped up with the customary BrewDog bite and imaginative twist.',
'image_url': 'https://images.punkapi.com/v2/2.png',
'abv': 4.1,
'ibu': 41.5,
'target_fg': 1010,
'target_og': 1041.7,
'ebc': 15,
'srm': 15,
'ph': 4.4,
'attenuation_level': 76,
'volume': {'value': 20, 'unit': 'litres'},
'boil_volume': {'value': 25, 'unit': 'litres'},
'method': {'mash_temp': [{'temp': {'value': 69, 'unit': 'celsius'},
'duration': None}],
'fermentation': {'temp': {'value': 18, 'unit': 'celsius'}},
'twist': None},
'ingredients': {'malt': [{'name': 'Maris Otter Extra Pale',
'amount': {'value': 3.25, 'unit': 'kilograms'}},
{'name': 'Caramalt', 'amount': {'value': 0.2, 'unit': 'kilograms'}},
{'name': 'Munich', 'amount': {'value': 0.4, 'unit': 'kilograms'}}],
'hops': [{'name': 'Amarillo',
'amount': {'value': 13.8, 'unit': 'grams'},
'add': 'start',
'attribute': 'bitter'},
{'name': 'Simcoe',
'amount': {'value': 13.8, 'unit': 'grams'},
'add': 'start',
'attribute': 'bitter'},
{'name': 'Amarillo',
'amount': {'value': 26.3, 'unit': 'grams'},
'add': 'end',
'attribute': 'flavour'},
{'name': 'Motueka',
'amount': {'value': 18.8, 'unit': 'grams'},
'add': 'end',
'attribute': 'flavour'}],
'yeast': 'Wyeast 1056 - American Ale™'},
'food_pairing': ['Fresh crab with lemon',
'Garlic butter dipping sauce',
'Goats cheese salad',
'Creamy lemon bar doused in powdered sugar'],
'brewers_tips': 'Be careful not to collect too much wort from the mash. Once the sugars are all washed out there are some very unpleasant grainy tasting compounds that can be extracted into the wort.',
'contributed_by': 'Sam Mason <samjbmason>'}]
I was able to unnest it to a level using json_normalize
import requests
import pandas as pd
url = "https://api.punkapi.com/v2/beers"
requests.get(url).json()
data = requests.get(url).json()
pd.json_normalize(data)
this is an image of the output after using json_normalize
now to unnest the column 'method.mash_temp' I included record_path
pd.json_normalize(
data,
record_path =['method', 'mash_temp'],
meta=['id', 'name']
)
but I am having troubles adding the other columns('ingredients.malt', 'ingredients.hops') with list of dictionaries in the record_path argument.

How to Transform data in a nested json?

I have the following data, and when I used json_flatten i was able to extract most of the data except for deliveryMethod.items and languages.items.
I also tried to use pd.json_normalize(a, record_path= 'deliveryMethod.items') but it doesn't seem to be working.
a = {'ID': '1', 'Name': 'ABC', 'Center': 'Center For Education', 'providerNameAr': 'ABC', 'city': {'id': 1, 'cityEn': 'LA', 'regionId': 0, 'region': None}, 'cityName': None, 'LevelNumber': 'ABCD', 'activityStartDate': '09/01/2020', 'activityEndDate': '09/02/2020', 'activityType': {'lookUpId': 2, 'lookUpEn': 'Course', 'code': None, 'parent': None, 'hasParent': False}, 'deliveryMethod': {'items': [{'lookUpId': 2, 'lookUpEn': 'online' 'code': None, 'parent': None, 'hasParent': False}]}, 'languages': {'items': [{'lookUpId': 1, 'lookUpEn': 'English', 'code': None, 'parent': None, 'hasParent': False}]}, 'activityCategory': {'lookUpId': 1, 'lookUpEn': 'Regular', 'code': None, 'parent': None, 'hasParent': False}, 'address': 'LA', 'phoneNumber': '-11111', 'emailAddress': 'ABCS#Gmail.com', 'isAllSpeciality': True, 'requestId': 23, 'parentActivityId': None, 'sppData': None}

Iterating over JSON data and printing. (or creating Pandas DataFrame from JSON file)

I’m trying to use Python print specific values from a JSON file that I pulled from an API. From what I understand, I am pulling it as a JSON file that has a list of dictionaries of players, with a nested dictionary for each player containing their data (i.e. name, team, etc.).
I’m running into issues printing the values within the JSON file, as each character is printing on a separate line.
The end result I am trying to get to is a Pandas DataFrame containing all the values from the JSON file, but I can’t even seem to iterate through the JSON file correctly.
Here is my code:
url = "https://api-football-v1.p.rapidapi.com/v3/players"
querystring = {"league":"39","season":"2020", "page":"2"}
headers = {
"X-RapidAPI-Host": "api-football-v1.p.rapidapi.com",
"X-RapidAPI-Key": "xxxxxkeyxxxxx"
}
response = requests.request("GET", url, headers=headers, params=querystring).json()
response_dump = json.dumps(response)
for item in response_dump:
for player_item in item:
print(player_item)
This is the output when I print the JSON response (first two items):
{'get': 'players', 'parameters': {'league': '39', 'page': '2', 'season': '2020'}, 'errors': [], 'results': 20, 'paging': {'current': 2, 'total': 37}, 'response': [{'player': {'id': 301, 'name': 'Benjamin Luke Woodburn', 'firstname': 'Benjamin Luke', 'lastname': 'Woodburn', 'age': 23, 'birth': {'date': '1999-10-15', 'place': 'Nottingham', 'country': 'England'}, 'nationality': 'Wales', 'height': '174 cm', 'weight': '72 kg', 'injured': False, 'photo': 'https://media.api-sports.io/football/players/301.png'}, 'statistics': [{'team': {'id': 40, 'name': 'Liverpool', 'logo': 'https://media.api-sports.io/football/teams/40.png'}, 'league': {'id': 39, 'name': 'Premier League', 'country': 'England', 'logo': 'https://media.api-sports.io/football/leagues/39.png', 'flag': 'https://media.api-sports.io/flags/gb.svg', 'season': 2020}, 'games': {'appearences': 0, 'lineups': 0, 'minutes': 0, 'number': None, 'position': 'Attacker', 'rating': None, 'captain': False}, 'substitutes': {'in': 0, 'out': 0, 'bench': 3}, 'shots': {'total': None, 'on': None}, 'goals': {'total': 0, 'conceded': 0, 'assists': None, 'saves': None}, 'passes': {'total': None, 'key': None, 'accuracy': None}, 'tackles': {'total': None, 'blocks': None, 'interceptions': None}, 'duels': {'total': None, 'won': None}, 'dribbles': {'attempts': None, 'success': None, 'past': None}, 'fouls': {'drawn': None, 'committed': None}, 'cards': {'yellow': 0, 'yellowred': 0, 'red': 0}, 'penalty': {'won': None, 'commited': None, 'scored': 0, 'missed': 0, 'saved': None}}]}, {'player': {'id': 518, 'name': 'Meritan Shabani', 'firstname': 'Meritan', 'lastname': 'Shabani', 'age': 23, 'birth': {'date': '1999-03-15', 'place': 'München', 'country': 'Germany'}, 'nationality': 'Germany', 'height': '185 cm', 'weight': '78 kg', 'injured': False, 'photo': 'https://media.api-sports.io/football/players/518.png'}, 'statistics': [{'team': {'id': 39, 'name': 'Wolves', 'logo': 'https://media.api-sports.io/football/teams/39.png'}, 'league': {'id': 39, 'name': 'Premier League', 'country': 'England', 'logo': 'https://media.api-sports.io/football/leagues/39.png', 'flag': 'https://media.api-sports.io/flags/gb.svg', 'season': 2020}, 'games': {'appearences': 0, 'lineups': 0, 'minutes': 0, 'number': None, 'position': 'Midfielder', 'rating': None, 'captain': False}, 'substitutes': {'in': 0, 'out': 0, 'bench': 3}, 'shots': {'total': None, 'on': None}, 'goals': {'total': 0, 'conceded': 0, 'assists': None, 'saves': None}, 'passes': {'total': None, 'key': None, 'accuracy': None}, 'tackles': {'total': None, 'blocks': None, 'interceptions': None}, 'duels': {'total': None, 'won': None}, 'dribbles': {'attempts': None, 'success': None, 'past': None}, 'fouls': {'drawn': None, 'committed': None}, 'cards': {'yellow': 0, 'yellowred': 0, 'red': 0}, 'penalty': {'won': None, 'commited': None, 'scored': 0, 'missed': 0, 'saved': None}}]},
This is the data type of each layer of the JSON file, from when I iterated through it with a For loop:
print(type(response)) <class 'dict'>
print(type(response_dump)) <class 'str'>
print(type(item)) <class 'str'>
print(type(player_item)) <class 'str'>
You do not have to json.dumps() in my opinion, just use the JSON from response to iterate:
for player in response['response']:
print(player)
{'player': {'id': 301, 'name': 'Benjamin Luke Woodburn', 'firstname': 'Benjamin Luke', 'lastname': 'Woodburn', 'age': 23, 'birth': {'date': '1999-10-15', 'place': 'Nottingham', 'country': 'England'}, 'nationality': 'Wales', 'height': '174 cm', 'weight': '72 kg', 'injured': False, 'photo': 'https://media.api-sports.io/football/players/301.png'}, 'statistics': [{'team': {'id': 40, 'name': 'Liverpool', 'logo': 'https://media.api-sports.io/football/teams/40.png'}, 'league': {'id': 39, 'name': 'Premier League', 'country': 'England', 'logo': 'https://media.api-sports.io/football/leagues/39.png', 'flag': 'https://media.api-sports.io/flags/gb.svg', 'season': 2020}, 'games': {'appearences': 0, 'lineups': 0, 'minutes': 0, 'number': None, 'position': 'Attacker', 'rating': None, 'captain': False}, 'substitutes': {'in': 0, 'out': 0, 'bench': 3}, 'shots': {'total': None, 'on': None}, 'goals': {'total': 0, 'conceded': 0, 'assists': None, 'saves': None}, 'passes': {'total': None, 'key': None, 'accuracy': None}, 'tackles': {'total': None, 'blocks': None, 'interceptions': None}, 'duels': {'total': None, 'won': None}, 'dribbles': {'attempts': None, 'success': None, 'past': None}, 'fouls': {'drawn': None, 'committed': None}, 'cards': {'yellow': 0, 'yellowred': 0, 'red': 0}, 'penalty': {'won': None, 'commited': None, 'scored': 0, 'missed': 0, 'saved': None}}]}
{'player': {'id': 518, 'name': 'Meritan Shabani', 'firstname': 'Meritan', 'lastname': 'Shabani', 'age': 23, 'birth': {'date': '1999-03-15', 'place': 'München', 'country': 'Germany'}, 'nationality': 'Germany', 'height': '185 cm', 'weight': '78 kg', 'injured': False, 'photo': 'https://media.api-sports.io/football/players/518.png'}, 'statistics': [{'team': {'id': 39, 'name': 'Wolves', 'logo': 'https://media.api-sports.io/football/teams/39.png'}, 'league': {'id': 39, 'name': 'Premier League', 'country': 'England', 'logo': 'https://media.api-sports.io/football/leagues/39.png', 'flag': 'https://media.api-sports.io/flags/gb.svg', 'season': 2020}, 'games': {'appearences': 0, 'lineups': 0, 'minutes': 0, 'number': None, 'position': 'Midfielder', 'rating': None, 'captain': False}, 'substitutes': {'in': 0, 'out': 0, 'bench': 3}, 'shots': {'total': None, 'on': None}, 'goals': {'total': 0, 'conceded': 0, 'assists': None, 'saves': None}, 'passes': {'total': None, 'key': None, 'accuracy': None}, 'tackles': {'total': None, 'blocks': None, 'interceptions': None}, 'duels': {'total': None, 'won': None}, 'dribbles': {'attempts': None, 'success': None, 'past': None}, 'fouls': {'drawn': None, 'committed': None}, 'cards': {'yellow': 0, 'yellowred': 0, 'red': 0}, 'penalty': {'won': None, 'commited': None, 'scored': 0, 'missed': 0, 'saved': None}}]}
or
for player in response['response']:
print(player['player'])
{'id': 301, 'name': 'Benjamin Luke Woodburn', 'firstname': 'Benjamin Luke', 'lastname': 'Woodburn', 'age': 23, 'birth': {'date': '1999-10-15', 'place': 'Nottingham', 'country': 'England'}, 'nationality': 'Wales', 'height': '174 cm', 'weight': '72 kg', 'injured': False, 'photo': 'https://media.api-sports.io/football/players/301.png'}
{'id': 518, 'name': 'Meritan Shabani', 'firstname': 'Meritan', 'lastname': 'Shabani', 'age': 23, 'birth': {'date': '1999-03-15', 'place': 'München', 'country': 'Germany'}, 'nationality': 'Germany', 'height': '185 cm', 'weight': '78 kg', 'injured': False, 'photo': 'https://media.api-sports.io/football/players/518.png'}
To get a DataFrame simply call pd.json_normalize() - Cause your question is not that clear I am not sure wiche information is needed and how to displayed. This is predestinated to ask a new question with exact that focus.:
pd.json_normalize(response['response'])
EDIT
Based on your comment and improvment:
pd.concat([pd.json_normalize(response,['response'])\
,pd.json_normalize(response,['response','statistics'])], axis=1)\
.drop(['statistics'], axis=1)
player.id
player.name
player.firstname
player.lastname
player.age
player.birth.date
player.birth.place
player.birth.country
player.nationality
player.height
player.weight
player.injured
player.photo
team.id
team.name
team.logo
league.id
league.name
league.country
league.logo
league.flag
league.season
games.appearences
games.lineups
games.minutes
games.number
games.position
games.rating
games.captain
substitutes.in
substitutes.out
substitutes.bench
shots.total
shots.on
goals.total
goals.conceded
goals.assists
goals.saves
passes.total
passes.key
passes.accuracy
tackles.total
tackles.blocks
tackles.interceptions
duels.total
duels.won
dribbles.attempts
dribbles.success
dribbles.past
fouls.drawn
fouls.committed
cards.yellow
cards.yellowred
cards.red
penalty.won
penalty.commited
penalty.scored
penalty.missed
penalty.saved
0
301
Benjamin Luke Woodburn
Benjamin Luke
Woodburn
23
1999-10-15
Nottingham
England
Wales
174 cm
72 kg
False
https://media.api-sports.io/football/players/301.png
40
Liverpool
https://media.api-sports.io/football/teams/40.png
39
Premier League
England
https://media.api-sports.io/football/leagues/39.png
https://media.api-sports.io/flags/gb.svg
2020
0
0
0
Attacker
False
0
0
3
0
0
0
0
0
0
0
1
518
Meritan Shabani
Meritan
Shabani
23
1999-03-15
München
Germany
Germany
185 cm
78 kg
False
https://media.api-sports.io/football/players/518.png
39
Wolves
https://media.api-sports.io/football/teams/39.png
39
Premier League
England
https://media.api-sports.io/football/leagues/39.png
https://media.api-sports.io/flags/gb.svg
2020
0
0
0
Midfielder
False
0
0
3
0
0
0
0
0
0
0

How to do error-handling of JSON Parser Loop

I found some elegant code that builds a list by iterating through each element of another JSON list:
results = [
(
t["vintage"]["wine"]["winery"]["name"],
t["vintage"]["year"],
t["vintage"]["wine"]["id"],
f'{t["vintage"]["wine"]["name"]} {t["vintage"]["year"]}',
t["vintage"]["wine"]["statistics"]["ratings_average"],
t["vintage"]["wine"]["statistics"]["ratings_count"],
t["price"]["amount"],
t["vintage"]["wine"]["region"]["name"],
t["vintage"]["wine"]["style"]["name"], #<--------------issue here
)
for t in r.json()["explore_vintage"]["matches"]
]
The problem is that sometimes the JSON doesn't have a "name" element because the "style" is null (or None in JSON world). See the second-last line below for the JSON sample.
Is there a simple way to handle this error?
Error:
matches[23]["vintage"]["wine"]["style"]["name"]
Traceback (most recent call last):
File "<ipython-input-94-59447d0d4859>", line 1, in <module>
matches[23]["vintage"]["wine"]["style"]["name"]
TypeError: 'NoneType' object is not subscriptable
Perhaps something like:
iferror(t["vintage"]["wine"]["style"]["name"], "DoesNotExist")
JSON:
{'id': 4026076,
'name': 'Shiraz - Petit Verdot',
'seo_name': 'shiraz-petit-verdot',
'type_id': 1,
'vintage_type': 0,
'is_natural': False,
'region': {'id': 685,
'name': 'South Eastern Australia',
'name_en': '',
'seo_name': 'south-eastern',
'country': {'code': 'au',
'name': 'Australia',
'native_name': 'Australia',
'seo_name': 'australia',
'sponsored': False,
'currency': {'code': 'AUD',
'name': 'Australian Dollars',
'prefix': '$',
'suffix': None},
'regions_count': 120,
'users_count': 867353,
'wines_count': 108099,
'wineries_count': 13375,
'most_used_grapes': [{'id': 1,
'name': 'Shiraz/Syrah',
'seo_name': 'shiraz-syrah',
'has_detailed_info': True,
'wines_count': 536370},
{'id': 2,
'name': 'Cabernet Sauvignon',
'seo_name': 'cabernet-sauvignon',
'has_detailed_info': True,
'wines_count': 780931},
{'id': 5,
'name': 'Chardonnay',
'seo_name': 'chardonnay',
'has_detailed_info': True,
'wines_count': 586874}],
'background_video': None},
'class': {'typecast_map': {'background_image': {}, 'class': {}}},
'background_image': {'location': '//images.vivino.com/regions/backgrounds/0iT8wuQXRWaAmEGpPjZckg.jpg',
'variations': {'large': '//thumbs.vivino.com/region_backgrounds/0iT8wuQXRWaAmEGpPjZckg_1280x760.jpg',
'medium': '//thumbs.vivino.com/region_backgrounds/0iT8wuQXRWaAmEGpPjZckg_600x356.jpg'}}},
'winery': {'id': 74363,
'name': 'Barramundi',
'seo_name': 'barramundi',
'status': 0,
'background_image': None},
'taste': {'structure': None,
'flavor': [{'group': 'black_fruit', 'stats': {'count': 16, 'score': 2987}},
{'group': 'oak', 'stats': {'count': 11, 'score': 1329}},
{'group': 'red_fruit', 'stats': {'count': 10, 'score': 1413}},
{'group': 'spices', 'stats': {'count': 6, 'score': 430}},
{'group': 'non_oak', 'stats': {'count': 5, 'score': 126}},
{'group': 'floral', 'stats': {'count': 3, 'score': 300}},
{'group': 'earth', 'stats': {'count': 3, 'score': 249}},
{'group': 'microbio', 'stats': {'count': 2, 'score': 66}},
{'group': 'vegetal', 'stats': {'count': 1, 'score': 100}},
{'group': 'dried_fruit', 'stats': {'count': 1, 'score': 100}}]},
'statistics': {'status': 'Normal',
'ratings_count': 1002,
'ratings_average': 3.5,
'labels_count': 11180,
'vintages_count': 25},
'style': None,
'has_valid_ratings': True}

Extract specific region from image using segmentation in python

I am having a JSON file where the annotation is stored as below
{'licenses': [{'name': '', 'id': 0, 'url': ''}], 'info': {'contributor': '', 'date_created': '', 'description': '', 'url': '', 'version': '', 'year': ''}, 'categories': [{'id': 1, 'name': 'book', 'supercategory': ''}, {'id': 2, 'name': 'ceiling', 'supercategory': ''}, {'id': 3, 'name': 'chair', 'supercategory': ''}, {'id': 4, 'name': 'floor', 'supercategory': ''}, {'id': 5, 'name': 'object', 'supercategory': ''}, {'id': 6, 'name': 'person', 'supercategory': ''}, {'id': 7, 'name': 'screen', 'supercategory': ''}, {'id': 8, 'name': 'table', 'supercategory': ''}, {'id': 9, 'name': 'wall', 'supercategory': ''}, {'id': 10, 'name': 'window', 'supercategory': ''}, {'id': 11, 'name': '__background__', 'supercategory': ''}], 'images': [{'id': 1, 'width': 848, 'height': 480, 'file_name': '153058384000.png', 'license': 0, 'flickr_url': '', 'coco_url': '', 'date_captured': 0}], 'annotations': [{'id': 1, 'image_id': 1, 'category_id': 7, 'segmentation': [[591.81, 146.75, 848.0, 119.83, 848.0, 289.18, 606.39, 288.06]], 'area': 38747.0, 'bbox': [591.81, 119.83, 256.19, 169.35], 'iscrowd': 0, 'attributes': {'occluded': False}}]}
I want to select a specific region from the image using the ''segmentation': [[591.81, 146.75, 848.0, 119.83, 848.0, 289.18, 606.39, 288.06]]' field within annotation in the above json file.
The image I am using is below
I tried with Opencv and PIL, but I didn't get effective output
Note: segmentation may have more than 8 coordinates

Categories

Resources