Select certain elements of response - python

I found myself not understanding how I can select only some elements of my Steam API request response.
Here is the code with the results that makes a correct request on Steam. It is non-reproducable because client_id is personal information. The results are included.
# All online streamers
client_id = "...confidential"
limit = "2"
def request_dataNewAPI(limit):
headers = {"Client-ID": client_id, "Accept": "application/vnd.twitchtv.v5+json"}
url = "https://api.twitch.tv/helix/streams?first=" + limit
r = requests.get(url, headers=headers).json()
return r
# If a bad user login name or offline response will be:
# {'data': [], 'pagination': {}}
table1 = request_dataNewAPI(limit)
The output is:
New API
{'data': [{'id': '34472839600', 'user_id': '12826', 'user_name': 'Twitch', 'game_id': '509663', 'community_ids': ['f261cf73-cbcc-4b08-af72-c6d2020f9ed4'], 'type': 'live', 'title': 'The 1st Ever 3rd or 4th Pre Pre Show! Part 6', 'viewer_count': 19555, 'started_at': '2019-06-10T02:01:20Z', 'language': 'en', 'thumbnail_url': 'https://static-cdn.jtvnw.net/previews-ttv/live_user_twitch-{width}x{height}.jpg', 'tag_ids': ['d27da25e-1ee2-4207-bb11-dd8d54fa29ec', '6ea6bca4-4712-4ab9-a906-e3336a9d8039']}, {'id': '34474693232', 'user_id': '39298218', 'user_name': 'dakotaz', 'game_id': '33214', 'community_ids': [], 'type': 'live', 'title': '𝙖𝙙𝙫𝙚𝙣𝙩𝙪𝙧𝙚 𝙩𝙞𝙢𝙚 | code: dakotaz in itemshop & GFUEL', 'viewer_count': 15300, 'started_at': '2019-06-10T06:37:02Z', 'language': 'en', 'thumbnail_url': 'https://static-cdn.jtvnw.net/previews-ttv/live_user_dakotaz-{width}x{height}.jpg', 'tag_ids': ['6ea6bca4-4712-4ab9-a906-e3336a9d8039']}], 'pagination': {'cursor': 'eyJiIjpudWxsLCJhIjp7Ik9mZnNldCI6Mn19'}}
The problem is that I want to select only the list of 'user_name' of active streamers. I tried the following:
print(table1['data']['user_name'])
gives "TypeError: list indices must be integers or slices, not str".
print(table1['data'])
gives the whole array of data:
[{'id': '34472839600', 'user_id': '12826', 'user_name': 'Twitch', 'game_id': '509663', 'community_ids': ['f261cf73-cbcc-4b08-af72-c6d2020f9ed4'], 'type': 'live', 'title': 'The 1st Ever 3rd or 4th Pre Pre Show! Part 6', 'viewer_count': 19555, 'started_at': '2019-06-10T02:01:20Z', 'language': 'en', 'thumbnail_url': 'https://static-cdn.jtvnw.net/previews-ttv/live_user_twitch-{width}x{height}.jpg', 'tag_ids': ['d27da25e-1ee2-4207-bb11-dd8d54fa29ec', '6ea6bca4-4712-4ab9-a906-e3336a9d8039']}, {'id': '34474693232', 'user_id': '39298218', 'user_name': 'dakotaz', 'game_id': '33214', 'community_ids': [], 'type': 'live', 'title': '𝙖𝙙𝙫𝙚𝙣𝙩𝙪𝙧𝙚 𝙩𝙞𝙢𝙚 | code: dakotaz in itemshop & GFUEL', 'viewer_count': 15300, 'started_at': '2019-06-10T06:37:02Z', 'language': 'en', 'thumbnail_url': 'https://static-cdn.jtvnw.net/previews-ttv/live_user_dakotaz-{width}x{height}.jpg', 'tag_ids': ['6ea6bca4-4712-4ab9-a906-e3336a9d8039']}]
As a final result, I would like to have something like:
'user_name': {name1, name2}

The problem is that I want to select only the list of 'user_name' of active streamers
... print(table1['data']['user_name'])
... gives "TypeError: list indices must be integers or slices, not str".
You receive TypeError because table1['data'] is a list, not dict and you must access its members with int, not str (although dict keys can be int as well).
Use list comprehension:
user_names = [x['user_name'] for x in table1['data']]
This will give you the list of strings representing user names.

Related

Trouble with inconsistent keys in a nested JSON

I am using the GoCardless API and I'm trying to print all the customer records. The code is iterating through each record and I'm successfully pulling data. woop. The issue I have is when trying to pull custom fields these are listed as metadata but the code falls over if the field isn't completed. Instead of the dictionary containing the Key but with a null value, the key just doesn't exist. This means the code falls over as it thinks the key doesn't exist.
I think what I need to do is add an if statement which prints NULL whenever it can't find the Key but I can't get that to work.
import gocardless_pro
import pandas
import Access_key
import os
client = gocardless_pro.Client(
access_token = Access_key.accesskey_live,
environment = 'live'
)
customers = client.customers.list().records
print(client.customers.list().records[0].__dict__)
data = {'id_customer' : [customer.id for customer in customers],
'First Name' : [customer.given_name for customer in customers],
'Family Name' : [customer.family_name for customer in customers],
'address_line1' : [customer.address_line1 for customer in customers],
'address_line2' : [customer.address_line2 for customer in customers],
'address_line3' : [customer.address_line3 for customer in customers],
'City' : [customer.city for customer in customers],
'Region' : [customer.region for customer in customers],
'Post Code' : [customer.postal_code for customer in customers],
'id_tenant' : [customer.metadata["t"] for customer in customers],
'Created date' : [customer.created_at for customer in customers]
}
df = pandas.DataFrame(data)
print(df)
The output is inconsistent in some cases there is a metadata field "t". The two outputs look like:
Output 1 - with metadata field "t"
'attributes': {'id': 'xxx', 'created_at': '2022-10-12T13:31:41.205Z', 'email': 'xxx', 'given_name': 'xxx', 'family_name': 'xxx', 'company_name': None, 'address_line1': 'xxx', 'address_line2': None, 'address_line3': None, 'city': 'xxxx', 'region': None, 'postal_code': 'xxx', 'country_code': 'GB', 'language': 'en', 'swedish_identity_number': None, 'danish_identity_number': None, 'phone_number': None, 'metadata': {'T': 'xxx'}}, 'api_response': <gocardless_pro.api_response.ApiResponse object at >'
Option 2 - without metadata field
'attributes': {'id': 'xxx', 'created_at': '2022-10-12T13:31:41.205Z', 'email': 'xxx', 'given_name': 'xxx', 'family_name': 'xxx', 'company_name': None, 'address_line1': 'xxx', 'address_line2': None, 'address_line3': None, 'city': 'xxxx', 'region': None, 'postal_code': 'xxx', 'country_code': 'GB', 'language': 'en', 'swedish_identity_number': None, 'danish_identity_number': None, 'phone_number': None, 'metadata': {}}, 'api_response': <gocardless_pro.api_response.ApiResponse object at >

Python get data with JSON response

I'm making a call to an api which is returning a JSON response, whcih i am then trying to retrieve certain data from within the response.
{'data': {'9674': {'category': 'token',
'contract_address': [{'contract_address': '0x2a3bff78b79a009976eea096a51a948a3dc00e34',
'platform': {'coin': {'id': '1027',
'name': 'Ethereum',
'slug': 'ethereum',
'symbol': 'ETH'},
'name': 'Ethereum'}}],
'date_added': '2021-05-10T00:00:00.000Z',
'date_launched': '2021-05-10T00:00:00.000Z',
'description': 'Wilder World (WILD) is a cryptocurrency '
'launched in 2021and operates on the '
'Ethereum platform. Wilder World has a '
'current supply of 500,000,000 with '
'83,683,300.17 in circulation. The last '
'known price of Wilder World is 2.28165159 '
'USD and is down -6.79 over the last 24 '
'hours. It is currently trading on 21 active '
'market(s) with $2,851,332.76 traded over '
'the last 24 hours. More information can be '
'found at https://www.wilderworld.com/.',
'id': 9674,
'is_hidden': 0,
'logo': 'https://s2.coinmarketcap.com/static/img/coins/64x64/9674.png',
'name': 'Wilder World',
'notice': '',
'platform': {'id': 1027,
'name': 'Ethereum',
'slug': 'ethereum',
'symbol': 'ETH',
'token_address': '0x2a3bff78b79a009976eea096a51a948a3dc00e34'},
'self_reported_circulating_supply': 19000000,
'self_reported_tags': None,
'slug': 'wilder-world',
'subreddit': '',
'symbol': 'WILD',
'tag-groups': ['INDUSTRY',
'CATEGORY',
'INDUSTRY',
'CATEGORY',
'CATEGORY',
'CATEGORY',
'CATEGORY'],
'tag-names': ['VR/AR',
'Collectibles & NFTs',
'Gaming',
'Metaverse',
'Polkastarter',
'Animoca Brands Portfolio',
'SkyVision Capital Portfolio'],
'tags': ['vr-ar',
'collectibles-nfts',
'gaming',
'metaverse',
'polkastarter',
'animoca-brands-portfolio',
'skyvision-capital-portfolio'],
'twitter_username': 'WilderWorld',
'urls': {'announcement': [],
'chat': [],
'explorer': ['https://etherscan.io/token/0x2a3bff78b79a009976eea096a51a948a3dc00e34'],
'facebook': [],
'message_board': ['https://medium.com/#WilderWorld'],
'reddit': [],
'source_code': [],
'technical_doc': [],
'twitter': ['https://twitter.com/WilderWorld'],
'website': ['https://www.wilderworld.com/']}}},
'status': {'credit_count': 1,
'elapsed': 7,
'error_code': 0,
'error_message': None,
'notice': None,
'timestamp': '2022-01-20T21:33:04.832Z'}}
The data i am trying to get is 'logo': 'https://s2.coinmarketcap.com/static/img/coins/64x64/9674.png', but this sits within [data][9674][logo]
But as this script to running in the background for other objects, i won't know what the number [9674] is for other requests.
So is there a way to get that number automatically?
[data] will always be consistent.
Im using this to get the data back
session = Session()
session.headers.update(headers)
response = session.get(url, params=parameters)
pprint.pprint(json.loads(response.text)['data']['9674']['logo'])
You can try this:
session = Session()
session.headers.update(headers)
response = session.get(url, params=parameters)
resp = json.loads(response.text)
pprint.pprint(resp['data'][next(iter(resp['data']))]['logo'])
where next(iter(resp['data'])) - returns first key in resp['data'] dict. In your example it '9674'
With .keys() you get a List of all Keys in a Dictionary.
So you can use keys = json.loads(response.text)['data'].keys() to get the keys in the data-dict.
If you know there is always only one entry in 'data' you could use json.loads(response.text)['data'][keys[0]]['logo']. Otherwise you would need to iterate over all keys in the list and check which one you need.

How to Match two APIs to update one API dataset using Python

I want to be able to GET information from API 1 and match it with API 2 and be able to update API 2's information with API 1. I am trying to figure out the most efficient/automated way to accomplish this as it also needs to be updated at a interval of every 10 minutes
I can query and get the results from API 1 this is my code and what my code looks like.
import json
import requests
myToken = '52c32f6588004cb3ab33b0ff320b8e4f'
myUrl = 'https://api1.com/api/v1/devices.json'
head = {'Authorization': 'Token {}'.format(myToken)}
response = requests.get(myUrl, headers=head)
r = json.loads(response.content)
r
The payload looks like this from API 1
{ "device" : {
"id": 153,
"battery_status" : 61,
"serial_no": "5QBYGKUI05",
"location_lat": "-45.948917",
"location_lng": "29.832179",
"location_address": "800 Laurel Rd, Lansdale, PA 192522,USA"}
}
I want to be able to take this information and match by "serial_no" and update all the other pieces of information for the corresponding device in API 2
I query the data for API 2 and this is what my code looks like
params = {
"location":'cf6707e3-f0ae-4040-a184-737b21a4bbd1',
"dateAdded":'ge:11/23/2020'}
url = requests.get('https://api2.com/api/assets',auth=('api2', '123456'), params=params)
r = json.loads(url.content)
r['items']
The JSON payload looks like this
[{'id': '064ca857-3783-460e-a7a2-245e054dcbe3',
'name': 'Apple Laptop 1',
'model': {'id': '50f5993e-2abf-49c8-86e0-8743dd58db6f',
'name': 'MacBook Pro'},
'manufacturer': {'id': 'f56244e2-76e3-46da-97dd-f72f92ca0779',
'name': 'APPLE'},
'room': {'id': '700ff2dc-0118-46c6-936a-01f0fa88c620',
'name': 'Storage Room 1',
'thirdPartyId': ''},
'location': {'id': 'cf6707e3-f0ae-4040-a184-737b21a4bbd1',
'name': 'Iron Mountain',
'thirdPartyId': ''},
'position': 'NonMounted',
'containerAsset': {'id': '00000000-0000-0000-0000-000000000000',
'name': None},
'baseAsset': {'id': '064ca857-3783-460e-a7a2-245e054dcbe3',
'name': 'Apple Laptop 1'},
'description': None,
'status': {'id': 'df9906d8-2856-45e3-9cba-bd7a1ac4971f',
'name': 'Production'},
'serialNumber': '5QBYGKUI06',
'tagNumber': None,
'alternateTagNumber': None,
'verificationStatus': {'id': 'cb3560a9-eef5-47b9-b033-394d3a09db18',
'name': 'Verified'},
'requiresRFID': False,
'requiresHangTag': False,
'bottomPosition': 0.0,
'leftPosition': 0.0,
'rackPosition': 'Front',
'labelX': None,
'labelY': None,
'verifyNameInRear': False,
'verifySerialNumberInRear': False,
'verifyBarcodeInRear': False,
'isNonDataCenter': False,
'rotate': False,
'customer': {'id': '00000000-0000-0000-0000-000000000000', 'name': None},
'thirdPartyId': '',
'temperature': None,
'dateLastScanned': None,
'placement': 'Floor',
'lastScannedLabelX': None,
'lastScannedLabelY': None,
'userDefinedValues': [{'userDefinedKeyId': '79e77a1e-4030-4308-a8ff-9caf40c04fbd',
'userDefinedKeyName': 'Longitude ',
'value': '-75.208917'},
{'userDefinedKeyId': '72c8056e-9b7d-40ac-9270-9f5929097e82',
'userDefinedKeyName': 'Address',
'value': '800 Laurel Rd, New York ,NY 19050, USA'},
{'userDefinedKeyId': '31aeeb91-daef-4364-8dd6-b0e3436d6a51',
'userDefinedKeyName': 'Battery Level',
'value': '67'},
{'userDefinedKeyId': '22b7ce4f-7d3d-4282-9ecb-e8ec2238acf2',
'userDefinedKeyName': 'Latitude',
'value': '35.932179'}]}
The documentation provided by API 2 tells me they only support PUT for updates as of right now but I would also want to know how I would do this using PATCH as it will be available in the future. So the data payload that I need to successful PUT is this
payload = {'id': '064ca857-3783-460e-a7a2-245e054dcbe3',
'name': 'Apple Laptop 1',
'model': {'id': '50f5993e-2abf-49c8-86e0-8743dd58db6f',
'name': 'MacBook Pro'},
'manufacturer': {'id': 'f56244e2-76e3-46da-97dd-f72f92ca0779',
'name': 'APPLE'},
'room': {'id': '700ff2dc-0118-46c6-936a-01f0fa88c620',
'name': 'Storage Room 1',
'thirdPartyId': ''},
'status': {'id': 'df9906d8-2856-45e3-9cba-bd7a1ac4971f',
'name': 'Production'},
'serialNumber': '5QBYGKUI06',
'verificationStatus': {'id': 'cb3560a9-eef5-47b9-b033-394d3a09db18',
'name': 'Verified'},
'requiresRFID': 'False',
'requiresHangTag': 'False',
'userDefinedValues': [{'userDefinedKeyId': '79e77a1e-4030-4308-a8ff-9caf40c04fbd',
'userDefinedKeyName': 'Longitude ',
'value': '-75.248920'},
{'userDefinedKeyId': '72c8056e-9b7d-40ac-9270-9f5929097e82',
'userDefinedKeyName': 'Address',
'value': '801 Laurel Rd, New York, Ny 192250, USA'},
{'userDefinedKeyId': '31aeeb91-daef-4364-8dd6-b0e3436d6a51',
'userDefinedKeyName': 'Battery Level',
'value': '67'},
{'userDefinedKeyId': '22b7ce4f-7d3d-4282-9ecb-e8ec2238acf2',
'userDefinedKeyName': 'Latitude',
'value': '29.782177'}]}
So apart of this is figuring out how I can query the json data portions that I need for the update
I am able to update the information using this line
requests.put('https://api2.com/api/assets/064ca857-3783-460e-a7a2-245e054dcbe3',auth=('API2', '123456'), data=json.dumps(payload))
but I need for it to dynamically update so I don't think the hard coded id parameter in the line will be efficient in a automation/efficiency standpoint. If anybody has any ideas, resources to point me in the right direction to know more about this process (I don't really know what it is even called) would be greatly appreciated.
Not entirely sure what you are trying to do here, but if you want to pull information nested in the responses you can do this.
Serial number from API 1
r['device']['serial_no']
Serial number for API 2
either r[0]['serialNumber'] or r['items'][0]['serialNumber'] depending on what you are showing
To modify the payload serial number, for example
payload['serialNumber'] = '123456abcdef'

Accessing keys/values in a paginated/nested dictionary

I know that somewhat related questions have been asked here: Accessing key, value in a nested dictionary and here: python accessing elements in a dictionary inside dictionary among other places but I can't quite seem to apply the answers' methodology to my issue.
I'm getting a KeyError trying to access the keys within response_dict, which I know is due to it being nested/paginated and me going about this the wrong way. Can anybody help and/or point me in the right direction?
import requests
import json
URL = "https://api.constantcontact.com/v2/contacts?status=ALL&limit=1&api_key=<redacted>&access_token=<redacted>"
#make my request, store it in the requests object 'r'
r = requests.get(url = URL)
#status code to prove things are working
print (r.status_code)
#print what was retrieved from the API
print (r.text)
#visual aid
print ('---------------------------')
#decode json data to a dict
response_dict = json.loads(r.text)
#show how the API response looks now
print(response_dict)
#just for confirmation
print (type(response_dict))
print('-------------------------')
# HERE LIES THE ISSUE
print(response_dict['first_name'])
And my output:
200
{"meta":{"pagination":{}},"results":[{"id":"1329683950","status":"ACTIVE","fax":"","addresses":[{"id":"4e19e250-b5d9-11e8-9849-d4ae5275509e","line1":"222 Fake St.","line2":"","line3":"","city":"Kansas City","address_type":"BUSINESS","state_code":"","state":"OK","country_code":"ve","postal_code":"19512","sub_postal_code":""}],"notes":[],"confirmed":false,"lists":[{"id":"1733488365","status":"ACTIVE"}],"source":"Site Owner","email_addresses":[{"id":"1fe198a0-b5d5-11e8-92c1-d4ae526edd6c","status":"ACTIVE","confirm_status":"NO_CONFIRMATION_REQUIRED","opt_in_source":"ACTION_BY_OWNER","opt_in_date":"2018-09-11T18:18:20.000Z","email_address":"rsmith#fake.com"}],"prefix_name":"","first_name":"Robert","middle_name":"","last_name":"Smith","job_title":"I.T.","company_name":"FBI","home_phone":"","work_phone":"5555555555","cell_phone":"","custom_fields":[],"created_date":"2018-09-11T15:12:40.000Z","modified_date":"2018-09-11T18:18:20.000Z","source_details":""}]}
---------------------------
{'meta': {'pagination': {}}, 'results': [{'id': '1329683950', 'status': 'ACTIVE', 'fax': '', 'addresses': [{'id': '4e19e250-b5d9-11e8-9849-d4ae5275509e', 'line1': '222 Fake St.', 'line2': '', 'line3': '', 'city': 'Kansas City', 'address_type': 'BUSINESS', 'state_code': '', 'state': 'OK', 'country_code': 've', 'postal_code': '19512', 'sub_postal_code': ''}], 'notes': [], 'confirmed': False, 'lists': [{'id': '1733488365', 'status': 'ACTIVE'}], 'source': 'Site Owner', 'email_addresses': [{'id': '1fe198a0-b5d5-11e8-92c1-d4ae526edd6c', 'status': 'ACTIVE', 'confirm_status': 'NO_CONFIRMATION_REQUIRED', 'opt_in_source': 'ACTION_BY_OWNER', 'opt_in_date': '2018-09-11T18:18:20.000Z', 'email_address': 'rsmith#fake.com'}], 'prefix_name': '', 'first_name': 'Robert', 'middle_name': '', 'last_name': 'Smith', 'job_title': 'I.T.', 'company_name': 'FBI', 'home_phone': '', 'work_phone': '5555555555', 'cell_phone': '', 'custom_fields': [], 'created_date': '2018-09-11T15:12:40.000Z', 'modified_date': '2018-09-11T18:18:20.000Z', 'source_details': ''}]}
<class 'dict'>
-------------------------
Traceback (most recent call last):
File "C:\Users\rkiek\Desktop\Python WIP\Chris2.py", line 20, in <module>
print(response_dict['first_name'])
KeyError: 'first_name'
first_name = response_dict["results"][0]["first_name"]
Even though I think this question would be better answered by yourself by reading some documentation, I will explain what is going on here. You see the dict-object of the man named "Robert" is within a list which is a value under the key "results". So, at first you need to access the value within results which is a python-list.
Then you can use a loop to iterate through each of the elements within the list, and treat each individual element as a regular dictionary object.
results = response_dict["results"]
results = response_dict.get("results", None)
# use any one of the two above, the first one will throw a KeyError if there is no key=="results" the other will return NULL
# this results is now a list according to the data you mentioned.
for item in results:
print(item.get("first_name", None)
# here you can loop through the list of dictionaries and treat each item as a normal dictionary

Accessing YAML data in Python

I have a YAML file that parses into an object, e.g.:
{'name': [{'proj_directory': '/directory/'},
{'categories': [{'quick': [{'directory': 'quick'},
{'description': None},
{'table_name': 'quick'}]},
{'intermediate': [{'directory': 'intermediate'},
{'description': None},
{'table_name': 'intermediate'}]},
{'research': [{'directory': 'research'},
{'description': None},
{'table_name': 'research'}]}]},
{'nomenclature': [{'extension': 'nc'}
{'handler': 'script'},
{'filename': [{'id': [{'type': 'VARCHAR'}]},
{'date': [{'type': 'DATE'}]},
{'v': [{'type': 'INT'}]}]},
{'data': [{'time': [{'variable_name': 'time'},
{'units': 'minutes since 1-1-1980 00:00 UTC'},
{'latitude': [{'variable_n...
I'm having trouble accessing the data in python and regularly see the error TypeError: list indices must be integers, not str
I want to be able to access all elements corresponding to 'name' so to retrieve each data field I imagine it would look something like:
import yaml
settings_stream = open('file.yaml', 'r')
settingsMap = yaml.safe_load(settings_stream)
yaml_stream = True
print 'loaded settings for: ',
for project in settingsMap:
print project + ', ' + settingsMap[project]['project_directory']
and I would expect each element would be accessible via something like ['name']['categories']['quick']['directory']
and something a little deeper would just be:
['name']['nomenclature']['data']['latitude']['variable_name']
or am I completely wrong here?
The brackets, [], indicate that you have lists of dicts, not just a dict.
For example, settingsMap['name'] is a list of dicts.
Therefore, you need to select the correct dict in the list using an integer index, before you can select the key in the dict.
So, giving your current data structure, you'd need to use:
settingsMap['name'][1]['categories'][0]['quick'][0]['directory']
Or, revise the underlying YAML data structure.
For example, if the data structure looked like this:
settingsMap = {
'name':
{'proj_directory': '/directory/',
'categories': {'quick': {'directory': 'quick',
'description': None,
'table_name': 'quick'}},
'intermediate': {'directory': 'intermediate',
'description': None,
'table_name': 'intermediate'},
'research': {'directory': 'research',
'description': None,
'table_name': 'research'},
'nomenclature': {'extension': 'nc',
'handler': 'script',
'filename': {'id': {'type': 'VARCHAR'},
'date': {'type': 'DATE'},
'v': {'type': 'INT'}},
'data': {'time': {'variable_name': 'time',
'units': 'minutes since 1-1-1980 00:00 UTC'}}}}}
then you could access the same value as above with
settingsMap['name']['categories']['quick']['directory']
# quick

Categories

Resources