Generating specific fields from JSON objects

Generating specific fields from JSON objects - python

I have a rather massive JSON objects and I'm trying to generate a specific JSON using only certain elements from it.
My current code looks like this:
data = get.req("/v2/users")
data = json.loads(data)
print (type(data)) # This returns <class 'list'>
print (data) # Below for what this returns
all_data = []
for d in data:
login_value = d['login']
if login_value.startswith('fe'):
continue
s = get.req("/v2/users/" + str(login_value)) # Sending another request with each
# login from the first request
all_data.append(s)
print (all_data) # Below for what this looks like this
print (data) before json.loads is str for the information, it returns data like this:
[
{
"id": 68663,
"login": "test1",
"url": "https://x.com/test"
},
{
"id": 67344,
"login": "test2",
"url": "https://x.com/test"
},
{
"id": 66095,
"login": "hi",
"url": "https://x.com/test"
}
]
print (all_data) returns a similar result to this for every user a request was sent for the first time
[b'{"id":68663,"email":"x#gmail.com","login":"xg","phone":"hidden","fullname":"xg gx","image_url":"https://imgur.com/random.png","mod":false,"points":5,"activity":0,"groups":['skill': 'archery']}
And this repeats for every user.
What I'm attempting to do is filtering by a few fields from all those results I received, so the final JSON I have will look something like this
[
{
"email": "x#gmail.com",
"fullname": "xg gf",
"points": 5,
"image_url", "https://imgur.com/random.pmg"
},
{
... similar json for the next user and so on
}
]
I feel as if the way I'm iterating over the data might be inefficient, so if you could guide me to a better way it would be wonderful.

In order for you to fetch login value, you have to iterate over data atleast once and for fetching details for every user, you have to make one call and that is exactly what you have done.
After you receive the user details instead of appending the whole object to the all_data list just take the field you need and construct a dict of it and then append it to all_data.
So your code has time complexity of O(n) which is best I understand.
Edit :
For each user you are receiving a byte response like below.
byte_response = [ b'{"id":68663,"email":"x#gmail.com","login":"xg","phone":"hidden","fullname":"xg gx","image_url":"https://imgur.com/random.png","mod":false,"points":5,"activity":0,"groups":[]}']
I'm not sure why would you get a response in a list [], but if it like that then take byte_response[0] so that we have the actual byte data like below.
byte_response = b'{"id":68663,"email":"x#gmail.com","login":"xg","phone":"hidden","fullname":"xg gx","image_url":"https://imgur.com/random.png","mod":false,"points":5,"activity":0,"groups":[]}'
response_decoded = byte_response.decode("utf-8") #decode it
import json
json_object_in_dict_form = json.loads(response_decoded) #convert it into dictionary
and then...
json_object_in_dict_form['take the field u want']

you can write:
data = get.req("/v2/users")
data = json.loads(data)
all_data = []
for d in data:
...
s = get.req("/v2/users/" + str(login_value))
new_data = {
'email': s['email'],
'fullname': s['fullname'],
'points': s['points'],
'image_url': s['image_url']
}
all_data.append(new_data)
print (all_data)
or you can make it fancy using an array with the fields you need:
data = get.req("/v2/users")
data = json.loads(data)
all_data = []
fields = ['email', 'fullname', 'point', 'image_url']
for d in data:
...
s = get.req("/v2/users/" + str(login_value))
new_data = dict()
for field in fields:
new_data[field] = s[field]
all_data.append(new_data)
print (all_data)

Related

How to extract and print a list of values from my JSON data?

I'm working in Python (3.8) and I've successfully called an API gotten it to print the JSON within command line after running the Python file. Now, I want to be able to print a particular list of information (like all of the names from the JSON), and later on save that list as its own set of data, but I'm hitting a block.
Example JSON I'm working with:
{
"data": {
"employees": [
{
"fields": {
"name": "Buddy",
"superheroName": "Syndrome",
"workEmail": "syndrome#example.com",
}
},
{
"fields": {
"name": "Helen Parr",
"superheroName": "Elastigirl",
"workEmail": "elastigirl#example.com",
}
}
]
}
I’ve tried the following so far and I was able to get “data” to print, but anytime I try to print another “layer” and get to say...“employees” or “fields” even, I hit a wall.
url = "my API url"
response = requests.get(url)
if response.status_code != 200:
print('Error with status code {}'.format(response.status_code))
exit()
jsonResponse = response.json()
jsonPretty = json.dumps(jsonResponse, indent=4, sort_keys=True)
jsonDictionary = json.loads(jsonPretty)
keys = jsonDictionary.keys()
for key in jsonDictionary.keys():
print(key)
Ideally, could someone share insight into how I can access the 'name' JSON value and get Python to print it as a list like the following, for example:
Buddy
Helen Parr

JSON files are basically nested dictionaries. jsonDictionary only contains one key and one entry under that key: data and another dictionary with the rest your result respectively.
If you wanted to access the name fields specifically:
employeesDict = jsonDictionary['data']
feildsDictList = employeesDict['employees']
firstFieldsDict = fieldsDictList[0]
secondFieldsDict = fieldsDictList[1]
firstName = firstFieldsDict['name']
secondNAme = secondFieldsDict['name']

You can access it like this (make sure it's already a dictionary):
for i in h['data']['employees']:
print(i['fields']['name'])
This way you can access the names with i['fields']['name']

How can I filter API GET Request on multiple variables?

I am really struggling with this one. I'm new to python and I'm trying to extract data from an API.
I have managed to run the script below but I need to amend it to filter on multiple values for one column, lets say England and Scotland. Is there an equivelant to the SQL IN operator e.g. Area_Name IN ('England','Scotland').
from requests import get
from json import dumps
ENDPOINT = "https://api.coronavirus.data.gov.uk/v1/data"
AREA_TYPE = "nation"
AREA_NAME = "england"
filters = [
f"areaType={ AREA_TYPE }",
f"areaName={ AREA_NAME }"
]
structure = {
"date": "date",
"name": "areaName",
"code": "areaCode",
"dailyCases": "newCasesByPublishDate",
}
api_params = {
"filters": str.join(";", filters),
"structure": dumps(structure, separators=(",", ":")),
"latestBy": "cumCasesByPublishDate"
}
formats = [
"json",
"xml",
"csv"
]
for fmt in formats:
api_params["format"] = fmt
response = get(ENDPOINT, params=api_params, timeout=10)
assert response.status_code == 200, f"Failed request for {fmt}: {response.text}"
print(f"{fmt} data:")
print(response.content.decode())

I have tried the script, and dict is the easiest type to handle in this case.
Given your json data output
data = {"length":1,"maxPageLimit":1,"data":[{"date":"2020-09-17","name":"England","code":"E92000001","dailyCases":2788}],"pagination":{"current":"/v1/data?filters=areaType%3Dnation%3BareaName%3Dengland&structure=%7B%22date%22%3A%22date%22%2C%22name%22%3A%22areaName%22%2C%22code%22%3A%22areaCode%22%2C%22dailyCases%22%3A%22newCasesByPublishDate%22%7D&latestBy=cumCasesByPublishDate&format=json&page=1","next":null,"previous":null,"first":"/v1/data?filters=areaType%3Dnation%3BareaName%3Dengland&structure=%7B%22date%22%3A%22date%22%2C%22name%22%3A%22areaName%22%2C%22code%22%3A%22areaCode%22%2C%22dailyCases%22%3A%22newCasesByPublishDate%22%7D&latestBy=cumCasesByPublishDate&format=json&page=1","last":"/v1/data?filters=areaType%3Dnation%3BareaName%3Dengland&structure=%7B%22date%22%3A%22date%22%2C%22name%22%3A%22areaName%22%2C%22code%22%3A%22areaCode%22%2C%22dailyCases%22%3A%22newCasesByPublishDate%22%7D&latestBy=cumCasesByPublishDate&format=json&page=1"}}
You can try something like this:
countries = ['England', 'France', 'Whatever']
return [country for country in data where country['name'] in countries]
I presume the data list is the only interesting key in the data dict since all others do not have any meaningful values.

How do I convert my tuple into the format so that it is acceptable for the JSON format in Python

I currently have this method in python code :
#app.route('/getData', methods = ['GET'])
def get_Data():
c.execute("SELECT abstract,category,date,url from Data")
data = c.fetchall()
resp = jsonify(data)
resp.status_code = 200
return resp
The output I get from this is:
[
[
"2020-04-23 15:32:13",
"Space",
"https://www.bisnow.com/new-jersey",
"temp"
],
[
"2020-04-23 15:32:13",
"Space",
"https://www.bisnow.com/events/new-york",
"temp"
]
]
However, I want the output to look like this:
[
{
"abstract": "test",
"category": "journal",
"date": "12-02-2020",
"link": "www.google.com"
},
{
"abstract": "test",
"category": "journal",
"date": "12-02-2020",
"link": "www.google.com"
}
]
How do I convert my output into an expected format?

As #jonrsharpe indicates, you simply cannot expect the tuple coming from this database query to turn into a dictionary in the JSON output. Your data variable does not contain the information necessary to construct the response you desire.
It will depend on your database but my recommendation would be to find a way to retrieve dicts from your database query instead of tuples, in which case the rest of your code should work as is. For instance, for sqlite, you could define your cursor c like this:
import sqlite3
connection = sqlite3.connect('dbname.db') # database connection details here...
connection.row_factory = sqlite3.Row
c = connection.cursor()
Now, if your database for some reason cannot support a dictionary cursor, you need to roll your own dictionary after retrieving the database query results. For your example, something like this:
fieldnames = ('abstract', 'category', 'date', 'link')
numfields = len(fieldnames)
data = []
for row in c.fetchall():
for idx in range(0, numfields - 1):
dictrow[fields[idx]] = row[idx]
data.append(dictrow)
I iterate over a list of field labels, which do not have to match your database columns but do have to be in the same order, and creating a dict by pairing the label with the datum from the db tuple in the same position. This passage would replace the single line data = c.fetchall() in OP.

error in accessing a nested json within a list in a json in python

I have this sample json data that is coming as a response from a POST request- response is the name of below sample json data variable:
{
"statusCode": "OK",
"data": [
{
"id": -199,
"result": {
"title": "test1",
"group": "test_grp2"
}
},
{
"id": -201,
"result": {
"title": "test2",
"group": "test_grp2"
}
}
]
}
Now what I want to do is form a list of list where each list will have id,title,and group values from above json data. Here is what my code looks like:
def get_list():
# code to get the response from POST request
json_data = json.loads(response.text)
group_list = []
if json_data['statusCode'] == 'OK':
for data in json_data['data']:
print(data)
for result_data in data['result']:
title = result_data['title']
group = result_data['group']
group_list.append([data['id'],title,group])
return(group_list)
When I execute this I get error as list indices must be int not str at line title = result_data['title]. How can I form the list of list above?

data['result'] in your case is a dictionary. When you iterate over it using for result_data in data['result'], you get the keys - in other words, result_data is a string and it does not support string indexing.
Instead, you meant:
for data in json_data['data']:
result_data = data['result']
title = result_data['title']
group = result_data['group']
Or, alternatively, you can make use of a list comprehension:
group_list = [(data['id'], data['result']['title'], data['result']['group'])
for data in json_data['data']]
print(group_list)

You can extract the list of lists directly using a list comprehension:
>>> [[d['id'], d['result']['title'], d['result']['group']] for d in json_data['data']]
[[-199, 'test1', 'test_grp2'], [-201, 'test2', 'test_grp2']]

API calls in Python, 2 different JSON Strings

I'm trying to call API's from various cryptocurrency exchanges in PYTHON.
This is the API JSON-string that gets returned by the following URL (https://api.mintpal.com/v1/market/stats/uro/BTC)
[
{
"market_id": "210",
"coin": "Uro",
"code": "URO",
"exchange": "BTC",
"last_price": "0.00399700",
"yesterday_price": "0.00353011",
"change": "+13.23",
"24hhigh": "0.00450000",
"24hlow": "0.00353010",
"24hvol": "6.561",
"top_bid": "0.00374001",
"top_ask": "0.00399700"
}
]
I'm interested in getting the "Last Price", I print it using the following code.
import urllib2
import json
url = 'https://api.mintpal.com/v1/market/stats/URO/BTC'
json_obj = urllib2.urlopen(url)
URO_data = json.load(json_obj)
for item in URO_data:
URO_last_price = item['last_price']
print URO_last_price
So far so good. However, I'm trying to do the same thing for the Bittrex exchange using the following URL (https://bittrex.com/api/v1.1/public/getmarketsummary?market=btc-uro)
The JSON String returned looks as follows:
{
"success": true,
"message": "",
"result": [
{
"MarketName": "BTC-URO",
"High": 0.00479981,
"Low": 0.00353505,
"Volume": 30375.93454693,
"Last": 0.00391656,
"BaseVolume": 120.61056568,
"TimeStamp": "2014-07-29T17:54:35.897",
"Bid": 0.00393012,
"Ask": 0.00395967,
"OpenBuyOrders": 182,
"OpenSellOrders": 182,
"PrevDay": 0.00367999,
"Created": "2014-05-15T05:46:29.917"
}
]
}
This JSON string has a different structure then the one before, and I can't use my first code to get the value "LAST". However, I can work around it by printing 'string index', but that's not a solution.
url = 'https://bittrex.com/api/v1.1/public/getticker?market=btc-uro'
json_obj = urllib2.urlopen(url)
URO_data = json.load(json_obj)
URO_String = str(URO_data)
last_price = URO_String[79:89]
URO_LastPrice = float(last_price)
print last_price
I want to get the value of "Last" in the second JSON string.

The Json is best thought of as a bunch of layers, and you need to get the information you need by going layer by layer.
The first layer you need to get is "result" with:
URO_data["result"]
REsult is an array, so you need to get the first item in the array by index
URO_data["result"][0]
From that object, you need to get the Last entry
URO_data["result"][0]["Last"]
That should work, but I haven't tested it.

You can access it as bellow,
>>>
>>> url = 'https://bittrex.com/api/v1.1/public/getticker?market=btc-uro'
>>> json_obj = urllib2.urlopen(url)
>>> URO_data = json.load(json_obj)
>>> URO_data['result']['Last']
0.00396129
>>>

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Generating specific fields from JSON objects - python

Related

How to extract and print a list of values from my JSON data?

How can I filter API GET Request on multiple variables?

How do I convert my tuple into the format so that it is acceptable for the JSON format in Python

error in accessing a nested json within a list in a json in python

API calls in Python, 2 different JSON Strings

Categories

Resources