Extract values from Json code using Python [duplicate] - python

This question already has answers here:
How can I parse (read) and use JSON?
(5 answers)
Closed 3 years ago.
I'm trying to extract the values from the following json code in this format by displaying each device in a different row. Any ideas on how to do it? Thanks in advance
device1 : 232323 Completed
device2 : 345848 Completed etc
I tried the below which gives me the following output:
[{'cid': 631084346, 'data': [[{'text': 'device1'}], [{'text': '732536:Completed.'}], [{'text': '1'}]], 'id': 1750501182}, {'cid': 1891684718, 'data': [[{'text': 'device2'}], [{'text': '732536:Completed.'}], [{'text': '1'}]], 'id': 2218910703}
api_readable = api_response.json()
computer_name = str(api_readable['data']['result_sets'][0]['rows'])
# computer_name = re.sub("[{},':]", "", computer_name)
print(computer_name)
Sample Json used
{
"data": {
"max_available_age": "",
"now": "2020/01/09 22:43:58 GMT-0000",
"result_sets": [{
"age": 0,
"archived_question_id": 0,
"columns": [{
"hash": 3409330187,
"name": "Computer Name",
"type": 1
},
{
"hash": 1792443391,
"name": "Action Statuses",
"type": 1
},
{
"hash": 0,
"name": "Count",
"type": 3
}
],
"error_count": 0,
"estimated_total": 129,
"expire_seconds": 3660,
"filtered_row_count": 11,
"filtered_row_count_machines": 11,
"id": 2880628,
"issue_seconds": 0,
"item_count": 11,
"mr_passed": 129,
"mr_tested": 129,
"no_results_count": 0,
"passed": 128,
"question_id": 2880628,
"report_count": 233,
"row_count": 11,
"row_count_machines": 11,
"rows": [{
"cid": 631084346,
"data": [
[{
"text": "device1"
}],
[{
"text": "732536:Completed."
}],
[{
"text": "1"
}]
],
"id": 1750501182
},
{
"cid": 1891684718,
"data": [
[{
"text": "device2"
}],
[{
"text": "732536:Completed."
}],
[{
"text": "1"
}]
],
"id": 2218910703
}

api_readable['data']['result_sets'][0]['rows'] is a list, you can simply loop over it and print the items you want.
for row in api_readable['data']['result_sets'][0]['rows']:
print(f"{row['data'][0][0]['text']} : {row['data'][1][0]['text']}");

Related

How to handle JSON list value in dataframe

I receive this json from an API call:
data = {'List': [{'id': 12403,
'name': 'myname',
'code': 'mycode',
'description': '',
'createdBy': '',
'createdDate': '24-Jun-2008 15:03:59 CDT',
'lastModifiedBy': '',
'lastModifiedDate': '24-Jun-2008 15:03:59 CDT'}]}
I want to handle this data and move it into a dataframe. When I attempt this with json_normalize it's basically putting my list value into a single cell in my dataframe.
My attempt:
import pandas as pd
df = pd.json_normalize(data)
Current output:
List
0 [{'id': 12403, 'name': 'myname', 'code': 'mycode...
Desired output:
Question
What's the best way to work with a list value from JSON to pandas dataframe?
Update
{
"Count": 38,
"Items": [
{
"Actions": [
"edit_",
"remove_",
"attachments_",
"cancel",
"continue",
"auditTrail",
"offline_",
"changeUser",
"linkRecord",
"resendNotification"
],
"Columns": [
{
"Label": "Workflow Name",
"Name": "__WorkflowName__",
"Value": "VOAPTSQA00000735"
},
{
"Label": "Workflow Description",
"Name": "__WorkflowDescription__",
"Value": "Vendor Outsourcing Contract Request (APTSQA | SAP Integration)"
},
{
"Label": "Current Assignee",
"Name": "__CurrentAssignee__",
"Value": "Vendor Outsourcing Integration User"
},
{
"Label": "Last Updated",
"Name": "__DateLastUpdated__",
"Value": "9/7/2022 12:22:14 PM"
},
{
"Label": "Created",
"Name": "__DateCreated__",
"Value": "9/7/2022 12:20:55 PM"
},
{
"Label": "Date Signed",
"Name": "__DateSigned__",
"Value": ""
},
{
"Label": "Completed",
"Name": "__DateCompleted__",
"Value": ""
},
{
"Label": "Status",
"Name": "__Status__",
"Value": "In RFP"
},
{
"Label": "Document ID",
"Name": "__DocumentIdentifier__",
"Value": ""
},
{
"Label": "End Date",
"Name": "__EndDate__",
"Value": "12/31/2033 12:00:00 AM"
},
{
"Label": "Stage Progress",
"Name": "__FormProgress__",
"Value": "0"
},
{
"Label": "Next Signer",
"Name": "__NextSigner__",
"Value": ""
}
],
"ResultSetId": "784a1b83-4d83-4b80-87a3-9c1293baa7d8",
"TaskId": "784a1b83-4d83-4b80-87a3-9c1293baa7d8",
"TokenId": "cdd53c33-803d-4a63-9abd-47b733b55e89"
}
Adding context for my comment about nested list of key pair values. Here when I normalize the json, I get the list of Columns all as one value in a cell.
The values of interest are under the List key, so slice it:
df = pd.json_normalize(data['List'])
output:
id name code description createdBy createdDate lastModifiedBy lastModifiedDate
0 12403 myname mycode 24-Jun-2008 15:03:59 CDT 24-Jun-2008 15:03:59 CDT

Continuous API data stream

I'm new to python and i've dived in to web API data collection, mainly in relation to my work. I'm trying to read data from a wireless network (JSON format) and saving it on the go. My code so far looks like this:
import requests
import json
apiKey = "<API_KEY_HERE>"
f = open('wifi_data.json','a')
s = requests.Session()
s.headers = {'X-API-Key': apiKey}
r = s.get('<URL_HERE>', stream=True)
for line in r.iter_lines():
if line:
f.write(str(json.dumps(json.loads(line), indent=4, sort_keys=True)))
My problem is, that this generates multiple JSON objects/dictionaries, and i can't seem to figure out, how to save it in proper JSON format, so i can further process and analyse the data.
My JSON looks like this:
{
"eventType": "KEEP_ALIVE",
"partnerTenantId": "",
"recordTimestamp": 1651149179364,
"recordUid": "event-f0c01f8e",
"spacesTenantId": "",
"spacesTenantName": ""
}{
"eventType": "IOT_TELEMETRY",
"iotTelemetry": {
"detectedPosition": {
"confidenceFactor": 48.0,
"lastLocatedTime": 1651149175000,
"latitude": 0.0,
"locationId": "location-fa765bec",
"longitude": 0.0,
"mapId": "",
"xPos": 19.6,
"yPos": 70.3
},
"deviceInfo": {
"companyId": "4c00",
"deviceId": "04:ee:03:53:60:4a",
"deviceMacAddress": "04:ee:03:53:60:4a",
"deviceName": "",
"deviceType": "IOT_BLE_DEVICE",
"firmwareVersion": "",
"group": [],
"label": "",
"manufacturer": "",
"rawDeviceId": "",
"serviceUuid": ""
},
"deviceRtcTime": -1,
"iBeacon": {
"advertizedTxPower": -70,
"beaconMacAddress": "04:ee:03:53:60:4a",
"major": 83,
"minor": 24650,
"uuid": "20cae8a0a9cf11e3a5e20800200c9a66"
},
"location": {
"apCount": 0,
"inferredLocationTypes": [
"CMXZONE"
],
"locationId": "location-fa765bec",
"name": "Reception",
"parent": {
"apCount": 8,
"floorNumber": 1,
"inferredLocationTypes": [
"FLOOR"
],
"locationId": "location-549356d4",
"name": "DNA Spaces Lab",
"parent": {
"apCount": 8,
"inferredLocationTypes": [
"NETWORK",
"BUILDING"
],
"locationId": "location-2f64620f",
"name": "SJC-19",
"parent": {
"apCount": 8,
"inferredLocationTypes": [
"CAMPUS"
],
"locationId": "location-36f8282c",
"name": "DNASpacesLab",
"parent": {
"apCount": 12,
"inferredLocationTypes": [
"ROOT"
],
"locationId": "location-65fae68e",
"name": "DNASpacesLAB",
"sourceLocationId": ""
},
"sourceLocationId": "f0918a66-8394-4f3b-ae7e-27fdca30da43"
},
"sourceLocationId": "c957ba87-4502-4ae9-81f3-246629a82711"
},
"sourceLocationId": "bb49a8cc-069a-4017-a1ce-c1eb9689eaa2"
},
"sourceLocationId": "83fbf9ed-f601-44c6-a60d-150ba5e9b149"
},
"maxDetectedRssi": -77,
"rawHeader": 0,
"rawPayload": "AgEGGv9MAAIVIMrooKnPEeOl4ggAIAyaZgBTYEq6",
"sequenceNum": 0
},
"partnerTenantId": "dnaspaceslab",
"recordTimestamp": 1651149179370,
"recordUid": "event-3039fb26",
"spacesTenantId": "spaces-tenant-464026d0",
"spacesTenantName": "DNASpacesLAB"
}
As you can see, not proper JSON, but unsure how to format it correctly, while the data streams from the API.
Any help is appreciated.
Thanks

How to extract JSON from a nested JSON file?

I am calling an API and getting a response like the below.
{
"status": 200,
"errmsg": "OK",
"data": {
"total": 12,
"items": [{
"id": 11,
"name": "BBC",
"priority": 4,
"levelStr": "All",
"escalatingChainId": 3,
"escalatingChain": {
"inAlerting": false,
"throttlingAlerts": 20,
"enableThrottling": true,
"name": "Example123",
"destination": [],
"description": "",
"ccdestination": [],
"id": 3,
"throttlingPeriod": 10
}
},
{
"id": 21,
"name": "CNBC",
"priority": 4,
"levelStr": "All",
"escalatingChainId": 3,
"escalatingChain": {
"inAlerting": false,
"throttlingAlerts": 20,
"enableThrottling": true,
"name": "Example456",
"destination": [],
"description": "",
"ccdestination": [],
"id": 3,
"throttlingPeriod": 10
}
}
]
}
}
I need to clean-up this JSON a bit and produce a simple JSON like below where escalatingChainName is the name in the escalatingChain list so that I can write this into a CSV file.
{
"items": [{
"id": 11,
"name": "BBC",
"priority": 4,
"levelStr": "All",
"escalatingChainId": 3,
"escalatingChainName": "Example123"
},
{
"id": 21,
"name": "CNBC",
"priority": 4,
"levelStr": "All",
"escalatingChainId": 3,
"escalatingChainName": "Example456"
}
]
}
Is there a JSON function that I can use to copy only the necessary key-value or nested key-values to a new JSON object?
With the below code, I am able to get the details list.
json_response = response.json()
items = json_response['data']
details = items['items']
I can print individual list items using
for x in details:
print(x)
How do I take it from here to pull only the necessary fields like id, name, priority and the name from escalatingchain to create a new list or JSON?
There is no existing function that will do what you want, so you'll need to write one. Fortunately that's not too hard in this case — basically you just create a list of new items by extracting the pieces of data you want from the existing ones.
import json
json_response = """\
{
"status": 200,
"errmsg": "OK",
"data": {
"total": 12,
"items": [{
"id": 11,
"name": "BBC",
"priority": 4,
"levelStr": "All",
"escalatingChainId": 3,
"escalatingChain": {
"inAlerting": false,
"throttlingAlerts": 20,
"enableThrottling": true,
"name": "Example123",
"destination": [],
"description": "",
"ccdestination": [],
"id": 3,
"throttlingPeriod": 10
}
},
{
"id": 21,
"name": "CNBC",
"priority": 4,
"levelStr": "All",
"escalatingChainId": 3,
"escalatingChain": {
"inAlerting": false,
"throttlingAlerts": 20,
"enableThrottling": true,
"name": "Example456",
"destination": [],
"description": "",
"ccdestination": [],
"id": 3,
"throttlingPeriod": 10
}
}
]
}
}
"""
response = json.loads(json_response)
cleaned = []
for item in response['data']['items']:
cleaned.append({'id': item['id'],
'name': item['name'],
'priority': item['priority'],
'levelStr': item['levelStr'],
'escalatingChainId': item['escalatingChainId'],
'escalatingChainName': item['escalatingChain']['name']})
print('cleaned:')
print(json.dumps(cleaned, indent=4))
You can try:
data = {
"status": 200,
"errmsg": "OK",
"data": {
"total": 12,
"items": [{
"id": 11,
"name": "BBC",
"priority": 4,
"levelStr": "All",
"escalatingChainId": 3,
"escalatingChain": {
"inAlerting": False,
"throttlingAlerts": 20,
"enableThrottling": True,
"name": "Example123",
"destination": [],
"description": "",
"ccdestination": [],
"id": 3,
"throttlingPeriod": 10
}
},
{
"id": 21,
"name": "CNBC",
"priority": 4,
"levelStr": "All",
"escalatingChainId": 3,
"escalatingChain": {
"inAlerting": False,
"throttlingAlerts": 20,
"enableThrottling": True,
"name": "Example456",
"destination": [],
"description": "",
"ccdestination": [],
"id": 3,
"throttlingPeriod": 10
}
}
]
}
}
for single_item in data["data"]["items"]:
print(single_item["id"])
print(single_item["name"])
print(single_item["priority"])
print(single_item["levelStr"])
print(single_item["escalatingChain"]["inAlerting"])
# and so on
Two ways of approaching this depending on whether your dealing with a variable or .json file using python list and dictionary comprehension:
Where data variable of type dictionary (nested) already defined:
# keys you want
to_keep = ['id', 'name', 'priority', 'levelStr', 'escalatingChainId',
'escalatingChainName']
new_data = [{k:v for k,v in low_dict.items() if k in to_keep}
for low_dict in data['data']['items']]
# where item is dictionary at lowest level
escalations = [{v+'Name':k[v]['name']} for k in data['data']['items']
for v in k if type(k[v])==dict]
# merge both lists of python dictionaries to produce flattened list of dictionaries
new_data = [{**new,**escl} for new,escl in zip(new_data,escalations)]
Or (and since your refer json package) if you have save the response to as a .json file:
import json
with open('response.json', 'r') as handl:
data = json.load(handl)
to_keep = ['id', 'name', 'priority', 'levelStr', 'escalatingChainId',
'escalatingChainName']
new_data = [{k:v for k,v in low_dict.items() if k in to_keep}
for low_dict in data['data']['items']]
escalations = [{v+'Name':k[v]['name']} for k in data['data']['items']
for v in k if type(k[v])==dict]
new_data = [{**new,**escl} for new,escl in zip(new_data,escalations)]
Both produce output:
[{'id': 11,
'name': 'BBC',
'priority': 4,
'levelStr': 'All',
'escalatingChainId': 3,
'escalatingChainName': 'Example123'},
{'id': 21,
'name': 'CNBC',
'priority': 4,
'levelStr': 'All',
'escalatingChainId': 3,
'escalatingChainName': 'Example456'}]

Adding additional information to aggregated results

Example below of what I'm trying but Im having trouble figuring out the addFields part.
I'd like to add the created_at and amount fields from the matching expr in the pipeline from $$items. Currently it returns every created_at in the $$items
User model
class User(BaseModel):
__schema__ = DictField(dict(
user_id=StringField(required=True),
name=StringField(required=True),
item_list=ListField(default=[])
))
class InventoryAggregateManipulator(Manipulator):
def transform_outgoing(self, doc, model):
cur = User.aggregate([
{"$lookup": {
"from": "item",
"let": {"items": "$item_list"},
"pipeline": [
{"$match": {"$expr": {"$in": ["$_id", "$$items.id"]}}},
{"$project": {"name": "$name", 'rate': "$rate", "payout": "$payout"}},
{"$addFields": {"last_run": "$$items.created_at"}}
],
"as": "inventory"
}},
{"$match": {"_id": doc['_id']}}
])
for doc in cur:
return doc
example item collection:
[
{
"_id": 1,
"name": "some_item",
"rate": 60,
"payout": 15
},
{
"_id": 2,
"name": "another_item",
"rate": 30,
"payout": 20
}
]
example user:
{
'_id': 1,
'user_id': 1234,
'name': 'user123',
'item_list':[
{
"id": 1,
"created_at": datetime,
"amount": 3
},
{
"id": 2,
"created_at": datetime,
"amount": 5
}
]
}
expected result
{
'user_id': 1234,
'name': 'user123',
'item_list':[
{
"id": 1,
"created_at": datetime,
"amount": 3
},
{
"id": 2,
"created_at": datetime,
"amount": 5
}
],
'inventory':[
{
"name": "some_item",
"rate": 60,
"payout": 15,
"last_run": [datetime from item_list],
"amount": [amount from item_list]
},
{
"name": "another_item",
"rate": 30,
"payout": 20,
"last_run": [datetime from item_list],
"amount": [amount from item_list]
}
]
}

Merging two json files using python

I'm new to python, I want to merge two JSON files
there should not be any duplicate:
if the values and name are same then I will add both the keys and maintain a single record, otherwise, I will keep the record
File 1:
[ {
"key": 1,
"name": "test",
"value": "NY"
},
{
"key": 1,
"name": "test",
"value": "CA"
},
{
"key": 1,
"name": "test",
"value": "MA"
},
{
"key": 1,
"name": "test",
"value": "MA"
}
]
File 2:
[ {
"key": 1,
"name": "test",
"value": "NJ"
},
{
"key": 1,
"name": "test",
"value": "CA"
},
{
"key": 1,
"name": "test",
"value": "TX"
},
{
"key": 1,
"name": "test",
"value": "MA"
}
]
and the merged file output should be:
[
{
"key": 1,
"name": "test",
"value": "NY"
},
{
"key": 3,
"name": "test",
"value": "MA"
},
{
"key": 1,
"name": "test",
"value": "NJ"
},
{
"key": 2,
"name": "test",
"value": "CA"
},
{
"key": 1,
"name": "test",
"value": "TX"
}
]
order of the record does not matter.
I have tried several approaches, like merging the files and then iterating over then, parsing both files separately but I'm facing issues, being new to python.
This should help.
# -*- coding: utf-8 -*-
f1 = [ {
"key": 1,
"value": "NY"
},
{
"key": 1,
"value": "CA"
},
{
"key": 1,
"value": "MA"
}
]
f2 = [ {
"key": 1,
"value": "NJ"
},
{
"key": 1,
"value": "CA"
},
{
"key": 1,
"value": "TX"
}
]
check = [i["value"] for i in f1] #check list to see if the value already exist in f1.
for i in f2:
if i['value'] not in check:
f1.append(i)
print(f1)
Output:
[{'value': 'NY', 'key': 1}, {'value': 'CA', 'key': 1}, {'value': 'MA', 'key': 1}, {'value': 'NJ', 'key': 1}, {'value': 'TX', 'key': 1}]

Categories

Resources