How do I access nested elements inside a json array in python - python

I want to iterate over the below json array to extract all the referenceValues and the corresponding paymentIDs into one
{
"payments": [{
"paymentID": "xxx",
"externalReferences": [{
"referenceKind": "TRADE_ID",
"referenceValue": "xxx"
}, {
"referenceKind": "ID",
"referenceValue": "xxx"
}]
}, {
"paymentID": "xxx",
"externalReferences": [{
"referenceKind": "ID",
"referenceValue": "xxx"
}]
}]
}
The below piece only extracts in case of a single payment and single externalreferences. I want to be able to do it for multiple payments and multiple externalreferences as well.
payment_ids = []
for notification in notifications:
payments= [(payment[0], payment["externalReferences"][0]["referenceValue"])
for payment in notification[0][0]]
if payments[0][1] in invoice_ids:
payment_ids.extend([payment[0] for payment in payments])

Looking at your structure, first you have to iterate through every dictionary in payments, then iterate through their external references. So the below code should extract all reference values and their payment IDs to a dictionary (and append to a list)
refVals = [] # List of all reference values
for payment in data["payments"]:
for reference in payment["externalReferences"]:
refVals.append({ # Dictionary of the current values
"referenceValue": reference["referenceValue"], # The current reference value
"paymentID": payment["paymentID"] # The current payment ID
})
print(refVals)
This code should output a list of a dictionary with all reference values and their payment IDs, in the data dictionary (assuming you read your data into the data variable)

Related

how to check whether the comma-separated value in the database is present in JSON data or not using python

I just have to check the JSON data on the basis of comma-separated e_code in the table.
how to filter only that data where users e_codes are available
in the database:
id email age e_codes
1. abc#gmail 19 123456,234567,345678
2. xyz#gmail 31 234567,345678,456789
This is my JSON data
[
{
"ct": 1,
"e_code": 123456,
},
{
"ct": 2,
"e_code": 234567,
},
{
"ct": 3,
"e_code": 345678,
},
{
"ct": 4,
"e_code": 456789,
},
{
"ct": 5,
"e_code": 456710,
}
]
If efficiency is not an issue, you could loop through the table, split the values to a list by using case['e_codes'].split(',') and then, for each code loop through the JSON to see whether it is present.
This might be a little inefficient if your data, JSON, or number of values are long.
It might be better to first create a lookup dictionary in which the codes are the keys:
lookup={}
for e in my_json:
lookup[e['e_code']] = 1
You can then check how many of the codes in your table are actually in the JSON:
## Let's assume that the "e_codes" cell of the
## current line is data['e_codes'][i], where i is the line number
for i in lines:
match = [0,0]
for code in data['e_codes'][i].split(','):
try:
match[0]+=lookup[code]
match[1]+=1
except:
match[1]+=1
if match[1]>0: share_present=match[0]/match[1]
For each case, you get a share_present, which is 1.0 if all codes appear in the JSON, 0.0 if none of them do and some value between to indicate the share of codes that were present. Depending on your threshold for keeping a case you can set a filter to True or False depending on this value.

Querying for values of embedded documents in MongoDB with PyMongo

I have a document in MongoDB that looks like this
{
"_id": 0,
"cash_balance": 50,
"holdings": [
{
"name": "item1",
"code": "code1",
"quantity": 300
},
{
"name": "item2",
"code": "code2",
"quantity": 100
}
]
}
I would like to query for this particular document and get the quantity value of the object inside the holdings array whose code matches "code1". It can be assumed that there will be a match.
data = collection.find_one({"_id": 0, "holdings.code": "code1"}, {"holdings.$.quantity": 1})
{ "_id": 0, "holdings": [{"name": "item1", "code": "code1", "quantity": 300}] }
Running the above code gives me this output and I can get the quantity value by using:
data["holdings"][0]["quantity]
300
However this seems to be a rather roundabout way of getting a single value. Is there a way I can query for the value of a particular key matching the code query without getting the holdings array containing the required object?
try to use the aggregate method with $unwind.
$unwind does the following:
Deconstructs an array field from the input documents to output a document for each element. Each output document is the input document with the value of the array field replaced by the element.
MongoDB documentation for $unwind
I created a playground example for you.

Map JSON values into another JSON object in Python

Background to this question is I have JSON Responses A and wish to send Response A's values only to JSON format B. JSON format B has different fields.
JSON Response A
{
"res_comment": "work description",
"res_id": "62",
"res_priority": "P2",
"res_qid": "INC0140315"
}
....... etc x n
Looking to map the values in JSON Response A to the below JSON format B.
JSON format B
{
"ServiceIssueCategoryID": "62", **#res_id**
"ServicePriorityCode": "P2", **#res_priority**
"ServiceNowID_KUT": "INC0140315", **#res_qid**
"ServiceRequestTextCollection":[
{
"TypeCode": "10004", **value does not change**
"Text": "work description" **#res_comment**
}
]
}
## update
To be clear:
I have made a single map of Values (between one set of curly brackets) to Format B. Then I used the below to update Format B:
with open("sapFormat.json", "r+") as jsonFile:
data = json.load(jsonFile)
tmp1 = data['ServiceIssueCategoryID']
data["ServiceIssueCategoryID"] = res_id
.... other fields
Works fine, however I have 100's of these individual 'res_id'. Not sure how to loop through all individually and add individually. The function that pulls the Values, takes all Values, e.g. 100 x res_id, need to take 1 res_id, update B and move to the next res_id, update etc

Update single row formatting for entire sheet

I want to just apply a formatting from a JSON Entry. The first thing I did was make my desirable format on my spreadsheet for the second row of all columns. I then retrieved them with a .get request (from A2 to AO3).
request = google_api.service.spreadsheets().get(
spreadsheetId=ss_id,
ranges="Tab1!A2:AO3",
includeGridData=True).execute()
The next thing I did was collect each of the formats for each column and record them in a dictionary.
my_dictionary_of_formats = {}
row_values = row_1['sheets'][0]['data'][0]['rowData'][0]['values']
for column in range(0, len(row_values)):
my_dictionary_of_formats[column] = row_values[column]['effectiveFormat']
Now I have a dictionray of all my effective formats for all my columns. I'm having trouble now applying that format to all rows in each column. I tried a batchUpdate request:
cell_data = {
"effectiveFormat": my_dictionary_of_formats[0]}
row_data = {
"values": [
cell_data
]
}
update_cell = {
"rows": [
row_data
],
"fields": "*",
"range":
{
"sheetId": input_master.tab_id,
"startRowIndex": 2,
"startColumnIndex": 0,
"endColumnsIndex": 1
}
}
request_body = {
"requests": [
{"updateCells": update_cell}],
"includeSpreadsheetInResponse": True,
"responseIncludeGridData": True}
service.spreadsheets().batchUpdate(spreadsheetId=my_id, body=request_body).execute()
This wiped out everything and I'm not sure why. I don't think I understand the fields='* attribute.
TL;DR
I want to apply a format to all rows in a single column. Much like if I used the "Paint Format" tool on the second row, first column and dragged it all the way down to the last row.
-----Update
Hi, thanks to the comments this was my solution:
###collect all formats from second row
import json
row_2 = goolge_api.service.spreadsheets().get(
spreadsheetId=spreadsheet_id,
ranges="tab1!A2:AO2",
includeGridData=True).execute()
my_dictionary = {}
row_values = row_2['sheets'][0]['data'][0]['rowData'][0]['values']
for column in range(0,len(row_values)):
my_dictionary[column] = row_values[column]
json.dumps(my_dictionary,open('config/format.json','w'))
###Part 2, apply formats
requests = []
my_dict = json.load(open('config/format.json'))
for column in my_dict:
requests.append(
{
"repeatCell": {
"range": {
"sheetId": tab_id,
"startRowIndex": str(1),
"startColumnIndex":str(column),
"endColumnIndex":str(int(column)+1)
},
"cell": {
"userEnteredFormat": my_dict[column]
},
'fields': "userEnteredFormat({})".format(",".join(my_dict[column].keys()))
}
})
body = {"requests": requests}
google_api.service.spreadsheets().batchUpdate(spreadsheetId=s.spreadsheet_id,body=body).execute()
When you include fields as a part of the request, you indicate to the API endpoint that it should overwrite the specified fields in the targeted range with the information found in your uploaded resource. fields="*" correspondingly is interpreted as "This request specifies the entire data and metadata of the given range. Remove any previous data and metadata from the range and use what is supplied instead."
Thus, anything not specified in your updateCells requests will be removed from the range supplied in the request (e.g. values, formulas, data validation, etc.).
You can learn more in the guide to batchUpdate
For an updateCell request, the fields parameter is as described:
The fields of CellData that should be updated. At least one field must be specified. The root is the CellData; 'row.values.' should not be specified. A single "*" can be used as short-hand for listing every field.
If you then view the resource description of CellData, you observe the following fields:
"userEnteredValue"
"effectiveValue"
"formattedValue"
"userEnteredFormat"
"effectiveFormat"
"hyperlink"
"note"
"textFormatRuns"
"dataValidation"
"pivotTable"
Thus, the proper fields specification for your request is likely to be fields="effectiveFormat", since this is the only field you supply in your row_data property.
Consider also using the repeatCell request if you are just specifying a single format.

Create a data frame from a complex nested dictionary?

I have a big nested, then nested then nested json file saved as .txt format. I need to access some specific key pairs and crate a data frame or another transformed json object for further use. Here is a small sample with 2 key pairs.
[
{
"ko_id": [819752],
"concepts": [
{
"id": ["11A71731B880:http://ontology.intranet.com/Taxonomy/116#en"],
"uri": ["http://ontology.intranet.com/Taxonomy/116"],
"language": ["en"],
"prefLabel": ["Client coverage & relationship management"]
}
]
},
{
"ko_id": [819753],
"concepts": [
{
"id": ["11A71731B880:http://ontology.intranet.com/Taxonomy/116#en"],
"uri": ["http://ontology.intranet.com/Taxonomy/116"],
"language": ["en"],
"prefLabel": ["Client coverage & relationship management"]
}
]
}
]
The following code load the data as list but I need to access to the data probably as a dictionary and I need the "ko_id", "uri" and "prefLabel" from each key pair and put it to a pandas data frame or a dictionary for further analysis.
with open('sample_data.txt') as data_file:
json_sample = js.load(data_file)
The following code gives me the exact value of the first element. But donot actually know how to put it together and build the ultimate algorithm to create the dataframe.
print(sample_dict["ko_id"][0])
print(sample_dict["concepts"][0]["prefLabel"][0])
print(sample_dict["concepts"][0]["uri"][0])
for record in sample_dict:
df = pd.DataFrame(record['concepts'])
df['ko_id'] = record['ko_id']
final_df = final_df.append(df)
You can pass the data to pandas.DataFrame using a generator:
import pandas as pd
import json as js
with open('sample_data.txt') as data_file:
json_sample = js.load(data_file)
df = pd.DataFrame(data = ((key["ko_id"][0],
key["concepts"][0]["prefLabel"][0],
key["concepts"][0]["uri"][0]) for key in json_sample),
columns = ("ko_id", "prefLabel", "uri"))
Output:
>>> df
ko_id prefLabel uri
0 819752 Client coverage & relationship management http://ontology.intranet.com/Taxonomy/116
1 819753 Client coverage & relationship management http://ontology.intranet.com/Taxonomy/116

Categories

Resources