A Python dictionary with repeated fields - python

I'm constructing a dictionary with Python to use with a SOAP API.
My SOAP API takes an input like this:
<dataArray>
<AccountingYearData>
<Handle>
<Year>string</Year>
</Handle>
<Year>string</Year>
<FromDate>dateTime</FromDate>
<ToDate>dateTime</ToDate>
<IsClosed>boolean</IsClosed>
</AccountingYearData>
<AccountingYearData>
<Handle>
<Year>string</Year>
</Handle>
<Year>string</Year>
<FromDate>dateTime</FromDate>
<ToDate>dateTime</ToDate>
<IsClosed>boolean</IsClosed>
</AccountingYearData>
</dataArray>
Se this for the full string
https://api.e-conomic.com/secure/api1/EconomicWebService.asmx?op=AccountingYear_CreateFromDataArray
Notice how the field appears multiple times.
How can I create a Python dict with this data?
If I do this:
data = {
'dataArray':{
'AccountingYearData':{
'Handle':{'Year':'2017'},
'Year':'2017',
'FromDate':'2017-01-01',
'ToDate':'2017-12-31',
'IsClosed':'False'
},
'AccountingYearData':{
'Handle':{'Year':'2017'},
'Year':'2017',
'FromDate':'2017-01-01',
'ToDate':'2017-12-31',
'IsClosed':'False'
}
}
}
I get:
>>> type (data)
<type 'dict'>
>>> data {
'dataArray': {
'AccountingYearData': {
'IsClosed': 'False',
'FromDate': '2017-01-01',
'Handle': {'Year': '2017'},
'ToDate': '2017-12-31',
'Year': '2017'
}
}
}
It's as expected I think, but now what I need.

Well, the answer seems obvious and is even hinted by the "dataArray" name: if you have a list of items, then you want to use a list to store them:
data = {
'dataArray':[
{
'AccountingYearData':{
'Handle':{'Year':'2017'},
'Year':'2017',
'FromDate':'2017-01-01',
'ToDate':'2017-12-31',
'IsClosed':'False'
},
},
{
'AccountingYearData':{
'Handle':{'Year':'2017'},
'Year':'2017',
'FromDate':'2017-01-01',
'ToDate':'2017-12-31',
'IsClosed':'False'
},
},
]
}

Related

How do I get the value of a dict item within a list, within a dict?

How do I get the value of a dict item within a list, within a dict in Python? Please see the following code for an example of what I mean.
I use the following lines of code in Python to get data from an API.
res = requests.get('https://api.data.amsterdam.nl/bag/v1.1/nummeraanduiding/', params)
data = res.json()
data then returns the following Python dictionary:
{
'_links': {
'next': {
'href': null
},
'previous': {
"href": null
},
'self': {
'href': 'https://api.data.amsterdam.nl/bag/v1.1/nummeraanduiding/'
}
},
'count': 1,
'results': [
{
'_display': 'Maple Street 99',
'_links': {
'self': {
'href': 'https://api.data.amsterdam.nl/bag/v1.1/nummeraanduiding/XXXXXXXXXXXXXXXX/'
}
},
'dataset': 'bag',
'landelijk_id': 'XXXXXXXXXXXXXXXX',
'type_adres': 'Hoofdadres',
'vbo_status': 'Verblijfsobject in gebruik'
}
]
}
Using Python, how do I get the value for 'landelijk_id', represented by the twelve Xs?
This should work:
>>> data['results'][0]['landelijk_id']
"XXXXXXXXXXXXXXXX"
You can just chain those [] for each child you need to access.
I'd recommend using the jmespath package to make handling nested Dictionaries easier. https://pypi.org/project/jmespath/
import jmespath
import requests
res = requests.get('https://api.data.amsterdam.nl/bag/v1.1/nummeraanduiding/', params)
data = res.json()
print(jmespath.search('results[].landelijk_id', data)

JSON Parsing in Python - help extracting dictionaries inside a list

I've searched and there's a similar problem here but the solution states to fix the json. I really cant fix the json produced as its from a REST API.
{
"__metadata": {
"uri": "http://website:6405/biprws/v1/cmsquery?page=1&pagesize=50"
},
"first": {
"__deferred": {
"uri": "http://website:6405/biprws/v1/cmsquery?page=1&pagesize=50"
}
},
"last": {
"__deferred": {
"uri": "http://website:6405/biprws/v1/cmsquery?page=1&pagesize=50"
}
},
"entries": [
{
"SI_ID": 31543,
"SI_NAME": "Some Client",
"SI_PARENTID": 31414,
"SI_PATH": {
"SI_FOLDER_NAME1": "COR OPS",
"SI_FOLDER_ID1": 31414,
"SI_FOLDER_OBTYPE1": 1,
"SI_FOLDER_NAME2": "CLIENT",
"SI_FOLDER_ID2": 28178,
"SI_FOLDER_OBTYPE2": 1,
"SI_NUM_FOLDERS": 2
}
}
]
}
I need to be able to get the folder names from SI_PATH, but that is where I am having issues. I can access "entries" fine as the whole json is considered as a dict, but the problem is after. If I get "entries", its just a list with a len of 1
import json
data = json.load(open('file.json'))
print(type(data))
print(data['entries])
print(type(data['entries']))
Sample output below:
<class 'dict'>
<class 'list'>
[{'SI_ID': 31543, 'SI_NAME': 'Some Client', 'SI_PARENTID': 31414, 'SI_PATH': {'SI_FOLDER_NAME1': 'COR OPS', 'SI_FOLDER_ID1': 31414, 'SI_FOLDER_OBTYPE1': 1, 'SI_FOLDER_NAME2': 'CLIENT', 'SI_FOLDER_ID2': 28178, 'SI_FOLDER_OBTYPE2': 1, 'SI_NUM_FOLDERS': 2}}]
I can use pandas to put the 'entries' onto a DataFrame and pull in the SI_PATH values, but not sure how to access each of them.
f = pd.DataFrame(data['entries'])
print(f['SI_PATH'].values)
Output of this:
[{'SI_FOLDER_NAME1': 'COR OPS', 'SI_FOLDER_ID1': 31414, 'SI_FOLDER_OBTYPE1': 1, 'SI_FOLDER_NAME2': 'CLIENT', 'SI_FOLDER_ID2': 28178, 'SI_FOLDER_OBTYPE2': 1, 'SI_NUM_FOLDERS': 2}]
But unsure as to how to access the items individual from this point. If possible, really want to stick with just importing json.
Since there is only one item in the list that is data['entries']:
print(data['entries'][0]['SI_ID'])
Prints:
31543
since it is a list of dict, why not
for items in data['entries']:
print(items.get("SI_ID"))

Validating arbitrary dict keys with strict schemas with Cerberus

I am trying to validate JSON, the schema for which specifies a list of dicts with arbitrary string keys, the corresponding values of which are dicts with a strict schema (i.e, the keys of the inner dict are strictly some string, here 'a'). From the Cerberus docs, I think that what I want is the 'keysrules' rule. The example in the docs seems to only show how to use 'keysrules' to validate arbitrary keys, but not their values. I wrote the below code as an example; the best I could do was assume that 'keysrules' would support a 'schema' argument for defining a schema for these values.
keysrules = {
'myDict': {
'type': 'dict',
'keysrules': {
'type': 'string',
'schema': {
'type': 'dict',
'schema': {
'a': {'type': 'string'}
}
}
}
}
}
keysRulesTest = {
'myDict': {
'arbitraryStringKey': {
'a': 'arbitraryStringValue'
},
'anotherArbitraryStringKey': {
'shouldNotValidate': 'arbitraryStringValue'
}
}
}
def test_rules():
v = Validator(keysrules)
if not v.validate(keysRulesTest):
print(v.errors)
assert(0)
This example does validate, and I would like it to not validate on 'shouldNotValidate', because that key should be 'a'. Does the flexibility implied by 'keysrules' (i.e, keys governed by 'keysrules' have no constraint other than {'type': 'string'}) propagate down recursively to all schemas underneath it? Or have I made some different error? How can I achieve my desired outcome?
I didn't want keysrules, I wanted valuesrules:
keysrules = {
'myDict': {
'type': 'dict',
'valuesrules': {
'type': 'dict',
'schema': {
'a': {'type': 'string'}
}
}
}
}
keysRulesTest = {
'myDict': {
'arbitraryStringKey': {
'a': 'arbitraryStringValue'
},
'anotherArbitraryStringKey': {
'shouldNotValidate': 'arbitraryStringValue'
}
}
}
def test_rules():
v = Validator(keysrules)
if not v.validate(keysRulesTest):
print(v.errors)
assert(0)
This produces my desired outcome.

MongoDB watch() aggregation match by field value

When I use the watch() function on my collection, I am passing a aggregation to filter what comes through. I was able to get operationType to work correctly, but I also only want to include documents in which the city field is equal to Vancouver. The current syntax I am using does not work:
change_stream = client.mydb.mycollection.watch([
{
'$match': {
'operationType': { '$in': ['replace', 'insert'] },
'fullDocument': {'city': {'$eq': 'Vancouver'} }
}
}
])
And for reference, this is the what the dictionary that I'm aggregating looks like:
{'_id': {'_data': '825F...E0004'},
'clusterTime': Timestamp(1595565179, 2),
'documentKey': {'_id': ObjectId('70fc7871...')},
'fullDocument': {'_id': ObjectId('70fc7871...'),
'city': 'Vancouver',
'ns': {'coll': 'notification', 'db': 'pipeline'},
'operationType': 'replace'}
I found I just have to use a dot to access the nested dictionary:
change_stream = client.mydb.mycollection.watch([
{
'$match': {
'operationType': { '$in': ['replace', 'insert'] },
'fullDocument.city': 'Vancouver' }
}
}
])

Accessing a json object nested in a json array with Python 3.x

Given the json payload below, how do I get the value of 'hotspot' using Python 3.x? The top level seems to be a a dict with one key value pair. 'Recs' is the key and the value is a Python list. I have loaded the json payload into the Python class using json.loads(payload).
json payload:
{
'Recs': [{
'eSrc': 'big-a1',
'reqPs': {
'srcIP': '11.111.11.111'
},
'a1': {
'a1Ver': '1.0',
'obj': {
'eTag': '38f028e',
'sz': 1217,
'seq': '02391D2',
'hotspot': 'web/acme/srv/dev/8dd'
},
'confId': 'acme-contains',
'pipe': {
'name': 'acme.dev',
'oId': {
'pId': 'BDAD'
}
}
}
}]
}
{ indicates a dict, [ indicates a list so hotspot is at:
my_json['Recs'][0]['a1']['obj']['hotspot']

Categories

Resources