find occurrences count in json object - python

{
"id": 1,
"resourceAttributes": {
"siteId": "100"
},
},
{
"id": 2,
"resourceAttributes": {
"siteId": "200"
},
},
{
"id": 3,
"resourceAttributes": {
"siteId": "100"
},
},
I have this kind of json and as a result I want to show output like this
SiteId100 occurance 2
SiteId200 occurance 1
this out put is on basis of occurrences of site id value like if siteId 100 values occurred in 2 objects I need to show number of occurrences with count. I am trying something like getting all possible site ids and then removing duplicates and then finding one by one but seems that is not a neat solution.

you need to use colections.Counter. here's the full code:
from collections import Counter
data = [
{
"id": 1,
"resourceAttributes": {
"siteId": "100"
},
},
{
"id": 2,
"resourceAttributes": {
"siteId": "200"
},
},
{
"id": 3,
"resourceAttributes": {
"siteId": "100"
},
}
]
c = Counter(s.get("resourceAttributes").get("siteId") for s in data)
for k, v in c.items():
print("siteId{} occurance: {}".format(k, v))
output:
siteId100 occurance: 2
siteId200 occurance: 1

Related

Python - having trouble selecting single value from json data

I have the following code from which I want to select a singular piece of data from the JSON.
I have the following code from which I want to select a singular piece of data from the JSON.
j = {
"data": [
{
"astronomicalDawn": "2023-01-16T04:58:21+00:00",
"astronomicalDusk": "2023-01-16T17:00:31+00:00",
"civilDawn": "2023-01-16T06:38:18+00:00",
"civilDusk": "2023-01-16T15:20:34+00:00",
"moonFraction": 0.36248449454701365,
"moonPhase": {
"closest": {
"text": "Third quarter",
"time": "2023-01-14T22:34:00+00:00",
"value": 0.75
},
"current": {
"text": "Waning crescent",
"time": "2023-01-16T06:00:00+00:00",
"value": 0.7943440617174506
}
},
"moonrise": "2023-01-16T01:01:55+00:00",
"moonset": "2023-01-16T09:53:57+00:00",
"nauticalDawn": "2023-01-16T05:46:36+00:00",
"nauticalDusk": "2023-01-16T16:12:16+00:00",
"sunrise": "2023-01-16T07:28:07+00:00",
"sunset": "2023-01-16T14:30:45+00:00",
"time": "2023-01-16T06:00:00+00:00"
},
{
"astronomicalDawn": "2023-01-17T04:57:26+00:00",
"astronomicalDusk": "2023-01-17T17:02:07+00:00",
"civilDawn": "2023-01-17T06:37:07+00:00",
"civilDusk": "2023-01-17T15:22:26+00:00",
"moonFraction": 0.26001046334874545,
"moonPhase": {
"closest": {
"text": "Third quarter",
"time": "2023-01-14T21:31:00+00:00",
"value": 0.75
},
"current": {
"text": "Waning crescent",
"time": "2023-01-17T06:00:00+00:00",
"value": 0.8296778757434323
}
},
"moonrise": "2023-01-17T02:38:30+00:00",
"moonset": "2023-01-17T10:01:03+00:00",
"nauticalDawn": "2023-01-17T05:45:35+00:00",
"nauticalDusk": "2023-01-17T16:13:58+00:00",
"sunrise": "2023-01-17T07:26:40+00:00",
"sunset": "2023-01-17T14:32:54+00:00",
"time": "2023-01-17T06:00:00+00:00"
}
],
"meta": {
"cost": 1,
"dailyQuota": 10,
"lat": 58.7984,
"lng": 17.8081,
"requestCount": 1,
"start": "2023-01-16 06:00"
}
}
print(j['data']['moonPhase'])
Which gives me this error;
TypeError: list indices must be integers or slices, not str
That error is in regard to the very last line of the code. But changing the very very last line to print(j['data']) works.
What am I doing wrong - I am trying to select moonPhase data. It turns me on. Thank you.
Try:
print(j['data'][0]['moonPhase'])
or
print(j['data'][1]['moonPhase'])
Explanation: The data property of the json object contains a list. There are two items in the list (item 0 and item 1). You must first select an item using [0] or [1] before selecting the moonPhase property of one of the objects in the list.
edit: If you want to select only items where the moonPhase is in the future try:
print([
item['moonPhase']
for item in j['data']
if (
datetime.fromisoformat(
item['moonPhase']['current']['time']
).timestamp() >
datetime.now().timestamp()
)
])
output
[{'closest': {'text': 'Third quarter', 'time': '2023-01-14T21:31:00+00:00', 'value': 0.75}, 'current': {'text': 'Waning crescent', 'time': '2023-01-17T06:00:00+00:00', 'value': 0.8296778757434323}}]

how do I access this json data in python?

hi I'm pretty new at coding and I was trying to create a program in python that reads and save in another file the data inside a json file (not everything, just what I want). I googled how to parse data but there's something I don't understand.
that's a part of the json file:
`
{
"profileRevision": 548789,
"profileId": "campaign",
"profileChangesBaseRevision": 548789,
"profileChanges": [
{
"changeType": "fullProfileUpdate",
"profile": {
"_id": "2da4f079f8984cc48e84fc99dace495d",
"created": "2018-03-29T11:02:15.190Z",
"updated": "2022-10-31T17:34:43.284Z",
"rvn": 548789,
"wipeNumber": 9,
"accountId": "63881e614ef543b2932c70fed1196f34",
"profileId": "campaign",
"version": "refund_teddy_perks_september_2022",
"items": {
"8ec8f13f-6bf6-4933-a7db-43767a055e66": {
"templateId": "Quest:heroquest_loadout_constructor_2",
"attributes": {
"quest_state": "Claimed",
"creation_time": "min",
"last_state_change_time": "2019-05-18T16:09:12.750Z",
"completion_complete_pve03_diff26_loadout_constructor": 300,
"level": -1,
"item_seen": true,
"sent_new_notification": true,
"quest_rarity": "uncommon",
"xp_reward_scalar": 1
},
"quantity": 1
},
"6940c71b-c74b-4581-9f1e-c0a87e246884": {
"templateId": "Worker:workerbasic_sr_t01",
"attributes": {
"gender": "2",
"personality": "Homebase.Worker.Personality.IsDreamer",
"level": 1,
"item_seen": true,
"squad_slot_idx": -1,
"portrait": "WorkerPortrait:IconDef-WorkerPortrait-Dreamer-F02",
"building_slot_used": -1,
"set_bonus": "Homebase.Worker.SetBonus.IsMeleeDamageLow"
}
}
}
]
}
`
I can access profileChanges. I wrote this to create another json file with only the profileChanges things:
`
myjsonfile= open("file.json",'r')
jsondata=myjsonfile.read()
obj=json.loads(jsondata)
ciso=obj['profileChanges']
for i in ciso:
print(i)
with open("file2", "w") as outfile:
json.dump( ciso, outfile, indent=1)
the issue I have is that I can't access "profile" (inside profileChanges) in the same way by parsing the new file and I have no idea on how to do it
Access to JSON or dict element is realized by list indexes, please look at below example:
a = [
{
"friends": [
{
"id": 0,
"name": "Reba May"
}
],
"greeting": "Hello, Doris Gallagher! You have 2 unread messages.",
"favoriteFruit": "strawberry"
},
]
b = a['friends']['id] # b = 0
I've added a couple of closing braces to make your snippet valid json:
s = '''{
"profileRevision": 548789,
"profileId": "campaign",
"profileChangesBaseRevision": 548789,
"profileChanges": [
{
"changeType": "fullProfileUpdate",
"profile": {
"_id": "2da4f079f8984cc48e84fc99dace495d",
"created": "2018-03-29T11:02:15.190Z",
"updated": "2022-10-31T17:34:43.284Z",
"rvn": 548789,
"wipeNumber": 9,
"accountId": "63881e614ef543b2932c70fed1196f34",
"profileId": "campaign",
"version": "refund_teddy_perks_september_2022",
"items": {
"8ec8f13f-6bf6-4933-a7db-43767a055e66": {
"templateId": "Quest:heroquest_loadout_constructor_2",
"attributes": {
"quest_state": "Claimed",
"creation_time": "min",
"last_state_change_time": "2019-05-18T16:09:12.750Z",
"completion_complete_pve03_diff26_loadout_constructor": 300,
"level": -1,
"item_seen": true,
"sent_new_notification": true,
"quest_rarity": "uncommon",
"xp_reward_scalar": 1
},
"quantity": 1
},
"6940c71b-c74b-4581-9f1e-c0a87e246884": {
"templateId": "Worker:workerbasic_sr_t01",
"attributes": {
"gender": "2",
"personality": "Homebase.Worker.Personality.IsDreamer",
"level": 1,
"item_seen": true,
"squad_slot_idx": -1,
"portrait": "WorkerPortrait:IconDef-WorkerPortrait-Dreamer-F02",
"building_slot_used": -1,
"set_bonus": "Homebase.Worker.SetBonus.IsMeleeDamageLow"
}
}
}
}
}
]
}
'''
d = json.loads(s)
print(d['profileChanges'][0]['profile']['version'])
This prints refund_teddy_perks_september_2022
Explanation:
d is a dict
d['profileChanges'] is a list of dicts
d['profileChanges'][0] is the first dict in the list
d['profileChanges'][0]['profile'] is a dict
d['profileChanges'][0]['profile']['version'] is the value of version key in the profile dict in the first entry of the profileChanges list.

How to parse JSON results by condition?

There is JSON and a Python script.
Which displays a list of Companies on the screen.
How to display all Regions for Company[id] ?
{
"data": [
{
"id": 1,
"attributes": {
"name": "Company1",
"regions": {
"data": [
{
"id": 1,
"attributes": {
"name": "Region 1",
}
},
{
"id": 2,
"attributes": {
"name": "Region 2",
}
},
]
}
}
},
{
"id": 2,
"attributes": {
"name": "Company2",
"regions": {
"data": [
{
"id": 1,
"attributes": {
"name": "Region 1",
}
},
{
"id": 2,
"attributes": {
"name": "Region 2",
}
}
]
}
}
},
],
}
Script for all companies.
import os
import json
import requests
BASE_URL = 'localhost'
res = requests.get(BASE_URL)
res_content = json.loads(res.content)
for holding in res_content['data']:
print(holding['id'], holding['attributes']['name'])
How to do the same for displaying the Region for Company[id] ?
Example: Display all Regions for Company 1
Iterate through a list of dictionaries, looking for a dictionary with the key 'name' that has the value 'Company1'. Once it finds that dictionary, it iterates through the list of dictionaries stored under the key 'regions' and prints the value of the key 'name' for each dictionary in that list.
You can try this:
for company in res_content['data']:
if company['attributes']['name'] == 'Company1':
for region in company['attributes']['regions']['data']:
print(region['attributes']['name'])
You just need to delve further down into the res_content object:
for holding in res_content['data']:
print(holding['id'], holding['attributes']['name'])
data = holding['attributes']['regions']['data']
for d in data:
print(' ', d['attributes']['name'])
Output:
1 Company1
Region 1
Region 2
2 Company2
Region 1
Region 2

Count documents and total sum of their array property, grouped by multiple properties of a subdocument

Having N documents of the following type:
{
"source": {
"domain": "1",
"type": "2"
},
"assets": [1, 2]
}
{
"source": {
"domain": "3",
"type": "4"
},
"assets": [3, 4, 5]
}
How can I get a total count of assets among all documents, grouped by domain + type?
In the above case, a query should return that domain:1 + type:2 has 2 combined assets in 1 document, while domain3 + type:4 has 3 combined assets in 1 document.
Note that domain:1 + type:2 != domain:2 + type:1.
My first attempt was
collection.aggregate([
{
"$project": {
"_id": 0,
"arraySize":{"$size":"$assets"}
}
},
{
"$group": {
"_id": {"$concat": ["$source.domain", "-", "$source.type"]},
"totalArraysSize":{"$sum":"$arraySize"}
}
},
])
But it only returns [{'_id': None, 'totalArraysSize': 616}], with no grouping.
I managed to write up a solution after experimenting with the syntax:
collection.aggregate([{
"$group": {
"_id": {"source": "$source.domain", "type": "$source.type"},
"asset_count":{"$sum":{"$size":"$assets"}},
"total_count":{"$sum": 1},
}
}])

DeepDiff ignore with regex

I have two objects:
d1 = [ { "id": 3, "name": "test", "components": [ { "id": 1, "name": "test" }, { "id": 2, "name": "test2" } ] } ]
d2 = [ { "id": 4, "name": "test", "components": [ { "id": 2, "name": "test" }, { "id": 3, "name": "test"2 } ] } ]
As you can see, everything stays the same, but the id property changes on both root object and also inside components.
I'm using DeepDiff to compare d1 and d2 and trying to ignore comparison of id objects. However, I'm not sure how to achieve this. I tried the following which didn't seem to work.
excluded_paths = "root[\d+\]['id']"
diff = DeepDiff(d1, d2, exclude_paths=excluded_paths)
You can try using exclude_obj_callback:
from deepdiff import DeepDiff
def exclude_obj_callback(obj, path):
return True if "id" in path else False
d1 = [ { "id": 3, "name": "test", "components": [ { "id": 1, "name": "test" }, { "id": 2, "name": "test2" } ] } ]
d2 = [ { "id": 4, "name": "test", "components": [ { "id": 2, "name": "test" }, { "id": 3, "name": "test2" } ] } ]
print(DeepDiff(d1, d2, exclude_obj_callback=exclude_obj_callback))
What this does is returns a boolean for every deep component that includes the string "id" in it. You may want to be careful with this since you may exclude other objects that you didn't mean to. A way around this could be to set less generic key values for example "component_id".

Categories

Resources