Iterating deeply nested pandas json object? - python

I have a pretty big json object which is of the format
[
{
"A":"value",
"TIME":1551052800000,
"C":35,
"D":36,
"E":34,
"F":35,
"G":33
},
{
"B":"value",
"TIME":1551052800000,
"C":36,
"D":56,
"E":44,
"F":75,
"G":38
}, ...
...
]
Converted to json with the help of pandas
df.to_json(orient='records')
I want to loop through the json body and update a specific key inside this json object and send it back to the client through my api
I want to do something like
for i = 0
object[i]["TIME"] = updateCaclulations
return i
I am new to python and have tried this. It helps iterate through the object but updation is not there and the time taken due to recursion is a lot.

First, pd.read_sql_query returns pd.DataFrame and not json.
As per your question:
Say you have a sample function calculate:
def update_calculation(time):
return time
You could update time so:
df["TIME"] = df["TIME"].apply(update_calculation)

Related

Error in loading json with os and pandas package [duplicate]

I have some difficulty in importing a JSON file with pandas.
import pandas as pd
map_index_to_word = pd.read_json('people_wiki_map_index_to_word.json')
This is the error that I get:
ValueError: If using all scalar values, you must pass an index
The file structure is simplified like this:
{"biennials": 522004, "lb915": 116290, "shatzky": 127647, "woode": 174106, "damfunk": 133206, "nualart": 153444, "hatefillot": 164111, "missionborn": 261765, "yeardescribed": 161075, "theoryhe": 521685}
It is from the machine learning course of University of Washington on Coursera. You can find the file here.
Try
ser = pd.read_json('people_wiki_map_index_to_word.json', typ='series')
That file only contains key value pairs where values are scalars. You can convert it to a dataframe with ser.to_frame('count').
You can also do something like this:
import json
with open('people_wiki_map_index_to_word.json', 'r') as f:
data = json.load(f)
Now data is a dictionary. You can pass it to a dataframe constructor like this:
df = pd.DataFrame({'count': data})
You can do as #ayhan mention which will give you a column base format
Or you can enclose the object in [ ] (source) as shown below to give you a row format that will be convenient if you are loading multiple values and planing on using matrix for your machine learning models.
df = pd.DataFrame([data])
I think what is happening is that the data in
map_index_to_word = pd.read_json('people_wiki_map_index_to_word.json')
is being read as a string instead of a json
{"biennials": 522004, "lb915": 116290, "shatzky": 127647, "woode": 174106, "damfunk": 133206, "nualart": 153444, "hatefillot": 164111, "missionborn": 261765, "yeardescribed": 161075, "theoryhe": 521685}
is actually
'{"biennials": 522004, "lb915": 116290, "shatzky": 127647, "woode": 174106, "damfunk": 133206, "nualart": 153444, "hatefillot": 164111, "missionborn": 261765, "yeardescribed": 161075, "theoryhe": 521685}'
Since a string is a scalar, it wants you to load it as a json, you have to convert it to a dict which is exactly what the other response is doing
The best way is to do a json loads on the string to convert it to a dict and load it into pandas
myfile=f.read()
jsonData=json.loads(myfile)
df=pd.DataFrame(data)
{
"biennials": 522004,
"lb915": 116290
}
df = pd.read_json('values.json')
As pd.read_json expects a list
{
"biennials": [522004],
"lb915": [116290]
}
for a particular key, it returns an error saying
If using all scalar values, you must pass an index.
So you can resolve this by specifying 'typ' arg in pd.read_json
map_index_to_word = pd.read_json('Datasets/people_wiki_map_index_to_word.json', typ='dictionary')
For newer pandas, 0.19.0 and later, use the lines parameter, set it to True.
The file is read as a json object per line.
import pandas as pd
map_index_to_word = pd.read_json('people_wiki_map_index_to_word.json', lines=True)
If fixed the following errors I encountered especially when some of the json files have only one value:
ValueError: If using all scalar values, you must pass an index
JSONDecodeError: Expecting value: line 1 column 1 (char 0)
ValueError: Trailing data
For example
cat values.json
{
name: "Snow",
age: "31"
}
df = pd.read_json('values.json')
Chances are you might end up with this
Error: if using all scalar values, you must pass an index
Pandas looks up for a list or dictionary in the value. Something like
cat values.json
{
name: ["Snow"],
age: ["31"]
}
So try doing this. Later on to convert to html tohtml()
df = pd.DataFrame([pd.read_json(report_file, typ='series')])
result = df.to_html()
I solved this by converting it into an array like so
[{"biennials": 522004, "lb915": 116290, "shatzky": 127647, "woode": 174106, "damfunk": 133206, "nualart": 153444, "hatefillot": 164111, "missionborn": 261765, "yeardescribed": 161075, "theoryhe": 521685}]

python for loop giving error 'list' object is not callable

I have the following piece of python code, which is supposed to access a JSON file stored in an s3 bucket, extract the organisations for each JSON object and then place them in a list. There is a for loop then which should go through the list and apply the organisation name to the ending /bitgear/IO-Air to create the appropriate IoT topic.
for uuid_index, uuid in enumerate(uuid_list):
s3 = boto3.resource('s3')
client = boto3.client('s3')
bucket = '3deo-sensor-data'
key = 'simulated/config/IoT-sim-config.json'
obj = s3.Object(bucket, key)
data = obj.get()['Body'].read().decode('utf-8')
json_data = json.loads(data)
dframe = pd.DataFrame(json_data, columns= ['organisation'])
org = dframe.values.tolist()
for orgs in org():
TOPIC = org[orgs] + '/bitgear/IO-Air'
However, I am getting the error 'list' object is not callable
Here is a snippet of how the JSON file looks:
[
{
"uuid": "1597c163-6fbf-4f46-8ff6-1e9eb4f07e34",
"organisation": "port_36",
"device_vendor": "bitgear",
"device_type": "IO-Air",
"client_id": "AQ_2"
},
{
"uuid": "cde2107e-8736-47de-9e87-2033c3063589",
"organisation": "hchjffsd2immvavb7jiqtedp",
"device_vendor": "bitgear",
"device_type": "IO-Air",
"client_id": "IoT_Sim_1"
}
]
Can anyone advise how to best form my desired IoT topic using the organisation name?
You don't need to use pandas for this, json.loads() deserialises a string to a Python object. Therefore the json loaded in becomes a list of dictionaries, so you can use:
import json
topics = []
for uuid_index, uuid in enumerate(uuid_list):
s3 = boto3.resource('s3')
client = boto3.client('s3')
bucket = '3deo-sensor-data'
key = 'simulated/config/IoT-sim-config.json'
obj = s3.Object(bucket, key)
data = obj.get()['Body'].read().decode('utf-8')
json_data = json.loads(data)
for data in json_data:
topics.append(data["organisation"] + '/bitgear/IO-Air')
Which leaves topics containing:
['port_36/bitgear/IO-Air', 'hchjffsd2immvavb7jiqtedp/bitgear/IO-Air']
You can then iterate through each of the topics in the topics list.
Change this part of you code:
for orgs in org:
TOPIC = orgs + '/bitgear/IO-Air'
Looking at you code, I think a much cleaner code would be to first convert your whole json to dataframe with all the columns using:
df = pd.DataFrame.from_records(json_data)
This will give you a dataframe with all the required fields and then you can simply append a string to all the column values of "organisation" column as below:
df['organisation'] = df['col'].astype(str) + '/bitgear/IO-Air'
To get this column values as list, you can simply use .tolist function as below:
df['organisation'].tolist()
for uuid_index, uuid in list(enumerate(uuid_list)):
try this in the first line of your code snippet. This will create a list object of elements with two items the index and the uuid which seems to be what you are trying to do.
First of all org = dframe.values.tolist() returns a list to which you can iterate but not call using (). So, for loop should be for orgs in org:.
Other thing I noticed is the following:
for orgs in org():
TOPIC = org[orgs] + '/bitgear/IO-Air'
When iterate over a list you get values directly not indices, so you can't do org[orgs] in your case, orgs is the element in the list itself.
On the other hand, if you want index you can do for idx, value in enumerate(org):

Python: json to json array and append date in the beginning

I have this json :
[{"UGC_TECH_PLATEFORME": "youtube", "UGC_TECH_ID": "UCu93VC-rD_TolBF4Pe5yz_Q"}]
And i'd like to get this one:
[{"2020-09-23":{"UGC_TECH_PLATEFORME": "youtube", "UGC_TECH_ID": "UCu93VC-rD_TolBF4Pe5yz_Q"}}]
I guess i should use append method but i really don't know nor find how.
Thanks.
#jizhihaoSAMA write answer, if u want save today json:
from datetime import datetime
yourjson[0] = {
str(datetime.now().date()) : yourjson[0]
}

How to work with multiple of the same key?

I have a large dictionary that contains weather data. You can take a look at it here
This weather data is for multiple days, and I want to get all of the values from one key. How would I do this?
Here is a simplified version of the dictionary:
'data': { 'day1' : {'weather_discription': 'cloudy'},
'day2' : {'weather_discription': 'clear'}
}
I tried to use this code:
import requests
r = requests.get('data website')
res = r.json()
print(res['weather_discription'])
You need a loop to get them all.
for day, data in res['data'].items():
print(f"Weather on {day} was {data['weather_description']}")

Convert pandas json for highstock data format in a angularjs controller

I'm breaking my head on how to convert from a json to the highstock array format for a Series chart.
In my code I do the follow:
df2.reset_index().to_json(orient='records')
which results in this (for example):
[{"date":1456185600000,"adj_close":94.69},
{"date":1456099200000,"adj_close":96.88},
{"date":1455840000000,"adj_close":96.04},
{"date":1455753600000,"adj_close":96.26}]
then I get that json at my angularjs controller as an array of Objects, and I don't know how to transform that to the highstock format, to this for example:
[
[1456185600000, 94.69],
[1456099200000, 96.88],
[1455840000000, 96.04],
[1455753600000, 96.26],
]
what should I do to make this conversion?
d = [{"date":1456185600000,"adj_close":94.69},
{"date":1456099200000,"adj_close":96.88},
{"date":1455840000000,"adj_close":96.04},
{"date":1455753600000,"adj_close":96.26}]
Then a list comprehension like this
[[di['date'], di['adj_close']] for di in d]
gives you the desired output:
[[1456185600000, 94.69],
[1456099200000, 96.88],
[1455840000000, 96.04],
[1455753600000, 96.26]]
you could use the Array.map() method to reformat the objects:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/map
var objectArray = [
{"date":1456185600000,"adj_close":94.69},
{"date":1456099200000,"adj_close":96.88},
{"date":1455840000000,"adj_close":96.04},
{"date":1455753600000,"adj_close":96.26}]
var highStockSeries = objectArray.map(function(object){
return [object.date, object.adj_close]
})
console.log(highStockSeries)
<div id="result">look in the JS console for your array</div>

Categories

Resources