import json
import pandas as pd
with open('sample.json', 'r') as f:
data = json.load(f)
df = pd.DataFrame(data)
print(df.to_string()[str('GET'):int('502')])
i need (GET AND 502,404 ERROR LIST) FROM THE LOCAL JSON USING PYTHON PANDAS
I am trying to re-write a json file to add missing data values. but i cant seem to get the code to re-write the data on the json file. Here is the code to fill in missing data:
import pandas as pd
import json
data_df = pd.read_json("Data_test.json")
#replacing empty strings with nan
df2 = data_df.mask(data_df == "")
#filling the nan with data from above.
df2["Food_cat"].fillna(method="ffill", inplace=True,)
"Data_test.json" is the file with the list of dictionary and I am trying to either edit this json file or create a new one with the filled in data that was missing.
I have tried using
with open('complete_data', 'w') as f:
json.dump(df2, f)
but it does not seem to work. is there a way to edit the current data or create a new json file with the completed data?
this is the original data, I would like to keep this format.
Try to do this
import pandas as pd
import json
data_df = pd.read_json("Data_test.json")
#replacing empty strings with nan
df2 = data_df.mask(data_df == "")
#filling the nan with data from above.
df2["Food_cat"].fillna(method="ffill", inplace=True,)
df2.to_json('path_of_file.json')
Tell me if it works.
I am trying to read this json file in python using this code (I want to have all the data in a data frame):
import numpy as np
import pandas as pd
import json
from pandas.io.json import json_normalize
df = pd.read_json('short_desc.json')
df.head()
Data frame head screenshot
using this code I am able to convert only the first row to separated columns:
json_normalize(df.short_desc.iloc[0])
First row screenshot
I want to do the same for whole df using this code:
df.apply(lambda x : json_normalize(x.iloc[0]))
but I get this error:
ValueError: If using all scalar values, you must pass an index
What I am doing wrong?
Thank you in advance
After reading the json file with json.load, you can use pd.DataFrame.from_records. This should create the DataFrame you are looking for.
wih open('short_desc.json') as f:
d = json.load(f)
df = pd.DataFrame.from_records(d)
I am using python 3.6 and trying to download json file (350 MB) as pandas dataframe using the code below. However, I get the following error:
data_json_str = "[" + ",".join(data) + "]
"TypeError: sequence item 0: expected str instance, bytes found
How can I fix the error?
import pandas as pd
# read the entire file into a python array
with open('C:/Users/Alberto/nutrients.json', 'rb') as f:
data = f.readlines()
# remove the trailing "\n" from each line
data = map(lambda x: x.rstrip(), data)
# each element of 'data' is an individual JSON object.
# i want to convert it into an *array* of JSON objects
# which, in and of itself, is one large JSON object
# basically... add square brackets to the beginning
# and end, and have all the individual business JSON objects
# separated by a comma
data_json_str = "[" + ",".join(data) + "]"
# now, load it into pandas
data_df = pd.read_json(data_json_str)
From your code, it looks like you're loading a JSON file which has JSON data on each separate line. read_json supports a lines argument for data like this:
data_df = pd.read_json('C:/Users/Alberto/nutrients.json', lines=True)
Note
Remove lines=True if you have a single JSON object instead of individual JSON objects on each line.
Using the json module you can parse the json into a python object, then create a dataframe from that:
import json
import pandas as pd
with open('C:/Users/Alberto/nutrients.json', 'r') as f:
data = json.load(f)
df = pd.DataFrame(data)
If you open the file as binary ('rb'), you will get bytes. How about:
with open('C:/Users/Alberto/nutrients.json', 'rU') as f:
Also as noted in this answer you can also use pandas directly like:
df = pd.read_json('C:/Users/Alberto/nutrients.json', lines=True)
if you want to convert it into an array of JSON objects, I think this one will do what you want
import json
data = []
with open('nutrients.json', errors='ignore') as f:
for line in f:
data.append(json.loads(line))
print(data[0])
The easiest way to read json file using pandas is:
pd.read_json("sample.json",lines=True,orient='columns')
To deal with nested json like this
[[{Value1:1},{value2:2}],[{value3:3},{value4:4}],.....]
Use Python basics
value1 = df['column_name'][0][0].get(Value1)
Please the code below
#call the pandas library
import pandas as pd
#set the file location as URL or filepath of the json file
url = 'https://www.something.com/data.json'
#load the json data from the file to a pandas dataframe
df = pd.read_json(url, orient='columns')
#display the top 10 rows from the dataframe (this is to test only)
df.head(10)
Please review the code and modify based on your need. I have added comments to explain each line of code. Hope this helps!
I am having a bit of trouble trying to load a JSON file into a pandas data frame. I have gone through different iterations of code listed below with no luck:
import pandas as pd
fileName = "ipl2015_auction_data.json"
jsonData = pd.read_json(fileName, orient="records")
ERROR: ValueError: arrays must all be same length
import pandas as pd
fileName = "ipl2015_auction_data.json"
#jsonData = pd.read_json(fileName, orient="records")
with open(fileName, "r") as jsonFile:
data = jsonFile.read()
df = pd.DataFrame(data)
print df.head()
ERROR: pandas.core.common.PandasError: DataFrame constructor not properly called!
I can't figure out what I am doing wrong. Here is the JSON data I am trying to load.
SOLUTION:
It turns out there was an issue with my pandas installation. Re-installing it did the trick.