How to load a json file in jupyter notebook using pandas? - python

I am trying to load a json file in my jupyter notebook
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib as plt
import json
%matplotlib inline
with open("pud.json") as datafile:
data = json.load(datafile)
dataframe = pd.DataFrame(data)
I am getting the following error
JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Please help

If you want to load a json file use pandas.read_json.
pandas.read_json("pud.json")
This will load the json as a dataframe.
The function usage is as shown below
pandas.read_json(path_or_buf=None, orient=None, typ='frame', dtype=True, convert_axes=True, convert_dates=True, keep_default_dates=True, numpy=False, precise_float=False, date_unit=None, encoding=None, lines=False, chunksize=None, compression='infer')
You can get more information about the parameters here
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_json.html

Another way using json!
import pandas as pd
import json
with open('File_location.json') as f:
data = json.load(f)
df=pd.DataFrame(data)

with open('pud.json', 'r') as file:
variable_name = json.load(file)
The json file will be loaded as python dictionary.

This code you are writing here is completely okay . The problem is the .json file that you are loading is not a JSON file. Kindly check that file.

Related

Loading a parquet file from a GitHub repository

I tried to read a parquet (.parq) file I have stored in a GitHub project, using the following script:
import pandas as pd
import numpy as np
import ipywidgets as widgets
import datetime
from ipywidgets import interactive
from IPython.display import display, Javascript
import warnings
warnings.filterwarnings('ignore')
parquet_file = r'https://github.com/smaanan/sev.en_commodities/blob/main/random_deals.parq'
df = pd.read_parquet(parquet_file, engine='auto')
and it gave me this error:
ArrowInvalid: Could not open Parquet input source '': Parquet
magic bytes not found in footer. Either the file is corrupted or this
is not a parquet file.
Does anyone know what this error message means and how I can load the file in my GitHub repository? Thank you in advance.
You should use the URL under the domain raw.githubusercontent.com.
As for your example:
parquet_file = 'https://raw.githubusercontent.com/smaanan/sev.en_commodities/main/random_deals.parq'
df = pd.read_parquet(parquet_file, engine='auto')
You can read parquet files directly from a web URL like this. However, when reading a data file from a git repository you need to make sure it is the raw file url:
url = 'https://github.com/smaanan/sev.en_commodities/blob/main/random_deals.parq?raw=true'

How to import the json file in python

I am trying to import a json file to python and then export is to an excel file using the following code:
import pandas as pd
df = pd.read_json('pub_settings.json')
df.to_excel('pub_settings.xlsx')
but i am getting the following error:
can anyone please tell me what i am doing wrong?
First import json file as a dictionary using following code:-
import json
with open("") as f:
data = json.load(f)
Then you can use following link to convert it to xlsx:-
https://pypi.org/project/tablib/0.9.3/

How do you import a ndjson file in jupyter notebook

I have tried the code below but it's not working
import json
with open("/Users/elton/20210228test2.ndjson") as f:
test2data = ndjson.load(f)
This works for me. import ndjson instead of import json. See more here: https://pypi.org/project/ndjson/
import ndjson
# load from file-like objects
with open('data.ndjson') as f:
data = ndjson.load(f)

Can't save data from yfinance into a CSV file

I found library that allows me to get data from yahoo finance very efficiently. It's a wonderful library.
The problem is, I can't save the data into a csv file.
I've tried converting the data to a Panda Dataframe but I think I'm doing it incorrectly and I'm getting a bunch of 'NaN's.
I tried using Numpy to save directly into a csv file and that's not working either.
import yfinance as yf
import csv
import numpy as np
urls=[
'voo',
'msft'
]
for url in urls:
tickerTag = yf.Ticker(url)
print(tickerTag.actions)
np.savetxt('DivGrabberTest.csv', tickerTag.actions, delimiter = '|')
I can print the data on console and it's fine. Please help me save it into a csv. Thank you!
If you want to store the ticker results for each url in different csv files you can do:
for url in urls:
tickerTag = yf.Ticker(url)
tickerTag.actions.to_csv("tickertag{}.csv".format(url))
if you want them all to be in the same csv file you can do
import pandas as pd
tickerlist = [yf.Ticker.url for url in urls]
pd.concat(tickerlist).to_csv("tickersconcat.csv")

Python: read a csv file generated dynamically by an API?

I want to read into pandas the csv generated by this URL:
https://www.alphavantage.co/query?function=FX_DAILY&from_symbol=EUR&to_symbol=USD&apikey=demo&datatype=csv
How should this be done?
I believe you can just read it with pd.read_csv
import pandas as pd
URL = 'https://www.alphavantage.co/query?function=FX_DAILY&from_symbol=EUR&to_symbol=USD&apikey=demo&datatype=csv'
df = pd.read_csv(URL)
Results:

Categories

Resources