Convert NOAA JSON value to data frame in python - python

From NOAA API, I can get Boston hourly weather forecast information via JSON file. Here is the link: https://api.weather.gov/gridpoints/BOX/70,76
(This JSON file is too long to present comprehensively here, please kindly click the link to see it)
I want to convert some of the weather variables into data frame to proceed further calculation.
The expected format is as below for temperature. I will use the same format to get precipitation, snowfall, humidity, etc.
expected dataframe format
Now I cannot figure out how to convert it to the dataframe I want. Please kindly help....
For now, here is the best I can do, but still cannot extract validTime and values from Temperature
import requests
import pandas as pd
response = requests.get("https://api.weather.gov/gridpoints/BOX/70,76")
# create new variable forecast
forecast=response.json()
df1 = pd.DataFrame.from_records(forecast['properties']).reset_index()
df2=df1.loc[ :1 , ['temperature','quantitativePrecipitation', 'snowfallAmount', 'relativeHumidity', 'windGust', 'windSpeed', 'visibility']]
df2
current output

Related

When I read excel file to python, time format has changed

when I read the excel file to python.
import pandas as pd
data = pd.read_excel('copy.xlsx')
data
Some part of my time data was uploaded successfully but another part of the time data has some problems. Problem is on these columns (in_time, call_time, process_in_time, out_time).
Why is this happening?
And how to handle and normalize this time data?
The Excel data link is here >enter link description here
enter image description here

Is there a way to work around python to pull multiple tickers / symbols from binance and extract necessary information from them?

I would like to pull multiple tickers from Binance and have managed to do so and write them into a CSV file. However, I am having an issue pulling specific information from the columns to have the OHLCV data only and then work on wrapping ta-lib around this data.
For eg. I would like to keep the OHLCV data from each row for XRPBTC, NEOBTC which are in columns, and write them into a new file or just wrap ta-lib around the same data. It works fine for just one ticker but I'm having some troubles extracting this for multiple tickers.
I am given to understand that these are in the format of lists, can I split them to keep only OHLCV data and from each row and from each column and write them into a new file - is there an easier way of splitting a list?
screenshot of the data
Link to relevant binance documentation Klines candlestick data
import pandas as pd
import numpy as np
import csv
import talib as ta
from binance.client import Client
candlesticks = ['XRPBTC','NEOBTC'] # unable to split for each row in multiple columns
data = pd.DataFrame()
for candlestick in candlesticks:
data[candlestick] = client.get_historical_klines(candlestick, Client.KLINE_INTERVAL_15MINUTE, "1 Jul, 2021")
data.to_csv("XRPNEO15M.csv")
print(data)

Python: Saving AJAX response data to .json and save this to pandas DataFrame

Hello and thank your for taking the time to have a read at this,
I am looking to extract company information from a particular stock exchange and then save this information to a pandas DataFrame.
Each firm has it's own webpage that are all determined by the "KodeEmiten" ending. These codes are saved in a column of the first Dataframe:
df = pd.DataFrame.from_dict(data['data'])
Now my goal is to use these codes to call each companies website individually and create a json file for each
for i in range (len(df)):
requests.get(f'https://www.idx.co.id/umbraco/Surface/ListedCompany/GetCompanyProfilesDetail?emitenType=&kodeEmiten={df.loc[i, "KodeEmiten"]}').json()
While this works i can't save this to a new DataFrame due list index out of range and incorrect keyword errors. There is significantly more information in the xhr than i actually need and the different structures are what I believe to cause the error trying to save them to a new DataFrame. I'm really just interested in getting the data in these xhr headers:
AnakPerusahaan:, Direktur:, Komisaris, PemegangSaham:
So my question is kind of two-in-one:
a) How can I just extract the information from those specific xhr headers (all of them are tables)
b) how can i save those to a new dataframe (or even list I don't really mind)
import requests
import pandas as pd
import json
import time
# gets broad data of main page of the stock exchange
sxow = requests.get('https://www.idx.co.id/umbraco/Surface/ListedCompany/GetCompanyProfiles?draw=1&columns%5B0%5D%5Bdata%5D=KodeEmiten&columns%5B0%5D%5Bname%5D&columns%5B0%5D%5Bsearchable%5D=true&columns%5B0%5D%5Borderable%5D=false&columns%5B0%5D%5Bsearch%5D%5Bvalue%5D&columns%5B0%5D%5Bsearch%5D%5Bregex%5D=false&columns%5B1%5D%5Bdata%5D=KodeEmiten&columns%5B1%5D%5Bname%5D&columns%5B1%5D%5Bsearchable%5D=true&columns%5B1%5D%5Borderable%5D=false&columns%5B1%5D%5Bsearch%5D%5Bvalue%5D&columns%5B1%5D%5Bsearch%5D%5Bregex%5D=false&columns%5B2%5D%5Bdata%5D=NamaEmiten&columns%5B2%5D%5Bname%5D&columns%5B2%5D%5Bsearchable%5D=true&columns%5B2%5D%5Borderable%5D=false&columns%5B2%5D%5Bsearch%5D%5Bvalue%5D&columns%5B2%5D%5Bsearch%5D%5Bregex%5D=false&columns%5B3%5D%5Bdata%5D=TanggalPencatatan&columns%5B3%5D%5Bname%5D&columns%5B3%5D%5Bsearchable%5D=true&columns%5B3%5D%5Borderable%5D=false&columns%5B3%5D%5Bsearch%5D%5Bvalue%5D&columns%5B3%5D%5Bsearch%5D%5Bregex%5D=false&start=0&length=700&search%5Bvalue%5D&search%5Bregex%5D=false&_=155082600847')
data = sxow.json() # save the request as .json file
df = pd.DataFrame.from_dict(data['data']) #creates DataFrame based on the data (.json) file
# add: compare file contents and overwrite original if same
cdate = time.strftime ("%Y%m%d") # creating string-variable w/ current date year|month|day
df.to_excel(f"{cdate}StockExchange_Overview.xlsx") # converts DataFrame to Excel file, can't overwrite existing file
for i in range (len(df)) :
requests.get(f'https://www.idx.co.id/umbraco/Surface/ListedCompany/GetCompanyProfilesDetail?emitenType=&kodeEmiten={df.loc[i, "KodeEmiten"]}').json()
#This is where I'm completely stuck
You don't need to convert the result to a dataframe. You can just loop through the json object and concatenate the url to get other companies website details.
Follow the code below:
import requests
import pandas as pd
import json
import time
# gets broad data of main page of the stock exchange
sxow = requests.get('https://www.idx.co.id/umbraco/Surface/ListedCompany/GetCompanyProfiles?draw=1&columns%5B0%5D%5Bdata%5D=KodeEmiten&columns%5B0%5D%5Bname%5D&columns%5B0%5D%5Bsearchable%5D=true&columns%5B0%5D%5Borderable%5D=false&columns%5B0%5D%5Bsearch%5D%5Bvalue%5D&columns%5B0%5D%5Bsearch%5D%5Bregex%5D=false&columns%5B1%5D%5Bdata%5D=KodeEmiten&columns%5B1%5D%5Bname%5D&columns%5B1%5D%5Bsearchable%5D=true&columns%5B1%5D%5Borderable%5D=false&columns%5B1%5D%5Bsearch%5D%5Bvalue%5D&columns%5B1%5D%5Bsearch%5D%5Bregex%5D=false&columns%5B2%5D%5Bdata%5D=NamaEmiten&columns%5B2%5D%5Bname%5D&columns%5B2%5D%5Bsearchable%5D=true&columns%5B2%5D%5Borderable%5D=false&columns%5B2%5D%5Bsearch%5D%5Bvalue%5D&columns%5B2%5D%5Bsearch%5D%5Bregex%5D=false&columns%5B3%5D%5Bdata%5D=TanggalPencatatan&columns%5B3%5D%5Bname%5D&columns%5B3%5D%5Bsearchable%5D=true&columns%5B3%5D%5Borderable%5D=false&columns%5B3%5D%5Bsearch%5D%5Bvalue%5D&columns%5B3%5D%5Bsearch%5D%5Bregex%5D=false&start=0&length=700&search%5Bvalue%5D&search%5Bregex%5D=false&_=155082600847')
data = sxow.json() # save the request as .json file
list_of_json = []
for nested_json in data['data']:
list_of_json.append(requests.get('https://www.idx.co.id/umbraco/Surface/ListedCompany/GetCompanyProfilesDetail?emitenType=&kodeEmiten='+nested_json['KodeEmiten']).json())
time.sleep(1)
The list_of_json will contain all the json results you requested for.
Here nested_json is the loop variable to loop through the array of json of different KodeEmiten.
This is a slight improvement on #bigbounty's approach:
Since the aim is to save the information to a list and then use said list further in the script list comprehension is actually a tad faster.
i.e.
list_of_json = [requests.get('url+nested_json["KodeEmiten"]).json() for nested_json in data["data"]]'

Converting CSV to HTML keeping format

My objective is: Converting DF to HTML which is send as an everyday mail
Current Method : converting df to csv to html
Problem: I have created my df which has as_index=True set, but when I save it to a csv this formatting is lost :
Example DataFrame:
Now when I save this df using to_csv(), the formatting in the index is lost ( means that ABC is now written 3 times across the index, instead of once as I want it)
I want the CSV to have the same formatting is that possible?
Please install pandas and use to_html().
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_html.html
Hope it can help you.

Convert API call into a csv file with selected attributes in columns

I have access public API data by given below link.
import json,urllib
import csv
data = urllib.urlopen("https://earthquake.usgs.gov/fdsnws/event/1/query?format=geojson&starttime=2016-10-01&endtime=2016-10-02").read()
output = json.loads(data)
print (output)
need help to put the obtained data into a csv file. With Following attributes should be the columns in the csv file:
• Latitude (Hint: Treat, the first entry in coordinates attribute as Lat)
• Longitude (Hint: Treat, the second entry in the coordinates attribute as Longitude)
• Title : This should include the Earthquake description
• Place: The location of the Earthquake
• Mag: Magnitude of the earthquake
And then to convert into Pandas dataframe
You can do this directly using pd.read_csv() and by requesting CSV data in the HTTP request:
import pandas as pd
url_csv = 'https://earthquake.usgs.gov/fdsnws/event/1/query?format=csv&starttime=2016-10-01&endtime=2016-10-02'
df = pd.read_csv(url_csv, usecols=['latitude', 'longitude', 'place', 'mag'])
Notice that I have changed the URL to request the data in CSV format by setting format=csv, and that pd.read_csv() accepts a URL for the data. usecols selects those columns to retain.
The CSV file does not contain the title column, however that column seems to be composed of the magnitude and location columns so, although you might want to avoid adding duplicate data, it can be constructed and appended to the dataframe like this:
df['title'] = 'M ' + df['mag'].map(str) + ' - ' + df['place']
There is also a pd.read_json() Pandas function, but I wasn't able to easily get it to work. If you can figure it out then you should be able to extract the required data without manually composing the title column.

Categories

Resources