Format the extracted covid vaccine data from website - python

Trying to format the "Vaccine data" from URL to pandas dataframe
https://www.mygov.in/sites/default/files/covid/vaccine/covid_vaccine_timeline.json
Here is the parent website
https://www.mygov.in/
Sample output
{"vaccine_data":[{"day":"2021-03-01","india_dose1":12256337,"india_dose2":2597799,"india_total_doses":14854136,"india_last_dose1":null,"india_last_dose2":null,"india_last_total_doses":null,"vacc_st_data":[{"st_name":"Andaman and Nicobar","state_id":"1","covid_state_name":"Andaman and Nicobar","covid_state_id":"35","dose1":"6581","dose2":"2556","total_doses":"9137","last_dose1":"","last_dose2":"","last_total_doses":""},{"st_name":"Andhra Pradesh","state_id":"2","covid_state_name":"Andhra Pradesh","covid_state_id":"28","dose1":"541202","dose2":"142431","total_doses":"683633","last_dose1":"","last_dose2":"","last_total_doses":""},{"st_name":"Arunachal Pradesh","state_id":"3","covid_state_name":"Arunachal Pradesh","covid_state_id":"12","dose1":"27572","dose2":"7309","total_doses":"34881","last_dose1":"","last_dose2":"","last_total_doses":""},{"st_name":"Assam","state_id":"4","covid_state_name":"Assam","covid_state_id":"18","dose1":"201640","dose2":"29159","total_doses":"230799","last_dose1":"","last_dose2":"","last_total_doses":""},{"st_name":"Bihar","state_id":"4","covid_state_name":"Bihar","covid_state_id":"10","dose1":"562270","dose2":"81079","total_doses":"643349","last_dose1":"","last_dose2":"","last_total_doses":""},{"st_name":"Chandigarh","state_id":"6","covid_state_name":"Chandigarh","covid_state_id":"4","dose1":"22424","dose2":"1899","total_doses":"24323","last_dose1":"","last_dose2":"","last_total_doses":""},
test = pd.read_json("/Users/dsg281/Downloads/vacin.json")
I am trying to extract the data in the below format in my data frame

import pandas as pd
import requests
req=requests.get("https://www.mygov.in/sites/default/files/covid/vaccine/covid_vaccine_timeline.json")
for i in range(len(req.json()["vaccine_data"])):
df=pd.json_normalize(req.json()["vaccine_data"][i]['vacc_st_data'])
print(df)

Does that help?
import pandas as pd
test = pd.read_json("https://www.mygov.in/sites/default/files/covid/vaccine/covid_vaccine_timeline.json")
for day in test["vaccine_data"]:
print(day)

Related

Reset index and present data in table format

I am using the following code -
import pandas as pd
from mftool import Mftool
import pandas as pd
import os
import time
mf = Mftool()
data =mf.get_scheme_historical_nav('138564',as_Dataframe=True)
data = data.rename_axis("Date",index= False)`
`
The above mentioned code gives me data in the following format -
enter image description here
Clearly, Date has been set to index, but i want to
keep 'Date' column in my df without categorizing it as index.
change dd-mm-yyyy to yyyy-mm-dd
can anybody help, thank you!
I tried using following, but it was not useful -
'data = data.set_index(to_datetime(data['Date']))
'data.d['Date'] = pd.to_datetime(data['Dateyour text'])'`

How to extract data from an api using python and convert it into a pandas data frame

I want to load the data from an API into a pandas data frame. How may I do that? The following is my code snippet:
import requests
import json
response_API = requests.get('https://data.spiceai.io/eth/v0.1/gasfees?period=1d')
#print(response_API.status_code)
data = response_API.text
parse_json = json.loads(data)
Almost there, the json is clean you can directly input it to a dataframe :
response_API = requests.get('https://data.spiceai.io/eth/v0.1/gasfees?period=1d')
data = response_API.json()
df = pd.DataFrame(data)

Filter data from a created list

I am working on my Covid data set from github and I would like to filter my data set with the countries that appear in the this EU_member list in csv format.
import pandas as pd
df = pd.read_csv('https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/owid-covid-data.csv')
df = df[df.continent == 'Europe']
# From here I want to just pick those countries that appear in the following list:
EU_members= ['Austria','Italy','Belgium''Latvia','Bulgaria','Lithuania','Croatia','Luxembourg','Cyprus','Malta','Czechia','Netherlands','Denmark','Poland','Estonia',
'Portugal','Finland','Romania','France','Slovakia','Germany','Slovenia','Greece','Spain','Hungary','Sweden','Ireland']
# I have tried something like this but it is not what I expected:
df.location.str.find('EU_members')
You can use .isin():
import pandas as pd
df = pd.read_csv('https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/owid-covid-data.csv')
EU_members= ['Austria','Italy','Belgium''Latvia','Bulgaria','Lithuania','Croatia','Luxembourg','Cyprus','Malta','Czechia','Netherlands','Denmark','Poland','Estonia',
'Portugal','Finland','Romania','France','Slovakia','Germany','Slovenia','Greece','Spain','Hungary','Sweden','Ireland']
df_out = df[df['location'].isin(EU_members)]
df_out.to_csv('data.csv')
Creates data.csv:

JSON from API call to pandas dataframe

I'm trying to get an API call and save it as a dataframe.
problem is that I need the data from the 'result' column.
Didn't succeed to do that.
I'm basically just trying to save the API call as a csv file in order to work with it.
P.S when I do this with a "JSON to CSV converter" from the web it does it as I wish. (example: https://konklone.io/json/)
import requests
import pandas as pd
import json
res = requests.get("http://api.etherscan.io/api?module=account&action=txlist&
address=0xddbd2b932c763ba5b1b7ae3b362eac3e8d40121a&startblock=0&
endblock=99999999&sort=asc&apikey=YourApiKeyToken")
j = res.json()
j
df = pd.DataFrame(j)
df.head()
output example picture
Try this
import requests
import pandas as pd
import json
res = requests.get("http://api.etherscan.io/api?module=account&action=txlist&address=0xddbd2b932c763ba5b1b7ae3b362eac3e8d40121a&startblock=0&endblock=99999999&sort=asc&apikey=YourApiKeyToken")
j = res.json()
# print(j)
filename ="temp.csv"
df = pd.DataFrame(j['result'])
print(df.head())
df.to_csv(filename)
Looks like you need.
df = pd.DataFrame(j["result"])

Convert a cell into different columns & rows of the DataFrame

I am trying to scrape a table out of a site. I have tried to convert data_row to a Pandas DataFrame; however, all the data are lumped in one cell of the DataFrame. Would you guys please help me convert the data_row into a Pandas DataFrame with "Business Mileage, "Charitable Mileage," "Medical mileage," and "Moving mileage" as rows and "2016," "2015," "2014," "2013," "2012," "2011," and "2010" as columns ?
from bs4 import BeautifulSoup
import urllib2
import pandas as pd
r = urllib2.urlopen('http://www.smbiz.com/sbrl003.html#cmv')
soup = BeautifulSoup(r)
print soup.prettify()
data_row = soup.findAll('pre')[0:1]

Categories

Resources