Format the extracted covid vaccine data from website - python
Trying to format the "Vaccine data" from URL to pandas dataframe
https://www.mygov.in/sites/default/files/covid/vaccine/covid_vaccine_timeline.json
Here is the parent website
https://www.mygov.in/
Sample output
{"vaccine_data":[{"day":"2021-03-01","india_dose1":12256337,"india_dose2":2597799,"india_total_doses":14854136,"india_last_dose1":null,"india_last_dose2":null,"india_last_total_doses":null,"vacc_st_data":[{"st_name":"Andaman and Nicobar","state_id":"1","covid_state_name":"Andaman and Nicobar","covid_state_id":"35","dose1":"6581","dose2":"2556","total_doses":"9137","last_dose1":"","last_dose2":"","last_total_doses":""},{"st_name":"Andhra Pradesh","state_id":"2","covid_state_name":"Andhra Pradesh","covid_state_id":"28","dose1":"541202","dose2":"142431","total_doses":"683633","last_dose1":"","last_dose2":"","last_total_doses":""},{"st_name":"Arunachal Pradesh","state_id":"3","covid_state_name":"Arunachal Pradesh","covid_state_id":"12","dose1":"27572","dose2":"7309","total_doses":"34881","last_dose1":"","last_dose2":"","last_total_doses":""},{"st_name":"Assam","state_id":"4","covid_state_name":"Assam","covid_state_id":"18","dose1":"201640","dose2":"29159","total_doses":"230799","last_dose1":"","last_dose2":"","last_total_doses":""},{"st_name":"Bihar","state_id":"4","covid_state_name":"Bihar","covid_state_id":"10","dose1":"562270","dose2":"81079","total_doses":"643349","last_dose1":"","last_dose2":"","last_total_doses":""},{"st_name":"Chandigarh","state_id":"6","covid_state_name":"Chandigarh","covid_state_id":"4","dose1":"22424","dose2":"1899","total_doses":"24323","last_dose1":"","last_dose2":"","last_total_doses":""},
test = pd.read_json("/Users/dsg281/Downloads/vacin.json")
I am trying to extract the data in the below format in my data frame
import pandas as pd
import requests
req=requests.get("https://www.mygov.in/sites/default/files/covid/vaccine/covid_vaccine_timeline.json")
for i in range(len(req.json()["vaccine_data"])):
df=pd.json_normalize(req.json()["vaccine_data"][i]['vacc_st_data'])
print(df)
Does that help?
import pandas as pd
test = pd.read_json("https://www.mygov.in/sites/default/files/covid/vaccine/covid_vaccine_timeline.json")
for day in test["vaccine_data"]:
print(day)
Related
Reset index and present data in table format
I am using the following code - import pandas as pd from mftool import Mftool import pandas as pd import os import time mf = Mftool() data =mf.get_scheme_historical_nav('138564',as_Dataframe=True) data = data.rename_axis("Date",index= False)` ` The above mentioned code gives me data in the following format - enter image description here Clearly, Date has been set to index, but i want to keep 'Date' column in my df without categorizing it as index. change dd-mm-yyyy to yyyy-mm-dd can anybody help, thank you! I tried using following, but it was not useful - 'data = data.set_index(to_datetime(data['Date'])) 'data.d['Date'] = pd.to_datetime(data['Dateyour text'])'`
How to extract data from an api using python and convert it into a pandas data frame
I want to load the data from an API into a pandas data frame. How may I do that? The following is my code snippet: import requests import json response_API = requests.get('https://data.spiceai.io/eth/v0.1/gasfees?period=1d') #print(response_API.status_code) data = response_API.text parse_json = json.loads(data)
Almost there, the json is clean you can directly input it to a dataframe : response_API = requests.get('https://data.spiceai.io/eth/v0.1/gasfees?period=1d') data = response_API.json() df = pd.DataFrame(data)
Filter data from a created list
I am working on my Covid data set from github and I would like to filter my data set with the countries that appear in the this EU_member list in csv format. import pandas as pd df = pd.read_csv('https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/owid-covid-data.csv') df = df[df.continent == 'Europe'] # From here I want to just pick those countries that appear in the following list: EU_members= ['Austria','Italy','Belgium''Latvia','Bulgaria','Lithuania','Croatia','Luxembourg','Cyprus','Malta','Czechia','Netherlands','Denmark','Poland','Estonia', 'Portugal','Finland','Romania','France','Slovakia','Germany','Slovenia','Greece','Spain','Hungary','Sweden','Ireland'] # I have tried something like this but it is not what I expected: df.location.str.find('EU_members')
You can use .isin(): import pandas as pd df = pd.read_csv('https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/owid-covid-data.csv') EU_members= ['Austria','Italy','Belgium''Latvia','Bulgaria','Lithuania','Croatia','Luxembourg','Cyprus','Malta','Czechia','Netherlands','Denmark','Poland','Estonia', 'Portugal','Finland','Romania','France','Slovakia','Germany','Slovenia','Greece','Spain','Hungary','Sweden','Ireland'] df_out = df[df['location'].isin(EU_members)] df_out.to_csv('data.csv') Creates data.csv:
JSON from API call to pandas dataframe
I'm trying to get an API call and save it as a dataframe. problem is that I need the data from the 'result' column. Didn't succeed to do that. I'm basically just trying to save the API call as a csv file in order to work with it. P.S when I do this with a "JSON to CSV converter" from the web it does it as I wish. (example: https://konklone.io/json/) import requests import pandas as pd import json res = requests.get("http://api.etherscan.io/api?module=account&action=txlist& address=0xddbd2b932c763ba5b1b7ae3b362eac3e8d40121a&startblock=0& endblock=99999999&sort=asc&apikey=YourApiKeyToken") j = res.json() j df = pd.DataFrame(j) df.head() output example picture
Try this import requests import pandas as pd import json res = requests.get("http://api.etherscan.io/api?module=account&action=txlist&address=0xddbd2b932c763ba5b1b7ae3b362eac3e8d40121a&startblock=0&endblock=99999999&sort=asc&apikey=YourApiKeyToken") j = res.json() # print(j) filename ="temp.csv" df = pd.DataFrame(j['result']) print(df.head()) df.to_csv(filename)
Looks like you need. df = pd.DataFrame(j["result"])
Convert a cell into different columns & rows of the DataFrame
I am trying to scrape a table out of a site. I have tried to convert data_row to a Pandas DataFrame; however, all the data are lumped in one cell of the DataFrame. Would you guys please help me convert the data_row into a Pandas DataFrame with "Business Mileage, "Charitable Mileage," "Medical mileage," and "Moving mileage" as rows and "2016," "2015," "2014," "2013," "2012," "2011," and "2010" as columns ? from bs4 import BeautifulSoup import urllib2 import pandas as pd r = urllib2.urlopen('http://www.smbiz.com/sbrl003.html#cmv') soup = BeautifulSoup(r) print soup.prettify() data_row = soup.findAll('pre')[0:1]