Reset index and present data in table format - python

I am using the following code -
import pandas as pd
from mftool import Mftool
import pandas as pd
import os
import time
mf = Mftool()
data =mf.get_scheme_historical_nav('138564',as_Dataframe=True)
data = data.rename_axis("Date",index= False)`
`
The above mentioned code gives me data in the following format -
enter image description here
Clearly, Date has been set to index, but i want to
keep 'Date' column in my df without categorizing it as index.
change dd-mm-yyyy to yyyy-mm-dd
can anybody help, thank you!
I tried using following, but it was not useful -
'data = data.set_index(to_datetime(data['Date']))
'data.d['Date'] = pd.to_datetime(data['Dateyour text'])'`

Related

calculate IGRF with pyigrf and export to csv

I want to calculate some data with pyIGRF library but when I export it to csv all data have same parameters.
import pyIGRF
import os
import numpy as np
import pandas as pd
os.chdir('D:/IGRF')
df=pd.read_csv('igrf.csv')
print(df)
df.head()
for i in range(0,len(df)):
x=[Inc,Dec,Hi,Xn,Yn,Zn,totalmag]=pyIGRF.igrf_value(df['lan'][i],df['lat'][i],df['alt'][i],2022)
for j in x:
df['Inc']=Inc
df['Dec']=Dec
df['total']=totalmag
print(df)
import csv
df.to_csv('IGRF_end.csv')
I think loop needs some changes, but I couldn't find this changes.
One problem that I see when looking at the pyIGRF documentation is that you've got the order of the inputs wrong. It should be:
pyIGRF.igrf_variation(lat, lon, alt, date)
And you've switched the latitude and longitude.

How to combine 2 columns in pandas DataFrame?

Hello! This is a CSV table.I was trying to combine CSV output with Python to create Gantt Charts. Each column in CSV file means a date time, for example start1 is the hours and the start2 - minutes. After that, i use pd.to_datetime(data["start1"], format="%H") for the proper formatting. Same to the start2.
And here is the thing: how can i combine both this columns in pandas DataFrame to get one column in "%H-%M" format? Like data["start"]. Here is the data.head() output and code:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from datetime import timedelta
#import data
data = pd.read_csv('TEST.csv')
#convert data str to "datetime" data
data["start1"] = pd.to_datetime(data["start1"], format="%H")
data["start2"] = pd.to_datetime(data["start2"], format="%M")
data["end1"] = pd.to_datetime(data["end1"], format="%H")
data["end2"] = pd.to_datetime(data["end2"], format="%M")
Try:
data["start"] = pd.to_datetime(data["start1"].astype(str).str.pad(2, fillchar="0") +
data["start2"].astype(str).str.pad(2, fillchar="0"),
format="%H%M")
data["end"] = pd.to_datetime(data["end1"].astype(str).str.pad(2, fillchar="0") +
data["end2"].astype(str).str.pad(2, fillchar="0"),
format="%H%M")
Before you change the data types to date time you can add an additional column like this:
data["start"] = data["start1"] + '-' + data["start2"]
data["start"] = pd.to_datetime(data["start"], format="%H-%M")
# then do the other conversions.

Format the extracted covid vaccine data from website

Trying to format the "Vaccine data" from URL to pandas dataframe
https://www.mygov.in/sites/default/files/covid/vaccine/covid_vaccine_timeline.json
Here is the parent website
https://www.mygov.in/
Sample output
{"vaccine_data":[{"day":"2021-03-01","india_dose1":12256337,"india_dose2":2597799,"india_total_doses":14854136,"india_last_dose1":null,"india_last_dose2":null,"india_last_total_doses":null,"vacc_st_data":[{"st_name":"Andaman and Nicobar","state_id":"1","covid_state_name":"Andaman and Nicobar","covid_state_id":"35","dose1":"6581","dose2":"2556","total_doses":"9137","last_dose1":"","last_dose2":"","last_total_doses":""},{"st_name":"Andhra Pradesh","state_id":"2","covid_state_name":"Andhra Pradesh","covid_state_id":"28","dose1":"541202","dose2":"142431","total_doses":"683633","last_dose1":"","last_dose2":"","last_total_doses":""},{"st_name":"Arunachal Pradesh","state_id":"3","covid_state_name":"Arunachal Pradesh","covid_state_id":"12","dose1":"27572","dose2":"7309","total_doses":"34881","last_dose1":"","last_dose2":"","last_total_doses":""},{"st_name":"Assam","state_id":"4","covid_state_name":"Assam","covid_state_id":"18","dose1":"201640","dose2":"29159","total_doses":"230799","last_dose1":"","last_dose2":"","last_total_doses":""},{"st_name":"Bihar","state_id":"4","covid_state_name":"Bihar","covid_state_id":"10","dose1":"562270","dose2":"81079","total_doses":"643349","last_dose1":"","last_dose2":"","last_total_doses":""},{"st_name":"Chandigarh","state_id":"6","covid_state_name":"Chandigarh","covid_state_id":"4","dose1":"22424","dose2":"1899","total_doses":"24323","last_dose1":"","last_dose2":"","last_total_doses":""},
test = pd.read_json("/Users/dsg281/Downloads/vacin.json")
I am trying to extract the data in the below format in my data frame
import pandas as pd
import requests
req=requests.get("https://www.mygov.in/sites/default/files/covid/vaccine/covid_vaccine_timeline.json")
for i in range(len(req.json()["vaccine_data"])):
df=pd.json_normalize(req.json()["vaccine_data"][i]['vacc_st_data'])
print(df)
Does that help?
import pandas as pd
test = pd.read_json("https://www.mygov.in/sites/default/files/covid/vaccine/covid_vaccine_timeline.json")
for day in test["vaccine_data"]:
print(day)

How to filter for dates range in timeseries or dataframe using python

Still a newbie with Python just trying to learn this stuff. Appreciate any help.
Right now when I connect to Alpha Vantage I get the full range of data for all the dates and it looks like this
I found some good sources for guides, but I keep getting empty dataframes or errors
This is how the code looks so far
import pandas as pd
from pandas import DataFrame
import datetime
from datetime import datetime as dt
from alpha_vantage.timeseries import TimeSeries
import numpy as np
stock_ticker = 'SPY'
api_key = open('/content/drive/My Drive/Colab Notebooks/key').read()
ts = TimeSeries (key=api_key, output_format = "pandas")
data_daily, meta_data = ts.get_daily_adjusted(symbol=stock_ticker, outputsize ='full')
#data_date_changed = data[:'2019-11-29']
data = pd.DataFrame(data_daily)
df.loc[datetime.date(year=2014,month=1,day=1):datetime.date(year=2015,month=2,day=1)]
The answer for this is
stock_ticker = 'SPY'
api_key = 'apikeyddddd'
ts = TimeSeries (key=api_key, output_format = "pandas")
data_daily, meta_data = ts.get_daily_adjusted(symbol=stock_ticker, outputsize ='full')
test = data_daily[(data_daily.index > '2014-01-01') & (data_daily.index <= '2017-08-15')]
print(data_daily)
print(test)
import datetime
df.loc[datetime.date(year=2014,month=1,day=1):datetime.date(year=2014,month=2,day=1)]
In my experience, you can't pass a simple string inside loc, needs to be a datetime object.

Reading Json file and converting it to columns in python

I am trying to read this json file in python using this code (I want to have all the data in a data frame):
import numpy as np
import pandas as pd
import json
from pandas.io.json import json_normalize
df = pd.read_json('short_desc.json')
df.head()
Data frame head screenshot
using this code I am able to convert only the first row to separated columns:
json_normalize(df.short_desc.iloc[0])
First row screenshot
I want to do the same for whole df using this code:
df.apply(lambda x : json_normalize(x.iloc[0]))
but I get this error:
ValueError: If using all scalar values, you must pass an index
What I am doing wrong?
Thank you in advance
After reading the json file with json.load, you can use pd.DataFrame.from_records. This should create the DataFrame you are looking for.
wih open('short_desc.json') as f:
d = json.load(f)
df = pd.DataFrame.from_records(d)

Categories

Resources