How to remove the timezone from yfinance data? - python

I grab data with yfinance package. I convert it into a panda dataframe.
However, I am unable to save the dataframe to excel file.
ValueError: Excel does not support datetimes with timezones. Please
ensure that datetimes are timezone unaware before writing to Excel.
This is how the dataframe looks like. It should be 8 columns. Spyder says it has 7 columns.
Below is my codes:
import yfinance as yf
import pandas as pd
stock = yf.Ticker("BABA")
# get stock info
stock.info
# get historical market data
hist = stock.history(start="2021-03-25",end="2021-05-20",interval="15m")
hist = pd.DataFrame(hist)
# pd.to_datetime(hist['Datetime'])
# hist['Datetime'].dt.tz_localize(None)
hist.to_excel(excel_writer= "D:/data/python projects/stock_BABA2.xlsx")

You can remove the time zone information of DatetimeIndex using DatetimeIndex.tz_localize() , as follows:
hist.index = hist.index.tz_localize(None)

You can convert time zones using tz_convert(), in your situation it should work with:
hist.index = hist.index.tz_convert(None)

Related

Can't parse a date from an excel file using Pandas

Data:Panda Dataframe, read from excel
Month Sales
01-01-17 1009
01-02-17 1004
..
01-12-19 2244
Code:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from statsmodels.tsa.stattools import adfuller
import datetime
CHI = pd.read_excel('D:\DS\TS.xls', index="Month")
CHI['Month'] = pd.to_datetime(CHI['Month']).dt.date
CHI['NetSalesUSD'] = pd.to_numeric(CHI['NetSalesUSD'], errors='coerce')
result = adfuller(CHI)
Error received:
float() argument must be a string or a number, not 'datetime.date'
I tried converting to integer , still not able to get the results, any suggestions?
I think the issue here is excel.
Excel likes to show dates as Month-Day for some reason.
Try changing the date format to short date in excel then save and run your python script again.
It looks like Pandas is not recognizing the date format by default. You can instruct Pandas to use a custom date parser. See the Pandas documentation for more details.
In your case, it would look something like this:
def parse_custom_date(x):
return pd.datetime.strptime(x, '%b-%y')
data_copy = pd.read_excel(
'D:\DS\DATA.xls',
'CHI',
index='Month',
parse_dates=['Month'],
date_parser=parse_custom_date,
)
Note that your date format does not appear to have day of the month, so this would assume the first day of the month.

How to change index frequency in a time series

I am using the yfinance library to import data for a given stock. See code below:
import yfinance as yf
from datetime import datetime as dt
import pandas as pd
# Naming Constants
stock = "AAPL"
start_date = "2014-01-01"
end_date = "2018-01-01"
# Importing all the data into a dataFrame
stock_data = yf.download(stock, start=start_date, end=end_date)
When I call print(stock_data.index) I have the following:
DatetimeIndex(['2014-01-02', '2014-01-03', '2014-01-06', '2014-01-07', '2014-01-08', '2014-01-09', '2014-01-10', '2014-01-13', '2014-01-14', '2014-01-15',
...
'2017-12-15', '2017-12-18', '2017-12-19', '2017-12-20', '2017-12-21', '2017-12-22', '2017-12-26', '2017-12-27', '2017-12-28', '2017-12-29'],
dtype='datetime64[ns]', name='Date', length=1007, freq=None)
I wish to switch the frequency argument from None to daily since every Date refers to a trading day.
When I say stock_data.index.freq = 'B' I get the following error:
ValueError: Inferred frequency None from passed values does not conform to passed frequency B
And if I put stock_data = stock_data.asfreq('B'), it will change the frequency but it will add certain lines that were not there originally and fills them with NA values.
In other words, what is the offset ALIAS used for trading days?
You can find the list of alias from the Pandas documentation here: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases
The error with stock_data.index.freq = 'B' indicates that your timeseries frequency is not 'business-day', but undefined or 'None'.
With
stock_data = stock_data.asfreq('B')
your are re-indexing your timeseries to business-daily frequency: The missing timestamps will be added, and the missing stock data values are set to NaN. Now you need to decide how replace them, so have a look here: pandas.DataFrame.asfreq. So you could replace all NaN's with a fixed value like -999, but in general what you want to do with stock data is take the last valid value at a given point in time, which is forward filling the gaps:
stock_data = stock_data.asfreq('B', method='ffill')
It's always worth reading the docs.

Python MetaTrader5 indicators

I'm using Metatrader5 module for python and this is my code
'''
#python
from datetime import datetime
import MetaTrader5 as mt5
# display data on the MetaTrader 5 package
print("MetaTrader5 package author: ", mt5.__author__)
print("MetaTrader5 package version: ", mt5.__version__)
# import the 'pandas' module for displaying data obtained in the tabular form
import pandas as pd
pd.set_option('display.max_columns', 500) # number of columns to be displayed
pd.set_option('display.width', 1500) # max table width to display
# import pytz module for working with time zone
import pytz
# establish connection to MetaTrader 5 terminal
if not mt5.initialize():
print("initialize() failed")
mt5.shutdown()
# set time zone to UTC
timezone = pytz.timezone("Etc/UTC")
# create 'datetime' object in UTC time zone to avoid the implementation of a local time zone offset
utc_from = datetime(2020, 1, 10, tzinfo=timezone)
# get 10 EURUSD H4 bars starting from 01.10.2020 in UTC time zone
rates = mt5.copy_rates_from("EURUSD", mt5.TIMEFRAME_H4, utc_from, 10)
# shut down connection to the MetaTrader 5 terminal
mt5.shutdown()
# display each element of obtained data in a new line
print("Display obtained data 'as is'")
for rate in rates:
print(rate)
# create DataFrame out of the obtained data
rates_frame = pd.DataFrame(rates)
# convert time in seconds into the datetime format
rates_frame['time'] = pd.to_datetime(rates_frame['time'], unit='s')
# display data
print("\nDisplay dataframe with data")
print(rates_frame)
'''
My question is s there any easy way to calculate stock indicators like RSI and MFI and other indicators using this module?
No. Its possible if using other modules though.
Here is a method using another that could achieve it:
https://www.mql5.com/en/articles/5691
Alternatively, you can pull the data from MT5 and throw it in TA-lib for analysis. TA-lib consumes the data and provides values for the indicators outside MT5.
Check out TA-lib: https://mrjbq7.github.io/ta-lib/
Since your data will be in a pandas df, I would check out pandas-ta, https://pypi.org/project/pandas-ta, all technical indicators. Also, thats a lot of code to pull your data, this is what I use;
import MetaTrader5 as mt
import pandas as pd
from datetime import datetime
mt.initialize()
df = pd.DataFrame( mt.copy_rates_range( '#MNQ', #micro nasd100
mt.TIMEFRAME_D1,
datetime( 2022, 1, 1 ),
datetime.now() ) )
# manipulate as you please
mt.shutdown()
and i didnt like the GMT+2 timezone used by metatrader at first but Ive found its easier to get used to it as the date change is timed to the daily futures market open at 5pm central, which in GMT+2 is day+1 00:00.

Convert csv column from Epoch time to human readable minutes

I have a pandas.DataFrame indexed by time, as seen below. The time is in Epoch time. When I graph the second column these time values display along the x-axis. I want a more readable time in minutes:seconds.
In [13]: print df.head()
Time
1481044277379 0.581858
1481044277384 0.581858
1481044277417 0.581858
1481044277418 0.581858
1481044277467 0.581858
I have tried some pandas functions, and some methods for converting the whole column, I visited: Pandas docs, this question and the cool site.
I am using pandas 0.18.1
If you read your data with read_csv you can use a custom dateparser:
import pandas as pd
#example.csv
'''
Time,Value
1481044277379,0.581858
1481044277384,0.581858
1481044277417,0.581858
1481044277418,0.581858
1481044277467,0.581858
'''
def dateparse(time_in_secs):
time_in_secs = time_in_secs/1000
return datetime.datetime.fromtimestamp(float(time_in_secs))
dtype= {"Time": float, "Value":float}
df = pd.read_csv("example.csv", dtype=dtype, parse_dates=["Time"], date_parser=dateparse)
print df
You can convert an epoch timestamp to HH:MM with:
import datetime as dt
hours_mins = dt.datetime.fromtimestamp(1347517370).strftime('%H:%M')
Adding a column to your pandas.DataFrame can be done as:
df['H_M'] = pd.Series([dt.datetime.fromtimestamp(int(ts)).strftime('%H:%M')
for ts in df['timestamp']]).values

Extract date from Pandas DataFrame

I want to download adjusted close prices and their corresponding dates from yahoo, but I can't seem to figure out how to get dates from pandas DataFrame.
I was reading an answer to this question
from pandas.io.data import DataReader
from datetime import datetime
goog = DataReader("GOOG", "yahoo", datetime(2000,1,1), datetime(2012,1,1))
print goog["Adj Close"]
and this part works fine; however, I need to extract the dates that correspond to the prices.
For example:
adj_close = np.array(goog["Adj Close"])
Gives me a 1-D array of adjusted closing prices, I am looking for 1-D array of dates, such that:
date = # what do I do?
adj_close[0] corresponds to date[0]
When I do:
>>> goog.keys()
Index([Open, High, Low, Close, Volume, Adj Close], dtype=object)
I see that none of the keys will give me anything similar to the date, but I think there has to be a way to create an array of dates. What am I missing?
You can get it by goog.index which is stored as a DateTimeIndex.
To get a series of date, you can do
goog.reset_index()['Date']
import numpy as np
import pandas as pd
from pandas.io.data import DataReader
symbols_list = ['GOOG','IBM']
d = {}
for ticker in symbols_list:
d[ticker] = DataReader(ticker, "yahoo", '2014-01-01')
pan = pd.Panel(d)
df_adj_close = pan.minor_xs('Adj Close') #also use 'Open','High','Low','Adj Close' and 'Volume'
#the dates of the adjusted closes from the dataframe containing adjusted closes on multiple stocks
df_adj_close.index
# create a dataframe that has data on only one stock symbol
df_individual = pan.get('GOOG')
# the dates from the dataframe of just 'GOOG' data
df_individual.index

Categories

Resources