Unable to add values to a pandas DataFrame

Unable to add values to a pandas DataFrame - python

I am trying to find the MACD(Moving Average Convergence Divergence) for a few stocks.I am using Pandas_ta, yfinance and pandas libraries. But When I am trying to add the Macd values to the dataframe I am getting this error:
IndexError: iloc cannot enlarge its target object
My code is :
import pandas as pd
import pandas_ta as ta
import yfinance as yf
import datetime as dt
import matplotlib.pyplot as plt
start=dt.datetime.today()-dt.timedelta(365)
end=dt.datetime.today()
zscore=pd.DataFrame()
rsi=pd.DataFrame()
tickers=['2060.SR' , '2160.SR', '3002.SR', '4007.SR', '3005.SR', '3004.SR' , '2150.SR']
macd=pd.DataFrame()
for i in tickers:
df=pd.DataFrame(yf.download(i, start=start, end=end, interval="1mo"))
df.columns = map(str.lower, df.columns)
macd=df.ta.macd()
Can someone let me know where my mistake is and how to solve this error. thanks

I am not sure which line gave you this error.
But please note that in the loop you are not adding data, but you are re-writing the data again and again:
for i in tickers:
df=pd.DataFrame(yf.download(i, start=start, end=end, interval="1mo"))
If you want to append, do the following:
agg_df = pd.DataFrame()
for i in tickers:
df=pd.DataFrame(yf.download(i, start=start, end=end, interval="1mo"))
agg_df = agg_df.append(df)

df=df.merge(macd, on="Date")

I used df.append(row) in the past which is deprecated since pandas 1.4.
most logical for me is the approach:
df.loc[len(df)] = ['list', 'of', 'elements'] # len(df.columns)
other methods are provided here: https://sparkbyexamples.com/pandas/how-to-append-row-to-pandas-dataframe/

Related

Pandas groupby using only year and month

I have a Python program using Pandas, which reads two dataframes, obtained in the following links:
Casos-positivos-diarios-en-San-Nicolas-de-los-Garza-Promedio-movil-de-7-dias: https://datamexico.org/es/profile/geo/san-nicolas-de-los-garza#covid19-evolucion
Denuncias-segun-bien-afectado-en-San-Nicolas-de-los-GarzaClic-en-el-grafico-para-seleccionar: https://datamexico.org/es/profile/geo/san-nicolas-de-los-garza#seguridad-publica-denuncias
What I currently want to do is a groupby in the "covid" dataframe with the same dates, having a sum of these. Regardless, no method has worked out, which regularly prints an error indicating that I should be using a syntaxis for "PeriodIndex". Does anyone have a suggestion or solution? Thanks in advance.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib notebook
#csv for the covid cases
covid = pd.read_csv('Casos-positivos-diarios-en-San-Nicolas-de-los-Garza-Promedio-movil-de-7-dias.csv')
#csv for complaints
comp = pd.read_csv('Denuncias-segun-bien-afectado-en-San-Nicolas-de-los-GarzaClic-en-el-grafico-para-seleccionar.csv')
#cleaning data in both dataframes
#keeping only the relevant columns
covid = covid[['Month','Daily Cases']]
comp = comp[['Month','Affected Legal Good', 'Value']]
#changing the labels from spanish to english
comp['Affected Legal Good'].replace({'Patrimonio': 'Heritage', 'Familia':'Family', 'Libertad y Seguridad Sexual':'Sexual Freedom and Safety', 'Sociedad':'Society', 'Vida e Integridad Corporal':'Life and Bodily Integrity', 'Libertad Personal':'Personal Freedom', 'Otros Bienes Jurídicos Afectados (Del Fuero Común)':'Other Affected Legal Assets (Common Jurisdiction)'}, inplace=True, regex=True)
#changing the month types to dates
covid['Month'] = pd.to_datetime(covid['Month'])
covid['Month'] = covid['Month'].dt.to_period('M')
covid

You can simply usen group by statement.Timegrouper by default converts it to datetime
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib notebook
#csv for the covid cases
covid = pd.read_csv('Casos-positivos-diarios-en-San-Nicolas-de-los-Garza-Promedio-movil-de-7-dias.csv')
covid = covid.groupby(['Month'])['Daily Cases'].sum()
covid = covid.reset_index()
# #changing the month types to dates
covid['Month'] = pd.to_datetime(covid['Month'])
covid['Month'] = covid['Month'].dt.to_period('M')
covid

Work with data in python and numpy/pandas

so I started learning how to work with data in python. I wanted to load multiple securities. But I have an error that I can not fix for some reason. Could someone tell me what is the problem?
import numpy as np
import pandas as pd
from pandas_datareader import data as wb
import matplotlib.pyplot as plt
tickers = ['PG', 'MSFT', 'F', 'GE']
mydata = pd.DataFrame()
for t in tickers:
mydata[t] = wb.DataReader(t, data_source='yahoo', start = '1955-1-1')

you need 2 fixes here:
1) 1955 is too early for this data source, try 1971 or later.
2) your data from wb.DataReader(t, data_source='yahoo', start = '1971-1-1') comes as dataframe with multiple series, so you can not save it to mydata[t] as single series. Use a dictionary as in the other answer or save only closing prices:
mydata[t] = pdr.data.DataReader(t, data_source='yahoo', start = '2010-1-1')['Close']

First of all please do not share information as images unless absolutely necessary.
See: this link
Now here is a solution to your problem. You are using year '1955' but there is a possibility that data is not available for this year or there may be some other issues. But when you select the right year it will work. Another thing it returns data as dataframe so you can not assign it like a dictionary so instead of making a DataFram you should make a dictionary and store all dataframes into it.
Here is improved code choose year carefully
import numpy as np
import pandas as pd
from pandas_datareader import data as wb
import matplotlib.pyplot as plt
from datetime import datetime as dt
tickers = ['PG', 'MSFT', 'F', 'GE']
mydata = {}
for t in tickers:
mydata[t] = wb.DataReader(t, data_source='yahoo',start=dt(2019, 1, 1), end=dt.now())
Output
mydata['PG']
High Low Open Close Volume Adj Close
Date
2018-12-31 92.180000 91.150002 91.629997 91.919998 7239500.0 88.877655
2019-01-02 91.389999 89.930000 91.029999 91.279999 9843900.0 88.258835
2019-01-03 92.500000 90.379997 90.940002 90.639999 9820200.0 87.640022
2019-01-04 92.489998 90.370003 90.839996 92.489998 10565700.0 89.428787

Why can't I search for a row in a pandas df using a date as part of a tuple index?

I am trying to search a pandas df I made which has a tuple as an index. The first part of the tuple is a date and the second part is a forex pair. I've tried a few things but I can't seem to search using a date-formatted string as part of a tuple with .loc or .ix
My df looks like this:
Open Close
(11-01-2018, AEDAUD) 0.3470 0.3448
(11-01-2018, AEDCAD) 0.3415 0.3408
(11-01-2018, AEDCHF) 0.2663 0.2656
(11-01-2018, AEDDKK) 1.6955 1.6838
(11-01-2018, AEDEUR) 0.2277 0.2261
Here is the complete code :
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
forex_11 = pd.read_csv('FOREX_20180111.csv', sep=',', parse_dates=['Date'])
forex_12 = pd.read_csv('FOREX_20180112.csv', sep=',', parse_dates=['Date'])
time_format = '%d-%m-%Y'
forex = forex_11.append(forex_12, ignore_index=False)
forex['Date'] = forex['Date'].dt.strftime(time_format)
GBP = forex[forex['Symbol'] == "GBPUSD"]
forex.index = list(forex[['Date', 'Symbol']].itertuples(index=False, name=None))
forex_open_close = pd.DataFrame(np.array(forex[['Open','Close']]), index=forex.index)
forex_open_close.columns = ['Open', 'Close']
print(forex_open_close.head())
print(forex_open_close.ix[('11-01-2018', 'GBPUSD')])
How do I get the row which has index ('11-01-2018', 'GBPUSD') ?

Can you try putting the tuple in a list using brackets?
Like this:
print(forex_open_close.ix[[('11-01-2018', 'GBPUSD')]])

I would recommend using the Pandas multiIndex. In your case you could do the following:
tuples = list(data[['Date', 'Symbol']].itertuples(index=False, name=None))
data.index = pd.MultiIndex.from_tuples(tuples, names=['Date', 'Symbol'])
# And then to index
data.loc['2018-01-11', 'AEDCAD']

iterating a stock tick data with append on python

I am trying to combine a series of stock tick data based on the dates.
But it wont work. Please help.
import pandas as pd
import tushare as ts
def get_all_tick(stockID):
dates=pd.date_range('2016-01-01',periods=5,freq='D')
append_data=[]
for i in dates:
stock_tick=pd.DataFrame(ts.get_tick_data(stockID,date=i))
stock_tick.sort('volume',inplace=True, ascending=False)
stock_tick=stock_tick[:10]
stock_tick.sort('time',inplace=True, ascending=False)
append_data.append(stock_tick.iterrows())
get_all_tick('300243')

I figure it out myself.
def get_all_tick(stockID):
.........
df = pd.DataFrame()
for i in get_date:
stock_tick = ts.get_tick_data(stockID, date=i)
stock_tick['Date']=i
stock_tick.sort('volume', inplace=True, ascending=False)
stock_tick = stock_tick[:10]
stock_tick.sort('time', inplace=True, ascending=False)
df = df.append(stock_tick)
df.to_excel('tick.xlsx',sheet_name='Sheet1')
get_all_tick('300243')

Pandas: 52 week high from yahoo or google finance

Does anyone know if you can get the 52 week high in pandas from either yahoo or google finance? Thanks.

It is possible, please check out pandas documentation. Here's an example:
import pandas.io.data as web
import datetime
symbol = 'aapl'
end = datetime.datetime.now()
start = end - datetime.timedelta(weeks=52)
df = web.DataReader(symbol, 'yahoo', start, end)
highest_high = df['High'].max()

One can also use yfinance(from yahoo)
pip install finance
import yfinance as yf
stock = "JNJ"
dataframe = yf.download(stock, period="1y", auto_adjust=True, prepost=True, threads=True)
max = dataframe['High'].max()

You could also use other libraries such as yahoo_fin. This one works better sometimes, it would depend on what you want to do, but it's good to bear in mind other possibilities : )
import yfinance as yf
import yahoo_fin.stock_info as si
stock = 'AAPL'
df = yf.download(stock, period="1y")
print("$",round(df['High'].max(), 2))
df2 = si.get_data(stock, interval="1mo")
print("$",round(df2['high'].tail(12).max(), 2))
Output:
$ 182.94
$ 182.94

You can use the info keyword to return lots of aggregated data like P/E Ratio, 52-Week High,etc.
import yfinance as yf
data = yf.Ticker(ticker).info
print(data.fiftyTwoWeekHigh)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Unable to add values to a pandas DataFrame - python

df=df.merge(macd, on="Date")

I used df.append(row) in the past which is deprecated since pandas 1.4. most logical for me is the approach: df.loc[len(df)] = ['list', 'of', 'elements'] # len(df.columns) other methods are provided here: https://sparkbyexamples.com/pandas/how-to-append-row-to-pandas-dataframe/

Related

Pandas groupby using only year and month

Work with data in python and numpy/pandas

Why can't I search for a row in a pandas df using a date as part of a tuple index?

iterating a stock tick data with append on python

Pandas: 52 week high from yahoo or google finance

Categories

Resources