Is there a replacement for this syntax? - python

with the new update on pandas I can't use this function that I used on on Datacamp learning course - (DAYOFWEEK doesn't exist anymore)
days_of_week = pd.get_dummies(dataframe.index.dayofweek,
prefix='weekday',
drop_first=True)
How can I change the syntax of my 'formula' to get the same results?
Sorry about the silly question but spent a lot of time here and I'm stuck...
Thanks in advance!
already tried just using the dataframe with index but doesn't get the days of the week on the get dummies\
used datetimeindex but messing up on the formulation as well
`days_of_week = pd.get_dummies(dataframe.index.dayofweek, prefix='weekday', drop_first=True)`
the dataframe is fairly big and need the outputs to get me the weekdays because I'm dealing with stock prices

Try weekday instead of dayofweek.
So
days_of_week = pd.get_dummies(dataframe.index.weekday,
prefix='weekday',
drop_first=True)
See docs below:
pandas.Series.dt.weekday

Related

Pandas convert date

Hey guys I need some help, my date is not in the correct format.
I made a function to convert all columns of dates it works but, it gives a return of SettingWithCopyWarning.
(https://i.stack.imgur.com/5xlT4.png)
(https://i.stack.imgur.com/hZe9f.png)
(https://i.stack.imgur.com/iglZB.png)
can you tell me how to solve this I've tried in several ways.
If your code is working and is doing its job you can always ignore the error by adding this at the top. I would not recommend it in a very large scale project.
from pandas.core.common import SettingWithCopyWarning
warnings.simplefilter(action="ignore", category=SettingWithCopyWarning)

Python / Bloomberg api - historical security data incl. weekends (sat. and sun.)

I'm currently working with an blpapi and trying to get a bdh of an index including weekends. (I'll later need to match this df with another date vector.)
I'm allready using
con.bdh([Index],['PX_LAST'],'19910102', today.strftime('%Y%m%d'), [("periodicitySelection", "DAILY")])
but this will return only weekdays (mon - fr). I know how this works in excel with the bbg function-builder but not sure about the wording within the blpapi.
Since I'll need always the first of each month,
con.bdh([Index],['PX_LAST'],'19910102', today.strftime('%Y%m%d'), [("periodicitySelection", "MONTHLY")])
wont work as well because it will return 28,30,31 and so.
Can anyone help here? THX!
You can use a combination of:
"nonTradingDayFillOption", "ALL_CALENDAR_DAYS" # include all days
"nonTradingDayFillMethod", "PREVIOUS_VALUE" # fill non-trading days with previous value

Converting Datetimeindex of a dataframe to week numbers

I am very new to Python and cannot seem to solve the problem on my own. Currently I have a dataset which I already converted to a DataFrame using pandas which has a datetimeindex according to yyyy-mm-dd-HH-MM-SS with time stamps of minutes. The attached figure shows the already interpolated dataframe.
enter image description here
Now I want to convert the date/datetimeindex to week numbers to plot the corresponding HVAC Actual, Chiller power etc. to their week number. The index already was set to time but I got an error telling that 'Time' was not recognized in the columns. I tried to recall the index like in the code below and from there create a new column using dt.week
building_interpolated = building_interpolated.set_index('Time')
building_interpolated['Week number'] =
building_interpolated['Time'].dt.week
If I am correct this should create a new column called Week number with the week number in it. However, I still get an error telling that ['Time'] is not in the columns (see figure below)
enter image description here
Anyone who can help me?
Regards, nooby Boaz ;)
df.index = df.index.to_series().dt.isocalendar().week

python groupby to dataframe (just groupby to data no additional functions) to export to excel

I am at a total loss as to why this is impossible to find but I really just want to be able to groupby and then export to excel. Don't need counts, or sums, or anything else and can only find examples including these functions. Tried removing those functions and the whole code just breaks.
Anyways:
Have a set of monthly metrics - metric name, volumes, date, productivity, and fte need. Simple calcs got the data looking nice, good to go. Currently it is grouped in 1 month sections so all metrics from Jan are one after the other etc. Just want to change the grouping so first section is individual metrics from Jan to Dec and so on for each one.
Initial data I want to export to excel (returns not a dataframe error)
dfcon = pd.concat([PmDf,ReDf])
dfcon['Need'] = dfcon['Volumes'] / (dfcon['Productivity']*21*8*.80)
dfcon[['Date','Current Team','Metric','Productivity','Volumes','Need']]
dfg = dfcon.groupby(['Metric','Date'])
dfg.to_excel(r'S:\FilePATH\GroupBy.xlsx', sheet_name='pandas_group', index = 0)
The error I get here is: 'DataFrameGroupBy' object has no attribute 'to_excel' (I have tried a variety of conversions to dataframes and closest I can get is a correct grouping displaying counts only for each one, which I do not need in the slightest)
I have also tried:
dfcon.sort('Metric').to_excel(r'S:\FILEPATH\Grouped_Output.xlsx', sheet_name='FTE Need', index = 0)
this returns the error: AttributeError: 'DataFrame' object has no attribute 'sort'
Any help you can give to get this to be able to be exported grouped in excel would be great. I am at my wits end here after over an hour of googling. I am also self taught so feel like I may be missing something very, very basic/simple so here I am!
Thank you for any help you can provide!
Ps: I know I can just sort after in excel but would rather learn how to make this work in python!
I am pretty sure sort() doesnt work anymore, try sort_values()

Finding time difference between columns

I am currently working with a dataset which has two DateTime columns: ACTUAL_SHIPMENT_DTM and SHIPMENT_CONFIRMED_DTM.
I am trying to find the difference in time between the two columns. I have tried the following code but the output is giving me the time difference of one column based on the rows. Basically I want a new column to be populated with the time difference of (ACTUAL_SHIPMENT_DTM - SHIPMENT_CONFIRMED_DTM).
Golden['Cycle_TIme'] = Golden.groupby('ACTUAL_SHIPMENT_DTM')
['SHIPMENT_CONFIRMED_DTM'].diff().dt.total_seconds()
Can anyone see errors in my code or guide me to proper documentation?
Lol I underestimated myself and asked a question way too soon. Well if anyone wants to know how to find the time difference between two columns here is my example code. Golden = DataFrame
Golden['Cycle_TIme'] = Golden["SHIPMENT_CONFIRMED_DTM"]-
Golden["ACTUAL_SHIPMENT_DTM"]

Categories

Resources