Plotting multilayered grouped df with xlim

Plotting multilayered grouped df with xlim - python

I have a plot with hourly values for 2019. When plotting with a sub-set of dates (January only) on the x-axis, my plot goes blank.
I have a DF that I group on the row-axis based on Months and Hours from the time index, for a specific column 'SE3'. The grouping looks good.
Now, I want to plot. The plot looks potentially good, but I want to zoom in on one month only. Based on another post on stackoverflow, I use set_xlim.
Then my plot does not show anything.
#Grouping of DF
df['SE3'].groupby([df.index.month, df.index.hour]).mean().round(2).head()
Picture of grouped DF1
#Plotting and setting new, shorter in time x-axis
ax=df['SE3'].groupby([df.index.month, df.index.hour]).mean().round(2).plot()
ax.set_xlim(pd.Timestamp('2019-01-01 01:00:00'), pd.Timestamp('2019-01-31 23:00:00'))
The expected result is to show the same plot, but now only for January. Instead the grap goes blank. However, the Out data shows
(737060.0416666666, 737090.9583333334), which seems to be date data.
Picture without set_xlim
enter image description here
Picture with set_xlim (empty)
enter image description here
My final aim when I understand why my plot is blank, is to show hourly averages for each month, like this:
enter image description here

Related

Setting custom tooltip on 3d surface plot in plotly python

I’ve been trying for a while to set custom tooltips on a 3d surface plot, but cannot figure it out. I can do something very simple, like make the tooltip the same for each point, but I’m having trouble putting different values for each point in the tooltip, when the fields aren't being graphed.
In my example, I have a dataset of 53 rows (weeks) and 7 columns (days of the week) that I’m graphing on a 3d surface plot, by passing the dataframe in the Z parameter. It’s a year’s worth of data, so each day has its own numeric value that’s being graphed. I’m trying to label each point with the actual date (hence the custom tooltip, since I'm not passing the date itself to the graph), but cannot seem to align the tooltip values correctly.
I tried a simple example to create a "tooltip array" of the same shape as the dataframe, but when I test whether I’m getting the shape right, by using a repeated word, I get an even weirder error where it uses the character values in the word as tooltips (e.g., c or _). Does anyone have any thoughts or suggestions? I can post more code, but tried to replicate my error with a simpler example.
labels=np.array([['test_label']*7]*53)
fig = go.Figure(data=[
go.Surface(z=Z, text=labels, hoverinfo='text'
)],)
fig.show()

We have created sample data similar to the data provided in the image. I created a data frame with randomly generated values for the dates of one consecutive year, added the week number and day number, and formed it into Z data. I have also added a date only data column. So your code will make the hover text display the date.
import numpy as np
import plotly.graph_objects as go
import pandas as pd
df = pd.DataFrame({'date':pd.to_datetime(pd.date_range('2021-01-01','2021-12-31',freq='1d')),'value':np.random.rand(365)})
df['day_of_week'] = df['date'].dt.weekday
df['week'] = df['date'].dt.isocalendar().week
df['date2'] = df['date'].dt.date
Z = df[['week','day_of_week','value']].pivot(index='week', columns='day_of_week')
labels = df[['week','day_of_week','date2']].pivot(index='week', columns='day_of_week').fillna('')
fig = go.Figure(data=[
go.Surface(z=Z,
text=labels,
hoverinfo='text'
)]
)
fig.update_layout(autosize=False, width=800, height=600)
fig.show()

Plot specific markers on plot in Python

I have a plot that has on x axis time and on y axis values in percentages. The plot is drawn based on a dataframe output. As I would need to review many plots, would be good to insert some pointers of a different color.
For example, each graph starts the timeline from 08:00 and finishes at 20:00. I would need a red marker at 12:00.
I have tried the following:
graph_df is a df that contains two columns: one with time and one with percentage data.
df = graph_df.loc[graph_df['time'] == "12:00"]
graph_df.plot(x="time", y="percentage", linewidth=1, kind='line')
plt.plot(df['time'], df['percentage'], 'o-', color='red')
plt.show()
plt.savefig(graph_name)
If I am using this section of the code, I am getting the marker at the correct percentage for 12:00, but always at the start of the timeline. In my case, the red dot is marked at 08:00, but with the right percentage associated.
Any idea why it's not correctly marked?
Thank you.

Converting the strings to datetime objects should work.
Replacing your first line with
graph_df["time"] = pd.to_datetime(graph_df["time"]).dt.time
df = graph_df[graph_df["time"].apply(lambda time: time.strftime("%H:%M"))=='12:00']
should do the job

Plot a graph in matplotlib with two different scales on one axis

I'm trying to plot a graph with time data on X-Axis. My data has daily information, but I want to create something that has two different date scales on X-Axis.
I want to start it from 2005 and it goes to 2014, but after 2014, I want that, the data continues by months of 2015. Is this possible to do? If so: how can I create this kind of plot?
Thanks.
I provided an image below:

Yes you can, just use the following pattern as I observed your X-axis values are already the same so it would just plot the other graph on the right
For a dataframe:
import numpy, matplotlib
data = numpy.array([45,63,83,91,101])
df1 = pd.DataFrame(data, index=pd.date_range('2005-10-09', periods=5, freq='W'), columns=['events'])
df2 = pd.DataFrame(numpy.arange(10,21,2), index=pd.date_range('2015-01-09', periods=6, freq='M'), columns=['events'])
matplotlib.pyplot.plot(df1.index, df1.events)
matplotlib.pyplot.plot(df2.index, df2.events)
matplotlib.pyplot.show()
You can change the parameters according to your convenience.

Seaborn Heatmap Not Placing Data on Axes Properly

I'm new to using Seaborn and usually only use Matplotlib.pyplot.
With the recent COVID developments I was asked by a supervisor to put together estimates of how changes to the student population & expenses we need to fund affected student fees (I work in a college budgeting office). I've been able to put together my scenario analysis, but am now trying to visualize these results in a heatmap.
What I'd like to be able to do is have the:
x-axis be my population change rates,
y_axis be my expense change rates,
cmap be my new student fees depending on the x & y axis.
What my code is currently doing is:
x-axis is displaying the new student fee category (not sure how to describe this - see picture)
y-axis is displaying the population change and expense change (population, expenses)
cmap is displaying accurately
Essentially, my code is stacking each scenario on top of the others along the y-axis.
Here is a picture of what is currently being produced, which is not correct:
I've attached a link to a Colab Jupyter notebook with my code, and below is a snippet of the section giving me problems.
# Create Pandas DF of Scenario Analysis
df = pd.DataFrame(list(zip(Pop, Exp, NewStud, NewTotal)),
index = [i for i in range(0,len(NewStud))],
columns=['Population_Change', 'Expense_Change', 'New_Student_Activity_Fee', 'New_Total_Fee'])
# Group this scenario analysis
df = df.groupby(['Population_Change', 'Expense_Change'], sort=False).max()
# Create Figure
fig = plt.figure(figsize=(15,8))
ax = plt.subplot(111)
# Drop New Student Activity Fee Column. Analyze Only New Total Fee
df = df.drop(['New_Student_Activity_Fee'], axis=1)
########################### Not Working As Desired
sb.heatmap(df)
###########################

Your DataFrame is not in the right shape for seaborn.heatmap(). For example, as a result of the groupby operation, you have Population_Change and Expense_Change as a MultiIndex, which would only be used for labelling by the plotting function.
So instead of the groupby, first drop the superfluous column, and then do this:
df = df.pivot(index='Expense_Change', columns='Population_Change', values='New_Total_Fee')
Then seaborn.heatmap(df) should work as expected.

Extra set of bars on plot in Pandas?

I want to create a plot using Pandas to show the standard deviations of item prices on specific week days (in my case there are 6 relevant days of the week, each shown as 0-5 on the x axis).
It seems to work however there is another set of smaller bars next to each standard deviation bar that is literally also valued at 0-5.
I think this means that I'm also accidentally also plotting the day of the week.
How can I get rid of these smaller bars and only show the standard deviation bars?
sales_std=sales_std[['WeekDay','price']].groupby(['WeekDay']).std()
.reset_index()
Here is where I try to plot the graph:
p = sales_std.plot(figsize=
(15,5),legend=False,kind="bar",rot=45,color="orange",fontsize=16,
yerr=sales_std);
p.set_title("Standard Deviation", fontsize=18);
p.set_xlabel("WeekDay", fontsize=18);
p.set_ylabel("Price", fontsize=18);
p.set_ylim(0,100);
Resulting Bar Plot:

You are plotting both WeekDay and price at the same time (i.e. plotting an entire Dataframe). In order to show bars for price only, you need to plot Series given WeekDay as an index (so no reset_index() is required after groupby()).
# you don't need `reset_index()` in your code
sales_std=sales_std[['WeekDay','price']].groupby(['WeekDay']).std()
sales_std['price'].plot(kind='bar')
Note: I intentionally omitted graph-styling parts of your code to focus on fixing the issue.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Plotting multilayered grouped df with xlim - python

Related

Setting custom tooltip on 3d surface plot in plotly python

Plot specific markers on plot in Python

Plot a graph in matplotlib with two different scales on one axis

Seaborn Heatmap Not Placing Data on Axes Properly

Extra set of bars on plot in Pandas?

Categories

Resources