Plotly scatter plot with past shadow - python

I have a Plotly express animated scatter plot, which plots variable 1 (INFLATION) against variable 2 (GROWTH) for every year (TIME) and every country (LOCATION) and then iterates through the years (that's the animation).
def create_plot(self):
fig = px.scatter(self.yoy2,
x="GROWTH",
y="INFLATION",
color="LOCATION",
animation_frame="TIME",
#animation_group="TIME",
size='GDP',
text="LOCATION",
hover_name="LOCATION",
range_y=[-2,12],
range_x=[-15,15],
labels={
"GROWTH": "Growth measured in QGDP PC_CHGPP",
"INFLATION": "INFLATION measured in AGRWTH",
"TIME": "Year"},
title=f"Growth versus Inflation plotter for OECD countries.",
template='none')
I would like to add historic data as shadows to the scatter. So for example there is the scatter of 2021 showing the variables growth and inflation for every country, then I would like to also show the scatter of 2020 with the same colours, and every country point of 2021 connected to their country point in 2020 with a line (so that you see the year over year evolution).
I tried to draw another scatter on top with a shifted dataframe but that did not work.
And sadly there is no function in plotly itself for this.
How could I achieve this?

Related

Matplotlib plotting data that doesnt exist

I am trying to plot three lines on one figure. I have data for three years for three sites and i am simply trying to plot them with the same x axis and same y axis. The first two lines span all three years of data, while the third dataset is usually more sparse. Using the object-oriented axes matplotlib format, when i try to plot my third set of data, I get points at the end of the graph that are out of the range of my third set of data. my third dataset is structured as tuples of dates and values such as:
data=
[('2019-07-15', 30.6),
('2019-07-16', 20.88),
('2019-07-17', 16.94),
('2019-07-18', 11.99),
('2019-07-19', 13.76),
('2019-07-20', 16.97),
('2019-07-21', 19.9),
('2019-07-22', 25.56),
('2019-07-23', 18.59),
...
('2020-08-11', 8.33),
('2020-08-12', 10.06),
('2020-08-13', 12.21),
('2020-08-15', 6.94),
('2020-08-16', 5.51),
('2020-08-17', 6.98),
('2020-08-18', 6.17)]
where the data ends in August 2020, yet the graph includes points at the end of 2020. This is happening with all my sites, as the first two datasets stay constant knowndf['DATE'] and knowndf['Value'] below.
Here is the problematic graph.
And here is what I have for the plotting:
fig, ax=plt.subplots(1,1,figsize=(15,12))
fig.tight_layout(pad=6)
ax.plot(knowndf['DATE'], knowndf['Value1'],'b',alpha=0.7)
ax.plot(knowndf['DATE'], knowndf['Value2'],color='red',alpha=0.7)
ax.plot(*zip(*data), 'g*', markersize=8) #when i plot this set of data i get nonexistent points
ax.tick_params(axis='x', rotation=45) #rotating for aesthetic
ax.set_xticks(ax.get_xticks()[::30]) #only want every 30th tick instead of every daily tick
I've tried ax.twinx() and that gives me two y axis that doesn't help me since i want to use the same x-axis and y-axis for all three sites. I've tried not using the axes approach, but there are things that come with axes that i need to plot with. Please please help!

Plotting multilayered grouped df with xlim

I have a plot with hourly values for 2019. When plotting with a sub-set of dates (January only) on the x-axis, my plot goes blank.
I have a DF that I group on the row-axis based on Months and Hours from the time index, for a specific column 'SE3'. The grouping looks good.
Now, I want to plot. The plot looks potentially good, but I want to zoom in on one month only. Based on another post on stackoverflow, I use set_xlim.
Then my plot does not show anything.
#Grouping of DF
df['SE3'].groupby([df.index.month, df.index.hour]).mean().round(2).head()
Picture of grouped DF1
#Plotting and setting new, shorter in time x-axis
ax=df['SE3'].groupby([df.index.month, df.index.hour]).mean().round(2).plot()
ax.set_xlim(pd.Timestamp('2019-01-01 01:00:00'), pd.Timestamp('2019-01-31 23:00:00'))
The expected result is to show the same plot, but now only for January. Instead the grap goes blank. However, the Out data shows
(737060.0416666666, 737090.9583333334), which seems to be date data.
Picture without set_xlim
enter image description here
Picture with set_xlim (empty)
enter image description here
My final aim when I understand why my plot is blank, is to show hourly averages for each month, like this:
enter image description here

How to set day of year as ticklabels for several years?

I'm currently working with some temperature data from a sensor that was active for about 4 months (from December 2018 to March 2019). I'm trying to plot the data; however, my time series currently goes from 350 to 430. How do I make the x-axis ticks start over at 0 once it reaches 365? Or, how can I add ticks that represent months starting at December and going to March?
Current graph:
Let's say you have your matplotlib.pyplot object, e.g. plt. We can use this to change the labels of the x-axis ticks:
xticks = plt.xticks()[0]
plt.xticks(xticks, (xticks % 365))

Extra set of bars on plot in Pandas?

I want to create a plot using Pandas to show the standard deviations of item prices on specific week days (in my case there are 6 relevant days of the week, each shown as 0-5 on the x axis).
It seems to work however there is another set of smaller bars next to each standard deviation bar that is literally also valued at 0-5.
I think this means that I'm also accidentally also plotting the day of the week.
How can I get rid of these smaller bars and only show the standard deviation bars?
sales_std=sales_std[['WeekDay','price']].groupby(['WeekDay']).std()
.reset_index()
Here is where I try to plot the graph:
p = sales_std.plot(figsize=
(15,5),legend=False,kind="bar",rot=45,color="orange",fontsize=16,
yerr=sales_std);
p.set_title("Standard Deviation", fontsize=18);
p.set_xlabel("WeekDay", fontsize=18);
p.set_ylabel("Price", fontsize=18);
p.set_ylim(0,100);
Resulting Bar Plot:
You are plotting both WeekDay and price at the same time (i.e. plotting an entire Dataframe). In order to show bars for price only, you need to plot Series given WeekDay as an index (so no reset_index() is required after groupby()).
# you don't need `reset_index()` in your code
sales_std=sales_std[['WeekDay','price']].groupby(['WeekDay']).std()
sales_std['price'].plot(kind='bar')
Note: I intentionally omitted graph-styling parts of your code to focus on fixing the issue.

Python Bokeh: How do I keep the y-axis tick marks stable?

Here I've taken some gapminder.org data and looped through the years to create a series of charts (which I converted to an animated gif in imageio) by modifying the Making Interactive Visualizations with Bokeh notebook.
The problem is that when the Middle Eastern countries float to the top in the 1970s, the y-axis tick marks (and the legend) gets perturbed. I'm keeping as many things as possible out of the year loop when I build the plots, so my y-axis code looks like this:
# Personal income (GDP per capita)
y_low = int(math.floor(income_df.min().min()))
y_high = int(math.ceil(income_df.max().max()))
y_data_range = DataRange1d(y_low-0.5*y_low, 1000000*y_high)
# ...
for year in columns_list:
# ...
# Build the plot
plot = Plot(
# Children per woman (total fertility)
x_range=x_data_range,
# Personal income (GDP per capita)
y_range=y_data_range,
y_scale=LogScale(),
plot_width=800,
plot_height=400,
outline_line_color=None,
toolbar_location=None,
min_border=20,
)
# Build the axes
xaxis = LinearAxis(ticker=SingleIntervalTicker(interval=x_interval),
axis_label="Children per woman (total fertility)",
**AXIS_FORMATS)
yaxis = LogAxis(ticker=LogTicker(),
axis_label="Personal income (GDP per capita)",
**AXIS_FORMATS)
plot.add_layout(xaxis, 'below')
plot.add_layout(yaxis, 'left')
As you can see, I've bumped up the data range by a factor of 10^6 with no effect. Is there some parameter I need to add to keep my y-axis tick marks (and legend) stable?
Don't use a DataRange1d, that's what is actually doing the "auto-ranging". If you know the full range that you want to always show up front use a Range1d:
Plot(y_range=Range1d(low, high), ...)
or more for convenience this will also work:
Plot(y_range=(low, high), ...)

Categories

Resources