Adding specific dots to a series plot in Python

Adding specific dots to a series plot in Python - python

I have a time series plot and I would like to add a red dot at a specific time index. Below is a sample code:
dt_index = pd.to_datetime(['2020-01-01','2020-02-01','2020-03-01','2020-04-01','2020-05-01'])
series = pd.Series([1.1,2.2,3.3,4.5,6.7], index = dt_index)
dots_to_add = pd.to_datetime(['2020-01-01','2020-04-01'])
series.plot()
Using dots_to_add as an index, how would I add a red dot to the line?

A dot in plot is called marker.
import pandas as pd
import matplotlib.pyplot as plt
dt_index = pd.to_datetime(['2020-01-01','2020-02-01','2020-03-01','2020-04-01','2020-05-01'])
series = pd.Series([1.1,2.2,3.3,4.5,6.7], index = dt_index)
dots_to_add = pd.to_datetime(['2020-01-01','2020-04-01'])
series.plot(marker='o')
plt.show()
I don't find an parameter to make marker color and plot color different. I think there is none, because a marker is a part of plot, they should have the same style.
But I think you can draw a scatter plot instead:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
dt_index = pd.to_datetime(['2020-01-01','2020-02-01','2020-03-01','2020-04-01','2020-05-01'])
series = pd.Series([1.1,2.2,3.3,4.5,6.7], index = dt_index)
dots_to_add = pd.to_datetime(['2020-01-01','2020-04-01'])
series.plot()
plt.scatter(series.index, series, color='r')
plt.show()
If you just want to add dots with dots_to_add as an index, you could just use a for loop with each loop plt.scatter() a dot.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
dt_index = pd.to_datetime(['2020-01-01','2020-02-01','2020-03-01','2020-04-01','2020-05-01'])
series = pd.Series([1.1,2.2,3.3,4.5,6.7], index = dt_index)
dots_to_add = pd.to_datetime(['2020-01-01','2020-04-01'])
series.plot()
for dot in dots_to_add:
plt.scatter(dot, series[dot], color='r')
plt.show()

Related

Bar plot for multidimensional columns using pandas

I want to plot my dataframe (df) as a bar plot based on the time columns, where each bar represents the value counts() for each letter that appears in the column.
Expected output
.
date,00:00:00,01:00:00,02:00:00,03:00:00,04:00:00
2002-02-01,Y,Y,U,N,N
2002-02-02,U,N,N,N,N
2002-02-03,N,N,N,N,N
2002-02-04,N,N,N,N,N
2002-02-05,N,N,N,N,N
When I select individual time columns, I can do as below
import pandas as pd
import numpy as np
from datetime import datetime
import matplotlib.pyplot as plt
df = pd.read_csv('df.csv')
df = df['04:00:00'].value_counts()
df.plot(kind='bar')
plt.show()
How can I plot all the columns on the same bar plot as shown on the expected output.

One possible solution is:
pd.DataFrame({t: df[t].value_counts() for t in df.columns if t != "date"}).T.plot.bar()

Here is an approach via seaborn's catplot:
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
from io import StringIO
df_str = '''date,00:00:00,01:00:00,02:00:00,03:00:00,04:00:00
2002-02-01,Y,Y,U,N,N
2002-02-02,U,N,N,N,N
2002-02-03,N,N,N,N,N
2002-02-04,N,N,N,N,N
2002-02-05,N,N,N,N,N'''
df = pd.read_csv(StringIO(df_str))
df_long = df.set_index('date').melt(var_name='hour', value_name='kind')
g = sns.catplot(kind='count', data=df_long, x='kind', palette='mako',
col='hour', col_wrap=5, height=3, aspect=0.5)
for ax in g.axes.flat:
ax.set_xlabel(ax.get_title()) # use the title as xlabel
ax.grid(True, axis='y')
ax.set_title('')
if len(ax.get_ylabel()) == 0:
sns.despine(ax=ax, left=True) # remove left axis for interior subplots
ax.tick_params(axis='y', size=0)
plt.tight_layout()
plt.show()

Avoiding overlapping plots in seaborn bar plot

I have the following code where I am trying to plot a bar plot in seaborn. (This is a sample data and both x and y variables are continuous variables).
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
xvar = [1,2,2,3,4,5,6,8]
yvar = [3,6,-4,4,2,0.5,-1,0.5]
year = [2010,2011,2012,2010,2011,2012,2010,2011]
df = pd.DataFrame()
df['xvar'] = xvar
df['yvar']=yvar
df['year']=year
df
sns.set_style('whitegrid')
fig,ax=plt.subplots()
fig.set_size_inches(10,5)
sns.barplot(data=df,x='xvar',y='yvar',hue='year',lw=0,dodge=False)
It results in the following plot:
Two questions here:
I want to be able to plot the two bars on 2 side by side and not overlapped the way they are now.
For the x-labels, in the original data, I have alot of them. Is there a way I can set xticks to a specific frequency? for instance, in the chart above only I only want to see 1,3 and 6 for x-labels.
Note: If I set dodge = True then the lines become very thin with the original data.

For the first question, get the patches in the bar chart and modify the width of the target patch. It also shifts the position of the x-axis to represent the alignment.
The second question can be done by using slices to set up a list or a manually created list in a specific order.
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
xvar = [1,2,2,3,4,5,6,8]
yvar = [3,6,-4,4,2,0.5,-1,0.5]
year = [2010,2011,2012,2010,2011,2012,2010,2011]
df = pd.DataFrame({'xvar':xvar,'yvar':yvar,'year':year})
fig,ax = plt.subplots(figsize=(10,5))
sns.set_style('whitegrid')
g = sns.barplot(data=df, x='xvar', y='yvar', hue='year', lw=0, dodge=False)
for idx,patch in enumerate(ax.patches):
current_width = patch.get_width()
current_pos = patch.get_x()
if idx == 8 or idx == 15:
patch.set_width(current_width/2)
if idx == 15:
patch.set_x(current_pos+(current_width/2))
ax.set_xticklabels([1,'',3,'','',6,''])
plt.show()

Percentage in axis y histogram Matplotlib

I have the following Plot
I need to add percentage inside the bars, it should be like this:
My code is the following:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sb
pkmn = pd.read_csv('data/Pokemon.csv')
pkmn.head()
order = pkmn['Generation'].value_counts().index
order
pkmngen = pkmn['Generation'].value_counts()
plt.figure(figsize=(6,4))
sb.countplot(data=pkmn, y='Generation', color = sb.color_palette()[4], order=order, )
plt.xticks(rotation=90)
plt.show()

Seaborn - Display Last Value / Label

I would like create an plot with to display the last value on line. But i can not create the plot with the last value on chart. Do you have an idea for to resolve my problem, thanks you !
Input :
DataFrame
Plot
Output :
Cross = Last Value In columns
Output Final
# import eikon as ek
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
import os
import seaborn as sns; sns.set()
import pylab
from scipy import *
from pylab import *
fichier = "P:/GESTION_RPSE/GES - Gestion Epargne Salariale/Dvp Python/Florian/Absolute
Performance/PLOT.csv"
df = pd.read_csv(fichier)
df = df.drop(columns=['Unnamed: 0'])
# sns.set()
plt.figure(figsize=(16, 10))
df = df.melt('Date', var_name='Company', value_name='Value')
#palette = sns.color_palette("husl",12)
ax = sns.lineplot(x="Date", y="Value", hue='Company', data=df).set_title("LaLaLa")
plt.show()

Do you just want to put an 'X' at the end of your lines?
If so, you could pass markerevery=[-1] to the call to lineplot(). However there are a few caveats:
You have to use style= instead of hue= otherwise, there are no markers drawn
Filled markers work better than unfilled markers (like "x"). You can just use markers=True to use the default markers, or pass a list markers=['s','d','o',etc...]
code:
fmri = sns.load_dataset("fmri")
fig, ax = plt.subplots()
ax = sns.lineplot(x="timepoint", y="signal",
style="event", data=fmri, ci=None, markers=True, markevery=[-1], markersize=10)

Pandas line plot suppresses half of the xticks, how to stop it?

I am trying to make a line plot in which every one of the elements from the index appears as an xtick.
import pandas as pd
ind = ['16-12', '17-01', '17-02', '17-03', '17-04',
'17-05','17-06', '17-07', '17-08', '17-09', '17-10', '17-11']
data = [1,3,5,2,3,6,4,7,8,5,3,8]
df = pd.DataFrame(data,index=ind)
df.plot(kind='line',x_compat=True)
however the resultant plot skips every second element of the index like so:
My code to call the plot includes the (x_compat=True) parameter which the documentation for pandas suggests should stop the auto tick configuratioin but it seems to have no effect.

You need to use ticker object on axis and then use that axis when plotting.
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
ind = ['16-12', '17-01', '17-02', '17-03', '17-04',
'17-05','17-06', '17-07', '17-08', '17-09', '17-10', '17-11']
data = [1,3,5,2,3,6,4,7,8,5,3,8]
df = pd.DataFrame(data,index=ind)
ax2 = plt.axes()
ax2.xaxis.set_major_locator(ticker.MultipleLocator(1))
df.plot(kind='line', ax=ax2)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Adding specific dots to a series plot in Python - python

Related

Bar plot for multidimensional columns using pandas

Avoiding overlapping plots in seaborn bar plot

Percentage in axis y histogram Matplotlib

Seaborn - Display Last Value / Label

Pandas line plot suppresses half of the xticks, how to stop it?

Categories

Resources